Claim Missing Document
Check
Articles

Found 11 Documents
Search

PREFERENCE BASED TERM WEIGHTING FOR ARABIC FIQH DOCUMENT RANKING Khadijah Fahmi Hayati Holle; Agus Zainal Arifin; Diana Purwitasari
Jurnal Ilmu Komputer dan Informasi Vol 8, No 1 (2015): Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information)
Publisher : Faculty of Computer Science - Universitas Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (381.084 KB) | DOI: 10.21609/jiki.v8i1.283

Abstract

In document retrieval, besides the suitability of query with search results, there is also a subjective user assessment that is expected to be a deciding factor in document ranking. This preference aspect is referred at the fiqh document searching. People tend to prefer on certain fiqh methodology without rejecting other fiqh methodologies. It is necessary to investigate preference factor in addition to the relevance factor in the document ranking. Therefore, this research proposed a method of term weighting based on preference to rank documents according to user preference. The proposed method is also combined with term weighting based on documents index and books index so it sees relevance and preference aspect. The proposed method is Inverse Preference Frequency with α value (IPFα). In this method, we calculate preference value by IPF term weighting. Then, the preference values of terms that is equal with the query are multiplied by α. IPFα combined with the existing weighting methods become TF.IDF.IBF.IPFα. Experiment of the proposed method uses dataset of several Arabic fiqh documents. Evaluation uses recall, precision, and f-measure calculations. Proposed term weighting method is obtained to rank the document in the right order according to user preference. It is shown from the result with recall value reach 75%, precision 100%, and f-measure 85.7% respectively.
COVERAGE, DIVERSITY, AND COHERENCE OPTIMIZATION FOR MULTI-DOCUMENT SUMMARIZATION Khoirul Umam; Fidi Wincoko Putro; Gulpi Qorik Oktagalu Pratamasunu; Agus Zainal Arifin; Diana Purwitasari
Jurnal Ilmu Komputer dan Informasi Vol 8, No 1 (2015): Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information)
Publisher : Faculty of Computer Science - Universitas Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (608.144 KB) | DOI: 10.21609/jiki.v8i1.278

Abstract

A great summarization on multi-document with similar topics can help users to get useful information. A good summary must have an extensive coverage, minimum redundancy (high diversity), and smooth connection among sentences (high coherence). Therefore, multi-document summarization that considers the coverage, diversity, and coherence of summary is needed. In this paper we propose a novel method on multi-document summarization that optimizes the coverage, diversity, and coherence among the summary's sentences simultaneously. It integrates self-adaptive differential evolution (SaDE) algorithm to solve the optimization problem. Sentences ordering algorithm based on topical closeness approach is performed in SaDE iterations to improve coherences among the summary's sentences. Experiments have been performed on Text Analysis Conference (TAC) 2008 data sets. The experimental results showed that the proposed method generates summaries with average coherence and ROUGE scores 29-41.2 times and 46.97-64.71% better than any other method that only consider coverage and diversity, re-spectively.
Optimasi Pembobotan pada Query Expansion dengan Term Relatedness to Query-Entropy based (TRQE) Resti Ludviani; Khadijah F. Hayati; Agus Zainal Arifin; Diana Purwitasari
Jurnal Buana Informatika Vol. 6 No. 3 (2015): Jurnal Buana Informatika Volume 6 Nomor 3 Juli 2015
Publisher : Universitas Atma Jaya Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24002/jbi.v6i3.433

Abstract

Abstract. An appropriate selection term for expanding a query is very important in query expansion. Therefore, term selection optimization is added to improve query expansion performance on document retrieval system. This study proposes a new approach named Term Relatedness to Query-Entropy based (TRQE) to optimize weight in query expansion by considering semantic and statistic aspects from relevance evaluation of pseudo feedback to improve document retrieval performance. The proposed method has 3 main modules, they are relevace feedback, pseudo feedback, and document retrieval. TRQE is implemented in pseudo feedback module to optimize weighting term in query expansion. The evaluation result shows that TRQE can retrieve document with the highest result at precission of 100% and recall of 22,22%. TRQE for weighting optimization of query expansion is proven to improve retrieval document.     Keywords: TRQE, query expansion, term weighting, term relatedness to query, relevance feedback Abstrak..Pemilihan term yang tepat untuk memperluas queri merupakan hal yang penting pada query expansion. Oleh karena itu, perlu dilakukan optimasi penentuan term yang sesuai sehingga mampu meningkatkan performa query expansion pada system temu kembali dokumen. Penelitian ini mengajukan metode Term Relatedness to Query-Entropy based (TRQE), sebuah metode untuk mengoptimasi pembobotan pada query expansion dengan memperhatikan aspek semantic dan statistic dari penilaian relevansi suatu pseudo feedback sehingga mampu meningkatkan performa temukembali dokumen. Metode yang diusulkan memiliki 3 modul utama yaitu relevan feedback, pseudo feedback, dan document retrieval. TRQE diimplementasikan pada modul pseudo feedback untuk optimasi pembobotan term pada ekspansi query. Evaluasi hasil uji coba menunjukkan bahwa metode TRQE dapat melakukan temukembali dokumen dengan hasil terbaik pada precision  100% dan recall sebesar 22,22%.Metode TRQE untuk optimasi pembobotan pada query expansion terbukti memberikan pengaruh untuk meningkatkan relevansi pencarian dokumen.Kata Kunci: TRQE, ekspansi query, pembobotan term, term relatedness to query, relevance feedback
Deteksi Bot Spammer pada Twitter Berbasis Sentiment Analysis dan Time Interval Entropy Christian Sri Kusuma Aditya; Mamluatul Hani’ah; Alif Akbar Fitrawan; Agus Zainal Arifin; Diana Purwitasari
Jurnal Buana Informatika Vol. 7 No. 3 (2016): Jurnal Buana Informatika Volume 7 Nomor 3 Juli 2016
Publisher : Universitas Atma Jaya Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24002/jbi.v7i3.656

Abstract

Abstract. Spam is an abuse of messaging undesired by recipients. Those who send spam are called spammers.  Popularity of Twitter has attracted spammers to use it as a means to disseminate spam messages. The spams are characterized by a neutral emotional sentiment or no particular users’ preference perspective. In addition, the regularity of tweeting behavior periodically shows automation performed by bot. This study proposes a new method to differentiate between bot spammer and legitimate user accounts by integrating the sentiment analysis (SA) based on emotions and time interval entropy (TIE). The combination of knowledge-based and machine learning-based were used to classify tweets with positive, negative and neutral sentiments. Furthermore, the collection of timestamp is used to calculate the time interval entropy of each account. The results show that the precision and recall of the proposed method reach up to 83% and 91%. This proves that the merging SA and TIE can optimize overall system performance in detecting Bot Spammer.Keywords: bot spammer, twitter, sentiment analysis, polarity, entropy Abstrak. Spam merupakan penyalahgunaan pengiriman pesan tanpa dikehendaki oleh penerimanya, orang yang mengirimkan spam disebut spammer. Ketenaran Twitter mengundang spammer untuk menggunakannya sebagai sarana menyebarluaskan pesan spam. Karakteristik dari tweet yang dikategorikan spam memiliki sentimen emosi netral atau tidak ada preferensi tertentu terhadap suatu perspektif dari user yang memposting tweet. Selain itu keteraturan waktu perilaku saat memposting tweet secara periodik menunjukkan otomatisasi yang dilakukan bot. Pada penelitian ini diusulkan metode baru untuk mendeteksi antara bot spammer dan legitimate user dengan mengintegrasikan sentimen analysis berdasarkan emosi dan time interval entropy. Pendekatan gabungan knowledge-based dan machine learning-based digunakan untuk mengklasifikasi tweet yang memiliki sentimen positif, negatif dan tweet netral. Selanjutnya kumpulan timestamp digunakan untuk menghitung time interval entropy dari tiap akun. Hasil percobaan menunjukan bahwa precision dan recall dari metode yang diusulkan mencapai 83% dan 91%. Hal ini membuktikan penggabungan Sentiment Analysis (SA) dan Time Interval Entropy (TIE) dapat mengoptimalkan performa sistem secara keseluruhan dalam mendeteksi Bot Spammer.Kata Kunci:  bot spammer, twitter, sentiment analysis,  polarity, entropy
Pencarian Question-Answer Menggunakan Convolutional Neural Network Pada Topik Agama Berbahasa Indonesia Rizqa Raaiqa Bintana; Chastine Fatichah; Diana Purwitasari
Ultimatics : Jurnal Teknik Informatika Vol 10 No 1 (2018): Ultimatics : Jurnal Teknik Informatika
Publisher : Faculty of Engineering and Informatics, Universitas Multimedia Nusantara

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (2494.968 KB) | DOI: 10.31937/ti.v10i1.842

Abstract

Community-based question answering (CQA) is formed to help people who search information that they need through a community. One condition that may occurs in CQA is when people cannot obtain the information that they need, thus they will post a new question. This condition can cause CQA archive increased because of duplicated questions. Therefore, it becomes important problems to find semantically similar questions from CQA archive towards a new question. In this study, we use convolutional neural network methods for semantic modeling of sentence to obtain words that they represent the content of documents and new question. The result for the process of finding the same question semantically to a new question (query) from the question-answer documents archive using the convolutional neural network method, obtained the mean average precision value is 0,422. Whereas by using vector space model, as a comparison, obtained mean average precision value is 0,282. Index Terms—community-based question answering, convolutional neural network, question retrieval
ANALYSIS OF RAW MATERIAL INVENTORY PREDICTION FOR PLASTIC ORE USING A COMBINATION OF CAUSALITY AND TIME SERIES METHODS: A CASE STUDY IN A TEXTILE INDUSTRY COMPANY Frangky Rawung; Agus Budi Raharjo; Diana Purwitasari
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 1 (2024): JUTIF Volume 5, Number 1, February 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.1.1809

Abstract

Raw material inventory is a valuable company asset in production activities. Inadequate or excessive availability can lead to production failures or cost wastage. This research aims to predict raw material inventory based on factors such as initial stock, receipts, usage, final stock, and differences in usage. A causality-based approach with Multiple Linear Regression (MLR) is used as the basis, complemented by a time series data approach that processes data trends using the Bidirectional Long Short-Term Memory (BiLSTM) algorithm. The prediction results from both models are then combined using the harmonic mean. This research utilizes a dataset of raw material inventory and applies the Root Mean Squared Error (RMSE) and R-squared (R²) performance parameters for model evaluation. The research is expected to provide useful information for companies in managing their raw material inventory and improving the efficiency of their production processes. Results show that, in the BiLSTM deep learning model, Polyethylene Terephthalate (PET) raw materials yielded an RMSE of 6.53 and an R² of 0.93. These results indicate that PET raw materials have a higher predictive value than other materials.
A STUDY ON RANKING METHOD IN RETRIEVING WEB PAGES BASED ON CONTENT AND LINK ANALYSIS: COMBINATION OF FOURIER DOMAIN SCORING AND PAGERANK SCORING Diana Purwitasari
JUTI: Jurnal Ilmiah Teknologi Informasi Vol 7, No 1, Januari 2008
Publisher : Department of Informatics, Institut Teknologi Sepuluh Nopember

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (2047.368 KB) | DOI: 10.12962/j24068535.v7i1.a57

Abstract

Ranking module is an important component of search process which sorts through relevant pages. Since collection of Web pages has additional information inherent in the hyperlink structure of the Web, it can be represented as link score and then combined with the usual information retrieval techniques of content score. In this paper we report our studies about ranking score of Web pages combined from link analysis, PageRank Scoring, and content analysis, Fourier Domain Scoring. Our experiments use collection of Web pages relate to Statistic subject from Wikipedia with objectives to check correctness and performance evaluation of combination ranking method. Evaluation of PageRank Scoring show that the highest score does not always relate to Statistic. Since the links within Wikipedia articles exists so that users are always one click away from more information on any point that has a link attached, it it possible that unrelated topics to Statistic are most likely frequently mentioned in the collection. While the combination method show link score which is given proportional weight to content score of Web pages does effect the retrieval results.
SISTEM PEMBANGKIT ANOTASI PADA ARTIKEL BERGAMBAR DENGAN PENDEKATAN KONTEKSTUAL Diana Purwitasari; Dian Saputra; Esti Yuniar; Umi Laili Yuhana; Daniel Siahaan
JUTI: Jurnal Ilmiah Teknologi Informasi Vol 9, No 1, Januari 2011
Publisher : Department of Informatics, Institut Teknologi Sepuluh Nopember

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (553.833 KB) | DOI: 10.12962/j24068535.v9i1.a64

Abstract

Development of E-learning sites and their materials make it is necessary to help users finding the desired materials. Context-based search engine will help users for the finding task. However that kind of searching can only be done for learning materials that have been semantically signed or annotated. Annotation is given for the article’s content or the article’s image within. There are many constraints for manually providing annotations to the learning articles such that automatic metadata or annotation generating method is needed. This paper discusses about annotation generating system with two subsystems: annotation recommender for learning material using contextual analysis and image metadata generator. The methods for contextual analysis are Latent Semantic Analysis (LSA) and WordNet-lexical dictionary usage. Our experimental results showed that subsystems can be used to generate annotation for articles and images in the articles though we have not done combination of two subsystems.
RANCANG BANGUN APLIKASI PENGAMBILAN BERITA SECARA OTOMATIS MENGGUNAKAN CONTENT SYNDICATION BERBASIS XML DENGAN PLATFORM MICROSOFT .NET Diana Purwitasari; Febriliyan Samopa; Ade Afrian
JUTI: Jurnal Ilmiah Teknologi Informasi Vol 3, No 1 Januari 2004
Publisher : Department of Informatics, Institut Teknologi Sepuluh Nopember

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (207.357 KB) | DOI: 10.12962/j24068535.v3i1.a128

Abstract

Banyaknya kebutuhan akan informasi di internet menyebabkan penyedia jasa situs berita untuk memberikan berita yang selalu yang terbaru. Salah satu alternatif solusi adalah dengan melakukan content syndication. Content syndication adalah adalah proses dimana suatu isi berita dikirimkan atau disediakan, biasanya dengan biaya tertentu, dari penyedia berita, biasanya disebut originators, ke pasar yang membutuhkan atau subscribers. RSS (Rich Site Summary) adalah format yang secara umum digunakan untuk melakukannya. RSS pada dasarnya adalah suatu file yang berada di suatu situs, yang menyediakan informasi tentang isi dari situs tersebut. File tersebut biasa disebut sebagai RSS Feeds dan dapat di ambil dan diolah untuk mendapatkan informasi tentang isi situs tersebut. Dibuat sebuah aplikasi untuk pengambilan situs berita secara otomatis menggunakan content syndication yang memerlukan aplikasi pada proses background untuk mengambil RSS Feeds secara berkala pada komputer yang berfungsi sebagai server. Server yang mengambil berita dari situs penyedia terdiri dari aplikasi yang mengatur konfigurasi berita tersebut, dan sebuah windows service untuk mengambil RSS feeds kemudian mengolahnya secara otomatis. Sedangkan aplikasi untuk membaca berita dari RSS server terdapat pada client berupa sebuah komponen plug-in. Uji coba pertama dilakukan dengan menguji keberhasilan aplikasi dalam mengatur konfigurasi skema, atribut tabel, dan pengaturan kategori situs penyedia RSS. Sedangkan ujicoba kedua dilakukan dengan melakukan perbandingan hasil pencarian berita yang didapat dari program dengan berita dari situs lain yang tidak menerapkan content syndication. Dari hasil pengujian diketahui bahwa aplikasi dengan content syndication mampu melakukan pencarian berita dan memberikan hasil yang lebih baik. Kata Kunci: Content Syndication, RSS, Windows Service, Band Object.
PENGKATEGORIAN ISI BERITA BERBAHASA INDONESIA MENGGUNAKAN ALGORITMA SYMBOLIC RULE INDUCTION BERBASIS DECISION TREE Yudhi Purwananto; Diana Purwitasari; Yos Nugroho
JUTI: Jurnal Ilmiah Teknologi Informasi Vol 3, No 1 Januari 2004
Publisher : Department of Informatics, Institut Teknologi Sepuluh Nopember

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (218.263 KB) | DOI: 10.12962/j24068535.v3i1.a131

Abstract

Pengkategorian teks sangat penting demi manajemen dan temu kembali pengetahuan yang ada pada teks tersebut. Pengkategorian teks yang dilakukan secara manual akan menghabiskan banyak waktu dan biaya. Karena itu diperlukan suatu sistem yang mampu mengkategorikan teks secara otomatis. Penelitian ini berusaha untuk mengkategorikan teks dengan menggunakan algoritma symbolic rule induction berbasis decision tree. Pengkategorian dilakukan untuk berita berbahasa Indonesia. Dari teks berita tersebut, dipilih fitur-fitur yang relevan untuk masing-masing kategori berdasarkan kriteria Information Gain. Dengan menggunakan fitur-fitur tersebut, dibangun decision tree melalui proses induksi. Untuk meningkatkan akurasi decision tree dilakukan proses pruning. Proses selanjutnya adalah menghasilkan aturan-aturan yang ekivalen secara logis dengan decision tree tersebut dengan memanfaatkan sibling criterion. Algoritma ini diuji coba dengan menggunakan data berita dari situs Detik. Uji coba dilakukan untuk mengetahui pengaruh dari jumlah fitur, jumlah data, dan nilai maksimum suatu fitur terhadap nilai F1 dan waktu komputasi. Hasil uji coba menunjukkan bahwa jumlah fitur dan jumlah data pelatihan yang bertambah cenderung akan meningkatkan nilai F1. Kata Kunci : Text Categorization, DTree, Sibling Criterion, Decision Tree, Symbolic Rule Induction