Claim Missing Document
Check
Articles

Found 36 Documents
Search

Rhetorical Sentences Classification Based on Section Class and Title of Paper for Experimental Technical Papers Afrida Helen; Ayu Purwarianti; Dwi H. Widyantoro
Journal of ICT Research and Applications Vol. 9 No. 3 (2015)
Publisher : LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.ict.res.appl.2015.9.3.5

Abstract

Rhetorical sentence classification is an interesting approach for making extractive summaries but this technique still needs to be developed because the performance of automatic rhetorical sentence classification is still poor. Rhetorical sentences are sentences that contain rhetorical words or phrases. Rhetorical sentences not only appear in the contents of a paper but also in the title. In this study, features related to section class and title class that have been proposed in a previous research were further developed. Our method uses different techniques to reach automatic section class extraction for which we introduce new, format-based features. Furthermore, we propose automatic rhetoric phrase extraction from the title. The corpus we used was a collection of technical-experimental scientific papers. Our method uses the Support Vector Machine (SVM) algorithm and the Naïve Bayesian algorithm for classification. The four categories used were: Problem, Method, Data, and Result. It was hypothesized that these features would be able to improve classification accuracy compared to previous methods. The F-measure for these categories reached up to 14%. 
Improvement of Fuzzy Geographically Weighted Clustering-Ant Colony Optimization Performance using Context-Based Clustering and CUDA Parallel Programming Nila Nurmala; Ayu Purwarianti
Journal of ICT Research and Applications Vol. 11 No. 1 (2017)
Publisher : LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.ict.res.appl.2017.11.1.2

Abstract

Geo-demographic analysis (GDA) is the study of population characteristics by geographical area. Fuzzy Geographically Weighted Clustering (FGWC) is an effective algorithm used in GDA. Improvement of FGWC has been done by integrating a metaheuristic algorithm, Ant Colony Optimization (ACO), as a global optimization tool to increase the clustering accuracy in the initial stage of the FGWC algorithm. However, using ACO in FGWC increases the time to run the algorithm compared to the standard FGWC algorithm. In this paper, context-based clustering and CUDA parallel programming are proposed to improve the performance of the improved algorithm (FGWC-ACO). Context-based clustering is a method that focuses on the grouping of data based on certain conditions, while CUDA parallel programming is a method that uses the graphical processing unit (GPU) as a parallel processing tool. The Indonesian Population Census 2010 was used as the experimental dataset. It was shown that the proposed methods were able to improve the performance of FGWC-ACO without reducing the clustering quality of the original method. The clustering quality was evaluated using the clustering validity index.
Efficient Utilization of Dependency Pattern and Sequential Covering for Aspect Extraction Rule Learning Fariska Zakhralativa Ruskanda; Dwi Hendratmo Widyantoro; Ayu Purwarianti
Journal of ICT Research and Applications Vol. 14 No. 1 (2020)
Publisher : LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.5614/itbj.ict.res.appl.2020.14.1.4

Abstract

The use of dependency rules for aspect extraction tasks in aspect-based sentiment analysis is a promising approach. One problem with this approach is incomplete rules. This paper presents an aspect extraction rule learning method that combines dependency rules with the Sequential Covering algorithm. Sequential Covering is known for its characteristics in constructing rules that increase positive examples covered and decrease negative ones. This property is vital to make sure that the rule set used has high performance, but not inevitably high coverage, which is a characteristic of the aspect extraction task. To test the new method, four datasets were used from four product domains and three baselines: Double Propagation, Aspectator, and a previous work by the authors. The results show that the proposed approach performed better than the three baseline methods for the F-measure metric, with the highest F-measure value at 0.633.
Detailed Analysis of Extrinsic Plagiarism Detection System Using Machine Learning Approach (Naive Bayes and SVM) Zakiy Firdaus Alfikri; Ayu Purwarianti
Indonesian Journal of Electrical Engineering and Computer Science Vol 12, No 11: November 2014
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v12.i11.pp7884-7894

Abstract

In this report we proposed a detailed analysis method of plagiarism detection system using machine learning approach. We used Naive Bayes and Support Vector Machine (SVM) as learning algorithms. Learning features used in the method are words similarity, fingerprints similarity, latent semantic analysis (LSA) similarity, and word pair. The purpose in selecting those features is to retrieve information from the state-of-the-art detailed analysis methods (words similarity, fingerprinting, and LSA) in order to integrate the strength of each method in detecting plagiarism. Several experiments were conducted to test the performance of the proposed method in detecting many cases of plagiarism. The experiments used data test that contains cases of literal plagiarism, partial literal plagiarism, paraphrased plagiarism, plagiarism with changed sentence structure, and translated plagiarism. The data test also contains cases of non-plagiarism of different topics and non-plagiarism of the same topic. The results obtained in experiments using SVM showed an average accuracy of 92.86% (reaching 95.71% without using words similarity feature). While the result obtained using Naive Bayes showed an average accuracy of 54.29% (reaching 84.29% without using the word pair features).
Sebuah Survey: Tingkat Kepercayaan Pengguna Terhadap Informasi di Sosial Media Titin Pramiyati; Iping Supriana; Ayu Purwarianti
Jurnal Sistem Informasi Vol 7, No 1 (2015)
Publisher : Universitas Sriwijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (410.184 KB) | DOI: 10.36706/jsi.v7i1.1979

Abstract

Abstract Information trustworthiness can be obtained based on the confidence level (trust) or reputation of the source of information. Nowadays, most people use information derived from social media, however finding reliable source of information can be troublesome. This paper discusses the results of determining the level of trust of certain information presented in social media. The media used as the source of information in this research were Facebook, Google+, Twitter, and LinkedIn. This research is a descriptive study, which is used to recognize behavior of social media users toward the trust level of the sources of information. Respondents involved in this study were divided into two clusters: Civilians and Military officers to seek for their opinion in terms of which social media that have trustworthy information. Data used to support this research was gathered through administering a survey. Survey distribution process was conducted by creating personally-administered questionnaire survey questions distributed directly to respondents. This kind of survey is quite sufficient for a limited survey purpose. Confidence level was measured using graphical and numerical measurements, and equipped with a chi-squared test hypothesis. Based on data analysis process, it was found that Twitter and Google+ chosen to be the most trustworthy source of information. Key word : information trust level;  graphical measurement; numerical measurement; chi-squared test hypothesis Abstrak Informasi yang dipercaya dapat diperoleh berdasarkan pada kepercayaan yang dimiliki oleh sumber informasi atau reputasi sumber informasi. Saat ini, banyak pengguna informasi menggunakan informasi yang berasal dari sosial media, akan tetapi mendapatkan informasi yang sumber informasinya dapat dipercaya masih belum diketahui. Paper ini membahas hasil penentuan tingkat kepercayaan informasi yang terdapat pada media sosial.  Media sosial yang digunakan sebagai sumber informasi pada penelitian ini adalah Facebook, Google+, Twitter, and LinkedIn. Penelitian ini merupakan penelitian deskriptif untuk mengetahui perilaku pengguna sosial media terhadap tingkat kepercayaan sumber informasi. Responden yang terlibat dalam penelitian ini dibagi dua kelompok yaitu kelompok Sipil dan kelompok Militer, untuk mendapatkan pilihan atas media sosial dengan informasi yang dapat dipercaya..  Data yang digunakan untuk mendukung penelitian ini diperoleh melalui survey. Penyebaran survey dilakukan dengan menggunakan pertanyaan yang dibuat sendiri sesuai dengan kebutuhan penelitian dan langsung disebar kepada responden. Survey ini cukup baik untuk survey yang terbatas. Tingka kepercayaan sosial media yang diberikan oleh pengguna menggunakan pengukuran grafis dan numerik, serta dilengkapi dengan uji hipotesis chi-kuadrat. Berdasarkan proses analisa data yang dilakukan, diperoleh bahwa media sosial Twitter dan Google+ adalah sumber informasi yang dipercaya. Kata kunci : tingkat kepercayaan informasi;  pengukuran grafis; pengukuran numerik; uji hipotesa chi-kuadrat
Tantangan dan Peluang pada Question Generation Wiwin Suwarningsih; Iping Supriana; Ayu Purwarianti
Jurnal Sistem Informasi Vol 6, No 2 (2014)
Publisher : Universitas Sriwijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (416.434 KB) | DOI: 10.36706/jsi.v6i2.1718

Abstract

Abstrak Pada makalah ini, kami melakukan survey beberapa penelitian yang membahas mengenai question generation (QG). QG adalah sebuah teknik untuk membangkitkan pertanyaan yang berasal dari sebuah kalimat atau teks dalam bentuk bahasa alami. Kami mencoba menelaah garis besar konseptual question generation yang terdiri dari tiga kategori yaitu : berbasis sintaks, berbasis semantik, dan berbasis template. Sistem question generation dalam kategori sintaksis sering menggunakan unsur semantik dan sebaliknya. Sedangkan sistem yang berbasis template menggunakan beberapa tingkat sintaksis dan/atau informasi semantik. Hasil akhir dari survey ini adalah sebuah review berupa tantangan dan peluang dalam pengembangan penelitian di masa mendatang, yaitu berupa : (a) Tantangan pada isu semantik leksikal dan sintaktik, (b) penggunaan alternatif segitiga Vauquois, shallow parser dan (c) representasi sintaksis dengan struktur pohon frasa.Kata kunci : question generation, leksikal, sintaksis, transformasi kalimat, segitiga Vauquois.Abstract In this paper, we reviewed the current state of the art in the question generation (QG). Question Generation (QG) is the task of generating reasonable questions from a text or sentence of natural language. We attempted to examine the question of conceptual outline generation consisting of three categories: Syntax based, semantic-based and template-based. Question generation system in the syntactic category often uses semantic elements and vice versa. While the template-based system using multiple levels of syntactic and / or semantic information. The final results of this survey is a review in the form of challenges and opportunities in the development of future research, which are: (a) challenge on the issue of lexical semantic and syntactic, (b) the use of alternative Vauquois triangular, shallow parser, and (c) the syntactic representation phrase structure tree.Key word : question generation, leksikal, sintaksis, transformasi kalimat, segitiga Vauquois
Monitoring Indonesian online news for COVID-19 event detection using deep learning Purnomo Husnul Khotimah; Andria Arisal; Andri Fachrur Rozie; Ekasari Nugraheni; Dianadewi Riswantini; Wiwin Suwarningsih; Devi Munandar; Ayu Purwarianti
International Journal of Electrical and Computer Engineering (IJECE) Vol 13, No 1: February 2023
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v13i1.pp957-971

Abstract

Even though coronavirus disease 2019 (COVID-19) vaccination has been done, preparedness for the possibility of the next outbreak wave is still needed with new mutations and virus variants. A near real-time surveillance system is required to provide the stakeholders, especially the public, to act in a timely response. Due to the hierarchical structure, epidemic reporting is usually slow particularly when passing jurisdictional borders. This condition could lead to time gaps for public awareness of new and emerging events of infectious diseases. Online news is a potential source for COVID-19 monitoring because it reports almost every infectious disease incident globally. However, the news does not report only about COVID-19 events, but also various information related to COVID-19 topics such as the economic impact, health tips, and others. We developed a framework for online news monitoring and applied sentence classification for news titles using deep learning to distinguish between COVID-19 events and non-event news. The classification results showed that the fine-tuned bidirectional encoder representations from transformers (BERT) trained with Bahasa Indonesia achieved the highest performance (accuracy: 95.16%, precision: 94.71%, recall: 94.32%, F1-score: 94.51%). Interestingly, our framework was able to identify news that reports the new COVID strain from the United Kingdom (UK) as an event news, 13 days before the Indonesian officials closed the border for foreigners.
PENILAIAN ESAI JAWABAN BAHASA INDONESIA MENGGUNAKAN METODE SVM - LSA DENGAN FITUR GENERIK Rama Adhitia; Ayu Purwarianti
Jurnal Sistem Informasi Vol. 5 No. 1 (2009): Jurnal Sistem Informasi (Journal of Information System)
Publisher : Faculty of Computer Science Universitas Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (845.401 KB) | DOI: 10.21609/jsi.v5i1.260

Abstract

Paper ini mengkaji sebuah solusi untuk permasalahan penilaian jawaban esai secara otomatis dengan menggabungkan support vector machine (SVM) sebagai teknik klasifikasi teks otomatis dengan LSA sebagai usaha untuk menangani sinonim dan polisemi antar index term. Berbeda dengan sistem penilaian esai yang biasa yakni fitur yang digunakan berupa index term, fitur yang digunakan proses penilaian jawaban esai adalah berupa fitur generic yang memungkinkan pengujian model penilaian esai untuk berbagai pertanyaan yang berbeda. Dengan menggunakan fitur generic ini, seseorang tidak perlu melakukan pelatihan ulang jika orang tersebut akan melakukan penilaian esai jawaban untuk beberapa pertanyaan. Fitur yang dimaksud meliputi persentase kemunculan kata kunci, similarity jawaban esai dengan jawaban referensi, persentase kemunculan gagasan kunci, persentase kemunculan gagasan salah, serta persentase kemunculan sinonim kata kunci. Hasil pengujian juga memperlihatkan bahwa metode yang diusulkan mempunyai tingkat akurasi penilaian yang lebih tinggi jika dibandingkan dengan metode lain seperti SVM atau LSA menggunakan index term sebagai fitur pembelajaran mesin. This paper examines a solution for problems of assessment an essay answers automatically by combining support vector machine (SVM) as automatic text classification techniques and LSA as an attempt to deal with synonyms and the polysemy between index terms. Unlike the usual essay scoring system that used index terms features, the feature used for the essay answers assessment process is a generic feature which allows testing of valuation models essays for a variety of different questions. By using these generic features, one does not need to re training if the person will conduct an assessment essay answers to some questions. The features include percentage of keywords, similarity essay answers with the answer reference, percentage of key ideas, percentage of wrong answer, and percentage of keyword synonyms. The test results also show that the proposed method has a higher valuation accuracy rate compared to other methods such as SVM or LSA, use term index as features in machine learning.
Tuning Hyperparameter pada Gradient Boosting untuk Klasifikasi Soal Cerita Otomatis Umi Laili Yuhana; Ayu Purwarianti; Imamah Imamah
JEPIN (Jurnal Edukasi dan Penelitian Informatika) Vol 8, No 1 (2022): Volume 8 No 1
Publisher : Program Studi Informatika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26418/jp.v8i1.50506

Abstract

Soal adalah susunan pertanyaan yang dibuat untuk menguji keberhasilan dari pembelajaran siswa. Bagi manusia, membedakan soal penjumlahan dengan pengurangan sangat mudah, namun tidak halnya dengan mesin. Mesin  membutuhkan pembelajaran untuk mengenali soal cerita apakah termasuk penjumlahan atau pengurangan. Kebutuhan mesin untuk mengenali soal cerita biasanya diterapkan dalam pembuatan sistem E-learning. Berdasarkan dari masalah ini, maka digunakan metode gradient boosting untuk mengklasifikasikan soal cerita. Kelas target atau label dari klasifikasi terdiri dari empat kelas yaitu penjumlahan, pengurangan, perkalian, pembagian, dan campuran.  Soal cerita diambil dari buku matematika untuk kelas tiga sampai kelas enam Sekolah Dasar. Guru Sekolah Dasar (SD) melabeli soal cerita, dan dijadikan sebagai dataset untuk pembelajaran dari machine learning. Dataset kemudian di preprocessing, ekstraksi fitur dengan menggunakan TF-IDF dan selanjutnya dibagi menjadi data training dan data testing dengan menggunakan K-fold cross validation dengan nilai K[5,10,20]. Performa metode gradient boosting dalam mengklasifikasikan soal matematika diukur dengan menggunakan akurasi. Akurasi didapatkan dari hasil perbandingan dari label yang diprediksi oleh machine learning dengan label dari pakar yaitu guru SD. Berdasarkan hasil percobaan pada 500 data soal cerita, diperoleh hasil akurasi terbaik sebesar 75,8% pada saat K=20 dengan hyperparameter gradient boosting N_estimator=100, max_depth=9 dan learning rate=0,15.
Sistem Rekomendasi Lokasi Optimal dan Potensi Penghematan Energi Pemasangan PLTS Atap Berbasis AI di Pulau Jawa Aminuddin, Amir; Supanto, Supanto; Saputra, Hadi; Herawati, Neng Ayu; Purwarianti, Ayu; Utama, Nugraha Priya
Jurnal Infomedia: Teknik Informatika, Multimedia, dan Jaringan Vol 10, No 2 (2025): Jurnal Infomedia
Publisher : Politeknik Negeri Lhokseumawe

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30811/jim.v10i2.7219

Abstract

Transisi menuju energi terbarukan di Indonesia menuntut pendekatan berbasis data dalam menentukan lokasi optimal pemasangan Pembangkit Listrik Tenaga Surya (PLTS) atap dan dalam memperkirakan dampak ekonomisnya. Penelitian ini mengembangkan sistem rekomendasi berbasis Artificial Intelligence (AI) yang mengintegrasikan data penyinaran matahari dari BMKG dan data konsumsi listrik dari PLN untuk mendukung perencanaan PLTS atap di Pulau Jawa. Pendekatan dilakukan melalui tiga metode pembelajaran mesin utama: klasifikasi untuk mengevaluasi kelayakan pelanggan, klasterisasi wilayah menggunakan algoritma clustering, dan regresi untuk memprediksi potensi penghematan energi. Lima algoritma klasifikasi dibandingkan, dengan LightGBM menunjukkan performa tertinggi (akurasi 87%). Segmentasi wilayah optimal diperoleh melalui KMeans (silhouette score 0,5566). Estimasi penghematan energi paling akurat dihasilkan oleh XGBoost Regressor dengan koefisien determinasi (R²) sebesar 0,9999. Hasil ini menunjukkan bahwa pendekatan integratif berbasis AI dapat menyediakan informasi prediktif yang akurat dan aplikatif bagi penyusunan strategi promosi dan investasi PLTS atap, sekaligus memberikan estimasi manfaat kuantitatif bagi pelanggan. Penelitian ini memberikan kontribusi signifikan dalam pengembangan sistem pendukung keputusan untuk energi terbarukan berbasis analitik spasial dan perilaku konsumsi.