Articles
Transfer Learning of Pre-trained Transformers for Covid-19 Hoax Detection in Indonesian Language
Lya Hulliyyatus Suadaa;
Ibnu Santoso;
Amanda Tabitha Bulan Panjaitan
IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol 15, No 3 (2021): July
Publisher : IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia.
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.22146/ijccs.66205
Nowadays, internet has become the most popular source of news. However, the validity of the online news articles is difficult to assess, whether it is a fact or a hoax. Hoaxes related to Covid-19 brought a problematic effect to human life. An accurate hoax detection system is important to filter abundant information on the internet. In this research, a Covid-19 hoax detection system was proposed by transfer learning of pre-trained transformer models. Fine-tuned original pre-trained BERT, multilingual pre-trained mBERT, and monolingual pre-trained IndoBERT were used to solve the classification task in the hoax detection system. Based on the experimental results, fine-tuned IndoBERT models trained on monolingual Indonesian corpus outperform fine-tuned original and multilingual BERT with uncased versions. However, the fine-tuned mBERT cased model trained on a larger corpus achieved the best performance.
PENGUKURAN TINGKAT KEMIRIPAN DOKUMEN BERBASIS CLUSTER
Ibnu Santoso;
Lya Hulliyyatus Suadaa
KLIK- KUMPULAN JURNAL ILMU KOMPUTER Vol 6, No 1 (2019)
Publisher : Lambung Mangkurat University
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.20527/klik.v6i1.181
Document similarity can be measured and used to discover other similar documents in a document collection (corpus). In a small corpus, measuring document similarity is not a problem. In a bigger corpus, comparing similarity rate between documents can be time consuming. A clustering method can be used to minimize number of document collection that has to be compared to a document to save time. This research is aimed to discover the effect of clustering technique in measuring document similarity and evaluate the performance. Corpus used was undergraduate thesis of Politeknik Statistika STIS students from year 2007-2016 as many as 2.049 documents. These documents were represented as bag of words model and clustered using k-means clustering method. Measurement of similarity used is Cosine similarity. From the simulation, clustering process for 3 clusters needs longer preparation time (17,32%) but resulting in faster query processing (77,88%) with accuracy of 0,98. Clustering process for 5 clusters needs longer preparation time (31,10%) but resulting in faster query processing (83,79%) with accuracy of 0,86. Clustering process for 7 clusters needs longer preparation time (45,10%) but resulting in faster query processing (85,30%) with accuracy of 0,98.
Deteksi Hoaks Pada Berita Berbahasa Indonesia Seputar COVID-19
Amanda Tabitha Bulan Panjaitan;
Ibnu Santoso
FORMAT Vol 10, No 1 (2021)
Publisher : Universitas Mercu Buana
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.22441/format.2021.v10.i1.007
Perkembangan teknologi yang semakin maju tentu mendatangkan banyak kemudahan bagi para penggunanya namun di lain sisi juga mempercepat penyebaran berita bohong pada internet. Berita bohong atau dikenal dengan hoaks adalah informasi sesat dan berbahaya karena menyesatkan persepsi manusia dengan menyampaikan informasi palsu sebagai kebenaran. Hoaks sendiri dapat bertujuan untuk mempengaruhi pembaca dengan informasi palsu sehingga pembaca mengambil tindakan sesuai dengan isi hoaks. Oleh karena itu, diperlukan sistem cerdas yang mampu mengklasifikasi sebuah berita dengan cepat yang menyebar melalui internet agar tidak menyesatkan para pembacanya. Penelitian ini dimulai dengan melakukan scraping berita yang sudah diberi kategori hoaks atau valid. Dataset tersebut dibagi dua menjadi data latih dan data uji. Dilakukan pre-processing mulai dari case folding, tokenizing, filtering dan stemming. Pada penelitian ini dilakukan perbandingan terhadap pengaruh penerapan feature engineering. Dari hasil akurasi, dapat dilihat bahwa dengan diterapkannya feature engineering mampu meningkatkan akurasi kelima metode klasifikasi. Metode random forest dengan penerapan feature engineering menghasilkan tingkat akurasi sebesar 96,05%.
Application of Named Entity Recognition via Twitter on SpaCy in Indonesian (Case Study : Power Failure in the Special Region of Yogyakarta)
Rizka Maulida Yanti;
Ibnu Santoso;
Lya Hulliyyatus Suadaa
Indonesian Journal of Information Systems Vol. 4 No. 1 (2021): August 2021
Publisher : Program Studi Sistem Informasi Universitas Atma Jaya Yogyakarta
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
DOI: 10.24002/ijis.v4i1.4677
SpaCy is a tool that can efficiently handle Natural Language Processing (NLP) problems, one of which is Named Entity Recognition (NER). NER is used to extract and identify named entities in a text. However, so far SpaCy has not officially released the NER model pre-train for Indonesian. On the other hand, based on the 2019 PLN statistical report, the Province of D.I. Yogyakarta is a province that often experiences power failure and many complaints from the public are found on Twitter related to power failure that occur in the province. This is because there is no research on extracting information related to electrical disturbances and research on NER using SpaCy in Indonesian is still rare. So in this study, information extraction related to power failure in the Province of D.I. will be carried out. Yogyakarta via twitter using Indonesian SpaCy. This study produces good performance results with 95.52% precision calculation, 93.27% recall, and 94.38% f1-score. Then, mapping is carried out based on the location entities contained in tweets related to electrical disturbances. From this process, it was found that the highest number of locations mentioned in the tweet related to power failure came from Sleman Regency, while the lowest number came from Gunung Kidul Regency. Then, the month that experienced the most power failure was March 2020, while the month that experienced the least amount of electricity was July 2020.
ANALISIS SENTIMEN BADAN PUSATSTATISTIK BERDASARKAN MEDIA ONLINE
Rizki Adriansah;
Ibnu Santoso
Seminar Nasional Official Statistics Vol 2020 No 1 (2020): Seminar Nasional Official Statistics 2020
Publisher : Politeknik Statistika STIS
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (426.836 KB)
|
DOI: 10.34123/semnasoffstat.v2020i1.359
Ketersediaan informasi yang real-time untuk mengevaluasi hasil kerja yang dibuat oleh lembaga pemerintah masih minim adanya. Salah satunya adalah evaluasi terkait angka – angka statistik yang dikeluarkan oleh BPS. Untuk itu dibutuhkan suatu teknik yang dapat digunakan untuk menjawab permasalahan tersebut. Pemanfaatan berita dan menggabungkannya dengan teknologi dapat membantu mengatasi permasalahan gap waktu yang cukup jauh antara waktu publikasi dan pelaksanaan evaluasi yang ada. Berhasil terkumpul 1.410 berita, yang terdiri dari 699 berita di antaranya berasal dari detik.com, 323 berita berasal dari kompas.com serta 388 berita lainnya berasal dari tempo.co. Hasil sentiment analysis menunjukkan bahwa angka – angka statistik yang dikeluarkan BPS pada senarai rencana terbit BRS 2019 secara umum sudah cukup baik. Hasil penelitian menunjukkan bahwa berita dapat digunakan sebagai bahan analisis untuk melakukan penilaian terhadap publikasi angka – angka statistik yang dikeluarkan BPS dikarenakan hasilnya yang cukup relevan dengan keadaan aslinya.
ANALISIS LOWONGAN PEKERJAAN
Eka Majida Agustyani;
Ibnu Santoso
Seminar Nasional Official Statistics Vol 2020 No 1 (2020): Seminar Nasional Official Statistics 2020
Publisher : Politeknik Statistika STIS
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (248.363 KB)
|
DOI: 10.34123/semnasoffstat.v2020i1.362
Lowongan kerja seharusnya menyediakan informasi yang dapat membantu para pencari kerja, terutama pencari kerja yang baru saja menyelesaikan pendidikannya karena beberapa dari mereka ada yang tidak memiliki perencanaan karier dan membutuhkan gambaran umum lowongan kerja yang dapat tersedia. Akan tetapi, mereka terkadang justru kebingungan karena lowongan kerja tersebut tidak mencantumkan informasi lengkap. Jobstreet merupakan portal lowongan kerja dengan pengakses terbanyak di Indonesia dan didominasi oleh lowongan kerja dari Provinsi DKI Jakarta. Penelitian ini dilakukan dengan tujuan untuk mendapatkan gambaran umum karakteristik lapangan pekerjaan Jobstreet yang berlokasi di Provinsi DKI Jakarta. Analisis ini dibantu dengan pengelompokkan lowongan kerja dan difokuskan pembahsannya ke latar belakang pendidikan yang paling dicari. Pengelompokkan dilakukan menggunakan metode Hierarchical Agglomerative Clustering dan mendapatkan hasil 5 klaster. Lowongan kerja di Jobstreet banyak mencari pelamar dengan latar belakang pendidikan Sistem Informasi, lulusan S1, dan minimal 1 tahun pengalaman. Perusahaan yang mengiklankan banyak yang bergerak di bidang pelayanan keuangan dan berlokasi di Jakarta Selatan. Lowongan kerja dengan syarat latar belakang pendidikan Sistem Informasi banyak yang berasal dari klaster 2 serta mensyaratkan minimal pendidikan S1 dan 1 tahun pengalaman kerja dengan perusahaan bergerak di bidang Teknik Informatika dan berlokasi di Jakarta Selatan.
Pembangunan Knowledge Management System Mahasiswa Politeknik Statistika STIS
Annisa Adytia Putri;
Ibnu Santoso
Seminar Nasional Official Statistics Vol 2022 No 1 (2022): Seminar Nasional Official Statistics 2022
Publisher : Politeknik Statistika STIS
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (929.314 KB)
|
DOI: 10.34123/semnasoffstat.v2022i1.1482
Knowledge Management System (KMS) is a system for processing knowledge collection, from identification to dissemination. Universities can use KMS in managing knowledge to facilitate access of information and improve administrative processes. This research focuses on the development a web KMS for Polstat STIS students using the KMSLC development method. Knowledge is devoted to non-academic knowledge in Polstat STIS. The total knowledge that has been collected and verified is 48. KMS evaluation is done with Black box, PPSUQ, Webqual 4.0, and testing the search feature. PSSUQ KMS score for overall satisfaction obtained is 1.8. The more closer to one, the better test results. The percentage of Webqual (98.7%) of respondents agreed with the quality of knowledge. As for the search feature, 99.1% of respondents also agree with the two statements on the search feature test. From the processes carried out, the KMS that was built was able to achieve the purposes of this research.
Pembangunan Sistem Informasi Surat Perjalanan Dinas Berbasis Website
Nugroho Purnomo Aji;
Ibnu Santoso
Seminar Nasional Official Statistics Vol 2022 No 1 (2022): Seminar Nasional Official Statistics 2022
Publisher : Politeknik Statistika STIS
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (766.243 KB)
|
DOI: 10.34123/semnasoffstat.v2022i1.1517
Official travel is a trip made by an employee of an agency related to official work assignments. During an official trip, it is necessary to make an assignment order and an official travel letter which aims to be an introduction when an employee carries out official travel activities within or outside the city. Currently, there are still some obstacles in the preparation and storage of data related to official travel letters at the Sragen BPS Regency, such as the length of time it takes to make letters, there is no recording and storage of data related to letters and report files, and there is no recapitulation of official travel expenses. This causes the preparation of accountability reports related to official travel to be constrained. Based on these problems, an information system that is integrated with the database is created. So that the creation, recording, and storage of data related to official travel letters is well organized. Based on the results of the Software Requirements Specification (SRS), a system is created that can create and record related letter data and there is a total cost recap. The system has been created and tested using the black box testing method, showing that all the functions on the system are running well. The average time needed to make a letter of assignment is 50.24 out of 10 trials. The score of the SUS test results obtained is 83.5 which indicates the system can function properly.
Pembangunan Sistem Informasi E-Canteen Berbasis Web Mobile di Politeknik Statistika STIS
Erik Rihendri Candra Adifa;
Ibnu Santoso
Seminar Nasional Official Statistics Vol 2022 No 1 (2022): Seminar Nasional Official Statistics 2022
Publisher : Politeknik Statistika STIS
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (548.79 KB)
|
DOI: 10.34123/semnasoffstat.v2022i1.1519
Common problems found in the STIS Statistics Polytechnic canteen include long queue waiting times, limited food stock availability, and the need for accurate sales recapitulation. The direction for digitizing MSMEs by the government provides an opportunity for the STIS Statistics Polytechnic to change the canteen system to be digitized. The system built allows shoppers to find out information on menu availability and place an order for a meal menu before arriving at the canteen. Buyers can also provide ratings and suggestions as or evaluation. From the seller's point of view, they also benefit from automatic recapitulation and sales statistics. The system development method used is the System Development Life Cycle (SDLC) Waterfall approach. The results of the Black Box test show that all functions are expected to run well. The score of the System Usability Scale (SUS) test result is 74.64 which indicates that the E-Canteen information system is well received. The results of the Post-Study System Usability Questionnaire (PSSUQ) test also showed good results.
Pembangunan Sistem Informasi Praktik Kerja Lapangan Berbasis Web
Viona Febriana;
Ibnu Santoso
Seminar Nasional Official Statistics Vol 2022 No 1 (2022): Seminar Nasional Official Statistics 2022
Publisher : Politeknik Statistika STIS
Show Abstract
|
Download Original
|
Original Source
|
Check in Google Scholar
|
Full PDF (778.523 KB)
|
DOI: 10.34123/semnasoffstat.v2022i1.1526
BPS Malang City organizes street vendors activities with a limit of 16 participants per day. The system of registration, attendance, and recording of street vendors' daily activity reports is still done conventionally. This causes the registration process to be inefficient. In addition, there are errors in the calculation of attendance and the daily activity reports are not well documented. So that street vendors' activities can run effectively and efficiently in line with current technological developments, researchers plan to build a web-based street vendor information system that provides registration information, attendance and daily activity reports. The method used is the System Development Life Cycle (SDLC) waterfall model, using the Codeigniter 4 framework. The black-box test results show that all system features can run as expected. The final score of usability evaluation using the System Usability Scale (SUS) is 74,791. This indicates that the PKL information system built at BPS Malang is included in the Acceptable category and can be well received by users.