Computer Science and Information Technologies
Vol 4, No 2: July 2023

The observed preprocessing strategies for doing automatic text summarizing

Muhammad Farhan Juna (Universitas Amikom Yogyakarta)
Mardhiya Hayaty (Universitas Amikom Yogyakarta)



Article Info

Publish Date
01 Jul 2023

Abstract

It is challenging for humans to keep up with the rapid creation of digital information due to the explosion of digital information. A written document can be analyzed to extract meaningful information using automatic text summarization. This research proposes 16 different experimental settings in which the model developed by IndoBERT will be applied in order to answer the question of how much of an impact preprocessing has on the quality of summaries produced by automatic text summarization. In order to answer this question, the researchers have devised this study. In this study, we will explicitly talk about preprocessing strategies by conducting tests with different combinations of preprocessing techniques. These techniques include data cleansing, stopwords, stemming, and case folding. After that, the recall-oriented understudy for gisting evaluation (ROUGE) assessment will be used to conduct the measurement of the research results. According to the findings of this research, the optimal level of performance may be accomplished by combining the processes of data cleaning and case folding with scores of 0.78, 0.60, and 0.68 for ROUGE-1, ROUGE-2, and ROUGE-L respectively.

Copyrights © 2023






Journal Info

Abbrev

csit

Publisher

Subject

Computer Science & IT Engineering

Description

Computer Science and Information Technologies ISSN 2722-323X, e-ISSN 2722-3221 is an open access, peer-reviewed international journal that publish original research article, review papers, short communications that will have an immediate impact on the ongoing research in all areas of Computer ...