Journal of Applied Data Sciences
Vol 7, No 2: May 2026

PRAKE: A Modified RAKE Model for Keyword Extraction in Accreditation Assessment Descriptions

Irmanda, Helena Nurramdhani (Unknown)
Hartati, Sri (Unknown)
Mulyana, Sri (Unknown)



Article Info

Publish Date
04 May 2026

Abstract

Study program accreditation requires aligning assessment criteria with the Self-Evaluation Sheet (LED), which is usually written as a lengthy and complex narrative. Finding relevant information requires a method that can automatically extract keywords from assessment descriptions as representations of the criteria. Keyword extraction can be applied through the Rapid Automatic Keyword Extraction (RAKE) method, a simple technique that works without labeled data. However, standard RAKE uses stopwords as delimiters to segment candidate phrases, making it less effective for complex sentences such as those found in accreditation assessment descriptions. Because a single sentence may contain several ideas, the extraction process should handle phrases carefully through splitting, merging, or extension according to their structure and meaning. To address this limitation, this study introduces PRAKE (Phrase-Refined RAKE), a modified RAKE algorithm that enhances candidate phrase extraction. Modifications are carried out at the Candidate Phrase Extraction stage through three techniques, including Phrase Completion to complete short phrases afterwards with the prefix of the previous phrase, Phrase Restructuring to rearrange phrases through merging or separation based on structure and meaning, and Semantic Phrase Composition to form new phrases from different elements that are semantically interrelated. Additionally, a domain term weighting based on term frequency is integrated into the scoring calculation to strengthen the relevance of terms to the accreditation context. The model achieved a precision of 0.90, recall of 0.83, and F1-score of 0.85, representing the average performance across all 101 assessment descriptions evaluated in this study. The results demonstrate that PRAKE adapts better to accreditation terminology and improves keyword relevance and extraction efficiency. These findings indicate that PRAKE provides a foundation for automated evaluation and can be extended for cross-domain document analysis.

Copyrights © 2026






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...