Aprilina Tarigan, Mida
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Job Classification Based on Skills and Qualifications Using Natural Language Processing and Ensemble Learning Methods Oktasia Nasution, Hafiza; Ramadhani, Dian; Aprilina Tarigan, Mida; Andreas, Prima; Suryati Ningsih, Dewita; Pramadewi, Arwinence
IT Journal Research and Development Vol. 10 No. 2 (2025)
Publisher : UIR PRESS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25299/itjrd.2025.25550

Abstract

This study proposes a job classification framework using Natural Language Processing (NLP) and Ensemble Learning to classify job roles based on required skills and qualifications. A large-scale open-source dataset containing 1.048.576 job postings was utilized, with attributes such as job title, qualifications, skills, company profile, and role. Only relevant attributes were used: skills and qualifications as input features, and role as the target label. Data were filtered to focus on three major job roles—Management, IT, and Digital—resulting in 489,651 relevant entries. Skills were extracted and standardized using GROK AI before feature transformation with MultiLabelBinarizer for one-hot encoding. The XGBoost algorithm was applied for classification under multiple data split configurations (70:15:15, 80:10:10, 70:30, 80:20, 90:10) with random_state=42 and multi-class log loss evaluation. Results showed that the 90:10 configuration achieved the highest accuracy (74.18%), followed by 80:20 with 68.44%. This research demonstrates that ensemble learning effectively handles high-dimensional categorical job data and provides a foundation for automated job classification systems and labor market analysis.