This study analyzes the lexical similarity of ten regional languages in Indonesia, namely Jambi Malay (jax), Kerinci (kvr), Minangkabau (min), Banjar (bjn), Mentawai (mwv), Sasak (sas), Javanese (jav), Toba (tob), Angkola (akb), and Mandailing (btm), along with Indonesian (ind) as a lingua franca; the genetic status of the regional languages; and the separation times among the regional languages. Data were collected through field observations at ten locations, with three informants per language, gathering 257 glosses from core (L1), nature (L2), general (L3), and cultural (L4) vocabulary. The analysis was conducted in three stages: first, synchronic lexical similarity was calculated using the Jaccard method; second, genetic relationships were analyzed through lexicostatistics based on L1 and L2; third, glottochronology was used to estimate language separation times among the regional languages. The results indicate that no language pairs share high similarity; most fall into the low-to-moderate similarity category. Lexicostatistical analysis reveals that core languages (jax, kvr, min, and bjn) and peripheral languages (tob, akb, and btm) form distinct genetic families, while mwv is the most lexically isolated language. Estimates of separation times indicate that core languages have a more recent lineage, while other languages such as mwv, jav, and tob show earlier divergence periods. These findings confirm that geographic proximity does not always correlate with linguistic relationships and suggest the need to revise the classification of Indonesian languages available in online databases, particularly the position of mwv, which should be reclassified as part of the Barrier Island language group rather than part of the Sumatran group. This study also highlights the importance of using primary data in language documentation to provide a more accurate map of linguistic evolution for regional languages in the archipelago.
Copyrights © 2026