Written communication in social media that emphasizes the speed of information dissemination, the phenomenon of using non-standard language often occurs at the level of sentences, clauses, phrases and words. As a source of data, social media with this phenomenon presents challenges in the process of extracting information. Normalization of non-standard language into standard language begins in the word normalization process where non-standard words (NSW) are normalized to standard forms (standard words (SW)). The normalization process using edit distance has limitations in the process of weighting the static mismatch, match, and gap values. In calculating the mismatch value, statida weighting cannot provide a weight difference due to incorrect keystrokes on the keyboard, especially adjacent keys. Due to the limited edit distance weighting, this research proposes a dynamic weighting method for mismatch weights. The result of this research is that there is a new method of dynamic weighting based on the position of the keyboard keys that can be used to normalize NSW using the approximate string matching method.
Copyrights © 2021