Terms or tokens are the main component in information retrieval system. The use of them as index andquery affects the performance of the system. This research is conducted to observe how similarity thesaurusimproves the performance of the Indonesian information retrieval system through query expansion. By using 30sets of query and 1.000 documents, series of tests are conducted using different weight of terms in query tomeasure the performance of the system before and after query expansion. By using cosine as similaritymeasurement and the weight of the query terms, the terms used in query expansion can be determined. Twotreatments that were used are by taking 5 (TH5) and 10 (TH10) terms that has the biggest similarity value withthe query. It is found that overall the query expansion improve the performance of the system compared to theone without query expansion (NoTH). However, it also depends on the weight of the terms in the query. On threeexperiment combined with NoTH, TH5, and TH10, the results show that idf is proved to be better used as weightof the terms in query in order to improve the performance of the system, either using query expansion or withoutquery expansion.
Copyrights © 2015