JuTISI (Jurnal Teknik Informatika dan Sistem Informasi)
Vol 1 No 2 (2015): JuTISI

Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

Oscar Karnalim (Maranatha Christian University)



Article Info

Publish Date
30 Aug 2015

Abstract

Byte code as information source is a novel approach which enable Java archive search engine to be built without relying on another resources except the Java archive itself [1]. Unfortunately, its effectiveness is not considerably high since some relevant documents may not be retrieved because of vocabulary mismatch. In this research, a vector space model (VSM) is extended with semantic relatedness to overcome vocabulary mismatch issue in Java archive search engine. Aiming the most effective retrieval model, some sort of equations in retrieval models are also proposed and evaluated such as sum up all related term, substituting non-existing term with most related term, logaritmic normalization, context-specific relatedness, and low-rank query-related retrieved documents. In general, semantic relatedness improves recall as a tradeoff of its precision reduction. We also proposed a scheme to take the advantage of relatedness without affected by its disadvantage (VSM + considering non-retrieved documents as low-rank retrieved documents using semantic relatedness). This scheme assures that relatedness score should be ranked lower than standard exact-match score. This scheme yields 1.754% higher effectiveness than our standard VSM.

Copyrights © 2015






Journal Info

Abbrev

jutisi

Publisher

Subject

Computer Science & IT

Description

Paper topics that can be included in JuTISI are as follows, but are not limited to: • Artificial Intelligence • Business Intelligence • Cloud & Grid Computing • Computer Networking & Security • Data Analytics • Datawarehouse & Datamining • Decision Support System • E-Systems (E-Gov, ...