Abstract
Word Sense Disambiguation (WSD) is one of the most difficult problems in the artificial intelligence field or well known as AI-hard or AI-complete. A lot of problems can be solved using word sense disambiguation approach such as sentiment analysis, machine translation, search engine relevance, coherence, anaphora resolution, and inference. This research is done to solve WSD problem with two small corpora. The use of Word2vec and Wikipedia are proposed to develop the corpora. After developing the corpora, the similarity of the sentence with the corpora is measured using cosine similarity to determine the meaning of the ambiguous word. Lastly, to improve accuracy, Lesk algorithms and Wu Palmer similarity are used to deal with problems when there is no word from a sentence in the corpus. The results of the research show an 85.51% accuracy rate and the semantic similarity improve the accuracy rate by 8.02% in determining the meaning of ambiguous words.
| Original language | English |
|---|---|
| Pages (from-to) | 1239-1246 |
| Number of pages | 8 |
| Journal | Indonesian Journal of Electrical Engineering and Computer Science |
| Volume | 12 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - Dec 2018 |
Keywords
- Lesk
- Wikipedia
- Word sense disambiguation
- Word2vec
- Wu palmer
Fingerprint
Dive into the research topics of 'Developing corpora using word2vec and wikipedia for word sense disambiguation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver