Daniel Lemire's blog

, 1 min read

Disambiguate words using wikipedia

A common problem in information retrieval is that words are ambiguous. That is a fancy way of saying that you cannot tell the meaning of a word when you take it out of context. Some people claim that this problem must be solved by using the Semantic Web. I have long advocated that the Semantic Web is more of a solution in search of a problem.

We already have some good strategies regarding disambiguation, but I have wondered recently why we can’t use wikipedia to disambiguate words. After all, wikipedia knows the difference between Java (the island) and Java (the programming language). It turns out that Google has implemented and patented this very idea!

Bunescu, R. and Pasca, M., Using Encyclopedic Knowledge for Named Entity Disambiguation, EACL-06, 2006.

See? Who needs RDF to disambiguate words?

(Source.)