Daniel Lemire's blog

Is the cosine similarity transitive?

, 1 min read

A simple enough similarity measure is the cosine similarity measure. It is used often in Information Retrieval and it works well. It is also quite simple: cos(v,w)=<v/|v|,w/|w|>. Clearly, it is reflexive (cos(v,v)=1) and symmetric (cos(v,w)=cos(w,v)). But it is also transitive: if cos(v,w) is…

Finally giving up on PDAs

, 2 min read

I have been one of the early adopters of PDAs. I had a pocket computer always with me circa 1985. I have been a PalmOS user for about 7 years now. But the sorry state of the market, the much improved free online offerings (such as Google Calendar), and the wider availability of WiFi make PDAs less…

PDFView is dead, vive Skim!

, 1 min read

PDFView, my trusty MacOS PDF viewer is dead. But fortunately, Skim comes to the rescue. Skim has pretty much the same features as PDFView. For example, you can tell it to automatically reload a PDF file when it changed on disk, which is a needed feature if you are going to use LaTeX seriously.…

Computing the Hamming distance between two strings in Java?

, 1 min read

Odd. I was looking this morning for some Java code to compute the Hamming distance between any two strings in Java, and could not find it. There are plenty of code samples for the Hamming distance between integers, but I am really looking for something that can process String objects. Anyone knows…

Slope One in Scala

, 1 min read

Steve Jenson (of Blogger fame) implemented the collaborative filtering algorithm Slope One in Scala. You can see his code online.

The Web is a distinct society

, 2 min read

It just came to me lately that the Web forms a world of its own, with its own political views. It always strike me how little government presence there is on the Web. In most Western economies, the government account for a large share of the economy (certainly above 20%). On the Web, I would say…

Babylon 5: The Lost Tales – Voices In The Dark

, 1 min read

We got the latest Babylon 5 DVD (The Lost Tales – Voices In The Dark). It is not a secret that I am a big scifi fan. Babylon 5 was a turning point in adult scifi. For this reason alone, I had to preorder the DVD. According to the Amazon web page, the DVD still ranks in 8th position with respect…

On the upcoming collapse of peer review

, 2 min read

I just read Is Peer Review in Decline? (PDF warning!) by Glenn Ellison. Here is how the author starts: Comparing the early 1990’s with the early 2000’s, there is a decline in the share of papers in the top general-interest journals (and the absolute number) written by faculty members from the…

Anyone has a Nokia Tablet PC?

, 1 min read

I am thinking about buying a Nokia N800 Internet Tablet PC. This seems like a cool replacement for my Palm m505. There is just a huge number of free applications for it. There are a few gotchas, no doubt, but I’d be interested in hearing from anyone who owns such a device?

Paper publications and climate change

, 1 min read

Oh! The irony. I am a member of SIAM. I can’t recall why I joined, probably to attend a conference. Anyhow, I am using the June 2007 edition of SIAM News as a mouse pad right now, and I can read on it “Mathematicians Confront Climate Change.” (In case you didn’t know, paper makes great…

Kasparov explains Russia

, 1 min read

Kasparov is arguably the greatest chess player in history. It takes a man with a healthy brain, lots of memory, and an incredible determination, to achieve such a mythical status. Unfortunately for Putin’s government, Kasparov is their opponent. Here is what Kasparov suggests we do to understand…

Eclipse Search Dialog box is killing me

, 1 min read

Eclipse is a great IDE. Up until I tried Eclipse, I thought that IDEs were for wannabes programmers. You know the type: why does programming have to be sooo hard, why can’t I just click and click? Well, programming is a design task and design is hard. So, just like there is no automated tool to…