Daniel Lemire's blog

Taporware: have fun with words

, 1 min read

This is great fun. Taporware: prototype of text analysis tools. Their “about” page is probably slightly obselete, but the gist of it is there: TAPoRware is a set of text analysis tools that enables users to perform text analysis on HTML, XML and plain text files, using documents from the…

Academic Career Advice from Curt Bonk

, 1 min read

This must be the longest post I read this year, but Curt Bonk wrote a brilliant post on how to get tenure. His advice is good. He clearly thought this through. The gist of it: Keep at it: fine tune your papers, fine tune them again, keep resubmitting them, draw beautiful pictures and diagrams.…

Duck Typing, MustIgnore and the Web

, 1 min read

Thanks in part to a private email from Stephen Downes, who seems to have enjoyed my post on Duck Typing and AI, I now see that the MustIgnore principle in XML,- the Web mantra Be strict in what you send, generous in what you accept, and duck typing are the same idea. A very important idea…

Saner rules for udev

, 1 min read

My favorite Linux brain teaser, udev, has changed its rule language somewhat. Previously, I had the following rule: BUS="usb", SYSFS{product}="Palm…

Got XFig to work under MacOS with fink

, 2 min read

Finally! I got XFig working perfectly under MacOS. Just do “fink install xfig323.” This installs an older version of XFig which does not freeze on you within seconds. I find it scary how XFig dependent I am. XFig is ancient. Yet, nobody seems to be able to build quite the same type of drawing…


, 1 min read

VirtueDesktops is an open source virtual desktop application. It did crash on me, but all I had to do was restart it and everything was fine: it was a graceful crash. It is pretty sharp software. It requires Mac OS X 10.4 or higher, and will run on both PowerPC and Intel-based macs.

Advice for a student going into Computer Science

, 1 min read

I was recently asked what kind of advice I would give to someone who wants to study Computer Science. Where to go, what to study? The best article on the matter, at least the best I ever saw, is Undergraduation by Paul Graham. Here is an amusing quote: The social sciences are also fairly bogus,…

Tagging as a new information retrieval paradigm

, 2 min read

The Sydney Morning Herarld is reporting that Tagging is popular. Tags are a Web 2.0 feature popularized by the Canadian Web site Flickr (possibly the largest and most popular multimedia database even built, before youtube came along). Essentially, tags allow us to replace semantically rigid…

The paperless office finally coming?

, 2 min read

In the seventies, some made the prediction that we soon would have paperless offices. What happened, of course, is that we started to use inexpensive printers and paper consumption increased, instead of decreasing. There are still people, many people, who print every email, essentially using email…

Why building software is hard

, 3 min read

Why is building software difficult? Why do so many projects fail? I recently had an argument with a colleague who thinks that the problem is that the software industry is unable to follow due process… to take the requirements, make up a plan and follow it. Well. There is no such thing as “a…

The Big Bang is Intelligent

, 1 min read

Scott Adams has a series of posts on intelligence (1, 2, 3). He is arguing that the Big Bang is intelligent, but in fact, he is arguing for a bit more than this. His logic is fascinating. Anything that creates (is the cause of) literature is intelligent. That’s just a specific version of the…

The death of computing

, 1 min read

Just read this fascinating article by Neil McBride: The death of computing. (Disclaimer: I would describe Neil as an IT professor, not as a Computer Science professor.) He tells us about the upcoming death of Computer Science, mostly due to the lack of interest from students. It’s easy to think…

Podcasting using ccPublisher

, 1 min read

Ok. I have decided to continue experimenting with podcasting. This time, I tried using a tool called ccPublisher to upload my audio file to archive.org. It worked beautifully.