Daniel Lemire's blog

I´m leaving for Houston (ICDM´05)

, 1 min read

From Nov. 26th to Dec. 1st 2005, I’m be in Houston for ICDM’05 where I’ll present our paper An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation. For a limited time, my slides are available on the Web. (If you are a thief, I’ve got two sons guarding my house, so don’t…

Java OLAP Interface (JOLAP) is dead?

, 1 min read

It looks like JOLAP is dead. The final specification has been approved on June 15th 2004. However, to this day, except for Mondrian and Xelopes, I know of no implementation of JOLAP. According this this thread, Oracle has no intention of ever supporting JOLAP. On the other hand, Oracle doesn’t…

Tabs are evil

, 1 min read

I thought I had written a piece about this, but no. So, there you go. Tabs are evil in text files. Why? Because the tab character (\t) has vaguely defined semantics. It means “insert x spaces” where x depends on the text editor and the preferences of the user. The solution? Tell your text…

IBM, Oracle and Microsoft freeing their databases

, 1 min read

Oracle has recently made available their Oracle Database 10g Express Edition. Its limitations are that it can only run servers with one processor, with 4GB of disk space and 1GB of memory. It is not sufficient for even a small data warehousing project, but it is great for teaching a class. It is…

Idea for a cool AJAX-based project: a web-based slide projector

, 1 min read

Thanks to tools like HTML Slidy and S5, you can build nice slide shows using HTML, CSS and some Javascript. But what if you are lecturing at a distance? Imagine people get to watch you by videoconference while watching your slides. How do they know when to go to the next slide and so on? The…

Cross-platform videoconferencing/slides sharing: still a long way to go?

, 2 min read

With Owen Kaser and Yuhong Yan, I am organizing our eBusiness Technologies course for this winter. Now, Yuhong is in Fredericton, Owen is in Saint John and I’m in Montreal. How do we give one course all together? Last time was through expensive videoconferencing equipment, but this time, we are…

Me with my new son Louka!

, 1 min read

Finally, a picture of me with my new son, Louka, by my fireplace, no less. I had to remove red eyes using gimp.

Can you infer tags from text?

, 1 min read

The buzz is all about tags these days. Tagyu is an interesting tool which claims to suggest tags based on the text content of the page. I’d like to see a description of the algorithm, but I see none. http://www.daniel-lemire.com/ gets the tags “firefox” “web2.0”.-…

Mondrian 2.0 is out!

, 1 min read

Version 2.0 of the most complete Open Source OLAP engine on the market has been released. Mondrian 2.0 has several major new features since Mondrian 1.2, including aggregate tables and user-defined functions. This release is a release candidate. It is not suitable for production applications.

Mondrian to partner up with Pentaho for Open Source Business Intelligence

, 1 min read

In an earlier post, I asked whether Open Source was ready for Business Intelligence. As it turns out, yesterday, the Mondrian team announced that they were partnering up with Pentaho which they claim to be the world’s leading provider of open source Business Intelligence (BI).

Numerical Python versus SciPy core

, 1 min read

If you are like me and use Python to do actual research, you probably know about Numerical Python which provides you with basic linear algebra, complex numbers, FFT and related code. Two days ago, I had to reinstall Numerical Python and say that they are promoting a replacement called SciPy core.…

StumbleUpon: collaborative filtering meets Firefox

, 1 min read

StumbleUpon is a Firefox extension which does collaborative filtering over visited web pages. From the toolbar, it looks like you can give the thumb up or down to a web page. It seems to be a centralized effort. The troubles with centralized collaborative filtering (one database for all people)…

Grep is just not for matching lines anymore

, 1 min read

Thanks to Owen Kaser, I’ve learned that grep can return just the match, and not the whole line. These examples say it all: $ grep Ab phone 505-837-2938 Abby Abbott 212-940-2039 Abel Baker 301-302-3030 Abigail Adams $ grep –o 'Ab[^ ]*' phone Abby Abbott Abel Abigail So, you can get all…

Boyer-Moore Fast String Searching Example

, 1 min read

Seems like Moore, of the Boyer-Moore algorithm, created a nice HTML demo of his algorithm. Very cool indeed. (Source: wikipedia entry on Boyer-Moore.)