From Nov. 26th to Dec. 1st 2005, I’m be in Houston for ICDM’05 where I’ll present our paper An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation. For a limited time, my slides are available on the Web.
(If you are a thief, I’ve got two sons guarding my house, so don’t…
It looks like JOLAP is dead. The final specification has been approved on June 15th 2004. However, to this day, except for Mondrian and Xelopes, I know of no implementation of JOLAP. According this this thread, Oracle has no intention of ever supporting JOLAP.
On the other hand, Oracle doesn’t…
I thought I had written a piece about this, but no. So, there you go. Tabs are evil in text files. Why? Because the tab character (\t) has vaguely defined semantics. It means “insert x spaces” where x depends on the text editor and the preferences of the user.
The solution? Tell your text…
Oracle has recently made available their Oracle Database 10g Express Edition. Its limitations are that it can only run servers with one processor, with 4GB of disk space and 1GB of memory. It is not sufficient for even a small data warehousing project, but it is great for teaching a class. It is…
Thanks to tools like HTML Slidy and S5, you can build nice slide shows using HTML, CSS and some Javascript.
But what if you are lecturing at a distance? Imagine people get to watch you by videoconference while watching your slides. How do they know when to go to the next slide and so on? The…
With Owen Kaser and Yuhong Yan, I am organizing our eBusiness Technologies course for this winter.
Now, Yuhong is in Fredericton, Owen is in Saint John and I’m in Montreal. How do we give one course all together? Last time was through expensive videoconferencing equipment, but this time, we are…
The buzz is all about tags these days. Tagyu is an interesting tool which claims to suggest tags based on the text content of the page. I’d like to see a description of the algorithm, but I see none.
http://www.daniel-lemire.com/ gets the tags “firefox” “web2.0”.-…
Version 2.0 of the most complete Open Source OLAP engine on the market has been released.
Mondrian 2.0 has several major new features since Mondrian 1.2, including aggregate tables and user-defined functions. This release is a release candidate. It is not suitable for production applications.
In an earlier post, I asked whether Open Source was ready for Business Intelligence. As it turns out, yesterday, the Mondrian team announced that they were partnering up with Pentaho which they claim to be the world’s leading provider of open source Business Intelligence (BI).
If you are like me and use Python to do actual research, you probably know about Numerical Python which provides you with basic linear algebra, complex numbers, FFT and related code.
Two days ago, I had to reinstall Numerical Python and say that they are promoting a replacement called SciPy core.…
StumbleUpon is a Firefox extension which does collaborative filtering over visited web pages. From the toolbar, it looks like you can give the thumb up or down to a web page.
It seems to be a centralized effort. The troubles with centralized collaborative filtering (one database for all people)…
Thanks to Owen Kaser, I’ve learned that grep can return just the match, and not the whole line.
These examples say it all:
$ grep Ab phone
505-837-2938 Abby Abbott
212-940-2039 Abel Baker
301-302-3030 Abigail Adams
$ grep –o 'Ab[^ ]*' phone
Abby
Abbott
Abel
Abigail
So, you can get all…