Daniel Lemire's blog

Open Access is the short-sighted fight

, 3 min read

My colleague Stevan Harnad thinks it is silly to boycott for-profit journals. My ex-colleague Stephen Downes admits to being a boycotter, but he claims not to be silly. Both of them are silly. Stephen Downes has worked outside the realm of prestigious academic journals (so he says). He claims that…

How to win academic debates

, 1 min read

You cannot get rid of your tenured colleagues even the idiots who got tenure by luck or by cheating. So you must get your points past them. Seb pointed me to a post about meeting power. It got me to reflect on my techniques: F____rame the debate along an axis that favors you. Find a way to divide…

Working with industry helps researchers?

, 1 min read

Is it a good idea for an academic researcher to work with industry R&D projects? Yes, in small doses: We find that university-industry relations exercise a positive effect on university scientific productivity only when (…) these activities do not exceed 15% of the researcher’s total…

Getting a Ph.D. for the money?

, 2 min read

Many of my Ph.D. students have admitted to being motivated by financial gain. Stanford is famous for their graduate-students-turned-entrepreneurs. Sergey Brin and Larry Page of Google fame come to mind. But I cannot yet point to a comparable success story. Still, I have seen several variations over…

What is more fundamental: Physics or Computer Science?

, 1 min read

Computer Science can be taken a natural science: the study of how the universe processes information. If it is a natural science, then does it build on Physics? Or does Physics build on Computer Science? The answer is obvious (to me): Without algorithms there would be no Physics! Physics is built…

Sensible hashing of variable-length strings is impossible

, 1 min read

Consider the problem of hashing an infinite number of keys—such as the set of all strings of any length—to the set of numbers in {1,2,…,b}. Random hashing proceeds by first picking at random a hash function h in a family of H functions. I will show that there is always an infinite…

A Simplified Open Publishing Manifesto

, 1 min read

Bill Gasarch is proposing a manifesto on Open Scholarship. What a great idea! Imagine thousands of researchers openly agreeing on practices making research more effective! We could change the culture of scientific research without having to convince publishers or funding agencies. To keep the ball…

The most important Theoretical Computer Science problem is inconsequential

, 2 min read

Some consider the P = NP problem to be the most important Theoretical Computer Science problem. It asks whether all problems whose solution can be verified quickly, can also be solved quickly. If you can answer this question, you win one million dollars. The catch is that quickly is defined as in…

Relational databases: are they obsolete?

, 2 min read

Michael Stonebraker is predicting that the dominance of the generic relational database is coming to an end. Having recently founded several database companies, he has a vested interested in this prediction . Here is Stonebraker logic: we can outperform relational databases with specialized…

The hard truth about research grants

, 2 min read

You must do many silly things to get a large research grant: You must know precisely what you will do__ for the next five years____.__ Yet, in my experience, good researchers only have a vague idea of where they will be in 5 years. If you know the promising research directions you will encounter,…

How things change: Cheaters are Innovators

, 1 min read

If you seek approval above all else, you are unlikely to innovate outside the rigid bounds of the current system: You do not convince existing journals to give more respect to this new field you created. You go out and create your own journals and conferences. John von Neumann did not wait for his…

Where do the best mathematicians come from?

, 1 min read

Americans think that the best scientists come from their best universities. To learn more, consider where the influential mathematicians from: got their degree in the USA 33% got their Ph.D. in the USA 58% working in the USA 68% The American scientific dominance relies partially on the…

Changing your perspective: horizontal, vertical and hybrid data models

, 2 min read

Data has natural layouts: text is written from the first to the last word, database tables are written one row at a time, Google presents results one document at a time, the early recommender systems compared users to other users, discussions are organized in newsgroups and posting boards by…

Toward author-centric science

, 2 min read

Too many research papers in Computer Science are nonsense: they convey no worthy message. Yet they pass a Turing test of sort: at a glance, they are indistinguishable from interesting research papers. In fact, they are designed as nonsense from the beginning: the authors mimic the output of good…

Why I am not publishing in PLoS One, yet

, 2 min read

PLoS One is a new peer-reviewed journal (2006) with many interesting features: The board includes many respected Computer Scientists: Ananth Grama, Johan Bollen, Josh C. Bongard, Robert P. Futrelle, etc. It is the largest Open Access journal in the world: 2,800 articles published in 2008. They…

Attributes of good research

, 1 min read

Paul Graham gives a list of attributes characterizing start-ups. It strikes me that many of these attributes could describe research projects as well: Good research projects fail. If there is no risk of failure, you are doing unoriginal research. (Except that out of my biggest failures have come…

Trading compression for speed with vectorization

, 2 min read

Bitmap indexes are used by search engines (such as Apache Lucene), they are available in DBMSes such as Oracle and PostgreSQL. They are used in column stores such as the Open Source engines Eigenbase and C-Store, as well as by many commercial solutions such as Vertica. Bitmap indexes are silly data…

To be smarter, ignore external rewards

, 2 min read

Last night, I watched a great talk by Dan Pink—author of several self-help books. He made a compelling point and he cited research papers. I went and read these research papers and I had great fun. Essentially, boosting your motivation with external rewards can lower the…