7 min read
About me
Name: Daniel Lemire
Location: Montreal, Canada
Occupation: Computer Scientist (full professor)
Home page: http://lemire.me/en/
My Setup: http://daniel.lemire.usesthis.com/ (2013)
Research papers: Google Scholar profile, arXiv, DBLP
Affiliation: Data Science Laboratory, Université du Québec (TELUQ); LATECE, UQAM
Email: lemire at gmail dot com
Headshot: see my home page
Keywords: Data Science, Indexing and Software Performance.
I have been an entrepreneur, a government researcher, and a university professor. I once designed, built and sold software for a living. I have a working knowledge of C, C++, Go, Java, JavaScript, Swift, Rust and Python. As a researcher, I have worked on many problems: from medical diagnostic to collaborative data processing. My current interest is software performance.
My Slope One recommender algorithm is a standard reference in the field of recommender systems. My work on bitmap indexes is used by companies like eBay, Facebook, LinkedIn and Netflix to accelerate their data processing. It is also part of platforms such as Apache Hive, Druid, Apache Spark, LinkedIn Pinot, Netflix Atlas and Apache Kylin. The version control system Git as used by GitHub is also accelerated by our compressed bitmaps. Some of my techniques have been adopted by Apache Lucene, the search engine behind sites such as Wikipedia or platforms such as Solr and Elastic. Some of my compression software is used by Apache Arrow and Apache Impala.
We wrote the fastest JSON parser in the world: simdjson. The simdjson library runs nearly everywhere and it is pioneering many new techniques.
I love to write: my blog has been featured on Reddit, Hacker News and Slashdot (1, 2).
I have written over 75 peer-reviewed papers. My work has been cited thousands of times. I have held a research grant (NSERC) from the federal government for nearly 20 years. I was co-chair of the NSERC computer science committee in 2019-2020 and in 2020-2021.
I also have two sons, two cats, and a beautiful wife. My dog has its own YouTube channel. I make my own bread, my own yogurt, my own beer, my own wine, my own port, my own furniture, my own drinks (I like daiquiris). I build robots, radio control sailboats and trucks. I grow my own vegetables in the summer using square gardening. I love science-fiction, both in book and TV format.
- Parsing numbers at a gigabyte per second (MIT Fast Code Seminar 2021)
- Floating-point Number Parsing w/Perfect Accuracy at GB/sec (Go Systems Conf SF 2020)
- Data Engineering at the Speed of Your Disk (Performance Summit III 2020)
- Parsing JSON Really Quickly: Lessons Learned (QCon San Francisco 2019)
- Next Generation Indexes For Big Data Engineering (ODSC East 2018, Boston)
- Engineering Fast Indexes for Big Data Applications (Spark Summit East 2017, Boston)
- Engineering Fast Indexes for Big Data Applications (deep dive) (Spark Summit East 2017, Boston)
- Algorithms, how content finds ‘you’ (Discoverability Summit, Toronto, 2016)
My 140-character bio:
Computer science professor at the University of Quebec, contributor to major data-science open-source projects, and long-time blogger.
When requested to provide a formal bio, I use this paragraph:
Daniel Lemire has a B.Sc. and a master in Mathematics from the University of Toronto, and a Ph.D. in Engineering Mathematics from the Ecole Polytechnique and the Université de Montréal. He is a computer science professor at the Data Science Laboratory of the Université du Québec (TELUQ). He has also been a research officer at the National Research Council of Canada and an entrepreneur. He has written over 75 peer-reviewed publications, including more than 45 journal articles. He has held competitive research grants for nearly 20 years. He was co-chair of the NSERC computer science committee from 2019 to 2021. He serves on the program committees of leading computer science conferences (e.g., ACM CIKM, WWW, ACM WSDM, ACM SIGIR, ACM RecSys). His open source software has been used by major corporations such as Google, LinkedIn, Netflix and Facebook. In 2020, he received the University of Quebec’s 2020 Award of Excellence for Achievement in Research for his work on the acceleration of JSON parsing. He joined the Circle of Excellence of the University of Quebec in 2019. His research interests include databases, information retrieval, and high-performance programming. He blogs regularly on computer science at http://lemire.me/blog/.
I also use the following shorter one:
Daniel Lemire is a computer science professor at the Data Science Laboratory of the University of Quebec (TELUQ). He is among the top 500 GitHub users worldwide in terms of follower count. He published over 80 peer-reviewed research papers, he has been cited thousands of times. He is an editor at the journal Software: Practice and Experience (Wiley, established in 1971). In 2020, he received the University of Quebec’s 2020 Award of Excellence for Achievement in Research for his work on the acceleration of JSON parsing. His research interests include high-performance programming. He is @lemire on Twitter, and he blogs weekly at https://lemire.me/blog
For academic correspondence, you can use the following address:
Prof. Daniel Lemire
Data Science Laboratory
TELUQ, Université du Québec
5800 Saint-Denis
Office 1105
Montreal (Quebec)
H2S 3L5 Canada
Office number: 12.166 (if you come visit)
Email: lemire (at) gmail (dot) com
- I never memorized the multiplication tables. I never memorized the quadratic formula. I never memorized most trigonometric identities. I do not know my office door number or phone number. In general, I avoid memorizing facts, I prefer to write them down where I and others will find them. That is why I write so much.
- Up until 2017, I had never owned a cell phone.
- I get lost easily. Even after living years at the same place, I still cannot locate the major streets by name. I got lost once or twice looking for my own office.
- I hear that people are sometimes nervous before addressing a crowd. I have no such stress: I am sure I could talk in front of tens of thousands of people without breaking a sweat. I will probably even tell jokes though it is unlikely many people will find me funny.
- I failed kindergarten and was put in a class for students with learning disabilities in first grade. When offered to rejoin a regular class, I chose to remain in the special class. Some of the reasons why they failed me in kindergarten was because I would not memorize my phone number and because I had trouble tying my shoe laces. I still walk with my shoe laces undone most of the time. I have never memorized my phone number. So I might still fail kindergarten.
- I lost all the electronic copies of my Ph.D. thesis the same day I sent the second revised version to the printer. Though I had backups, I overwrote all the backups with an empty file, by accident within a few minutes. Had the school asked for a revision, I would have had to retype it.
- Academically, I consider myself almost entirely self-taught. I am an autodidact with a Ph.D. Though I have been a tenured college professor in computer science for over a decade, with continuous research funding (in computer science) provided by the federal government through competitive grants, I have never taken a computer science class in college. I once attended a first-year computer college course for a couple of weeks, but I gave up quickly. I was once formally offered a job as a business professor at a good university, despite the fact that I never took a business class in college. I have three degrees in Mathematics (B.Sc., M.Sc., Ph.D.), but I derived very little use of all the classes I took. It should be said that I skipped class on every occasion.
- Going way back to my childhood I always felt bad about competition. When playing board games, I would often intentionally avoid winning… trying to keep the game going.
- I have been programming computers since I was twelve. I still program most days. I read code all the time.
- I play video games almost daily. It is my favorite hobby. I tend to play games on a dedicated console.