Daniel Lemire's blog

, 9 min read

Science and Technology links (April 22nd, 2018)

  1. You probably can’t write the two forms of the letter g, even if you have seen them thousands and thousands of times.
  2. Some neurodegenerative diseases might result from a fungal infection. This would include diseases like Parkinson’s. The theory seems to be that many of us get infected with nasty things that remain dormant in most of us, but not in the weakest or unluckiest among us. That is, you may already be infected with whatever is going to kill you later. Scary.
  3. Mangan suggests that the sometimes reported health benefits of moderate alcohol consumption might have to do with its bactericidal effect?
  4. Researchers use machine learning to identify cells under a microscope without having to intervene on the cells themselves (e.g., with fluorescent labels). This might turn out to be a big deal as it could reduce the cost of research drastically.
  5. Google appears to be able to isolate voices when several people talk at the same time.
  6. Google launches Talk to Books. It is a tool where you can ask questions and get back answers from a large collection of books. It works well and reminds me of Vernor Vinge’s Rainbows End where some entrepreneurs wanted to digitalize books to build an artificial intelligence. Seems like we are well on schedule to realize the program set forth in the novel (set in 2025).
  7. Two eggs per day do not adversely affect the biomarkers associated with heart diseases, but increase satiety throughout the day in a young healthy population.
  8. Microsoft will distribute its own version of Linux. Bill Gates, Microsoft’s founder, repeatedly opposed the GPL, the license under which Linux is distributed. So it is definitively a post-Bill-Gates era at Microsoft. I have mixed feelings regarding the GPL myself, but I like the new directions that Microsoft is taking. It much easier to like Microsoft today, they seem much less confrontational.
  9. An extra robot per 1,000 workers reduces the employment to population ratio by 0.18-0.34 percentage points and wages by 0.25-0.5%. Should you worry? Maybe not:

These are sizable effects. But it should also be noted that even under the most aggressive scenario, we are talking about a relatively small fraction of employment in the US economy being affected by robots. There is nothing here to support the view that new technologies will make most jobs disappear and humans largely redundant.

  1. Google published do-it-yourself artificial intelligence maker kits.
  2. Coffee consumption may reduce the total cancer incidence and it also has an inverse association with some type of cancers. (Credit: P. D. Mangan)
  3. When I was a teenager, I assumed that strong people and smart people were distinct groups. Not so it seems. Grip strength is associated with cognitive performance, according to a study over half a million participants. It is unclear which way the causality goes. Fit people are likely to be both smart and strong. I don’t know how you can get smarter directly, but I know how you can get stronger (lift weights and such), could it be that getting stronger makes you smarter? For older people, this might very well be true.
  4. Patients with major depression exhibited higher epigenetic aging in blood and brain tissue, suggesting that they are biologically older than their corresponding chronological age.
  5. Lenovo sells a high-quality virtual reality headset for $200.
  6. According to Ceci and Williams, current policies to increase female representation in science are misguided:

Explanations for women’s underrepresentation in math-intensive fields of science often focus on sex discrimination in grant and manuscript reviewing, interviewing, and hiring. Claims that women scientists suffer discrimination in these arenas rest on a set of studies undergirding policies and programs aimed at remediation. More recent and robust empiricism, however, fails to support assertions of discrimination in these domains. To better understand women’s underrepresentation in math-intensive fields and its causes, we reprise claims of discrimination and their evidentiary bases. Based on a review of the past 20 years of data, we suggest that some of these claims are no longer valid and, if uncritically accepted as current causes of women’s lack of progress, can delay or prevent understanding of contemporary determinants of women’s underrepresentation. We conclude that differential gendered outcomes in the real world result from differences in resources attributable to choices, whether free or constrained, and that such choices could be influenced and better informed through education if resources were so directed. Thus, the ongoing focus on sex discrimination in reviewing, interviewing, and hiring represents costly, misplaced effort: Society is engaged in the present in solving problems of the past, rather than in addressing meaningful limitations deterring women’s participation in science, technology, engineering, and mathematics careers today.

In fact, it seems that there are strong biases favoring women as it is:

Comparisons of oral non-gender-blind tests with written gender-blind tests for about 100,000 individuals observed in 11 different fields over the period 2006-2013 reveal a bias in favor of women that is strongly increasing with the extent of a field’s male-domination. This bias turns (…) to about 10 percentile ranks for women in math, physics, or philosophy.

If I had to guess, I would think that I have a bias favorable to women when it comes to recruiting. So the question is: why so few women?

A colleague of mine works on female equity issues. She gave me a tip on how to recruit women: stress that there are already other women around. Women will be hesitant to join a lab that is made exclusively of men. Sadly, that’s kind of a vicious circle.

  1. Cloning horses appears to be a profitable business.
  2. According to the New York Times, robots can assemble Ikea furniture semi-autonomously. It is not clear that the robots actually start from Ikea’s instructions. And they clearly start from already laid out parts. (Credit: Peter Turney)
  3. Gary Bernhardt writes:

Reminder to people whose “big data” is under a terabyte: servers with 1 TB RAM can be had about $20k. Your data set fits in RAM.

Gary is correct. In 2011, I wrote:

Many information systems have storage costs which are proportional to the number of individuals. I call them sapien-bound systems. (…) Soon, all sapien-bound systems will fit in RAM cheaply.

To describe my own research, I prefer to use the term “data engineering” rather than “big data”. However, processing data quickly and efficiently is not a solved problem. Having a terabyte of RAM does not make your computing problems go away and, in some sense, it outlines them.

  1. Our atmosphere is filled with trillions of viruses according to the New York Times:

Generally it’s assumed these viruses originate on the planet and are swept upward, but some researchers theorize that viruses actually may originate in the atmosphere. Whatever the case, viruses are the most abundant entities on the planet by far. While Dr. Suttle’s team found hundreds of millions of viruses in a square meter, they counted tens of millions of bacteria in the same space.

(Source: P.D. Mangan)

  1. Human beings transformed the planet in deeper ways that is often assumed:

Our assumption that modern ecosystems are normal is flawed, says Theodor. They’re not necessarily functioning in the way that they did even 11,000 years ago.

  1. Machine-learning and artificial intelligence are popular today, and people tend to want to use them everywhere. Top researchers in these fields are paid millions of dollars.It is possible that current machine-learning techniques are being overrated. Makridakis et al. found that, on prediction tasks, the accuracy of machine-learning models (“artificial intelligence”) is below that of simple old-school statistical models. The motivation of their work is interesting:

The motivation for writing this paper was an article published in Neural Networks in June 2017. The aim of the article was to improve the forecasting accuracy of stock price fluctuations and claimed that “the empirical results show that the proposed model indeed display a good performance in forecasting stock market fluctuations”. In our view, the results seemed extremely accurate for stock market series that are essentially close to random walks so we wanted to replicate the results of the article and emailed the corresponding author asking for information to be able to do so. We got no answer and we, therefore, emailed the Editor-in-Chief of the Journal asking for his help. He suggested contacting the other author to get the required information. We consequently, emailed this author but we never got a reply. Not being able to replicate the result and not finding research studies comparing ML methods with alternative ones we decided to start the research leading to this paper.

A major fault in contemporary research is that people fail to honestly and fairly compare their “fancy” work with simpler alternatives as if simplicity was a fault. It is not! You always want to use the simplest thing that works.

  1. Jordan has a worthy essay entitled Artificial Intelligence — The Revolution Hasn’t Happened Yet. He points out that what we have is good statistical algorithms, but that it remains to build a new kind of engineering which we might call “artificial intelligence engineering”.

I’m also a computer scientist, and it occurred to me that the principles needed to build planetary-scale inference-and-decision-making systems of this kind, blending computer science with statistics, and taking into account human utilities, were nowhere to be found in my education. And it occurred to me that the development of such principles — which will be needed not only in the medical domain but also in domains such as commerce, transportation and education — were at least as important as those of building AI systems that can dazzle us with their game-playing or sensorimotor skills.

  1. A technical called “repetition lag training” can significantly improve a particular type of memory task:

Repetition lag training improved objective measures of recollection, eliminated the age-related recollection decrement, and these improvements maintained over three months.

That is, elderly people who train become as good as young people at recollection tests. The results appear robust even with people who suffer from mild cognitive impairment. However, participants fail to report an improvement in their daily life. That is, you can train for a memory task and improve to a youthful level even if you are very old, but you only improve what you train. I think that this is viewed in a pessimistic light, but I think it suggests that “training” becomes increasingly important when you get older, or if you suffer from cognitive limitations.

  1. Prescription drugs rapamycin and metformin regenerate hair.
  2. Matt Ridley points out that we have not extended the maximal lifespan of human beings, and I call him up on his pessimism, using Twitter. He wrote a book entitled The Rational Optimist.