Daniel Lemire's blog

, 9 min read

Science and Technology links (Novembre 13rd 2021)

9 thoughts on “Science and Technology links (Novembre 13rd 2021)”

  1. foobar says:

    I’m not surprised by the fact that automatic scanning for illegal content on images could (un)impressively fail on categorisation of doctored “innocent” images, but this is hardly the only problem with such black-box technological solutions to complex problems. In addition to technological issues there are also cultural issues, and no matter how “inclusive” companies behind these solutions are, they are unlikely to fully understand these questions at least as long as they operate on the American, that is the US mindset.

    I have a suspicion that from the puritan US viewpoint most family albums of Nordic families, especially with a bit older photos, would easily be categorised by these algorithms as containing child porn. What could be more innocent than memories of your kids playing naked on the lakeshore when they were small, might a Nordic person ask? What would be the problem if an older relative would appear on the same photo?

    Well, as far as I, as a Nordic person understand many Americans have a *very* different view on nudity. If you train the model to see nude kids as suspicious content soon those Nordic families are going to get at least occasionally in trouble for snapping summer photos, and if you don’t do it, those American puritans are going to complain that models are too lax. And frankly, don’t even try the American viewpoint that your morals are somehow more infallible than those of others. Differences on attitudes even inside Western societal sphere can be quite large.

    If models are incapable of coping graciously with practically invisible technical attacks turning photos of buildings into illegal content, I highly doubt they are soon going to be nuanced enough to reliably infer legality – which is quite tightly bound to social acceptability of particular use in a specific cultural environment – of photos such as mentioned above from the image content alone. When we are speaking of billions and billions of snaps on this age of ubiquitous camera presence, even one percent false positive rate of such algorithms in a specific category can put lots of people in unnecessary trouble.

    1. Grommit says:

      The perceptual hashing algorithms are hashing algorithms—- that is, they are looking for known images. They are very unlikely to be triggered by random other images— even images of nude children.

      That adversarial images can be created to fool the algorithm does not really change this.

      I’m not sure it’s great that we are using people’s phones to spy on them, but this type of snooping is relatively unobtrusive.

      1. this type of snooping is relatively unobtrusive

        It is also sure to be used by totalitarian governments to track people who seek to oppose them. If you want to sell phones in China, and you have the technology to track unwanted pictures and report them to the government, you can be certain that it will be used for that purpose, or you won’t sell your phones there.

      2. foobar says:

        Thanks for pointing this out! I have been under impression that Apple, of all companies, claiming to be privacy-conscious, hasn’t really been too clear on their on-device illegal content detection and the limits they would be putting on themselves in the future regarding it. Maybe they’re trying to do a balancing act between their strong crypto approach and interests of governments (particularly the US government) to maintain more favourable legal environment to keep that crypto running on phones…

        Nonetheless, even hashing can be worrisome. What if a government has an interest to track down the origin of a photo it finds problematic for purely political reasons, or someone gets access to your family album along with millions of others and trawls it for content which they find a way to use for profit or power in a legally questionable context? Suddenly you may end up in a situation where you have to prove your innocence instead of them proving your guilt, and it may be wishful thinking that given such technological powers to the system you would actually have good chances succeeding.

  2. foobar says:

    Regarding Nordic countries being innovative, I find it a bit questionable that the authors of the referred column (with all three authors being relatively “famous” in Finland) seem to refer only research which involves data from late 2010s, over a decade ago, in particular around 2007. This was the period just before the fall of Nokia, which completely dominated R&D in Finland back then, being an unnaturally large part of the economy for a single developed nation. Finland might have finally been recovered from its fall (consider the 7 billion euro acquisition of Wolt by Doordash announced couple days ago as an example, more valuable than Microsoft acquiring Nokia phone business), but the selection of data on the column has a bit of a questionably preferential choice of the data time frame, if you ask me…

    1. foobar says:

      Eh, I of course I meant late 2000s (the decade that is). Late 2010s would have been fine!

    2. I realize that the evidence is weak and I paused before including it in my list, but the important point for me is that this argument was interesting. It may be wrong.

      1. foobar says:

        I don’t think it can be easily argued to be “wrong” or even necessarily “weak”, and the same caveats as I see on it regarding Finnish numbers don’t really apply to other countries mentioned in the column. I just see that the time frame might have been picked in order to paint Finnish innovation in a with a bit more rosy tint than it would possibly appear on some fresher study. Nokia with its subcontractors just was really a behemoth of its time.

        I do believe there’s a point in the column despite all this.

      2. KK says:

        Dont include something becaue it might be ‘interesting’ if the evidence is weak.