Kashmir Hill, reporting for The New York Times:
Mark noticed something amiss with his toddler. His son’s penis
looked swollen and was hurting him. Mark, a stay-at-home dad in
San Francisco, grabbed his Android smartphone and took photos to
document the problem so he could track its progression.
It was a Friday night in February 2021. His wife called an advice
nurse at their health care provider to schedule an emergency
consultation for the next morning, by video because it was a
Saturday and there was a pandemic going on. The nurse said to send
photos so the doctor could review them in advance. […]
With help from the photos, the doctor diagnosed the issue and
prescribed antibiotics, which quickly cleared it up. But the
episode left Mark with a much larger problem, one that would cost
him more than a decade of contacts, emails and photos, and make
him the target of a police investigation. Mark, who asked to be
identified only by his first name for fear of potential
reputational harm, had been caught in an algorithmic net designed
to snare people exchanging child sexual abuse material.
Just an awful story, but filled with nothing but good intentions. Hill has done yeoman’s work reporting this story out. You can imagine how reluctant a source might be talk about such an incident, even with a promise of using only their first name.
Basically, there are two main methods major cloud hosts identify CSAM. The first is comparing a cryptographic hash of a given image against the National Center for Missing and Exploited Children’s database of hashes for known CSAM imagery. This method is also known as “fingerprinting”. That’s the method Apple, controversially, proposed introducing for iCloud Photos last year — but has shelved until further notice. It’s essentially a method for identifying known CSAM without distributing the actual known CSAM imagery.
The other method is using machine learning models to flag uploaded images simply because the trained AI model identifies them as suspicious. It’s essentially a search for new CSAM imagery — photos and videos that aren’t (yet) in the NCMEC fingerprint database. This method is what ensnared Mark, the subject of Hill’s story.
To my knowledge, no innocent person has been falsely flagged and investigated like Mark using the NCMEC fingerprint database. It could happen. But I don’t think it has. It seems uncommon for an innocent person like Mark to be flagged and investigated by the second method, but as Hill reports, we have no way of knowing how many like Mark there are who’ve been wrongly flagged, because for obvious reasons they’re unlikely to go public with their stories.
Near the end of Hill’s report:
Dr. Suzanne Haney, chair of the American Academy of Pediatrics’
Council on Child Abuse and Neglect, advised parents against
taking photos of their children’s genitals, even when directed by
“The last thing you want is for a child to get comfortable with
someone photographing their genitalia,” Dr. Haney said. “If you
absolutely have to, avoid uploading to the cloud and delete them
She said most physicians were probably unaware of the risks in
asking parents to take such photos.
“Avoid uploading to the cloud” is difficult advice for most people to follow. Just about everyone uses their phone as their camera, and most phones from the last decade or so — iPhones and Android alike — upload photos to the cloud automatically. When on Wi-Fi — like almost everyone is at home — the uploads to the cloud are often nearly instantaneous. I think the only advice to take away from this story is the first suggestion: to never take photos of your children’s genitals, even when directed by a doctor. Photos taken for a doctor, trying to show a rash or other skin condition, seem far more likely to be wrongly flagged than, say, photos of a baby playing in the bathtub. But I don’t know if I’d even take bath time photos of a child today. I certainly wouldn’t upload them to Google or Facebook.
Google’s system was seemingly in the wrong in Mark’s case, and the company’s checks and balances failed as well. (Google permanently deleted his account, including his Google Fi cellular plan, so he lost both his longtime email address and his phone number, along with all the other data he’d stored with Google.) But it’s worth noting that Apple’s proposed fingerprinting system generated several orders of magnitude more controversy than Google’s already-in-place system ever has, simply because Apple’s proposal involved device-side fingerprinting, and Google’s system runs on their servers.
The on-device vs. on-server debate is legitimate and worth having. But I think it ought to be far less controversial than Google’s already-in-place system of trying to identify CSAM that isn’t in the NCMEC known database.