From Apple, to Anomaly, to ImageNet

I stepped through a metal doorway into a dark corridor. To my right, white text stood illuminated against a black wall extending infinitely into an endless wall of square photographs. An invitation and an introduction, the glowing words invite the museum-goer to “take a critical look at how artificial intelligence networks are taught to ‘perceive’ and ‘see’ the world by engineers who provide them with vast training sets of images and words…”. I quickly read the rest of the text, passing over words like ImageNet and algorithms, as my mind was sucked into the vortex of photographs swirling in the distance. The arrangement starts with a single image of an Apple, continually expanding in both the sheer numbers of images and the level of controversy that surrounds their label. About mid-way, the pictures have grown in numbers to fill the wall from floor to ceiling in an overwhelming spectacle of organization among an image of complete chaos. Images ranged from babies, coffee, and jaw breakers, to subarachnoid space, heathen, and schemer—ending with the final category: Anomaly.

This vortex of vivid imagery is Trevor Paglen’s exhibit, “From ‘Apple’ to ‘Anomaly’” currently at the Barbican in central London. Looking at this piece, I felt an overwhelming sense of smallness as I stood just inches from the image wall, neck strained as I began to stare at the photographs entire body lengths above me. But even further than the physicality of my insignificance in that room was the hazy fog of confusion that seemed to cloud my understanding of what exactly I was looking at. Sure, I was completely fascinated, but at what? What was Trevor Paglen trying to tell me?

Emerging from the darkness, I spoke to a fellow museum-goer Samantha-Kay about her experience in the exhibit. She said that the message she took from the exhibit was that of a single emotion: worry. She felt like she had awakened to the biases that exist in machine learning, commenting that when a machine makes a decision, it’s really from whoever made the program or the machine, “it’s from their mind and from their mind-view only”. Samantha also questioned, “So what’s exactly going on in that super mind, or that ‘artificial’ super mind?”. To me, she seemed to get at the emotional truth that exists in Paglen’s exhibit. One that feels almost eerie and apocalyptic, as we see the slow deterioration of sense and logic as the categories escalate from a harmless apple to the abstract attempt to represent something as abstract as an anomaly in a tangible, visible form.

Inspired by Samantha’s insightful interpretation to Paglen’s piece, I decided to embark on a journey to discover my own. I recalled the brief introduction at the entrance of the exhibit, remembering the mention of image sets and machine learning. I began to speculate that the images before me were likely seen through the eyes of an algorithm. I thought, perhaps, Paglen is commenting on the way algorithms are being used to label abstract concepts, like anomalies, as if they are equal to that of an apple. I thought these images were the output of such faulty algorithmic identification, the result of machine learning and all the biases that come with it.

Spoiler alert: I was wrong.

It was a few simple words by an employee at the Barbican that completely dismantled my interpretation of “From ‘Apple’ to ‘Anomaly’”. I was told, “People think they understand it, but have gotten the wrong order…they think this is the output of the search rather than the input of the search.” I was part of the masses. I was one of these people that got it completely, and utterly, wrong. In a very matter of fact, nonchalant utterance, the employee said: It’s about ImageNet. Image Net—the very word I so wrongly glossed over as I read the introduction to the exhibit just hours before.

In a fury of curiosity, I quickly whipped out my phone and fell into an abyss of Google searches: What is ImageNet? Is ImageNet a bank of images? Is ImageNet a machine? Where did ImageNet’s images come from? My questions were endless, and my understanding remained largely inconclusive. In my research, I found that ImageNet, in a few words, is a data set. But not just any data set, it is the best data set to possibly exist to this day. In a Quartz article, Dave Gerhgom says ImageNet essentially is “the data that transformed AI research—and possibly the world”. In addition, to blow your mind even further, the accuracy of ImageNet as a data set increased the accuracy in machines correctly identifying things from 71.8% to a whopping 97.3%—which Gerhgom says far surpasses human abilities.

So, in other words, think of ImageNet as a big box of extremely effective flashcards. ImageNet is essentially the world’s best set of flashcards, and machines and algorithms all over the world are begging to be its students. However, these flashcards were not made by a machine. They were made by people.

With this new knowledge, I turned Paglen’s work over to expose a new side. I disconnected it from the flawed and backwards understanding I had of it when I saw it for the first time. I saw “From ‘Apple’ to ‘Anomaly’” for what it really is: a graveyard of inputs. However, despite my initial misunderstanding of Paglen’s work, I think that his choice to represent the usually solely virtual existence of ImageNet in a tangible and visual form before the viewer allows us to see the vast influence this bank of images can have on artificial intelligence. It exposes the flaws and innate biases within ImageNet before the eyes of the average passer-by. In a recent interview, Paglen said that “It is important for us to look at these images and to think about the kinds of politics that are built into technical systems. I think that showing those images and labels is itself an indictment of the process—a particular kind of indictment that can only really be done effectively by looking”. Paglen has used ImageNet as the source material for his work in a way that it is not usually subject to. ImageNet is in front of the camera, not behind the scenes, and this stark exposure is enlightening the viewer to the innate bias of the data that is teaching machines all over the world to see.