When a Computer Science professor from the University of Virginia looked at an image-recognition software he was developing, he began to notice a pattern. The software showed a significant gender bias when interpreting the context of the images it was shown. For example, if it were presented with a photo of a kitchen, it was more likely to associate it with women.
Research identified the cause of this bias as the collection of images used to train the software to begin with. The computer was trained using ImSitu, which was curated by the University of Washington. It is one of the most used collection of images for research, next to COCO which was initially coordinated by Microsoft and is now cosponsored by Facebook. Both collections contain over 100,000 images, with more images of men than women. Both push outdated stereotypes by showing a heavy association of women with, say, kitchen objects and men with, sporting equipment. For instance, the study found that in the ImSitu collection, depictions of people cooking were 33% more likely to involve women than men.
The system did not just mimic this bias, it amplified it. While the initial data set of images showed an association with women and shopping or men to sports, the computer processed that data to create an even bigger association. Even when it was displayed a photo of a man in a kitchen, the software labeled it as “woman.”
This isn’t the first time computer programs have expressed a bias. Last year, researchers at Boston University studied a system that exhibited similar bias with words. When the phrase “man is to computer programmer as woman is to x,” was input, the system responded with “x = homemaker.” In another occasion, it output that father is to doctor as mother is to nurse.
This phenomenon could apply to other types of biases, like race. A previous study discovered that a machine trained by a list of words taken from online material was more likely to rate a CV highly if it belonged to person with a traditionally European name.
This doesn’t mean that we should be putting a bat to our computer screens in the name of equality. In truth, the problem isn’t machinery at all; It’s us. The machine is not innately biased. It is simply taking in the biased input that programmers are feeding it and mimicking it. When a system is only introduced to skewed and limited portrayals of women and people of color, then it is bound to reflect that in the way that it functions.
I would argue that people work in the same way. Call me an optimist, but I don’t think that people are innately sexist or racist. I believe that we are born into a society that portrays women and people of color in a limited way, and that affects the way that we think. When women in the media are only shown as damsels in distress or sex-symbols, those ideals are reflected in the ubiquity of sexual harassment in the workplace. When people of color are villanized in the news, it is reflected in the high incarceration rates. To some extent, we are input-output machines as well.
Most of the researchers agreed that the way to address the bias in the machines studied is to eliminate the bias in the information it is receiving. This is particularly difficult, both in a lab and in society, because implicit biases aren’t always noticeable. Likely, it will take a bit of time and lot of reflection for us to rid of the bias in those machines – and in ourselves.