By Lisa Rein
A lot of folks were wondering about what Chelsea Manning‘ meant when she discussed a “Code of Ethics” during her SXSW talk, last March. Well there’s no need to wonder, because Chelsea discussed this in detail, with her co-panelists Kristian Lum (Human Rights Data Analysis Group) and Caroline Sinders (Wikimedia Foundation), during the Ethical Algorithms track at the last Aaron Swartz Day at the Internet Archive.
Here’s a partial transcript to shed some light on the situation.
The panel discussed the issue of valid data and how companies have become more concerned with whether they can sell an algorithm-driven software product, and less concerned with whether the damned thing works and who its results might affect, should they be inaccurate.
Link to the complete video for Ethical Algorithms panel.
Lisa Rein: Okay first question: Are the software companies who are making these algorithm-based products are just selling them to whomever they can, for whatever they can apply them to? And if so, how can we stop this from happening?
These companies are powerful and getting larger and more powerful all the time. Yet, no one seems to care; even the companies buying the snake oil products; as long as they can resell the services somehow, and the money keeps coming in. What can we do?
Chelsea Manning: Me personally, I think that we in technology have a responsibility to make our own decisions in the workplace – wherever that might be. And to communicate with each other, share notes, talk to each other, and really think – take a moment – and think about what you are doing. What are you doing? Are you helping? Are you harming things? Is it worth it? Is this really what you want to be doing? Are deadlines being prioritized over – good results? Should we do something? I certainly made a decision in my own life to do something. It’s going to be different for every person. But you really need to make your own decision as to what to do, and you don’t have to act individually. We can work as a community. We can work as a collective entity. People in technology, we’re not going to be able to explain to people – all this stuff. People know that everything’s messed up, and they know that things are messed up because of the algorithms that we have. We’ve educated them on that. They understand that. They understand that viscerally, because they see the consequences and the results of these things that are happening every day.
The problem is that people in technology aren’t paying attention, and even some of us who are paying attention, aren’t doing anything about it. We’re waiting for somebody to do something about it. Somebody else isn’t going to do it. We’re going to have to do it ourselves.
Caroline Sinders: Even if you feel like a cog in the machine, as a technologist, you aren’t. There are a lot of people like you trying to protest the systems you’re in. Especially in the past year, we’ve heard rumors of widespread groups and meetings of people inside of Facebook, inside of Google, really talking about the ramifications of the U.S. Presidential election, of questioning, “how did this happen inside these platforms?” – of wanting there even to be accountability inside of their own companies. I think it’s really important for us to think about that for a second. That that’s happening right now. That people are starting to organize. That they are starting to ask questions.
I think especially looking at where we are right now in San Francisco – inside the hub of Silicon Valley – in a space where it’s very amenable to protest is very amiable to supporting ethical technology. How do we build more support for other people. Is it going to spaces we’re not usually in? Is it going to other tech meet ups? Maybe. Is it having hard conversations with other technologists? Probably. How do we push the politics of our community into the wider spread community? We have to go and actually evangelize that I think.
Is the company even using the algorithm in the way it was intended to be used? Often a company purchases an algorithm that is made for one kind of analytics and it gets used for a completely different thing, and then you get these really skewed results.
Lisa Rein: Wait a minute. I can’t believe I’m asking this, but, are you saying that, as long as they like the results, nobody cares if the results are accurate?
Caroline Sinders: ‘How sure are we that it’s true?’ is not the question that I’m hearing in the conference room. It’s more like ‘we’ve gotten these results and these people have purchased it.’ or ‘It’s selling really well.’ Cause we are in the age of people building software as a product, capabilities as a product, APIs as a product. (Meaning that you buy access to an API that’s like a pipeline.) And if it’s returning certain results that a company can then use and put in a portfolio to sell to other different kinds of clients, like, it doesn’t actually matter how much it works, if it has the appearance of working; if it’s pumping out ‘results.’ So, I can’t speak to like academic verifiability of different kinds of APIs. I can speak to “that I have not ever heard people really talk about that.
Chelsea Manning: Yeah. I’ve had experience with this in particular… For verifiability of data, it’s purely academic. That’s what I’ve found. When you are working in a corporate or a business setting or whether your are working in a government setting – military context or whatever – it’s ‘results, results, results.’ Nobody cares how verifiable the data is. Everybody’s cutting corners. Everybody’s trying to meet deadlines. That’s why we — people in technology – we need to be thinking more ethically. We need to be very cognizant of the systems that we’re building, and not just sitting there, continually meeting deadline and meeting priorities that our set by our leadership, or by clients or by senior corporate, ya know, C Suite people.
You really need to think about what you’re doing. What the consequences of what you’re doing are. Because these (questions) are not happening, and they should be happening. In many cases, for some of these systems, maybe the question of whether we should be doing a system like this at all is a question that should be asked – at least asked, in some of these rooms. It’s not, and it’s not going to be.
Caroline Sinders: I think there is a big push though, if you work in industry software, to really understand the ethical ramifications of the products your using, or the software that you’re using, and how it effects your users. And how this effects even unintended bystanders – people that have not opted in to the system, or into the product, right? And that’s where you get into like, different surveillance systems, or systems that are in the whole vein of the Internet of Things, right? How many people are “accidentally” a part of a data set that they didn’t get to opt in to.
Kristen Lum: And in some cases, remember, you yourself can act as a sponge for accountability. Because now, let’s say you have a system that’s been purchased, that’s been created by “peer reviewed science” or very expensive technology, and it’s saying to do the thing that your organization kinds of wants to do anyway. Well, maybe do some research and show the people your working with, and say “hey we may be over policing this community.” Because, otherwise, it’s like “Hey, this software we spent all this money on is telling us to do it,” which gives them justification to do what they want to do anyway. So, try to maybe act like a buffer, between these viewpoints, by being able to ask, and question, ‘why are you doing that?’
Lisa Rein: Would opening up the “black box” solve everything?
Chelsea Manning: It’s not just that it’s a black box, even when the code is available to you, sometimes how it’s actually coming up with the predictions it’s coming up with — apart from doing pure math, when you’re trying to come up with something that’s understood by humans for an explanation, it can escape you sometimes. So I think that’s one of the dangers of depending on showing the entire algorithm. We have to fully understand these algorithms and not just see how they work from a code perspective or an algorithmic perspective.
What scares me with that is some of these algorithms being used in like, Bail hearings… You literally changing this person’s life, because they are going to stay in jail, because they are in a bail hearing, where an algorithm — made by some company — decided that you’re more predicted to be arrested. It’s not evidentiary in any way, but it’s being used in an evidentiary manner. It’s just a mathematical prediction based on false data — or poor data — and it’s actually tearing people’s lives apart. And it’s also feeding into this feedback loop, because they’re seen as being re-arrestable. Therefore, it reinforces the data set.
Kristen Lum: There are a lot of models now predicting whether an individual will be re-arrested in the future. Here’s a question: What counts as a “re-arrest?” Say someone fails to appear for court and a bench warrant is issued, and then they are arrested. Should that count? So I don’t see a whole lot of conversation about this data munging.
Caroline Sinders: Specifically, I think some of the investigative reporting that Pro Publica has done specifically on this is really worth highlighting.
(Editor’s Note: Parts of this partial transcript were rearranged slightly for flow and readability.)