Source: Steering the right course for AI from Google Cloud
Over the past year, I’ve met with hundreds of customers to talk about how AI has changed their sense of what’s possible. We’ve discussed the way algorithms are reducing workloads for doctors and nurses by intelligently triaging patients, connecting journalists to global audiences with accurate language translation, and even reducing customer service wait times by automatically responding to common requests. I’ve been amazed at how we can apply AI to help solve so many business problems for our customers, but these same customers express a level of uncertainty and concern about AI, as well.
For all the amazing things this technology enables, it brings with it the potential for unintended consequences. This has many of our customers asking: how can we benefit from AI while avoiding its challenges?
To put this discussion in perspective, I often start by presenting the above image. It’s the Mosaic web browser as it appeared in 1994, and I think it’s an apt metaphor for where AI stands in 2018. Like the web of the mid 90’s, today’s AI is rapidly transitioning from an academic niche to a mainstream technology. The internet revolution brought about benefits and risks alike, and we have an obligation to consider the full range of possibilities before us. After all, while it was easy to see that technologies like email and text messaging would help us stay in touch, it was harder to imagine their role in the spread of malicious software or cyberbullying.
The coming decade will likely pose challenges even more complex than those of the early web, but I’m heartened by the eagerness our customers have shown to address them proactively. In fact, the same questions tend to come up again and again:
Unfair bias: How can we be sure our machine learning models treat every user fairly and justly?
Interpretability: How can we make AI more transparent, so we can better understand its recommendations?
Changing workforce: How can we responsibly harness the power of automation while ensuring today’s workforce is prepared for tomorrow?
Doing good: Finally, how can we be sure we’re using AI for good?
It’s tempting to imagine algorithms as infallible and objective, but the truth is that machine learning models are only as reliable as the data they’re trained on. And because humans are responsible for finding, organizing, and labeling that data, it’s all too easy for even the slightest irregularity to make a measurable difference in the result. Worse still, since algorithms perform at superhuman speeds and global scales, unfair bias isn’t just duplicated—it’s amplified.
Although unfair bias can be the product of deliberate prejudice, our blind spots play a far more pervasive role. For instance, we have a natural tendency to gravitate towards people and ideas that confirm our beliefs while avoiding those that challenge them. It’s a phenomenon known as confirmation bias, and it can distort the perception of even the most well-intentioned developer.
Additionally, because unfair biases are already found in the world around us, even faithfully collected data can reflect them. For instance, historical volumes of text—often used to train machine learning models that deal with natural language processing or translation—can perpetuate harmful stereotypes if left uncorrected. Seminal work by Bolukbasi et al. quantified this phenomenon with disturbing clarity, demonstrating how easily statistical language models can “learn” outdated assumptions about gender, such as “doctor” being “male” and “nurse” being “female.” Similar issues, known as embedded biases, have been demonstrated with respect to race as well.
We’re approaching these issues on multiple fronts, and awareness is among the most important. To foster a wider understanding of the need for fairness in technologies like machine learning, we’ve created educational resources like ml-fairness.com and the recently-announced fairness module in our ML education crash course.
We’ve also seen an encouraging trend towards documentation as a means to better understand what goes on inside a machine learning solution. Earlier this year, researchers suggested a formal approach to documenting datasets, especially when they contain human-centric or demographically sensitive information. Building on this idea, researchers at Google have proposed “model cards,” a standardized format for describing the goals, assumptions, performance metrics, and even ethical considerations of a machine learning model. At a glance, model cards are intended to help developers—regardless of ML expertise—to make informed decisions about using a given component responsibly.
Of course, we’ve always committed to empowering developers with the tools they can rely on, and the challenge of bias is no different. This begins with embedded documentation like our Inclusive ML Guide, integrated throughout AutoML, and extends into tools like TensorFlow Model Analysis (TFMA) and the What-If Tool, which give developers the analytic insights they need to be confident their models will treat all users fairly. TFMA makes it easy to visualize the performance of a model across a range of circumstances, features and subsets of its user population, while What-If allows a developer to easily run counterfactuals, shedding light on what might happen if key characteristics were reversed, such as the demographic attributes of a given user. Both tools provide immersive, interactive ways to explore machine learning behavior in detail, helping you to identify lapses in fairness and representation.
Finally, we’re harnessing the power of community through Kaggle, our data science platform. Our recently announced Inclusive Images Challenge tackles the issue of skewed geographical diversity in image training sets, which has resulted in classifiers that often struggle with depictions of people from underrepresented regions. Contestants are challenged to build models that better generalize across geography—without incorporating new data—leading to more inclusive, robust tools that better serve a global user base. We’re optimistic that progress in this task will have applications elsewhere, and we’re excited to present the results at the 2018 Conference on Neural Information Processing Systems.
I’m proud of the steps we’re taking, and I believe the knowledge and tools we’re developing will go a long way towards making AI more fair. But no single company can solve such a complex problem alone. The fight against unfair bias will be a collective effort, shaped by input from a range of stakeholders, and we’re committed to listen. As our world continues to change, we’ll continue to learn.
As pressing as the challenge of unfair bias is, however, it’s part of an even more fundamental question: how can AI truly earn our trust? As machine learning plays a growing role in decisions that were once the exclusive domain of humans, the answer will depend more and more on a crucial factor: accountability.
Since their inception, many deep learning algorithms have been treated like black boxes, as even their creators struggle to articulate precisely what happens between input and output. We cannot expect to gain peoples’ trust if we continue to treat AI like a black box, as trust comes from understanding. While the logic of traditional software can be laid bare with a line-by-line examination of the source code, a neural network is a dense web of connections shaped by exposure to thousands or even millions of training examples. The result is a tradeoff in which flexibility is gained at the cost of an intuitive explanation.
Progress is being made with the establishment of best practices, a growing set of tools, and a collective effort to aim for interpretable results from the start of the development cycle. In fact, when we published our own principles for building responsible AI systems earlier this year, interpretability was among its four most fundamental pillars.
Already, we’re seeing exciting efforts to bring interpretability to real-world problems. In the case of image classification, for instance, recent work from Google AI demonstrates a method to represent human-friendly concepts, such as striped fur or curly hair, then quantify the prevalence of those concepts within a given image. The result is a classifier that articulates its reasoning in terms of features most meaningful to a human user. An image might be classified “zebra”, for instance, due in part to high levels of “striped” features and comparatively low levels of “polka dots”. In fact, researchers are experimenting with the application of this technique to diabetic retinopathy diagnosis, making output more transparent—and even allowing the model to be adjusted when a specialist disagrees with its reasoning.
There’s no denying our relationship to work is changing, and many of our customers are wondering how they should balance the potential of automation with the value of their workforce.
I don’t see the future of automation as a zero-sum game, however. A recent report from pwc showed that 67% of executives say that AI will help humans and machines work together to be stronger using both artificial and human intelligence.
It’s also important to remember that jobs are rarely monolithic. Most consist of countless distinct tasks, ranging from high-level creativity to repetitive tasks, each of which will be impacted by automation to a unique degree. In radiology, for instance, algorithms are playing a supporting role; by automating the evaluation of simple, well-known symptoms, a human specialist can focus on more challenging tasks, while working faster and more consistently.
However, some job categories face more immediate change than others, and much can be done to ease the transition. To this end, Google.org has dedicated a $50 million fund to support nonprofits preparing for the future of work across three broad endeavors:
Providing lifelong training and education to keep workers in demand
Connecting potential employees with ideal job opportunities based on skills and experience
Supporting workers in low-wage employment
Of course, this is just a first step, and we look forward to supporting a growing range of similar initiatives in the coming years.
Finally, there’s the question that transcends everything: “how can I be sure I’m using AI to make a positive difference in people’s lives?”
This is a difficult question to answer, made all the harder by our tendency to focus on how AI behaves at the extremes. For example, few would deny that using AutoML to affordably monitor endangered species, as the Zoological Society of London has done, is unambiguously good. We have also seen how TensorFlow, Google’s open-source machine learning framework, is helping Rainforest Connection fight against illegal deforestation, farmers identify diseased plants, and the prediction of the likelihood of wildfire in a forest. Furthermore, our AI for Social Goodprogram recently announced a $25 million grantto help fund AI research that will tackle humanitarian and environmental challenges. And our Data Solutions for Change program continues to help non-profit organizations and NGOs use purpose-driven analytics to fight against unemployment, detect Alzheimer’s, create more sustainable food systems, and optimize community programming.
But there’s an enormous grey area, especially with controversial areas like AI for weaponry, which represents one application of this technology we have decided not to pursue as stated in our AI principles. Our customers find themselves in a variety of places along the spectrum of possibility on controversial use cases, and are looking to us to help them think through what AI means for their business.
We are working with both our customers and our product teams to work through these areas. To bring an informed, outside perspective to this question, we enlisted the help of technology ethicist Shannon Vallor, who consults across Cloud AI to help shape our understanding of this ever-evolving grey area, and how our work fits into it. From internal educational programs on best practices in AI ethics, to consultation on real-world implementations of our AI Principles, she brings an expert perspective to Cloud AI on how this technology can be guided by ethical design, analysis and decision-making. For example, ethical design principles can be used to help us build fairer machine learning models. Careful ethical analysis can help us understand which potential uses of vision technology are inappropriate, harmful, or intrusive. And ethical decision-making practices can help us reason better about challenging dilemmas and complex value tradeoffs—such as whether to prioritize transparency or privacy in an AI application where providing more of one may mean less of the other.
For all the uncertainties that lie ahead, one thing is clear: the future of AI will be built on much more than technology. This will be a collective effort, equally reliant on tools, information, and a shared desire to make a positive impact on the world.
That’s why this isn’t a declaration—it’s a dialogue. Although we’re eager to share what we’ve learned after years at the forefront of this technology, no one knows the needs of your customers better than you, and both perspectives will play a vital role in building AI that’s fair, responsible and trustworthy. After all, every industry is facing its own AI revolution, which is why every industry deserves a role in guiding it. We look forward to an ongoing conversation with you on how to make that promise a reality.