In pre-trained AI models, stereotypes can exist even among the most commonly used programs offered by Microsoft and Google. As the conversation on social justice and race heightens nationally, the tech community is examining some glaring issues found in natural language processing (NLP) techniques. Mission Critical magazine interviewed Beerud Sheth, CEO of Gupshup, to find out how technology leaders can use their expertise to digitally shift gender and race biases.

“A good use of AI is to use NLP techniques to eliminate latent racism or unconscious bias in the written word — court judgments, legislation, news articles, online documents, social media posts, etc. — that all of us need to be more consciously aware of and work toward fixing,” said Sheth.

According to him, AI technology has proven time and again that it can help course-correct humanity. Let’s find out how. 

Mission Critical: For those who are unfamiliar, can you define what natural language processing (NLP) is?

Sheth: It's the ability of computer software to process and understand human (natural) language. NLP enables computers to understand and communicate with humans. It lies at the foundation of Amazon Alexa, Google Search, Google translate, and so on.

Without getting too technical, can you describe the AI behind it and what goes into that programming?

Sheth: AI programs read and analyze billions of documents to understand the patterns in language. It identifies statistical patterns in the word usage so it can deconstruct sentences to understand their meanings. In a way, it mimics the human brain, which makes up word associations, correlations, and grammar rules of communication. The human brain, or the AI like it, learns for example that queens rule, athletes run, and birds fly.  

How is NLP used across various industries?

Sheth: NLP is used to provide a human interface. Instead of humans behaving like computers, computers can now behave like humans. Instead of humans clicking buttons and screens to drive computers, humans can just write or speak to them. It's a bit like the Star Trek video where the characters just spoke naturally to computers. It's used for consumer devices like Amazon Alexa, in the legal industry to analyze case law, in the media industry to write articles, by search engines to understand queries, on mobile devices like Google Assistant, in cars for hands-free directions, and more. 

Where would you say it's most prevalent?  

Sheth: It is most prevalent in communication-intensive industries (call centers, mobile devices, text messaging, search, etc.) and document-intensive industries (legal, financial, media, etc.). 

Talk about some of the advancements that have been made with NLP over the years and how the technology has been used successfully.

Sheth: NLP has seen a recent spurt in the last few years with the application of deep learning to develop advanced language models. Models with names, such as ULMFiT, BERT, GPT, Roberta, and many others, are rapidly accelerating the state of the art. 

Now, let's switch gears a little bit. Recent events have caused the tech community to identify race and gender bias issues in NLP techniques. Can you elaborate on that?

Sheth: Language models learn patterns from human-created documents. To the extent human language or history has biases, the AI will learn (or inherit?) those biases. For example, language models will automatically learn that engineers tend to be men and receptionists tend to be women. The bias in AI models is the inevitable accumulation of biases in human history and language. 

However, we know better. We want AI models in 2020 to forget older biases and reflect our current values. This requires conscious effort to selectively debug and eliminate that bias. 

How can NLP auditing be used to eliminate the stereotypical bias that we see today?

Sheth: There are known techniques for de-biasing the models, and the AI community is increasingly conscious of using them before deploying these models. A few techniques include creating a more representative training set (by consciously adding and removing documents), or adding synthetically created documents to diversify the training set (by replacing "he" with "she" in a document), or tweaking the model parameters (by programming titles like "engineering" to be gender-neutral or race-neutral), and so on. 

Can you dissect the way machine learning is developed and how it evolves over time, including the challenges?

Sheth: The combination of continuously improving hardware (with capability growing according to Moore's law), availability of more digital data (as computers record and digitize all aspects of human activity), and new software insights (deep learning and other similar models) is accelerating the growth of machine learning. We've barely scratched the surface, but it's already showing promising results. The potential is immense. But there are concerns, too, around the risk of abuse of this superpower. 

Define AI ethics and how they can be applied to technology to course-correct humanity.

Sheth: AI models need to be embedded with the values we hold dear. Just as the constitution declaring equality lies at the foundation of our country's systems, similarly, AI systems will need to be designed with important values, guiding principles, and preferences in mind. 

Realistically, if humans program the machines, can AI ever be 100% objective, and, if so, how long will it take to get there?

Sheth: AI is inherently statistical. AI systems are designed to be probably, approximately correct — a bit like humans. AI can certainly be better than humans at ignoring biases, since it's easier to test and modify software. However, we're heading into a new world with unforeseen implications. Societies often use biases like affirmative action to compensate for other historical biases like racism. Will elimination of all bias lead to fairness? What does fairness even mean? There is much more to think about and discuss than can be covered in this response.