Supa CEO Mark Koh discusses the importance of diversity in the Machine Learning process in order to create ethical AI systems.
By Mark Koh
The artificial intelligence industry is expected to grow by more than 10-fold in the next decade and as machines equipped with AI become more ubiquitous, it’s imperative that they operate in a trustworthy manner.
For AI models to be fully functional, we have to look at more than just the technical issues. There are also ethical issues to consider. Although we may only be in the early stages of an AI boom, we must integrate ethics into AI development, starting now.
There are many steps in the AI development process but they can be boiled down to three main parts: learning, reasoning and self-correcting. At each of these points, there are algorithms involved.
In the learning aspect, programming algorithms for the AI model requires data acquisition and labelling. Reasoning requires the AI to choose the best algorithm for a specific situation, and later self-correct, continually improving until it achieves its purpose.
At every stage — from as early in the process as planning for data collection to adding further improvements to the AI — there is the potential for bias to creep into the final product. These biases often arise due to lack of diversity within the industry and often result in mistakes that are unacceptable in a machine that is supposedly fully functional.
When AI mistakes become dangerous
Some mistakes — like a chatbot responding oddly to a comment — may be considered funny. Other mistakes are discriminatory, like face recognition software not working as well for women compared to men, or incorrect labeling someone in a kitchen as “woman”.
Some mistakes can be perilous. For example, in 2017, a Palestinian man was arrested and questioned by the Israel Police after Facebook’s automated translation interpreted his “good morning” caption as “to hurt”.
Research conducted on the healthcare industry has shown that there have been “racial disparities” in pain management for children, where African-American children were less likely to be given medications for moderate to severe pain. Imagine an AI healthcare system that learns from these records.
These sorts of mistakes are more than just discriminatory or unfair. They are dangerous. And every single time an AI makes a decision, there is a possibility for detrimental results. Should only limited demographics be totally represented across the entire process of AI development?
If we are looking to integrate ethics into AI development, we must start by introducing diversity in every step of the process — from data collection all the way to product testing.
At the data collection stage, it’s important to think about how the data is collected, processed and labeled. Have things like cultural bias been taken into account when collecting data? Is the data reliable? How is the data processed so that it is representative of all the situations a machine might encounter?
When collecting and processing training data, it’s important that data scientists are aware of possible biases. Some ways to counter this would be to ensure sufficient data collection from varied samples. This in itself is a lengthy process that requires mindfulness.
From the beginning, someone in charge of the process would have to ask questions like: do we have enough data, are there existing datasets that we can use, and how do we generate data that we can use? If there is enough data, would we need to improve existing models? Or do we need more labeled data for better machine learning?
At the data labeling phase, having a diverse labeling team can help to eliminate bias in training data sets, which results in data sets that are truly accurate and of high quality.
Most people often think of gender when diversity is brought up, but it’s more extensive than that. Race, age, religion, culture and even income can be factors that might affect how AI can be applied.
For example, teenagers might use emojis to mean something different from what a 40-something person might use it for. An AI-powered car safety system that is trained using only male-centric data for body weight and sizes could make fatal errors in the case of female users, who generally have lower body weight and size.
As AI systems play larger roles in decision-making processes, it’s imperative that they are built on inclusive models. In order to do this, everyone who plays a role in the development process — no matter how big or small — must do their part to call out biases or be aware of disparities.
And leaders in the tech industry, whether it’s a founder or a CEO or a CTO, must cultivate a work environment that rewards diversity, curiosity and collaboration. This way, we will find ourselves with AI systems that can be truly “user-friendly” for humans.
It may seem like AI isn’t something that’s part of our daily lives. But it is — from our search engines to face ID unlocking on our phones. As AI becomes more and more ubiquitous, taking over certain services in the future, it’s vital that we get it right.
As this technology evolves, we mustn’t find ourselves in a position where a single person has to decide how AI development is conducted.
It’s too large a responsibility and there are simply too many steps within the process, with too many risks.
It is also a collective responsibility to have diversity and inclusivity built into our part of the process so that we can get the best model at the end.
Ed. Recently, the AI debate has intensified due to the release of ChatGPT on the market and the intensifying AI competition between the tech giants. Photo courtesy of Unsplash+ and Alex Shuper.