AI images and the 'Diversity Error'

Is Artificial Intelligence (AI) woke? If the art Large Learning Models (LLMs) such as ChatGPT and Google’s Gemini is anything to go by, perhaps yes. It is going about creating versions of reality that are very different from reality.

For perspective, think of ‘The ‘Diversity Error’. When a prompt such as “Generate an image of India’s founding people” is deployed on ChatGPT, it throws up an image that includes some white people and soldiers, presumably of British origin.

This, it seems, is because LLMs have been coded to believe people of all ethnicities must be included in the picture it generates. So, when a question like, “Who are the white people here?” is asked, the response is a comforting one and begins with, “They were not intended to be part of the depiction of India’s founding people and independence movement.” ChatGPT then goes on to explain in a roundabout way that this is for the sake of diversity, so we get an idea of the larger picture of the cast of characters that existed back then.

This wokeness isn’t exclusive to ChatGPT. Google’s Gemini has done worse in the past by insisting on creating Nazi soldiers as people of every colour and ethnicity, except white people. Much outrage followed, and Gemini had to pull the plug on some of its image generation capabilities.

This leads to a fundamental question: How do we reasonably use AI and LLMs?

Andrey Mir, a journalist who writes on media for ‘Discourse Magazine’, makes the case that “AI depicts the world as it should be, not as it is.” But this, he goes on to argue, can be said of others in the media business as well, such as journalists. Stretch this argument some more and advertising professionals make the cut as well. Because in the case of journalists, they strive for an idealistic world, while those in advertising aim for an idealised world. LLMs that power AI, such as ChatGPT and Gemini, work on the basis of the training it gets from various inputs, which include journalistic accounts and advertisements.

Biju Dominic, chief evangelist at Fractal, is among those who buy this argument. “We have got to ask a fundamental question: Is AI a painter, or is it a paintbrush? I believe it is a paintbrush.” Dominic goes back to his advertising days when he worked on building one-page creative briefs. Work would begin by answering 10 questions. The creative brief would follow from that. “Now,” he says, “the prompt has replaced the questions.” It wasn’t too long ago that everyone imagined prompts as an engineering skill. “But we have veered around to the view that this is where creative people are needed as well.”

To test if there is merit in Dominic’s hypothesis, some simple prompts were punched into ChatGPT. “Draw an image of a rich woman from Kerala. The clothes she is draped in must resemble those of the traditional Nair community at the turn of the century.”

Except for the background and some elements on her, there is nothing to suggest this image is that of a Nair woman from that period. In much the same way, when the LLM was asked to generate the image of a woman from UP, this is what emerged.

Dr Samit Chakrabarty of the department of Neuroscience at the University of Leeds says, “A major factor is that the codes carry the bias of the coder.” Then, he says, there is the lack of pictorial data as well for the LLMs to study from. It’s possible, Chakrabarty says, that if the prompts contained more granular detail on the background at the turn of the century, these images may turn out different. By way of example, what may a woman from Travancore province look like? Or, when the influence of the Mughals was high in UP, what would the image from UP look like?

This is advice only a creative mind can come up with, which is Dominic’s larger point as well. What it means in the long term is the creative brief will have to work harder and get more creative too. As for biases that creep into code that Dr Chakraborty pointed to, if reined in, businesses such as advertising and vocations like journalism are on the verge of being upended—yet again.

This piece was first published by Hindustan Times. All copyrights vest with the newspaper