As we’ve seen the rapid advancement of artificial intelligent (AI) models in the past year, text-to-image AI has become one of the most popular models to grab the interest of the world even beyond those working in technology fields. An exciting promise of these models is the ability to create any image someone can fathom from a text description. Though anyone who’s spent much time using text-to-image models can attest, you don’t always get what you’re looking for if you’re not using the right words. Even if you are, there are inherent biases present even the simplest image generations.
Vanderbilt Data Science Institute Chief Data Scientist Jesse Spencer-Smith hosted a deep-dive conversation Feb. 27. He discussed the science behind text-to-image models, as well as how to recognize and prevent biases in the images through precise prompt engineering. In his talk, Dr. Spencer-Smith discussed recent biases we’ve seen here at the DSI as we generate our own images through AI.
In the examples below, we used the same prompt but only changed the words “men” and “women.” It’s easy to see how the AI generated image for the female version is a more revealing image than that of the men.
To work around this bias, Dr. Spencer-Smith in his talk discussed using phrases like “dressed for office” and “professional” while writing the prompt. This prompt engineering not only helps to create a more professional image, by reiterating the need for more professional representation of women, the model is trained to help eliminate those inherent biases.
Dr. Spencer-Smith’s talk goes deeper into the biases and problems with text-to-image models. You can view the entire workshop on our YouTube Page.