How does AI perceive people that are blind or have low vision?
Have you ever wondered how AI, particularly Midjourney, perceives and represents people who are blind or have low vision?
Have you ever wondered how AI, particularly Midjourney, perceives and represents people who are blind or have low vision?
Let’s delve into this topic and explore how Midjourney, a generative AI tool that can convert natural language prompts into images, can assist in creating photos of this specific group.
Before we dive in, allow us to provide a brief context. As a small business, we face various challenges, one of which is obtaining a diverse range of photos showcasing the Envision Glasses by our target group. Due to time constraints, limited resources, and lack of specialized equipment like high-quality cameras and proper lighting screens, we often find ourselves resorting to using the same photos from a campaign we ran over a year ago.
That’s where the design team at Envision decided to explore the use of Midjourney. We wanted to investigate if this AI tool could help us generate more varied content. However, we won’t keep you in suspense until the end of this article. The current state of AI, particularly Midjourney, doesn’t allow us to create consistent and realistic depictions of our product in use (yet!).
But let’s take a step back for a moment. Suppose we would want to share something relevant for Global Accessibility Awareness Day and include an image of a person that is either blind or has low vision to provide additional context in a social media post. Can we, at least, achieve that?
In this article, we’ll showcase how AI interprets people who are blind or have low vision. The objective here is to create images that accurately represent individuals that are blind or have low vision.
We do want to emphasize that many people who are blind or have low vision don’t necessarily rely on white canes, guide dogs, or would be wearing any sunglasses at all. However, for certain social media posts, we need to depict a person using a white cane or accompanied by a guide dog, as this helps sighted individuals unfamiliar with the industry understand that the person is blind or has low vision and can benefit from our product, whether that is our (free!) Envision App or the Envision Glasses.
We will also share the prompts we used and the corresponding image outputs from Midjourney. Keep in mind that when using Midjourney, it always provides four different images as output.
Prompt 1: A person that is blind
Now, let’s shift our attention to the output generated by Midjourney. In this particular case, the images portray three individuals with blindfolds or cloth covering their eyes, along with one person sporting sunglasses. Interestingly, all the photos appear slightly dark, and they predominantly feature men.
Prompt 2: A blind person
For Prompt 2, the results show four photos, all featuring men wearing glasses. One is standing in front of a white cloth, another has a guide dog, and yet another appears to be wearing a robe or djellaba. However, none of these images meet the requirements we need for social media — they simply aren’t usable.
Prompt 3: A blind person holding a white cane
Moving on to Prompt 3, which specifies a blind person holding a white cane. Surprisingly, the resulting photos only show individuals in white dresses holding canes, and they don’t even include faces. Once again, these images are far from what we’re looking for.
Prompt 4: A blind person holding a white cane and a guide dog
This time, the results show people with sunglasses, holding some kind of cane, but none of them accurately depict a white cane or clearly indicate the presence of a guide dog. We’re still not quite there.
By giving it more context, we already get a bit more better results. The 4 photos show a person with a dog and they’re holding some cane. The results are not great as none show an actual white cane nor any hint to the dog being a guide dog. All 4 people are wearing sunglasses.
Let’s give it even more context.
Prompt 5: full body photo of a blind man who has a white cane in his right hand and a guide dog on his left hand.
However, when we give it more context in Prompt 5 — a full body photo of a blind man with a white cane in his right hand and a guide dog on his left — we start to see some improvements. Among the four resulting photos, one actually shows a person with a dog, and the white cane is somewhat recognizable. The others, unfortunately, depict the cane as more of a wooden stick or even interpret it as the dog’s cord.
Prompt 6: a full body photo of a blind woman on the street with a guide dog
Prompt 6, requesting a full body photo of a blind woman on the street with a guide dog, doesn’t effectively differentiate between a regular dog and a guide dog. It fails to convey that the person is blind or has low vision and is accompanied by a guide dog. It is fascinating to observe that AI often portrays people that are blind as wearing sunglasses.
Prompt 7: a full body photo of a blind woman on the street holding a white cane
Moving on to Prompt 7, we encounter another setback. Although the four resulting images show women holding canes, they don’t resemble regular white canes at all. Furthermore, the AI seems unsure whether it should be a single cane or multiple ones resembling crutches.
Prompt 8: a full body photo of a blind woman on the street holding one white cane in her right hand
Prompt 8 finally presents some improvement. The images show a woman with sunglasses holding a cane, and in two of them, the cane is white and relatively suitable for social media use.
Despite the existence of numerous white canes available in various colors and sizes, the most prevalent ones typically feature a ball or a red-colored section, are relatively thin, and often designed to be foldable. Surprisingly, none of the AI-generated photos created thus far have successfully depicted this characteristic representation.
Prompt 9: a full body photo of a blind woman walking on the street while holding 1 white cane on her right hand
In an attempt to add more movement, Prompt 9 requests a full body photo of a blind woman walking on the street while holding one white cane in her right hand. Although the previous prompt felt somewhat static, the results here still fall short in terms of accurately depicting a white cane. However, the image in the bottom left does have a nice feel to it. Interestingly, the AI tends to interpret the people as wearing white attire. Perhaps it’s time to switch things up.
Prompt 10: a full body photo of a happy blind woman with colourful clothes on the street while holding 1 white cane in her right hand
For the next prompt, the results once again feature four women, this time with vibrant clothing. However, the white cane still doesn’t resemble a real white cane in any of the images.
...and we could continue generating and experimenting with the given prompts, but the overall conclusion remains that Midjourney is not reliable when it comes to creating highly specific scenarios.
So far, we have not been able to successfully generate an AI-generated photo of a blind or low-vision person holding a white cane and a guide dog that would convincingly pass as a real photo.
Furthermore, this also demonstrates that agencies and photographers need not fear losing their jobs just yet. Instead, we envision a future where AI and human creativity coexist harmoniously. Campaigns can still be captured with real users and genuine products, while afterward, some enhancements and combinations using AI could be made to preserve certain details.
So we'll wait until artificial intelligence evolves, but until then, all photos you see on our website are actual humans that we know and of whom we know that use our product. And if you wonder... this artıcle is written by a human too.