AI will change the way blind people see the world | Wired

To celebrate her 38th birthday, Chela Robles and her family went to her favorite bakery, One House in Benicia (California, USA) for a steak sandwich. Brownies. On the way home, he touched a small touch screen on his temple and described the outside world. „A cloudy sky,” was the reply he got through his Google Glass smart glasses.

Robles lost sight in his left eye at age 28 and his right eye a year later. Blindness, he says, results in missing small details that help people relate to each other, such as facial expressions and cues. For example, her father tells many ironic jokes, so she doesn’t always know when he’s being serious. „If a picture can tell a thousand words, how many words does an expression communicate,” he commented.

AI at the service of the visually impaired

Robles tried a few services that put her in touch with people looking for help. But in April he signed up for the experiment Ask Envision, an artificial intelligence assistant using OpenAI’s GPT-4, a multi-model system capable of capturing images and text and providing conversational responses. It is one of many assistive products for blind people who are beginning to integrate language patterns, with the promise of providing more visual detail about the world around them, and greater independence.

Envision was launched in 2018 Processor of Skilled Read the text on the photos in early 2021, in Google Glass. Earlier this year, the company began integrating an open-source conversational model that answers basic questions. Later, Envision incorporated OpenAI’s GPT-4 for image-to-text descriptions.

READ  The new Volkswagen Golf 2024 faces the end of its development, which looks like an expected update of the German compact

Be my eyes, a 12-year-old app that helps users recognize objects around them, added GPT-4 in March. Microsoft, one of OpenAI’s major investors, has begun testing the integration of GPT-4 into its service. SeeingAIIt provides such functionality, according to Sarah Bird, president of the Institute for Artificial Intelligence.

In its previous version, Envision read the text of an image from beginning to end. Now he can summarize the text of a photo and answer supporting questions. That means Ask Envision can read a menu and answer questions about prices, dietary restrictions and dessert preferences.

Richard Beardsley, another early user of Ask Envision, says he typically uses the service to do things like look up contact information on a bill or read the ingredient list on food boxes. Thanks to Google Glass’s hands-free option, you’ll turn toward your guide dog while holding onto its leash and handle. „Before, you couldn’t skip a certain part of the text,” he says. „Having it makes life so much easier because you’re doing exactly what you’re looking for.”

Dodaj komentarz

Twój adres e-mail nie zostanie opublikowany. Wymagane pola są oznaczone *