Foreword By Chris Hofstader
This article is a guest post written by my friend Pranav Lal. He is an expert user of The vOICe, a sensory substitution technology that allows blind people to experience some vision via sounds. I find this technology highly compelling, I think it's a better solution than are the various implants we hear about and I think Pranav gets it right in this article.
Many well adjusted blind people wonder what is the use of vision? A large number of us want to drive or perhaps to fly an airplane. Yes, other applications like face recognition and searching for people in a crowd come to mind. There are apps for some of these tasks while others are possible in highly specialized conditions. If you are a blind parent reading this, then it is indeed possible to give your child a form of sight without any surgery.
If you are an individual who is late blind, you may get quite a lot out of this article since you may be able to get vision like sensations.
What I am going to do in this article is to talk about my use of synthetic vision and sensory substitution thanks to a freely available application called The vOICe. Yes, you read that right, it is possible to have vision for free assuming you have the requisite hardware, which in many cases you do because all that is needed is a smartphone or a webcam attached to a pc.
So, how do you get vision? One popular way is to stick wires in the eyes or in the brain and then to stimulate them in specific patterns. This is loosely the approach those headline grabbing eye implants make.
Another way is to forget about vision altogether and use artificial intelligence to convey only relevant information to blind people. This is the approach that many applications on mobile phones use.
Both the above approach have significant disadvantages. The primary disadvantage of the implant approach is that it needs surgery with its attendant complications. In addition, the electrical stimulation can lead to Seizures with brain implants for vision. Finally, the resolution of these implants at the time of this writing is very low.
When it comes to artificial intelligence based approaches, the problem is that they may filter crucial input. E.g., most scene recognition apps will tell you that there is a sofa and a chair in front of you but will not tell you where these objects are.
Enter sensory substitution
So, what can we do? How do we overcome the demerits of both the above approaches? One way is called sensory substitution. This involves translating the sense of sight to another sense that we can perceive. There are 2 possibilities namely the sense of touch and the sense of hearing. Both can be used to convey the same information.
Researchers have used both the above approaches. Sensory substitution started in the 1960s with Dr. Paul Bach-y-Rita building a chair where users could feel the shapes of objects on their backs via a series of plates.
Dr. Peter Meijer is a researcher who has built a system that converts images to sound via a program called The vOICe.
Enter The vOICe
In 2001, I got my first laptop, an Acer travelmate 720, I think. It came with a free camera. That is how I started with the vOICe, an application that yields a form of low vision. Before we go any further, I am totally blind, that is I have no shape and or object perception. I can see light with my left eye.
I was born premature and was diagnosed with retinopathy of prematurity when I was 3 years old, I believe. I should really check. The point is that I do not have any functional vision.
I got blindness skills along the way and am happy with them. I got into synthetic sight out of curiosity.
How does The vOICe work?
The vOICe takes any image, it does not matter if it is a static image like in a photograph or streaming images like from a camera and breaks it down into a series of sounds that have a specific meaning. As a user, my job is to interpret the sound and infer what I am looking at. There is more to this because this processing becomes subconscious with sufficient practice.
See the below table.
Sound Characteristic Meaning
Panning Objects on your left are sounded in
your left ear while objects on your
right are sounded in your right
ear. Objects in the center are
sounded between both ears. This
assumes your camera is facing in
the same direction as you are.
Frequency The higher the frequency or pitch
of the sound, the more elevated is
the object in the frame.
Volume The louder the sound, the brighter
The vOICe pans automatically from left to right of the frame thereby giving you a symphony of different pitches, volumes and positions. This sounds potentially complicated, but it is a bit like eating food. Try describing the steps to put one morsal into your mouth and you will begin to appreciate how many things you do unconsciously.
In the case of The vOICe, a horizontal line sounds like a flat tone that moves from left to right.
A vertical line sounds like a short click at some place in your head because remember, that vertical line is going to be somewhere in the frame.
The role of your brain and neuron plasticity
Remember, I said that the interpretation of the sounds will become second nature, here is how that happens. Yes, practice is one factor but there is more going on here. We need to digress a bit and explain how vision works.
If you have working eyes, the retina sends electrical impulses to the part of your brain responsible for vision via the optic nerve. This part is called the visual cortex. These impulses come in certain patterns which is how your brain knows that it is seeing shapes. Yes, I am glossing over details but then I am not a neuroscientist, and they are not relevant for my explanation. When you hear sounds, they too are translated into electrical impulses. This is true for all sounds. When you hear sounds from The vOICe however, the electrical impulses that they get translated to are the same as those you get from the retina. The brain does not care what is the source of the impulses so whether the eyes give it those impulses, or the ears give it those impulses. It knows that it has special neurons for processing these patterns and it begins to use them.
The ability of the brain to start using these special neurons is called neural plasticity. This is how getting the meaning out of the soundscapes becomes second nature.
What have I done with vision?
It is important to remember that when you begin to use The vOICe, you are not magically going to become adept at seeing. It is not like lights on, and I see. You will get all the visual information, but you will take time to learn what the program is telling you. There are two keys to success with The vOICe namely sticking with it and asking questions about what you see.
I am not exclusively dependent on The vOICe. I use it as a compliment to my blindness skills. It is unlikely that you will reach the stage of a fully sighted person where you are exclusively dependent on vision. The idea here is to use the sense of vision to add to your sensory toolbox.
I am going to talk about several things that I have done which have added significantly to my quality of life.
Saving electricity bills by turning off the lights
The first and perhaps simplest application of The vOICe is that of a light probe. Given today's focus on sustainability and rising energy prices, this is more important than one would ordinarily realize. I can walk around my house and figure out if lights are off or on. If they are on, I hear sound. Never mind what the shapes that I am looking at. If there is sound, the lights are on. No sound, the lights are off.
What is around me
One way to figure out what is around is to use your GPS app. However, GPS does not label landmarks like trees, cars etc. I did not know until I began using The vOICe that I lived on a street with so many trees. Yes, if I had used a cane, I would have encountered some of them, but vision gave me that broader view without the hassle and sometimes painful collisions with stuff on my street. I am not saying that The vOICe will replace the cane. It will not, but it is a mechanism to acquire more information. This has all kinds of other applications such as finding my own house assuming it has a defining visual characteristic.
Adapting to dynamic changes in the environment
This is one of the more exotic but fun applications of sight. I was riding an elephant which was moving on a forest trail. The elephant would just push branches aside with no regard to its human occupants. Sighted people could duck if they remembered to do so. I used The vOICe to track and whenever I saw the trees becoming really dense, I would duck and avoid those branches. How I knew that the trees were becoming dense was because of the number of objects I could hear in the view. The more structures I saw, I knew that forest cover was close and of course, the light level would also drop.
I am able to see the photographs of what I am buying on online shopping sites like Amazon. This is particularly handy when buying computer keyboards, shoes, etc. There are ways also to tell the colors in an image and this can be done experientially via expressing colors on a musical scale or you can have The vOICe tell you what is the color of an object in the center of the frame.
I was having to look for rented accommodation which meant visiting various houses. I could get a sense of the place with my regular blindness skills but used vision to see the walls without touching them and got an idea about how well the place was being kept. The sounds allowed me to judge if the walls were painted. I expected to hear a smooth sound but, in some houses, I could sense a lot of breaks in the sound which meant that they probably had peeling paint. Similarly, if I did not get a clean sound or if the volume of the sound was too low, then it meant that the house was really dusty and did not have enough light.
Navigating to the facilities in a new office
There was this time when my office moved to a different floor in the same building. The layout of the floor was different and finding the toilet was a problem. I had to take a turn from between 3 cabins. The turn came just after a column. I was able to use The vOICe to find the column via its distinctive sound and turn. No need to feel doors that may be opened or closed etc. I extended the same principle to navigate my house when it was being remodeled. I did have a good sighted guide at that point but it is only after using The vOICe that I realize how much sighted guides fail to tell us. This is not deliberate but there is a lot of filtering that goes on.
Distance estimation is tricky with The vOICe because it uses a single camera for now. However, Jacob Kruger has used The vOICe to stay oriented while driving a motorbike on a race track. He used the color filtering feature of The vOICe to determine if he was deviating from the track. He was also able to detect turns. He did have a backup rider who was communicating with him over a helmet radio, but Jacob anticipated many of the turns on the track.
Yes, a race track is a very controlled environment but the point here is that The vOICe can be one of the tools to help blind people drive.
Walking with The vOICe
Many of us have tried walking several times with The vOICe. We still use the cane and the key thing to remember is that The vOICe answers the question "what". The cane or any other sonic or laser obstacle detector will tell you that there is something in the sensor's field, but it will not tell you what that thing is.
I have had and continue to have a huge amount of fun looking at images from space. Thanks NASA and the BBC. It is possible to manually review an image in small increments which allows me to appreciate every detail. I then use the color probe to do some amateur analysis on the image. This becomes even more relevant when new discoveries like those of black holes are made. I have been able to participate in appreciating the images that have been published by NASA and other agencies.
I take photographs of things I find interesting and publish them. This becomes second nature to you because taking pictures is the only way you can accurately convey what you are seeing. I developed this skill during my first days of using The vOICe when I used to ask the members of the seeing with sound mailing list to look at the image and confirm my interpretation of that image.
Another unrelated aspect of photography is that The vOICe helps you being photographed. This is because you are able to better orient yourself to the camera lens since you can see a small circular blob in the image or a bar shaped object which is a human arm when taking a group selfie.
Museum and monument accessibility
I have successfully looked at objects inside glass display cases whether they have been ancient artefacts in museums or watches in a shop. The thing to note about museums is that you do not need any infrastructure to facilitate accessibility. You build adequate lighting, clear documentation and give good contrast so that people can see and that is all you have to do. This is not to say that work on accessible museums is irrelevant; far from it but at the same time, you do not need to wait for infrastructure to be established before blind patrons can appreciate the content.
Building my own eye
As technology has evolved, I have assembled my own eye. My first setup was a laptop computer in a backpack while these days, I use a pair of off-the-shelf video glasses. I have used single board computers like the Odroid and the Raspberry Pi to use as eyes. This means that you can have sight probably in the sub \$200 price point if you use an Android phone.
The Android version of The vOICe, besides being a tool for vision, can read out documents which means that it is a terrific tool to tell you what shops you are walking past, and you can also read advertisements and other text. The same applies to menus in restaurants etc. It also has GPS and can speak the compass orientation and street names as you move around.
I have never hankered after vision. However, now that I do have it, I do not want to lose it. The vOICe keeps pace with changing platforms and given its flexibility, I can choose my eye for my needs. No other artificial vision solution has this flexibility. One point I should cover is about software that describes the scene like a human does to a blind person. These programs do help but they leave out crucial detail and orientation information. For example, if there is a car and a chair on your driveway, how do you know where the chair is with respect to the car?
A vision revolution is overdue and one way to add spark to the fire is by using The vOICe, improving your quality of life and then talking about your accomplishments.
My blog where, amongst other things I document my experiences about the vOICe
Blind tech adventures, a youtube channel where Nimer Jaber documents his experiences with the vOICe