A Review of the vOICe

Summary: The vOICe, notable for the spelling of “O I C”, is a programme aimed to assist individuals with varying forms of blindness to see using an association of black and white images with soundscapes. The programme was developed by Dr Peter Meijer, of the Netherlands, and is currently being hosted at the SeeingWithSound.com website. At present, the vOICe can be run only on Windows and Android, but there has not been time to develop the programme for cross-platform devices.
Description: The vOICe takes advantage of a few bits of equipment for a person to be successful. A portable computer can be modified to function while the lid is closed. The user can run the application after securing their camera glasses mounted on their head to the computer by using a good-sized USB cable. The programme will start out by testing the headphones by speaking two words on each side. “Left, right.” By a complex set of arrays built into the programme, the application will take a video shot of what is in front of the camera from left to right, convert the result into a unique set of computer-generated sounds consisting mainly of tones and noise providing many textures. The higher set of frequencies correspond to activity in the upper portion of the camera and vice versa. The brighter an object is to the camera, the louder the sounds will appear. If the object gets closer to the camera, the background sounds will gradually diminish until the object is within full view. This is part of the ratio that a lot of people misconceive. It takes one second for a sound scape to be converted from the video, applying sixty-four pixels with a predetermined range of audible frequencies. Note, for a first-time blind person, a pixel is not going to make sense to them. In brief, a pixel is a squared lined boarder that holds part of an image. That is why some people can faintly see the lines while staring at certain screens.
The vOICe can also be grounds for learning how to use the mouse. Using the mouse is like taking a probe across a big square grid, you learn to identify the sound of the mouse with the items on this grid. Then, as you move the mouse, the sound moves in accordance to wave field holophony synthesis. When you find a major area of focus, you press the mouse to focus on the smaller area within that grid. You have more access to your focus on that area. When you are done, you focus on yet another smaller bit of detail within that grid. So it is like opening a huge box, then finding the box you want within that huge box. Then you open that box and then hunt for the other box, and then you open that box, and so on. It keeps getting smaller with every opening of each one.
Along with the vOICe, researchers around the world have developed similar sensory substitution devices that serve the same or similar purpose as the vOICe. One example is the Hebrew University’s pentatonic EyeMusic image converter in C-major. Many researchers recently gathered at certain places to discuss their beliefs about how people with any form of disability could benefit from technology adapted for such people. Recently, there was a study that was published in the journal of Cognitive Science that showed how the vOICe could be of help to those wanting to see the world using their ears.
Purpose: With the ever-lasting work on regrowing nerves and other tissue in the eyes, it is hoped that sensory substitution devices can be used to encourage visual rehabilitation to train the brain to see either for the first time, or after a period of prolonged absence. There is talk about inventing some devices that can transmit images from one brain to another brain by use of light-guided proteins, or optogenetics. In this fashion, blind people would have all the colours in their minds, and it will be up to them to make their own images in their heads after this new sensation is given to them.
Disadvantages and negativities: As of right now, the vOICe, or any form of autonomous sensory substitution, cannot be used for extreme applications due to the imposing of danger to the person wearing them. This includes activities such as, travelling in unfamiliar environments, crossing streets, etc. Each form of sensory substitution will have its own ups and downs, so it is important that the person finds the one that best fits their needs. Currently, all devices are set to be easily customised for the end-user, though that may get overwhelming since there are many options to change. The idea is to simulate a form of natural eyesight, which cannot be customised, save for putting on special glasses and such. It is recommended that people research Aira and Be My Eyes as a possible alternative. Again, they are not going to be a safety substitute.
Comments: I have been using the vOICe off and on, but I find it rather confusing to grasp the concept all at once. Throughout experimentation and research, I found out that the vOICe is primarily focusing on the association of visual images to audible soundscapes, not so much with colours to sounds, like some of the other devices do. This is because those devices may not give you the shape of an object, just the colour of what it is.
Another problem is that there is no distinction between knowing the different variations of grey with the use of pitch, since that is already being used for knowing the height of an object is.  I discovered that most people are familiar with associating colours with musical keys, but those musical keys are what give us specific emotions that can be correlated to colours. Another possibility is the advent of certain cadences that resolve to a certain key, or when combining three or more colours, they will create new harmonic content that can give us an emotion, such as pastel colours.
Suggestions and improvements: Most of these have already been considered and are being worked on. However, they will still be addressed. As of now, the most effective way to receive sound input is through stereo headphones. A possible suggestion is to use a more complex system that will use wave-field synthesis technology  to enhance the spacial awareness of the person. Instead of having the sounds tell you how high an object is, the sounds will come from the upper portion of the set, and vice versa. This way, the pitches of sound can be used for other applications, such as identifying colours, shades, tints, etc.
A discussion regarding stereoscopic vision brought the possibility of wearing two cameras, one above each eye, that would move in accordance to eye movements of the person (if they do not suffer from nystagmus, amblyopia, etc) or some other neurological disorder. This can be accompanied with the surround-sound system, so that the left side is depended on the left camera while the right side is depended on the right one.
The ability to focus and keep track of an object with the camera would be helpful to know when a particular object is moving. This would mean that one would stop the scanning process and have the programme generate a sound scape for that image in real time. When the image shifts to one side, the sounds will move in the direction that it went. This is called head-related transfer function (HRTF), and is found in modern augmented reality systems, though the concept is primarily psychoacoustic. The purpose of focusing on the object is to hear the soundscape of said object while moving the camera to study the details of each side. For example, you want to study a painting. Using the vOICe, you can stop the scanning and concentrate on the image. Using a probe-like fashion, you would manually scan the painting with the camera(s) to hear each of the sound scapes change in volume, pitch, timbre, or texture.
Brain wave entrainment can also be added to synchronise the frequency waves of a person’s brain, that way they can be more relaxed and be able to concentrate. The term ‘qualia’ refers to a state of meditative consciousness, such as twilight or dawn. These things can be achieved through not only visual brain wave synchronisation, but using auditory stimulation, as well. This includes, beatings (binaural or monaural), or isochronic pulses with sonic harmonics. It also consists of certain chords not found in forms of conventional music, and therefore, it has no organised time and rhythm.
Another possibility to consider is to use the colour functions as part of brain wave entrainment. The soundscapes can indicate the black-and-white area surrounding the person, while the brain wave soundscapes fill them in with what the colours are.
Current research has not dove much into the latter area, but it is also said that people with music training have higher performance with auditory and visual stimulation.
To increase the number of end-user interest in the vOICe, it is recommended that more platforms be included, and additional considerations given to these recommendations..