Reading the mind with machines
Researchers are developing brain-computer interfaces that would enable communication for people with locked-in syndrome and other conditions that render them unable to speak
In Alexandre Dumas’s classic novel The Count of Monte-Cristo, a character named Monsieur Noirtier de Villefort suffers a terrible stroke that leaves him paralyzed. Though he remains awake and aware, he is no longer able to move or speak, relying on his granddaughter Valentine to recite the alphabet and flip through a dictionary to find the letters and words he requires. With this rudimentary form of communication, the determined old man manages to save Valentine from being poisoned by her stepmother and thwart his son’s attempts to marry her off against her will.
Dumas’s portrayal of this catastrophic condition — where, as he puts it, “the soul is trapped in a body that no longer obeys its commands” — is one of the earliest descriptions of locked-in syndrome. This form of profound paralysis occurs when the brain stem is damaged, usually because of a stroke but also as the result of tumors, traumatic brain injury, snakebite, substance abuse, infection or neurodegenerative diseases like amyotrophic lateral sclerosis (ALS).
The condition is thought to be rare, though just how rare is hard to say. Many locked-in patients can communicate through purposeful eye movements and blinking, but others can become completely immobile, losing their ability even to move their eyeballs or eyelids, rendering the command “blink twice if you understand me” moot. As a result, patients can spend an average of 79 days imprisoned in a motionless body, conscious but unable to communicate, before they are properly diagnosed.
The advent of brain-machine interfaces has fostered hopes of restoring communication to people in this locked-in state, enabling them to reconnect with the outside world. These technologies typically use an implanted device to record the brain waves associated with speech and then use computer algorithms to translate the intended messages. The most exciting advances require no blinking, eye tracking or attempted vocalizations, but instead capture and convey the letters or words a person says silently in their head.
“I feel like this technology really has the potential to help the people who have lost the most, people who are really locked down and cannot communicate at all anymore,” says Sarah Wandelt, a graduate student in computation and neural systems at Caltech in Pasadena. Recent studies by Wandelt and others have provided the first evidence that brain-machine interfaces can decode internal speech. These approaches, while promising, are often invasive, laborious and expensive, and experts agree they will require considerably more development before they can give locked-in patients a voice.
Engaging the brain — but where?
The first step of building a brain-machine interface is deciding which part of the brain to tap. Back when Dumas was young, many believed the contours of a person’s skull provided an atlas for understanding the inner workings of the mind. Colorful phrenology charts — with tracts blocked off for human faculties like benevolence, appetite and language — can still be found in antiquated medical texts and the home decor sections of department stores. “We, of course, know that’s nonsense now,” says David Bjånes, a neuroscientist and postdoctoral researcher at Caltech. In fact, it’s now clear that our faculties and functions emerge from a web of interactions among various brain areas, with each area acting as a node in the neural network. This complexity presents both a challenge and an opportunity: With no one brain region yet found that’s responsible for internal language, a number of different regions could be viable targets.
For example, Wandelt, Bjånes and their colleagues found that a part of the parietal lobe called the supramarginal gyrus (SMG), which is typically associated with grasping objects, is also strongly activated during speech. They made the surprising discovery while observing a tetraplegic study participant who has had a microelectrode array — a device smaller than the head of a push pin covered in scads of scaled-down metal spikes — implanted in his SMG. The array can record the firing of individual neurons and transmit the data through a tangle of wires to a computer to process them.
Bjånes likens the setup of their brain-machine interface to a football game. Imagine that your brain is the football stadium, and each of the neurons is a person in that stadium. The electrodes are the microphones you lower into the stadium to listen in. “We hope that we place those near the coach, or maybe an announcer, or near some person in the audience that really knows what’s going on,” he explains. “And then we’re trying to understand what’s happening on the field. When we hear a roar of the crowd, is that a touchdown? Was that a pass play? Was that the quarterback getting sacked? We’re trying to understand the rules of the game, and the more information we can get, the better our device will be.”
In the brain, the implanted devices sit in the extracellular space between neurons, where they monitor the electrochemical signals that move across synapses every time a neuron fires. If the implant picks up on the relevant neurons, the signals that the electrodes record look like audio files, reflecting a different pattern of peaks and valleys for different actions or intentions.
The Caltech team trained their brain-machine interface to recognize the brain patterns produced when a tetraplegic study participant internally “spoke” six words (battlefield, cowboy, python, spoon, swimming, telephone) and two pseudowords (nifzig, bindip). They found that after only 15 minutes of training, and by using a relatively simple decoding algorithm, the device could identify the words with over 90 percent accuracy.
Wandelt presented the study, which is not yet published in a peer-reviewed scientific journal, at the 2022 Society for Neuroscience conference in San Diego. She thinks the findings signify an important proof of concept, though the vocabulary would need to be expanded before a locked-in patient could foil an evil stepmother or procure a glass of water. “Obviously, the words we chose were not the most informative ones, but if you replace them with yes, no, certain words that are really informative, that would be helpful,” Wandelt said at the meeting.
Thoughts into letters into words
Another approach circumvents the need to build up a big vocabulary by designing a brain-machine interface that recognizes letters instead of words. By trying to mouth out the words that code for each letter of the Roman alphabet, a paralyzed patient could spell out any word that popped into their head, stringing those words together to communicate in full sentences.
“Spelling things out loud with speech is something that we do pretty commonly, like when you’re on the phone with a customer service rep,” says Sean Metzger, a graduate student in bioengineering at the University of California San Francisco and the University of California, Berkeley. Just like static on a phone line, brain signals can be noisy. Using NATO code words — like Alpha for A, Bravo for B and Charlie for C — makes it easier to discern what someone is saying.
Metzger and his colleagues tested this idea in a participant who was unable to move or speak as the result of a stroke. The study participant had a larger array of electrodes — about the size of a credit card — implanted over a broad swath of his motor cortex. Rather than eavesdropping on individual neurons, this array records the synchronized activity of tens of thousands of neurons, like hearing an entire section in a football stadium groan or cheer at the same time.
Using this technology, the researchers recorded hours of data and fed it into sophisticated machine learning algorithms. They were able to decode 92 percent of the study subject’s silently spelled-out sentences — such as “That is all right” or “What time is it?” — on at least one of two tries. A next step, Metzger says, could be combining this spelling-based approach with a words-based approach they developed previously to enable users to communicate more quickly and with less effort.
‘Still in the early stage’
Today, close to 40 people worldwide have been implanted with microelectrode arrays, with more coming online. Many of these volunteers — people paralyzed by strokes, spinal cord injuries or ALS — spend hours hooked up to computers helping researchers develop new brain-machine interfaces to allow others, one day, to regain functions they have lost. Jun Wang, a computer and speech scientist at the University of Texas at Austin, says he is excited about recent progress in creating devices to restore speech, but cautions there is a long way to go before practical application. “At this moment, the whole field is still in the early stage.”
Wang and other experts would like to see upgrades to hardware and software that make the devices less cumbersome, more accurate and faster. For example, the device pioneered by the UCSF lab worked at a pace of about seven words per minute, whereas natural speech moves at about 150 words a minute. And even if the technology evolves to mimic human speech, it is unclear whether approaches developed in patients with some ability to move or speak will work in those who are completely locked in. “My intuition is it would scale, but I can’t say that for sure,” says Metzger. “We would have to verify that.”
Another open question is whether it is possible to design brain-machine interfaces that do not require brain surgery. Attempts to create noninvasive approaches have faltered because such devices have tried to make sense of signals that have traveled through layers of tissue and bone, like trying to follow a football game from the parking lot.
Wang has made headway using an advanced imaging technique called magnetoencephalography (MEG), which records magnetic fields on the outside of the skull that are generated by the electric currents in the brain, and then translating those signals into text. Right now, he is trying to build a device that uses MEG to recognize the 44 phonemes, or speech sounds, in the English language — like ph or oo — which could be used to construct syllables, then words, then sentences.
Ultimately, the biggest challenge to restoring speech in locked-in patients may have more to do with biology than with technology. The way speech is encoded, particularly internal speech, could vary depending on the individual or the situation. One person might imagine scrawling a word on a sheet of paper in their mind’s eye; another might hear the word, still unspoken, echoing in their ears; yet another might associate a word with its meaning, evoking a particular feeling-state. Because different brain waves could be associated with different words in different people, different techniques might have to be adapted to each person’s individual nature.
“I think this multipronged approach by the different groups is our best way to cover all of our bases,” says Bjånes, “and have approaches that work in a bunch of different contexts.”
This article originally appeared in Knowable Magazine, an independent journalistic endeavor from Annual Reviews.