Emaan Bangash
News Editor
POSTED
2 months ago

Cancer patients suffering from loss of laryngeal function to benefit from adaptable hardware, machine learning

A UTD professor developed a portable speech processor allowing people who have suffered the loss of their larynx the ability to speak again.

Assistant professor of bioengineering Jun Wang developed a speech processor designed to read the movements of a person’s tongue and lips and register them into words. His research is primarily targeted towards patients who have undergone laryngectomies — the surgical removal of a larynx — due to cancer.

These patients typically have unintelligible or hoarse voices and use devices such as an electrolarynx or a voice prosthesis. The devices sound vastly different from a human voice and often sound robotic. Wang said he hopes to help improve the patients’ communication skills and speech quality and reduce social difficulty with his technology.

“When we speak, your larynx vibrates to generate the frequency, the source of the sound, then you move your tongue and lips to make the shape of the sound into speech,” Wang said. “Some patients can still communicate in life, but some don’t so it impacts their social life. They cannot talk to their friends and family members, but they’re still healthy, they still walk, the brain is normal and the body is normal.”

Wang developed the software of the device and is currently looking to create hardware that could be adaptable and easily worn like a Bluetooth earpiece. The device would be placed in the patient’s ear and would have a speaker and sensors to track tongue and lip movements in real time. He partnered with Georgia Tech’s college of engineering to develop the hardware further and plans to patent the device in the future.

“Ideally, we’d like to have a device like a Bluetooth headphone that will have a sensor that can track the tongue movements in real time and can actually be embedded with a small computer chip with my software, and it’ll also have a speaker,” Wang said. “When the speech is converted into sound, it’ll play on the speaker.”

The lab tested 25 people total, five of whom were laryngectomy patients from UT Southwestern. Each patient records between 100 and 1000 phrases as the device senses the motions simultaneously. To collect the data of the tongue and lip movements, two sensors are glued to the interior of the patient’s mouth, and an electromagnetic field generator is placed next to the patient’s head. When the patient speaks, the device records the motions to be translated into speech in post-processing. The algorithm makes use of machine learning to map movements to different phrases.

Research assistant Beiming Cao joined Wang’s project in January 2016 and helped develop the algorithm for the speech processor. Cao, a doctoral candidate in electrical and computer engineering, said the team has had to collect a substantial amount of data from healthy patients to teach the machine to map tongue and lip movements accurately.

“Our main technology, which is not very popular, is the deep learning technology which enables the software to learn from the examples like the articulation, and they will map each voice,” Cao said. “In order to let the model learn, we need a large amount of data. That’s why we’re doing so much (data collection) from healthy people, laryngectomy patients and ALS patients as well.”

In addition to recording English phrases, Wang said his team is looking into mapping other languages. The lab has tested on patients who speak Korean, Spanish and Mandarin Chinese.
UTD alumnus Kristin Teplanski, a doctoral candidate in communication sciences and disorders, joined the lab in August 2016 and has helped with data collection in the lab. She said her clinical background has given her first-hand experience of the patients’ struggles with communication.

“The patients, they’re very excited about it,” Teplanski said. “I went to a laryngectomy conference and they were all very interested in the research and they didn’t think this would ever be possible.”

Wang said that patients typically prefer to use one device only and not use other methods of communication, such as sign language or typing.

“I think it’s going to dramatically change their lives,” Teplanski said “We’re going to give them a voice. Prior to this they had no idea that this was even a possibility.”