Praxis Labs Co-founders, Elise Smith and Heather Shen
Female-led AI Learning Platform Praxis Labs has Been Acquired by Torch
July 11, 2025
5 Steps to Overcome Your First CEO Crisis
British Business Bank Announces $540M Fund for an Inclusive VC Future in the UK
July 14, 2025
Praxis Labs Co-founders, Elise Smith and Heather Shen
Female-led AI Learning Platform Praxis Labs has Been Acquired by Torch
July 11, 2025
5 Steps to Overcome Your First CEO Crisis
British Business Bank Announces $540M Fund for an Inclusive VC Future in the UK
July 14, 2025

Howard University & Google Release Data to Help AI Recognize Black American Dialects

News

Howard University and Google Research have released data to help AI developers improve the experience of African Americans using automatic speech recognition (ASR) technology.

The partnership, named Project Elevate Black Voices, is a dataset comprising over 600 hours of AAE dialects from 32 states to enhance AI’s recognition of diverse Black dialects.

Why Project Elevate Black Voices’ Data is Pertinent

African American English (AAE), also known as African American Vernacular, Black English, Black talk, or Ebonics, is a rich language deeply rooted in history and culture. Due to inherent biases in the development process, incorrect results are sometimes generated when Black users vocalize commands to AI-driven technology. 

Related Post: Howard University Team Wins $1M in Goldman Sachs’ Market Madness Competition

Many Black users have had to alter their voice patterns to sound more authentic, often changing away from their natural accents to be understood by voice products. These linguistic nuances are usually overlooked or misinterpreted by AI-driven technologies, creating barriers for many Black individuals who interact with these technologies.

Through this initiative, researchers traveled across the United States to document the dialects and diction commonly used in Black communities. “African American English has been at the forefront of United States culture since almost the beginning of the country,” said Gloria Washington, Ph.D., Howard University researcher and co-principal investigator of Project Elevate Black Voices and Howard University researcher. 

Advertisement

“Voice assistant technology should understand different dialects of all African American English to truly serve not just African Americans, but also other persons who speak these unique dialects. It’s about time that we provide the best experience for all users of these technologies.”  

Related Post: Bowie State University Receives $2.2M Grant to Increase Doctoral Faculty

How The Data was Collected and Will Be Deployed

Researchers spent 600 hours collecting data from users of different AAE dialects from thirty-two states. They found that there is a lack of natural AAE speech found within speech data because Black users have been implicitly conditioned to change their voices when using ASR-based technology. Even when data is available, in-product AAE is challenging to leverage due to code-switching. 

Advertisement

“Working with our outstanding partners at Howard University on Project Elevate Black Voices has been a tremendous and personal honor,” said Courtney Heldreth, Ph.D., co-principal investigator at Google Research. “It’s our mission at Google to make technology that’s useful and accessible, and I truly believe that our work here will allow more users to express themselves authentically when using smart devices.”

Related Post: Magic Johnson Donates $500K to Xavier University

The Howard African American English Dataset 1.0 will initially be made available exclusively to researchers and institutions within historically Black colleges and universities. This ensures that the data is employed in ways that reflect the interests and needs of marginalized communities. Specifically, it will benefit African American communities whose linguistic practices have often been excluded or misrepresented in computational systems.

Howard University will retain ownership of the dataset and its licensing, serving as stewards for its responsible use and ensuring the data benefits Black communities. Google can also use the dataset to enhance its products, ensuring that its tools are more effective for a broader range of users. Google performs this type of model training work with various dialects, languages, and accents across the US and the world. 

Advertisement

Image Credit: Howard University

Stephen Oluwadara
Stephen Oluwadara
Stephen Oluwadara is a general news reporter for UrbanGeekz covering stories across the US and Africa.
Toggle Dark Mode
Share
Share
Tweet
Reddit
Email