Google’s voice search function has received an upgrade that the company says makes it faster and better.
A spokesperson said that Google voice search within the Google app is now 300 milliseconds faster, with an increase of 500 milliseconds depending on the device the new app is being used on.
New computational models
The company announced that it has added new computational models that “are more accurate, robust to noise, and faster to respond to voice search queries models.”
In a blog post, Haşim Sak, Andrew Senior, Kanishka Rao, Françoise Beaufays and Johan Schalkwyk of the Google Speech Team went into detail about how the new upgrades came about and exactly what they entail.
“Back in 2012, we announced that Google voice search had taken a new turn by adopting Deep Neural Networks (DNNs) as the core technology used to model the sounds of a language. These replaced the 30-year old standard in the industry: the Gaussian Mixture Model (GMM). DNNs were better able to assess which sound a user is producing at every instant in time, and with this they delivered greatly increased speech recognition accuracy,” the post began.
“Today, we’re happy to announce we built even better neural network acoustic models using Connectionist Temporal Classification (CTC) and sequence discriminative training techniques. These models are a special extension of recurrent neural networks (RNNs) that are more accurate, especially in noisy environments, and they are blazingly fast!”
Waveform frames
Google’s blog post goes deep into the specifics of how the latest improvements have been made.
A traditional speech recogniser breaks down the waveform spoken by a user into small consecutive “frames” of 10 milliseconds of audio. The new improved acoustic models used by Google rely on Recurrent Neural Networks (RNN), which have feedback loops that use memory cells and a sophisticated gating mechanism.
The post concludes: “We are happy to announce that our new acoustic models are now used for voice searches and commands in the Google app (on Android and iOS), and for dictation on Android devices. In addition to requiring much lower computational resources, the new models are more accurate, robust to noise, and faster to respond to voice search queries – so give it a try, and happy (voice) searching!”