• Cryptocurrency
  • Earnings
  • Enterprise
  • About TechBooky
  • Submit Article
  • Advertise Here
  • Contact Us
TechBooky
  • African
  • AI
  • Metaverse
  • Gadgets
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
  • African
  • AI
  • Metaverse
  • Gadgets
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
TechBooky
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Home Artificial Intelligence

IBM Surpasses Microsoft in Speech Recognition Accuracy in Just Five Months

Paul Balo by Paul Balo
March 19, 2017
in Artificial Intelligence
Share on FacebookShare on Twitter

IBM provided quite the technological shockwave a short five months after Microsoft proudly announced its Speech Recognition Technology had reached a 5.9 percent word error rate (WER) – bringing the technology closer to matching human performance. IBM reported that it had pushed the technological envelope even further, delivering a remarkable 5.5 percent WER, thereby setting a new record for machine-based speech recognition previously held by the Microsoft system.

But what is the significance of WER? In the context of speech recognition and translation systems, WER is a pivotal measure of accuracy – a lower value reflects a higher degree of accuracy. The current human performance record is pinned at 5.1 percent.

IBM’s success was forged by integrating two specific language models: Long Short-Term Memory (LSTM) and the WaveNet technology provided by Google’s DeepMind. WaveNet was designed to generate speech that resembles human voice as closely as possible, whereas LSTM is a recurrent network unit, highly skilled at remembering values over long or short periods. LSTM’s strength lies in its ability to learn from history and, as a result, make faster predictions in time series.

IBM reports that the harmonious interplay between these two technologies enabled the achievement of a lower WER than Microsoft’s offering. However, the tech titans differ in their views on how these figures equate to human parity. Microsoft maintains that its 5.9 percent WER doesn’t quite match the performance of an average person during a speech recognition task, whereas IBM asserts that 5.1 percent is a more fitting representation of human parity, and that’s what they’re aiming for.

In the end, the objective for all players in the field, according to IBM, is to achieve ‘human parity’ – an error rate equivalent to two humans conversing. Many in the industry have claimed to have reached that coveted 5.9 percent WER mark, synonymous with human parity. However, IBM argues that this is still not a cause for celebration, as “Reaching human parity – meaning an error rate on par with that of two humans speaking, we determined human parity is actually lower than what anyone has yet achieved — at 5.1 percent.” They continue to challenge themselves and others in the groundbreaking race for ultimate speech recognition technology.

[Be sure to include visuals depicting LSTM and WaveNet models and an infographic comparing WERs of different companies. Include internal links to articles covering IBM and Microsoft’s advancements in speech recognition technology, WaveNet, and LSTM.]

Related Posts:

  • Audio_Models_wallpaper_16.9
    OpenAI Launches New Audio Models for Agentic Workflows
  • J3FQNHAKV5CI5JOBZHWLUWJASI
    EU: X, Facebook, YouTube Toughen Up Over Hate Speech
  • GitHub Incorporates GPT-4 Chatbot To Complete The Copilot X Code Snippet Generator.
    GitHub Incorporates GPT-4 Chatbot To Complete The…
  • OUXSPAPPUVK27H6RHKQWKLT4VI
    OpenAI Is Developing A New Language Model Open Source AI
  • Building92microsoft (1)
    Microsoft Is Reportedly Set To Invest $10 Billion In…
  • Microsoft Offered OpenAI Billions of Investment To pair Azure Cloud and ChatGPT’s Integration.
    Microsoft Offered OpenAI Billions of Investment To…
  • 1_zpKoi14a19eY-z4CyCwDZg
    Microsoft Authorized Flutterwave's Incorporation With Azure
  • hero-image (3)
    Integration of Microsoft's Copilot AI Assistant to…

Discover more from TechBooky

Subscribe to get the latest posts sent to your email.

Tags: deepmindIBMmicrosoftspeech recognitionwavenet
Paul Balo

Paul Balo

Paul Balo is the founder of TechBooky and a highly skilled wireless communications professional with a strong background in cloud computing, offering extensive experience in designing, implementing, and managing wireless communication systems.

BROWSE BY CATEGORIES

Select Category

    Receive top tech news directly in your inbox

    subscription from
    Loading

    Freshly Squeezed

    • X Experiments with Community Notes for Popular Content July 25, 2025
    • Snapchat Adds Safe Arrival Notifications for Friends July 25, 2025
    • OpenAI Set To Release GPT-5 in August July 25, 2025
    • Zobe’s Ring Gives Contactless Payments New Look July 25, 2025
    • Visa Establishes Its First Africa Data Centre in Johannesburg July 25, 2025
    • Q2 Earnings: Intel sees AI Progress but PC Division Lags July 25, 2025

    Browse Archives

    July 2025
    MTWTFSS
     123456
    78910111213
    14151617181920
    21222324252627
    28293031 
    « Jun    

    Quick Links

    • About TechBooky
    • Advertise Here
    • Contact us
    • Submit Article
    • Privacy Policy
    Generic selectors
    Exact matches only
    Search in title
    Search in content
    Post Type Selectors
    • African
    • Artificial Intelligence
    • Gadgets
    • Metaverse
    • Tips
    • About TechBooky
    • Advertise Here
    • Submit Article
    • Contact us

    © 2025 Designed By TechBooky Elite

    Discover more from TechBooky

    Subscribe now to keep reading and get access to the full archive.

    Continue reading

    We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.