Tobias Münch

Is the web ready for voice user interfaces?

The Web Speech API is not ready for production. Its accuracy is like a coin flip and it has critical privacy flaws.

Is the web ready for voice user interfaces?
#1about 3 minutes

Why voice user interfaces are important for accessibility

Voice interfaces can significantly improve web accessibility for users with disabilities and provide hands-free convenience for mobile professionals.

#2about 1 minute

Understanding the Web Speech API's core functions

The Web Speech API is a W3C standard divided into speech recognition for converting voice to text and speech synthesis for converting text to voice.

#3about 2 minutes

Reviewing VUI research and its current limitations

Research projects like the Conversational Web and a wheelchair VUI demonstrate potential but suffer from inconsistent accuracy, online-only functionality, and lack of wake words.

#4about 3 minutes

How to implement the Web Speech API in JavaScript

Learn the step-by-step process of implementing speech recognition, including loading the class, configuring grammar with JSGF, starting the listener, and processing the results.

#5about 2 minutes

Navigating the Web Speech API's result data structure

The API returns a nested data structure containing a list of results, each with alternatives that include the text transcript and a confidence score.

#6about 3 minutes

Key challenges limiting Web Speech API adoption

The API's adoption is hindered by significant issues including poor developer experience, privacy risks from cloud processing, no offline support, and inconsistent browser implementations.

#7about 3 minutes

A look inside the browser's implementation of speech recognition

An analysis of the Chromium source code reveals how the Web Speech API is implemented through layers that manage and dispatch recognition tasks to either remote cloud services or local OS-dependent engines.

#8about 5 minutes

The future of VUIs with Stanford's React Genie

Stanford's React Genie project offers a new paradigm by loosely coupling a voice agent with React state, allowing for complex voice commands that can manipulate off-screen content and application logic.

#9about 1 minute

Final verdict on the web's readiness for voice UIs

While the current Web Speech API is suitable for experimentation, it is not reliable enough for production use, but promising research indicates a more capable future for web-based voice interfaces.

Related jobs
Jobs that call for the skills explored in this talk.

job ad

Saby Company
Delebio, Italy

Intermediate

d

Saby Company
Delebio, Italy

Junior

Featured Partners

Related Articles

View all articles
CH
Chris Heilmann
Dev Digest 116 - WWWAI?
This time, learn how to un-AI Google's search results, what's new on the web, avoid a new security hole and go back to BASICS with us. News and ArticlesWhat a week. Google, Microsoft, OpenAI and many others had their big flagship events announcing th...
Dev Digest 116 - WWWAI?
LM
Luis Minvielle
The Best Upcoming IT Webinars
Now that you already know what IT webinars are and how they can help you level up your professional appeal, you might want actually to get into one. Live tech webinars are one of the best ways to stay on top of the latest trends and tools because eit...
The Best Upcoming IT Webinars
CH
Chris Heilmann
WWC24 Talk - Scott Hanselman - AI: Superhero or Supervillain?
Join Scott Hanselman at WWC24 to explore AI's role as a superhero or supervillain. Scott shares his 32 years of experience in software engineering, discusses AI myths, ethical dilemmas, and tech advancements. Engage with his live demos and insights o...
WWC24 Talk - Scott Hanselman - AI: Superhero or Supervillain?
EM
Eli McGarvie
16 Ways Developers Can Use ChatGPT-4 and GPT-4o
ChatGPT has been busy getting new designations. If you’ve been scrolling on 𝕏 over the last week, then you’ve seen the ChatGPT-4o announcement and probably thought of Joaquin Phoenix’s virtual girlfriend on Her.Beyond the references to flicks, the la...
16 Ways Developers Can Use ChatGPT-4 and GPT-4o

From learning to earning

Jobs that call for the skills explored in this talk.