Intro to the HTML5 Speech Synthesis API

In this quick tutorial I will give you a little introduction to the HTML5 Speech Synthesis API. We will learn how HTML5 Speech Synthesis works by creating a simple form as a toy example that will allow us to select a voice from the list of available voices for speech synthesis and a text-field which will contain the text that needs to be spoken by the Speech Synthesis.

HTML5 Speech Synthesis Demo

Have a look at the demo below. Go ahead and pick a voice, type in some text and hit the Speak!” button.

See the Pen sLizk by Creative Punch (@CreativePunch) on CodePen.

The Code

The code used in the HTML5 Speech Synthesis demo is viewable in the demo widget, though for those interested we will go over the relevant stuff below.

  speechSynthesis.onvoiceschanged = function() {
    var $voicelist = $('#voicelist');
    speechSynthesis.getVoices().forEach(function(voice, index) {
      console.log(index,, voice.default ? '(default)' :'');
      var $option = $('<option>')
      .html( + (voice.default ? ' (default)' :''));

This code simply fills my list with available voices. Very important to note is that the list of HTML5 Speech Synthesis voices, known as a SpeechSynthesisVoiceList will load asynchronously. This means that you have to use the event speechSynthesis.onvoiceschanged in order to ensure that your code for adding voices to the list and switching voices will execute only after the list has been retrieved.

Also note how t here is no declaration of speechSynthesis. It is readily available.

Next, let’s get our HTML5 Speech Synthesis to speak!

    var text = $('#speech-input').val();
    var msg = new SpeechSynthesisUtterance();
    var voices = window.speechSynthesis.getVoices();
    msg.voice = voices[$('#voicelist').val()];
    msg.text = text;

    msg.onend = function(e) {
      console.log('Finished in ' + event.elapsedTime + ' seconds.');


The special things about this code, and the guidelines to using the API, are the following:

  • You need to construct a message by creating an instance of SpeechSynthesisUtterance.
  • You need to specify the options of the SpeechSynthesisUtterance instance, such as voice, text, pitch, and
  • A voice is set by retrieving the list of voices, and assigning the right voice object (not just index) to the voice property.
  • If no voice is added to the property, it will use the default voice.
  • A text must be set using the text property.
  • Events are available to execute on certain moments (such as when the Speech Synthesis has finished).
  • Use the speak method and pass the message (with text, voice, … properties) to make it speak!

Where to go from here?

While this toy example probably doesn’t have much real-world use. The same methods could be applied for accessibility (text-reading for people with bad vision). One could use it to spice up their online quizzes and have the computer read questions out loud for you. These are just small ideas. If you make anything really interesting, feel free to leave a comment!

Share the knowledge!
Share on Facebook0Tweet about this on Twitter0Share on Google+0Share on StumbleUpon0Share on Reddit1Share on LinkedIn0Share on TumblrBuffer this pageDigg this


You may also like...

Stay updated