# Web Speech API for a Node.js/Socket.io chat app

Source: https://tpiros.dev/blog/web-speech-api-for-a-node-jssocket-io-chat-app

HTML5 ships with the [Web Speech API](https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html), and I thought it'd be fun to bolt speech recognition onto my chat application. Instead of typing messages, people could just talk.

I've pushed the latest changes to the [GitHub repository](https://github.com/tpiros/advanced-chat). Let's walk through them.

On the HTML side, I've added a 'Record' button with a `startButton(event)` function wired up to the `onclick` event. Simple.

The interesting bits live in the JavaScript.

First thing: check whether the browser supports Web Speech. At the time of writing, Google Chrome is the only browser with full support. Mozilla's been making progress on bringing it to Firefox. For now, this check does the job:

```js
if ('webkitSpeechRecognition' in window) {
  console.log('webkitSpeechRecognition is  available');
}
```

If it's available, we initialise it and set `continuous` and `interimResults` to true. When `continuous` is on, the user agent keeps recognising and returns zero or more final results (think dictation). When `interimResults` is on, we get partial results as the user speaks.

```js
var recognition = new webkitSpeechRecognition();
recognition.continuous = true;
recognition.interimResults = true;
```

For the full list of configurable parameters and methods, run `console.log(recognition);` and poke around.

I've implemented two methods so far, `onstart()` and `onresult()`:

```js
recognition.onstart = function() {
  recognizing = true;
};

recognition.onresult = function(event) {
  console.log(event);
  var interim_transcript = '';
  for (var i = event.resultIndex; i < event.results.length; ++i) {
    if (event.results[i].isFinal) {
      final_transcript += event.results[i][0].transcript;
      $('#msg').addClass("final");
      $('#msg').removeClass("interim");
    } else {
      interim_transcript += event.results[i][0].transcript;
      $("#msg").val(interim_transcript);
      $('#msg').addClass("interim");
      $('#msg').removeClass("final");
    }
  }
  $("#msg").val(final_transcript);
  };
}
```

The event handler on the 'Record' button picks the language and fires `start()`:

```js
function startButton(event) {
  if (recognizing) {
    console.log('stopping');
    recognition.stop();
    recognizing = false;
    $('#start_button').prop('value', 'Record');
    return;
  }
  final_transcript = '';
  recognition.lang = 'en-GB';
  recognition.start();
  $('#start_button').prop('value', 'Recording ... Click to stop.');
  ignore_onend = false;
  $('#msg').val();
}
```

When you test the application, Chrome will ask for microphone access before recording starts. Allow it.

There's still plenty to do here. I want to catch the event when someone denies microphone access (accidentally or on purpose). Adding more language options would be good too. Those improvements will come with time.