HTML5 Rocks

HTML5 Rocks

Web apps that talk - Introduction to the Speech Synthesis API

By Eric Bidelman at

The Web Speech API adds voice recognition (speech to text) and speech synthesis (text to speech) to JavaScript. The post briefly covers the latter, as the API recently landed in Chrome 33 (mobile and desktop). If you're interested in speech recognition, Glen Shires had a great writeup a while back on the voice recognition feature, "Voice Driven Web Apps: Introduction to the Web Speech API".

Basics

The most basic use of the synthesis API is to pass the speechSynthesis.speak() and utterance:

var msg = new SpeechSynthesisUtterance('Hello World');
window.speechSynthesis.speak(msg);

Try it!

However, you can also alter parameters to effect the volume, speech rate, pitch, voice, and language:

var msg = new SpeechSynthesisUtterance();
var voices = window.speechSynthesis.getVoices();
msg.voice = voices[10]; // Note: some voices don't support altering params
msg.voiceURI = 'native';
msg.volume = 1; // 0 to 1
msg.rate = 1; // 0.1 to 10
msg.pitch = 2; //0 to 2
msg.text = 'Hello World';
msg.lang = 'en-US';

msg.onend = function(e) {
  console.log('Finished in ' + event.elapsedTime + ' seconds.');
};

speechSynthesis.speak(msg);

Setting a voice

The API also allows you to get a list of voice the engine supports:

speechSynthesis.getVoices().forEach(function(voice) {
  console.log(voice.name, voice.default ? '(default)' :'');
});

Then set a different voice, by setting .voice on the utterance object:

var msg = new SpeechSynthesisUtterance('I see dead people!');
msg.voice = speechSynthesis.getVoices().filter(function(voice) { return voice.name == 'Whisper'; })[0];
speechSynthesis.speak(msg);

Demo

In my Google I/O 2013 talk, "More Awesome Web: features you've always wanted" (www.moreawesomeweb.com), I showed a Google Now/Siri-like demo of using the Web Speech API's SpeechRecognition service with the Google Translate API to auto-translate microphone input into another language:

DEMO: http://www.moreawesomeweb.com/demos/speech_translate.html

Unfortunately, it used an undocumented (and unofficial API) to perform the speech synthesis. Well now we have the full Web Speech API to speak back the translation! I've updated the demo to use the synthesis API.

Browser Support

Chrome 33 has full support for the Web Speech API, while Safari for iOS7 has partial support.

Feature detection

Since browsers may support each portion of the Web Speech API separately (e.g. the case with Chromium), you may want to feature detect each feature separately:

if ('speechSynthesis' in window) {
 // Synthesis support. Make your web apps talk!
}

if ('SpeechRecognition' in window) {
  // Speech recognition support. Talk to your apps!
}

What's the CSS :scope pseudo-class for?

By Eric Bidelman at

:scope is defined in CSS Selectors 4 as:

A pseudo-class which represents any element that is in the contextual reference element set. This is is a (potentially empty) explicitly-specified set of elements, such as that specified by the querySelector(), or the parent element of a <style scoped> element, which is used to "scope" a selector so that it only matches within a subtree.

An example of using this guy is within a <style scoped> (more info):

<style>
  li {
    color: blue;
  }
</style>

<ul>
  <style scoped>
    li {
      color: red;
    } 
    :scope {
      border: 1px solid red;
    }
  </style>
  <li>abc</li>
  <li>def</li>
  <li>efg</li>
</ul>

<ul>
  <li>hij</li>
  <li>klm</li>
  <li>nop</li>
</ul>

Note: <style scoped> can be enabled in Chrome using the "Enable experimental WebKit features" flag in about:flags.

This colors the li elements in the first ul red and, because of the :scope rule, puts a border around the ul. That's because in the context of this <style scoped>, the ul matches :scope. It's the local context. If we were to add a :scope rule in the outer <style> it would match the entire document. Essentially, equivalent to :root.

Contextual elements

You're probably aware of the Element version of querySelector() and querySelectorAll(). Instead of querying the entire document, you can restrict the result set to a contextual element:

<ul>
 <li id="scope"><a>abc</a></li>
 <li>def</li>
 <li><a>efg</a></li>
</ul>
<script>
  document.querySelectorAll('ul a').length; // 2

  var scope = document.querySelector('#scope');
  scope.querySelectorAll('a').length; // 1
</script>

When these are called, the browser returns a NodeList that's filtered to only include the set of nodes that a.) match the selector and b.) which are also descendants of the context element. So in the the second example, the browser finds all a elements, then filters out the ones not in the scope element. This works, but it can lead to some bizarre behavior if you're not careful. Read on.

When querySelector goes wrong

There's a really important point in the Selectors spec that people often overlook. Even when querySelector[All]() is invoked on an element, selectors still evaluate in the context of the entire document. This means unanticipated things can happen:

scope.querySelectorAll('ul a').length); // 1
scope.querySelectorAll('body ul a').length); // 1

WTF! In the first example, ul is my element, yet I'm still able to use it and matches nodes. In the second, body isn't even a descendant of my element, but "body ul a" still matches. Both of these are confusing and not what you'd expect.

It's worth making the comparison to jQuery here, which takes the right approach and does what you'd expect:

$(scope).find('ul a').length // 0
$(scope).find('body ul a').length // 0

...enter :scope to solve these semantic shenanigans.

Fixing querySelector with :scope

WebKit recently landed support for using the :scope pseudo-class in querySelector[All](). You can test it in Chrome Canary 27.

You can use it restrict selectors to a context element. Let's see an example. In the following, :scope is used to "scope" the selector to the scope element's subtree. That's right, I said scope three times!

scope.querySelectorAll(':scope ul a').length); // 0
scope.querySelectorAll(':scope body ul a').length); // 0
scope.querySelectorAll(':scope a').length); // 1

Using :scope makes the semantics of the querySelector() methods a little more predictable and inline with what others like jQuery are already doing.

Performance win?

Not yet :(

I was curious if using :scope in qS/qSA gives a performance boost. So...like a good engineer I threw together a test. My rationale: less surface area for the browser to do selector matching means speedier lookups.

In my experiment, WebKit currently takes ~1.5-2x longer than not using :scope. Drats! When crbug.com/222028 gets fixed, using it should theoretically give you a slight performance boost over not using it.

Detect DOM changes with Mutation Observers

By Ernest Delgado at

Back in 2000, the Mutation Events API was specified to make it easy for developers to react to changes in a DOM (e.g. DOMNodeRemoved, DOMAttrModified, etc).

This feature wasn’t widely used by web developers but it presented a very convenient and popular use case for Chrome Extensions if they wanted to perform some action when something in the page changed.

Mutation Events are useful, but at the same time they create some performance issues. The Events are slow and they are fired too frequently in a synchronous way, which causes some undesired browser bugs.

Introduced in the DOM4 specification, DOM Mutation Observers will replace Mutation Events. Whereas Mutation Events fired slow events for every single change, Mutation Observers are faster using callback functions that can be delivered after multiple changes in the DOM.

You can manually handle the list of changes the API offers, or use a library such as Mutation Summary which makes this task easier and adds a layer of reliability about the changes that took place in the DOM.

You can start using Mutation Observers in Chrome Beta to detect changes in the DOM and be ready to use it when it comes to stable (Chrome 18). If you are currently using the deprecated Mutation Events, just migrate to Mutation Observers.

Here’s an example of listing inserted nodes with Mutation Events:

var insertedNodes = [];
document.addEventListener("DOMNodeInserted", function(e) {
  insertedNodes.push(e.target);
}, false);
console.log(insertedNodes);

And here’s how it looks with Mutation Observers:

var insertedNodes = [];
var observer = new WebKitMutationObserver(function(mutations) {
 mutations.forEach(function(mutation) {
   for (var i = 0; i < mutation.addedNodes.length; i++)
     insertedNodes.push(mutation.addedNodes[i]);
 })
});
observer.observe(document, { childList: true });
console.log(insertedNodes);

Getting Gmail to handle all mailto: links with registerProtocolHandler

By Paul Irish at

If you use Gmail you may become frustrated when you click a mailto: link by accident and now your desktop client of Outlook or Mail starts up. Thanks to navigator.registerProtocolHandler() (which we've covered here before) you can wire up Gmail as your default mail client for all mailto: links in Chrome and Firefox.

Screenshot of registerProtocolHandler prompt Here's how:
  1. First, open up a Gmail tab. You must do this from the Gmail tab, not your html5rocks one. :)
  2. Open your javascript console (cmd-opt-j on Mac, ctrl-shift-j on Windows) and enter:
  3. navigator.registerProtocolHandler("mailto",
                                      "https://mail.google.com/mail/?extsrc=mailto&url=%s",
                                      "Gmail");
  4. Accept the confirmation from the browser.
  5. Click this mailto: link to test out your new Gmail mailto hookup!

Boom. Enjoy.

If you ever need to removing this setting, you can do that at chrome://settings/handlers in Chrome and Preferences->Applications->mailto in Firefox.


Pointer Lock API Brings FPS Games to the Browser

By Ilmari Heikkinen at

The Pointer Lock API recently landed in Chrome Canary and the Dev channel, all rejoice! Wait, what? You haven't heard of the Pointer Lock API? Well, in a nutshell, the Pointer Lock API makes it possible to write proper first-person shooters for the web.

The Chrome implementation lets a full-screen webpage ask your permission to capture the mouse pointer so that you can't move it outside the page. This lets web developers write 3D games and applications without having to worry about the mouse cursor moving outside of the page. When the pointer is locked, the pointer move events have movementX and movementY attributes defined that tell how much the mouse moved since the last move event. As usual with bleeding edge APIs, these attributes are vendor-prefixed, so you need to use webkitMovementX and suchlike.

To enable the Pointer Lock API in current Chrome builds, the easiest way is to go to about:flags and turn on the "Enable Pointer Lock"-flag. You can also turn it on by starting Chrome using the --enable-pointer-lock command line flag.

There are already a couple cool demos out taking advantage of this feature. Check out the Quake 3 WebGL demo by Brandon Jones to see how Pointer Lock API makes WebGL FPS games a viable prospect. Another cool demo is the Webgl Street Viewer

To get started with the Pointer Lock API, here's a small snippet cribbed from MDN:

<button onclick="document.body.webkitRequestFullScreen();">No, you lock it up!</button>
<script>
navigator.pointer = navigator.pointer || navigator.webkitPointer;

var onError = function() {
  console.log("Mouse lock was not successful.");
};

document.addEventListener('webkitfullscreenchange', function(e) {
  if (document.webkitIsFullScreen) {
    navigator.pointer.lock(document.body, function() {
      // Locked and ready to play.
    }, onError);
  }
}, false);

document.body.addEventListener('webkitpointerlocklost', function(e) {
  console.log('Pointer lock lost!');
}, false);

document.body.addEventListener('mousemove', function(e) {
  if (navigator.pointer.isLocked) { // got a locked pointer
    var movementX = e.movementX || e.webkitMovementX;
    var movementY = e.movementY || e.webkitMovementY;
  }
}, false);
</script>

You can see a fuller example at html5-demos.com. For more information, have a look at:

Registering a custom protocol handler

By Eric Bidelman at

Chrome 13 finally includes navigator.registerProtocolHandler. This API allows web apps to register themselves as possible handlers for particular protocols. For example, users could select your application to handle "mailto" links.

Register a protocol scheme like:

navigator.registerProtocolHandler(
    'web+mystuff', 'http://example.com/rph?q=%s', 'My App');

The first parameter is the protocol. The second is the URL pattern of the application that should handle this scheme. The pattern should include a '%s' as a placeholder for data and it must must be on the same origin as the app attempting to register the protocol. Once the user approves access, you can use this link through your app, other sites, etc.:

<a href="web+mystuff:some+data">Open in "My App"</a>

Clicking that link makes a GET request to http://example.com/rph?q=web%2Bmystuff%3A:some%20data. Thus, you have to parse q parameter and manually strip out data from the protocol.

It's worth noting that Firefox has had navigator.registerProtocolHandler implemented since FF3. One difference in Chrome's implementation is around custom protocols. Those need to be prefixed with "web+", as seen in the example above. The following protocols do not need a "web+" prefix: "mailto", "mms", "nntp", "rtsp", "webcal".

More information on this API can be found on the MDN article.

Page Visibility API: Have I got your attention?

By Michael Mahemoff at

Multi-tab browsing is now the norm, so you can’t assume the user is watching your app just because it’s running. Fortunately, the new “Page Visibility API”: http://code.google.com/chrome/whitepapers/pagevisibility.html lets your app discover if it’s visible or not. You could use the API to cut down on unnecessary network activity and computation.

document.webkitHidden is a boolean value indicating if the current page is hidden (you can try it now in the console if you’re using a recent build of Chromium). document.webkitVisibilityState will return a string indicating the current state, one of “visible”, “hidden”, and “prerendered”. And a new webkitvisibilitychange event will fire when any of these changes, e.g. when the user opens you app’s tab, or moves away from it.

If you’re interested in giving this a whirl, check out visibility.js which adds a little bit of sugar on the API to make watching these interactions a bit more fun.

Contra in HTML5 + Web Audio API

By Eric Bidelman at

Thanks to the power of GWT, HTML5, and the Web Audio API we can build the originator of everyone's favorite cheat code, Contra: http://nes-sound.brad-rydzewski.appspot.com/

Check out the open source NES Emulator in HTML5, gwt-nes-port.

navigator.onLine in Chrome Dev channel

By Eric Bidelman at

With the offline APIs in HTML5, there's no excuse not to provide a flawless offline experience for users. One thing that can help this story is the navigator.onLine property; a feature that recently landed in Chrome dev channel. This property returns true or false depending on whether or not the app has network connectivity:

if (navigator.onLine) {
  console.log('ONLINE!');
} else {
  console.log('Connection flaky');
}

A web app can also listen for online and offline events to determine when the connection is available again or when an app goes offline:

window.addEventListener('online', function(e) {
  // Re-sync data with server.
}, false);

window.addEventListener('offline', function(e) {
  // Queue up events for server.
}, false);

I've posted a working demo at http://html5-demos.appspot.com/static/navigator.onLine.html and more information on offline events can be found in the MDN.