No visible cause for "Unexpected token ILLEGAL"

javascript syntax-error illegal-characters

bfavaretto · Oct 4, 2012 · Viewed 301.5k times · Source

I'm getting this JavaScript error on my console:

Uncaught SyntaxError: Unexpected token ILLEGAL

This is my code:

It's super simple, as you can see. How could it be causing a syntax error?

Answer

The error

When code is parsed by the JavaScript interpreter, it gets broken into pieces called "tokens". When a token cannot be classified into one of the four basic token types, it gets labelled "ILLEGAL" on most implementations, and this error is thrown.

The same error is raised if, for example, you try to run a js file with a rogue @ character, a misplaced curly brace, bracket, "smart quotes", single quotes not enclosed properly (e.g. this.run('dev1)) and so on.

A lot of different situations can cause this error. But if you don't have any obvious syntax error or illegal character, it may be caused by an invisible illegal character. That's what this answer is about.

But I can't see anything illegal!

There is an invisible character in the code, right after the semicolon. It's the Unicode U+200B Zero-width space character (a.k.a. ZWSP, HTML entity ). That character is known to cause the Unexpected token ILLEGAL JavaScript syntax error.

And where did it come from?

I can't tell for sure, but my bet is on jsfiddle. If you paste code from there, it's very likely to include one or more U+200B characters. It seems the tool uses that character to control word-wrapping on long strings.

UPDATE 2013-01-07

After the latest jsfiddle update, it's now showing the character as a red dot like codepen does. Apparently, it's also not inserting U+200B characters on its own anymore, so this problem should be less frequent from now on.

UPDATE 2015-03-17

Vagrant appears to sometimes cause this issue as well, due to a bug in VirtualBox. The solution, as per this blog post is to set sendfile off; in your nginx config, or EnableSendfile Off if you use Apache.

It's also been reported that code pasted from the Chrome developer tools may include that character, but I was unable to reproduce that with the current version (22.0.1229.79 on OSX).

How can I spot it?

The character is invisible, do how do we know it's there? You can ask your editor to show invisible characters. Most text editors have this feature. Vim, for example, displays them by default, and the ZWSP shows as <u200b>. You can also debug it online: jsbin displays the character as a red dot on its code panes (but seems to remove it after saving and reloading the page). CodePen.io also displays it as a dot, and keeps it even after saving.

The ECMAScript Specification

I couldn't find any mention to that specific character on the ECMAScript Specification (versions 3 and 5.1). The current version mentions similar characters (U+200C and U+200D) on Section 7.1, which says they should be treated as IdentifierParts when "outside of comments, string literals, and regular expression literals". Those characters may, for example, be part of a variable name (and var x\u200c; indeed works).

Section 7.2 lists the valid White space characters (such as tab, space, no-break space, etc.), and vaguely mentions that any other Unicode “space separator” (category “Zs”) should be treated as white space. I'm probably not the best person to discuss the specs in this regard, but it seems to me that U+200B should be considered white space according to that, when in fact the implementations (at least Chrome and Firefox) appear to treat them as an unexpected token (or part of one), causing the syntax error.