Consider this string:
var s = "A\0Z";
Its length is 3, as given by s.length
. Using console.log
you can see the string isn't cut and that s[1]
is ""
and s.charCodeAt(1)
is 0
.
When you alert it in Firefox, you see AZ
. When you alert it in Chrome/Linux using alert(s)
, the \0
terminates the string and you see A
.
My question is: what should browsers and Javascript engines do? Is Chrome buggy here? Is there a document defining what should happen?
As this is a question about standard, a reference is needed.
What the browser should do is keep track of the string and its length separately since there are no null terminators present in the standard. (A string is just an object with a length).
What Chrome seems to do (I am taking your word for this) is use the standard C string functions which terminate at a \0. To answer one of your questions: Yes this to me constitutes a bug in Chrome's handling of the alert()
function.
Formally the spec says:
A string literal is zero or more characters enclosed in single or double quotes. Each character may be represented by an escape sequence. All characters may appear literally in a string literal except for the closing quote character, backslash, carriage return, line separator, paragraph separator, and line feed. Any character may appear in the form of an escape sequence.
Also:
A string literal stands for a value of the String type. The String value (SV) of the literal is described in terms of character values (CV) contributed by the various parts of the string literal.
And regarding the NUL byte:
The CV [Character Value] of EscapeSequence :: 0 [lookahead ∉ DecimalDigit] is a <NUL> character (Unicode value 0000).
Therefore, a NUL byte should simply be "yet another character value" and have no special meaning, as opposed to other languages where it might end a SV (String value).
For Reference of (valid) "String Single Character Escape Sequences" have a look at the ECMAScript Language spec section 7.8.4. There is a table at the end of the paragraph listing the aforementioned escape sequences.
What someone aiming to write a Javascript engine could probably learn from this: Don't use C/C++ string functions. :)