How to decode HTML entities like
'
to its original character?
In browsers we can create a DOM to do the trick (see here) or we can use some libraries like he
In NodeJS we can use some third party lib like html-entities
What if we want to use plain JavaScript to do the job?
There are many similar questions and useful answers in stackoverflow but I can't find a way works both on browsers and Node.js. So I'd like to share my opinion.
I have posted my opinion as an answer below. I hope it can be a helping hand for someone. :)
There are many similar questions and useful answers in stackoverflow but I can't find a way works both on browsers and Node.js. So I'd like to share my opinion.
For html codes like
<
>
'
and even Chinese characters.
I suggest to use this function. (Inspired by some other answers)
function decodeEntities(encodedString) {
var translate_re = /&(nbsp|amp|quot|lt|gt);/g;
var translate = {
"nbsp":" ",
"amp" : "&",
"quot": "\"",
"lt" : "<",
"gt" : ">"
};
return encodedString.replace(translate_re, function(match, entity) {
return translate[entity];
}).replace(/&#(\d+);/gi, function(match, numStr) {
var num = parseInt(numStr, 10);
return String.fromCharCode(num);
});
}
This implement also works in Node.js environment.
decodeEntities("哈哈 '这个'&"那个"好玩<>") //哈哈 '这个'&"那个"好玩<>
As a new user, I only have 1 reputation :(
I can't make comments or answers to existing posts so that's the only way I can do for now.
Edit 1
I think this answer works even better than mine. Although no one gave him up vote.