I am working in a CMS which allows users to enter content. The problem is that when they add symbols ®
, it may not display well in all browsers. I would like to set up a list of symbols that must be searched for, and then converted to the corresponding html entity. For example
® => ®
& => &
© => ©
™ => ™
After the conversion, it needs to be wrapped in a <sup>
tag, resulting in this:
®
=> <sup>®</sup>
Because a particular font size and padding style is necessary:
sup { font-size: 0.6em; padding-top: 0.2em; }
Would the JavaScript be something like this?
var regs = document.querySelectorAll('®');
for ( var i = 0, l = imgs.length; i < l; ++i ) {
var [?] = regs[i];
var [?] = document.createElement('sup');
img.parentNode.insertBefore([?]);
div.appendChild([?]);
}
Where "[?]" means that there is something that I am not sure about.
Additional Details:
You can use regex to replace any character in a given unicode range with its html entity equivalent. The code would look something like this:
var encodedStr = rawStr.replace(/[\u00A0-\u9999<>\&]/g, function(i) {
return '&#'+i.charCodeAt(0)+';';
});
This code will replace all characters in the given range (unicode 00A0 - 9999, as well as ampersand, greater & less than) with their html entity equivalents, which is simply &#nnn;
where nnn
is the unicode value we get from charCodeAt
.
See it in action here: http://jsfiddle.net/E3EqX/13/ (this example uses jQuery for element selectors used in the example. The base code itself, above, does not use jQuery)
Making these conversions does not solve all the problems -- make sure you're using UTF8 character encoding, make sure your database is storing the strings in UTF8. You still may see instances where the characters do not display correctly, depending on system font configuration and other issues out of your control.
Documentation
String.charCodeAt
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt