I have strings in Spanish and other languages that may contain generic special characters like (),*, etc. That I need to remove. But the problem is that it also may contain special language characters like ñ, á, ó, í etc and they need to remain. So I am trying to do it with regexp the following way:
var desired = stringToReplace.replace(/[^\w\s]/gi, '');
Unfortunately it is removing all special characters including the language related. Not sure how to avoid that. Maybe someone could suggest?
I would suggest using Steven Levithan's excellent XRegExp library and its Unicode plug-in.
Here's an example that strips non-Latin word characters from a string: http://jsfiddle.net/b3awZ/1/
var regex = XRegExp("[^\\s\\p{Latin}]+", "g");
var str = "¿Me puedes decir la contraseña de la Wi-Fi?"
var replaced = XRegExp.replace(str, regex, "");
See also this answer by Steven Levithan himself: