The following will replace ASCII control characters (shorthand for [\x00-\x1F\x7F]
):
my_string.replaceAll("\\p{Cntrl}", "?");
The following will replace all ASCII non-printable characters (shorthand for [\p{Graph}\x20]
), including accented characters:
my_string.replaceAll("[^\\p{Print}]", "?");
However, neither works for Unicode strings. Does anyone has a good way to remove non-printable characters from a unicode string?
my_string.replaceAll("\\p{C}", "?");
See more about Unicode regex. java.util.regexPattern
/String.replaceAll
supports them.