Replace non ASCII character from string

rahulsri picture rahulsri · Dec 15, 2011 · Viewed 130.6k times · Source

I have strings A função, Ãugent in which I need to replace character like ç,ã,Ã with empty strings.

How can I match only those non ASCII characters?

I am using a function

public static String matchAndReplaceNonEnglishChar(String tmpsrcdta) {
    String newsrcdta = null;
    char array[] = Arrays.stringToCharArray(tmpsrcdta);
    if (array == null)
        return newsrcdta;

    for (int i = 0; i < array.length; i++) {
        int nVal = (int) array[i];
        boolean bISO =
                // Is character ISO control
                Character.isISOControl(array[i]);
        boolean bIgnorable =
                // Is Ignorable identifier
                Character.isIdentifierIgnorable(array[i]);
        // Remove tab and other unwanted characters..
        if (nVal == 9 || bISO || bIgnorable)
            array[i] = ' ';
        else if (nVal > 255)
            array[i] = ' ';
    }
    newsrcdta = Arrays.charArrayToString(array);

    return newsrcdta;
}

but it is not working properly..what improvement it is needed...here I have one more problem is that final string is getting replaced by space character which create the extra space in string.

Answer

FailedDev picture FailedDev · Dec 15, 2011

This will search and replace all non ASCII letters:

String resultString = subjectString.replaceAll("[^\\x00-\\x7F]", "");