Use Java and RegEx to convert casing in a string

Andreas picture Andreas · May 5, 2010 · Viewed 38.3k times · Source

Problem: Turn

"My Testtext TARGETSTRING My Testtext" 

into

"My Testtext targetstring My Testtext"

Perl supports the "\L"-operation which can be used in the replacement-string.

The Pattern-Class does not support this operation:

Perl constructs not supported by this class: [...] The preprocessing operations \l \u, \L, and \U. https://docs.oracle.com/javase/10/docs/api/java/util/regex/Pattern.html

Answer

polygenelubricants picture polygenelubricants · May 5, 2010

You can't do this in Java regex. You'd have to manually post-process using String.toUpperCase() and toLowerCase() instead.

Here's an example of how you use regex to find and capitalize words of length at least 3 in a sentence

    String text = "no way oh my god it cannot be";
    Matcher m = Pattern.compile("\\b\\w{3,}\\b").matcher(text);

    StringBuilder sb = new StringBuilder();
    int last = 0;
    while (m.find()) {
        sb.append(text.substring(last, m.start()));
        sb.append(m.group(0).toUpperCase());
        last = m.end();
    }
    sb.append(text.substring(last));

    System.out.println(sb.toString());
    // prints "no WAY oh my GOD it CANNOT be"

Note on appendReplacement and appendTail

Note that the above solution uses substring and manages a tail index, etc. In fact, you can go without these if you use Matcher.appendReplacement and appendTail.

    StringBuffer sb = new StringBuffer();
    while (m.find()) {
        m.appendReplacement(sb, m.group().toUpperCase());
    }
    m.appendTail(sb);

Note how sb is now a StringBuffer instead of StringBuilder. Until Matcher provides StringBuilder overloads, you're stuck with the slower StringBuffer if you want to use these methods.

It's up to you whether the trade-off in less efficiency for higher readability is worth it or not.

See also