Java: scanning string for a pattern

Markos Fragkakis picture Markos Fragkakis · Mar 5, 2010 · Viewed 22.4k times · Source

This is probably a quicky. Why does this code not return anything?

import java.util.Scanner;

public class MainClass {

public static void main(String[] args) {
    try {

        Scanner sc = new Scanner("asda ASA adad");
        String pattern = "[A-Z]+";

        while ((sc.hasNext(pattern))) {

            System.out.println(sc.next(pattern));
        }
        sc.close();
    } catch (Exception e) {
        e.printStackTrace();
    }
}
}

Answer

polygenelubricants picture polygenelubricants · Mar 5, 2010

hasNext(String pattern) only returns true if the next token matches the pattern. In your case, "asda" is the next token, and that does NOT match "[A-Z]+". The documentation is clear in that "[the] scanner does not advance past any input".

If you change the pattern to "[A-Za-z]+", then you'd get three tokens, which may be what you intended.

If in fact you only want to get tokens that match "[A-Z]+", then you can do any of the following:

  • simply discard non-matching tokens
  • useDelimiter("[^A-Z]+"), then simply invoke next()
  • use skip("[^A-Z]+")
  • use findInLine("[A-Z]+")

Tip: if performance is critical, you'd want to use the precompiled Pattern overloads of these methods.

Tip: do keep in mind that"Xooo ABC" has two "[A-Z]+" matches. If this is not what you want, then the regex will have to be a bit more complicated. Or you can always simply discard non-matching tokens.