How to match Cyrillic characters with a regular expression

Greg Finzer picture Greg Finzer · Nov 11, 2009 · Viewed 87.4k times · Source

How do I match French and Russian Cyrillic alphabet characters with a regular expression? I only want to do the alpha characters, no numbers or special characters. Right now I have

[A-Za-z]

Answer

Pedro Lobito picture Pedro Lobito · Jun 14, 2011

If your regex flavor supports Unicode blocks ([\p{IsCyrillic}]), you can match Cyrillic characters with:

[\p{IsCyrillic}] or [\p{Cyrillic}]

Otherwise try using:

[U+0400–U+04FF]

For PHP use:

[\x{0400}-\x{04FF}]

Explanation:

[\p{IsCyrillic}]

Match a character from the Unicode block “Cyrillic” (U+0400–U+04FF) «[\p{IsCyrillic}]»

Note:

Unicode Characters list and Numeric HTML Entities of [U+0400–U+04FF] .