Java: Basic Regular expression summary

These are the main regular expression characters that you should know.

Character classes

Character classes provide a way to specify a set of characters. The set can be explicitly enclosed in []. The set can also be expressed by what must not be in it by beginning the set with a caret, "^". There are a number of predefined sets (eg, \d, \s, etc). The minus, "-", can be used to indicate a range of character values. Altho a character class matches only one character, a quantifier following it can be used to match multiple characters.
[abc] a, b, or c (simple class)
[^abc] Any character except a, b, or c (negation)
[a-zA-Z] a through z or A through Z, inclusive (range)
 
Predefined character classes
. Any character (may or may not match line terminators)
\d A digit: [0-9]
\D A non-digit: [^0-9]
\s A whitespace character: [ \t\n\x0B\f\r]
\S A non-whitespace character: [^\s]
\w A word character: [a-zA-Z_0-9]
\W A non-word character: [^\w]

Quantifiers (repeating the previous element)

 
Greedy quantifiers - Expand as much as possible
X? X, once or not at all
X* X, zero or more times
X+ X, one or more times
X{n} X, exactly n times
X{n,} X, at least n times
X{n,m} X, at least n but not more than m times
 
Reluctant quantifiers - Expand only if forced by later failure to match
X?? X, once or not at all
X*? X, zero or more times
X+? X, one or more times
X{n}? X, exactly n times
X{n,}? X, at least n times
X{n,m}? X, at least n but not more than m times

Boundary matchers - Zero-width matches.

^ The beginning of a line. Very useful.
$ The end of a line. Very userful. ^$ matches all emtpy lines.
\b A word boundary
\B A non-word boundary
\A The beginning of the input
\G The end of the previous match
\Z The end of the input but for the final terminator, if any
\z The end of the input

Other

 
Logical operators
XY X followed by Y
X|Y Either X or Y
 
Grouping - Parentheses both group and create a numbered element that can be used later.
(X) X. This capturing group is remembered so it can be referenced later. Numbered starting at 1.
 
Quotation
\ Nothing, but quotes the following character.
 
Characters
x The character x
\\ The backslash character
\t The tab character ('\u0009')
\n The newline (line feed) character ('\u000A')
\r The carriage-return character ('\u000D')
\f The form-feed character ('\u000C')