Regular expressions

A regular expression provides a compact description of a set [of strings], without having to list all elements. There are several versions of regexp: basic (BRE), extended (ERE), perl compatible (PCRE) and other [less important] versions related to misc programming languages and applications. In BRE metacharacters ?, +, {, |, (, ) lose their special meaning, instead the backslashed versions should be used: \?, \+, \{, \|, $, and $. See reference table below.

Anchors

`^`	beginning of line;
`$`	end of line;
`<`	left word boundary;
`>`	right word boundary;
`\b`	word boundary;
`\B`	not a word boundary;

Quantifiers

`?`	0 or 1;
`*`	0, 1 or more;
`+`	1 or more;
`{n}`	exactly `n`;
`{n,}`	`n` or more;
`{,m}`	at most `m`;
`{n,m}`	at least `n`, but no more than `m`;

By default ?, *, + are greedy quantifiers (i.e., they match as much as possible). To make them lazy (matching as little as possible) add ? (??, *?, +?).

Alternation

`\|`	separates alternative patterns to be matched;

Most chars are treated as literals (they match only themselves). Any metachar with special meaning may be quoted by preceding it with a backslash.

Matches (Unix-style)

`.`	any single character;
`[...]`	any single character contained within `[ ]`;
`[^...]`	any single character not contained within `[ ]`;
`\d`	any single digit;
`\D`	any single non-digit;
`\s`	any single whitespace (space, `\t`, `\v`, `\n`, `\r`, `\f`);
`\S`	any single non-whitespace;
`\w`	any single alphanumeric character;
`\W`	any single non-alphanumeric character;
`\c`	control character (example: `\c[` matches Esc);
`\n`	newline;
`\r`	carriadge return;
`\t`	Tab (horizontal Tab);
`\v`	vertical Tab;
``	define a marked subexpression;
`\n`	where `n` is a digit (1..9); matches what the `n`th marked subexpression matched;

Matches (POSIX-style)

`[:alnum:]`	any alphanumeric character (`[0-9A-Za-z]`);
`[:alpha:]`	any alpha character (`[A-Za-z]`);
`[:blank:]`	space or Tab;
`[:ctrl:]`	any control character;
`[:digit:]`	any digit (`[0-9]`);
`[:graph:]`	any pseudographic character;
`[:lower:]`	any lowercase character;
`[:print:]`	any printable character;
`[:punct:]`	any punctuation character;
`[:space:]`	any whitespace (space, `\t`, `\n`, `\r`, `\f`, `\v`);
`[:upper:]`	any uppercase character;
`[:xdigit:]`	any hexadecimal digit;

Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these rules.

	`.`	`[ ]`	`^`	`$`	``	`\{ \}`	`?`	`+`	`\|`	`( )`
`awk`	`x`	`x`	`x`	`x`			`x`	`x`	`x`	`x`
`grep`	`x`	`x`	`x`	`x`	`x`	`x`
`egrep`	`x`	`x`	`x`	`x`	`x`		`x`	`x`	`x`	`x`
`fgrep`	`x`	`x`	`x`	`x`	`x`
`sed`	`x`	`x`	`x`	`x`	`x`	`x`
`perl`	`x`	`x`	`x`	`x`	`x`		`x`	`x`	`x`	`x`
`vi`	`x`	`x`	`x`	`x`	`x`

Examples

`/^$/`	an empty line;
`/./`	a line with at least one char;
`/^/`	all lines;
`/thing/`	`thing` somewhere in the line;
`/^thing/`	`thing` at the beginning of the line;
`/thing$/`	`thing` at the end of the line;
`/^thing$/`	a line consisting of `thing` only;
`/thing.$/`	`thing` plus some other chars;
`/thing\.$/`	`thing.` at the end of the line;
`/\/thing\//`	`/thing/` somewhere in the line;
`/[tT]hing/`	`thing` or `Thing`;
`/thing[0-9]/`	`thing` followed by one digit;
`/thing[^0-9]/`	`thing` followed by a non-digit;
`tele(f\|ph)one`	`telefone` or `telephone`;

/thing[0-9][^0-9]/

thing followed by digit and non-digit;

/thing1.*thing2/

thing1 followed by some chars, then thing2;

/^thing1.*thing2$/

thing1 at the beginning, thing2 at the end;

	`.`	`[ ]`	`^`	`$`	`\( \)`	`\{ \}`	`?`	`+`	`\|`	`( )`
`awk`	`x`	`x`	`x`	`x`			`x`	`x`	`x`	`x`
`grep`	`x`	`x`	`x`	`x`	`x`	`x`
`egrep`	`x`	`x`	`x`	`x`	`x`		`x`	`x`	`x`	`x`
`fgrep`	`x`	`x`	`x`	`x`	`x`
`sed`	`x`	`x`	`x`	`x`	`x`	`x`
`perl`	`x`	`x`	`x`	`x`	`x`		`x`	`x`	`x`	`x`
`vi`	`x`	`x`	`x`	`x`	`x`