Talk:Regular expression examples/sandbox

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Trying to come up with a language-independent version of this article...

-->

Metacharacter(s)	Description	Example
Metacharacter(s)	Description	string	regex	match	notes
`.`	Normally matches any character except a newline. Within square brackets the dot is literal	`Hello World`	`......`	`Hello,`
`( )`	Groups a series of pattern elements to a single element. ^[1]	`Hello World`	`(H..).(o..)`	`Hello W` Group 1: `Hel` Group 2: `o W`
`?`	Matches the preceding pattern element zero or one times.	`Hello World`	`H(.?)e`	`He` Group 1: `«empty»`	There is a possible character (no, in this case) between 'H' and 'e'.
`+`	Matches the preceding pattern element one or more times.	`Hello World`	`l+`	`ll`	There are one or more consecutive letter "l"'s in "Hello World"
*``**	Matches the preceding pattern element zero or more times.	`Hello World`	*`elo`**	`ello`	There is an 'e' followed by zero to many 'l' followed by 'o' (eo, elo, ello, elllo)
`{M,N}`	Denotes the minimum M and the maximum N match count.	`Hello World`	`l{1,2}`	`ll`	There exists a substring with at least 1 and at most 2 l's
`?`	Modifies the *, +, or {M,N}'d regexp that comes before to match as few times as possible.	`Hello World`	`l+?`	`l`	Compare this (called the non-greedy match) with the greedier version above with the unmodified '+'.
`[...]`	Denotes a set of possible character matches.	`Hello World`	`[aeiou]+`	`e`	Matches the first occurrence of a succession of vowels (one or more).
`[^...]`	Matches every character except the ones inside brackets.	`Hello World`	`[^aeiou]+`	`H`	Matches the first occurrence of a succession of 'not-vowels'
\|	Separates alternate possibilities.	`Hello World`	`(Hello\|Hi\|Pogo)`	`Hello`	At least one of Hello, Hi, or Pogo is contained in the string.
`\b`	Matches a word boundary	`Hello World`	`ell\b`	matches nothing	There is no substring matching 'ell' at the end of a word
`\w`	Matches a 'word' character (defined as the group of alphanumeric characters, including the underscore "_"; same as `[A-Za-z0-9_]`)	`Hello World`	`\w`	`H`	There is at least one alphanumeric character in string (A-Z, a-z, 0-9, _)
`\W`	Matches a non-alphanumeric character, excluding "_"; same as [^A-Za-z0-9_]	`Hello World`	`\W`	«space»	The space between Hello and World is not alphanumeric
`\s`	Matches a whitespace character (space, tab, newline, form feed)	`Hello World`	*`\s.`**	`World`	Any characters (0 or more) after a whitespace character
`\S`	Matches anything BUT a whitespace.	`Hello World`	*`\S.\S`**	`Hello World`	There are TWO non-whitespace characters, which may be separated by other characters
`\d`	Matches a digit; same as [0-9].	`99 bottles of beer on the wall`	`(\d+)`	Group 1: `99`	Group 1 is the first number in the string
`\D`	Matches a non-digit; same as [^0-9].	`99 bottles of beer on the wall`	`\D`	«space»	The first non-digit character is the space after 99
`^`	Matches the beginning of a line or string.	`Hello World`	`^He`	`He`	The string starts with the characters 'He'
`$`	Matches the end of a line or string.	`Hello World`	`rld$`	`rld`	The given string is a line or string that ends with 'rld'
`\A`	Matches the beginning of a string (but not an internal line).	`Hello\nWorld`	`\AH`	`H`	The matched string starts with 'H'
`\z`	Matches the end of a string (but not an internal line). ^[2]	`Hello\nWorld\n`	`d\n\z`	`d\n`	The matched string is a string that ends with 'd\\n'

Notes

^ When you match a pattern within parentheses, you can use any of $1, $2, ... later to refer to the previously matched pattern.
^ see Perl Best Practices Page 240

[1] When you match a pattern within parentheses, you can use any of $1, $2, ... later to refer to the previously matched pattern.

[2] see Perl Best Practices Page 240

[1]

[2]