Jump to content

Talk:Regular expression examples/sandbox

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Trying to come up with a language-independent version of this article...

-->

Metacharacter(s) Description Example
string regex match notes
. Normally matches any character except a newline. Within square brackets the

dot is literal

Hello World ...... Hello,
( ) Groups a series of pattern elements to a single element. [1] Hello World (H..).(o..) Hello W

Group 1: Hel
Group 2: o W

? Matches the preceding pattern element zero or one times. Hello World H(.?)e He

Group 1: «empty»

There is a possible character (no, in this case) between 'H' and 'e'.
+ Matches the preceding pattern element one or more times. Hello World l+ ll There are one or more consecutive letter "l"'s in "Hello World"
* Matches the preceding pattern element zero or more times. Hello World el*o ello There is an 'e' followed by zero to many 'l' followed by 'o' (eo, elo, ello, elllo)
{M,N} Denotes the minimum M and the maximum N match count. Hello World l{1,2} ll There exists a substring with at least 1 and at most 2 l's
? Modifies the *, +, or {M,N}'d regexp that comes before

to match as few times as possible.

Hello World l+? l Compare this (called the non-greedy match) with the greedier version

above with the unmodified '+'.

[...] Denotes a set of possible character matches. Hello World [aeiou]+ e Matches the first occurrence of a succession of vowels (one or more).
[^...] Matches every character except the ones inside brackets. Hello World [^aeiou]+ H Matches the first occurrence of a succession of 'not-vowels'
| Separates alternate possibilities. Hello World (Hello|Hi|Pogo) Hello At least one of Hello, Hi, or Pogo is contained in the string.
\b Matches a word boundary Hello World ell\b matches nothing There is no substring matching 'ell' at the end of a word
\w Matches a 'word' character (defined as the group of alphanumeric

characters, including the underscore "_"; same as [A-Za-z0-9_])

Hello World \w H There is at least one alphanumeric character in string

(A-Z, a-z, 0-9, _)

\W Matches a non-alphanumeric character, excluding "_"; same as [^A-Za-z0-9_] Hello World \W «space» The space between Hello and World is not alphanumeric
\s Matches a whitespace character (space, tab, newline, form feed) Hello World \s.* World Any characters (0 or more) after a whitespace character
\S Matches anything BUT a whitespace. Hello World \S.*\S Hello World There are TWO non-whitespace characters, which may be separated

by other characters

\d Matches a digit; same as [0-9]. 99 bottles of beer on the wall (\d+) Group 1: 99 Group 1 is the first number in the string
\D Matches a non-digit; same as [^0-9]. 99 bottles of beer on the wall \D «space» The first non-digit character is the space after 99
^ Matches the beginning of a line or string. Hello World ^He He The string starts with the characters 'He'
$ Matches the end of a line or string. Hello World rld$ rld The given string is a line or string that ends with 'rld'
\A Matches the beginning of a string (but not an internal line). Hello\nWorld \AH H The matched string starts with 'H'
\z Matches the end of a string (but not an internal line).

[2]

Hello\nWorld\n d\n\z d\n The matched string is a string that ends with 'd\\n'

Notes

  1. ^ When you match a pattern within parentheses, you can use any of $1, $2, ... later to refer to the previously matched pattern.
  2. ^ see Perl Best Practices Page 240