Go to the first, previous, next, last section, table of contents.

Description of MySQL regular expression syntax.

Regular expressions are a powerful way of specifying complex searches.

MySQL uses regular Henry Spencers inplementation of regular expressions. And that is aimed to conform to POSIX 1003.2. MySQL uses the extended version.

To get more exact information see Henry Spencers regex.7 manual that is included in the source distribution. See section Who has helped to make MySQL..

This is a simplistic reference that skips the details. From here on a regualr expressions is called a regexp.

A regular expression describes a set of strings. The simplest case is one that has no special characters in it. For example the regexp hello matches hello and nothing else.

Nontrivial regular expressions use certain special constructs so that they can match more than one string. For example, the regexp hello|word matches either the string hello or the string word.

And a more comples example regexp B[an]*s matches any of the strings Bananas, Baaaaas, Bs and all other string starting with a B and continuing with any number of a n and ending with a s.

The following special characters/constructs are known.

^

Start of whole string.

mysql> select "fo\nfo" regexp "^fo$";           -> 0
mysql> select "fofo" regexp "^fo";              -> 1

$

End of whole string.

mysql> select "fo\no" regexp "^fo\no$";         -> 1
mysql> select "fo\no" regexp "^fo$";            -> 0

.

Any character (including newline).

mysql> select "fofo" regexp "^f.*";             -> 1
mysql> select "fo\nfo" regexp "^f.*";           -> 1

a*

Any sequence of zero or more a's.

mysql> select "Ban" regexp "^Ba*n";             -> 1
mysql> select "Baaan" regexp "^Ba*n";           -> 1
mysql> select "Bn" regexp "^Ba*n";              -> 1

a+

Any sequence of one or more a's.

mysql> select "Ban" regexp "^Ba+n";             -> 1
mysql> select "Bn" regexp "^Ba+n";              -> 0

a?

Either zero or one a.

mysql> select "Bn" regexp "^Ba?n";              -> 1
mysql> select "Ban" regexp "^Ba?n";             -> 1
mysql> select "Baan" regexp "^Ba?n";            -> 0

de|abc

Either the sequence de or abc.

mysql> select "pi" regexp "pi|apa";             -> 1
mysql> select "axe" regexp "pi|apa";            -> 0
mysql> select "apa" regexp "pi|apa";            -> 1
mysql> select "apa" regexp "^(pi|apa)$";        -> 1
mysql> select "pi" regexp "^(pi|apa)$";         -> 1
mysql> select "pix" regexp "^(pi|apa)$";        -> 0

(abc)*

Zero or more times the sequence abc.

mysql> select "pi" regexp "^(pi)+$";            -> 1
mysql> select "pip" regexp "^(pi)+$";           -> 0
mysql> select "pipi" regexp "^(pi)+$";          -> 1

{1}

{2,3}

The is a more general way of writing regexps that match many occurences.

a*: Can be written as a{0,}.
+: Can be written as a{1,}.
?: Can be written as a{0,1}.

To be more precice an atom followed by a bound containing one integer i and no comma matches a sequence of exactly i matches of the atom. An atom followed by a bound containing one integer i and a comma matches a sequence of i or more matches of the atom. An atom followed by a bound containing two integers i and j matches a sequence of i through j (inclusive) matches of the atom. Both arguments must 0 >= value <= RE_DUP_MAX (default 255), and if there are two of them, the second must be bigger or equal to the first.

[a-dX]

[^a-dX]

Any character which is (not if ^ is used) either a, b, c, d or X. To include ] it has to be written first. To include - it has to be written first or last. So [0-9] matches any decimal digit. All character that does not have a defined mening inside a [] pair has no special meaning and matches only itself.

mysql> select "aXbc" regexp "[a-dXYZ]";         -> 1
mysql> select "aXbc" regexp "^[a-dXYZ]$";       -> 0
mysql> select "aXbc" regexp "^[a-dXYZ]+$";      -> 1
mysql> select "aXbc" regexp "^[^a-dXYZ]+$";     -> 0
mysql> select "gheis" regexp "^[^a-dXYZ]+$";    -> 1
mysql> select "gheisa" regexp "^[^a-dXYZ]+$";   -> 0

[[.characters.]]

The sequence of characters of that collating element. The sequence is a single element of the bracket expression's list. A bracket expression containing a multi-character collating element can thus match more than one character, e.g. if the collating sequence includes a ch collating element, then the RE [[.ch.]]*c matches the first five characters of chchcc.

[=character-class=]

An equivalence class, standing for the sequences of characters of all collating elements equivalent to that one, including itself. For example, if o and (+) are the members of an equivalence class, then [[=o=]], [[=(+)=]], and [o(+)] are all synonymous. An equivalence class may not be an endpoint of a range.

[:character_class:]

Within a bracket expression, the name of a character class enclosed in [: and :] stands for the list of all characters belonging to that class. Standard character class names are:

alnum	digit	punct
alpha	graph	space
blank	lower	upper
cntrl	print	xdigit

These stand for the character classes defined in ctype(3). A locale may provide others. A character class may not be used as an endpoint of a range.

mysql> select "justalnums" regexp "[[:alnum:]]+";       -> 1
mysql> select "!!" regexp "[[:alnum:]]+";               -> 0

[[:<:]]

[[:>:]] These match the null string at the beginning and end of a word respectively. A word is defined as a sequence of word characters which is neither preceded nor followed by word characters. A word character is an alnum character (as defined by ctype(3)) or an underscore.

mysql> select "a word a" regexp "[[:<:]]word[[:>:]]";      -> 1
mysql> select "a xword a" regexp "[[:<:]]word[[:>:]]";     -> 0

mysql> select "weeknights" regexp "^(wee|week)(knights|nights)$"; -> 1

Go to the first, previous, next, last section, table of contents.

Casa de Bender