Regular Expressions Regex Cheat Sheet

Regular Expressions, or Regex, are not unique to the Business Intelligence (BI) world, but they are something I frequently encounter and often find puzzling. I wish I had a handy cheat sheet whenever I need to use Regex in my formulas, but I can never find a concise and helpful one through Google searches. So, I decided to create one for people like me who don’t necessarily struggle with Regex but are often annoyed by its complexity.

Characters

CharacterLegendExampleSample Match
.Any character except line breaka.cabc
\dMatches any digit (Arabic numeral). Equivalent to [0-9]Order_\dOrder_1
\D.Matches any character that is not a digit (Arabic numeral). Equivalent to [^0-9]\D\D\DABC
\wMatches any alphanumeric character, including the underscore. Equivalent to [A-Za-z0-9_]\w-\w\w\w A-b_1
\WMatches any character that is not a word character from the basic Latin alphabet. Equivalent to [^A-Za-z0-9_]\W\W\W\W *-+=)
\s. Matches a single white space character, including space, tab, form feed, line feed, and other Unicode spaces. \w-\sa\s\Wc- a *
\SMatches a single character other than white space. \S\S\Syes
\tMatches a horizontal tab.T\t\w\bT ab
\rCarriage return character
\nLine feed character
\r\nLine separator on WindowsAB\r\nCDAB
CD
\Escapes a special character\.\\\~.\~

Anchors

AnchorLegendExampleSample Match
^Matches the beginning of input. (But when [^inside brackets], it means “not”)^abc .*abc (line start)
\AStart of string\Aabcabc(string start)
$End of string, or end of line in multi-line pattern.*? the end$this is the end
\zThe end of the inputthe end\zthis is…\n…the end
\ZThe end of the input but for the final terminator, if anythe end\Z this is…\n…the end\n
\bMatches a word boundary.Bob.*\bcat\b Bob ate the cat
\BNot word boundaryc.*\Bcat\B.*copycats
\GThe end of the previous match

Quanti­fiers

QuantifierLegendExampleSample Match
+One or morex+xxxxx
*Zero or more timesA*B*C*AACCCC
?Once or none(Makes quantifiers “lazy)abc?abc
{3}Exactly three times\D{3}ABC
{2,4}Two to four times\d{2,4}156
{x,}x or more times\w{x,}a_bdcs

Assertion

AssertionLegendExampleSample Match
(?=…)Positive lookahead. Matches “x” only if “x” is followed by “y”.\d+(?= dollars)100 in 100 dollars
(?<=…)Positive lookbehind. Matches “x” only if “x” is preceded by “y”.(?=\d+ dollars)\d+ 100 in 100 dollars
(?!…)Negative lookahead. Matches “x” only if “x” is not followed by “y”.\d+(?!\d| dollars)100 in 100 Yen
(?<!…)Negative lookbehind. Negative Lookahead Before the Match(?!\d+ dollars)\d+100 in 100 Yen

Character Classes and Groups

CharacterLegendExampleSample Match
[ … ]One of the characters in the bracketsbe[ea]rbeer or bear
[x-y]One of the characters in the range from x to y[A-Z]+GREAT
[^x]One character that is not x[^a-z]{3}A1!
[^x-y]One of the characters not in the range from x to y[^a-h]+zzz
[\d\D]One character that is a digit or a non-digit[\d\D]+Any characters,
including new lines
x|yAlternation / OR operandM|Lmatches M in [size M]
( … )Capturing groupA(nt|pple)Apple (captures “pple”)