Basic Syntax
Literal Characters
a - Matches the character ‘a’
abc - Matches the exact string “abc”
123 - Matches the exact string “123”
. - Matches any single character except newline
^ - Matches start of string/line
$ - Matches end of string/line
* - Matches 0 or more of the preceding element
+ - Matches 1 or more of the preceding element
? - Matches 0 or 1 of the preceding element
| - OR operator (alternation)
() - Grouping
[] - Character class
{} - Quantifiers
Escaping Special Characters
\. - Literal dot
\* - Literal asterisk
\+ - Literal plus
\? - Literal question mark
\\ - Literal backslash
\( \) - Literal parentheses
\[ \] - Literal square brackets
Character Classes
Basic Character Classes
[abc] - Matches ‘a’, ‘b’, or ‘c’
[a-z] - Matches any lowercase letter
[A-Z] - Matches any uppercase letter
[0-9] - Matches any digit
[a-zA-Z] - Matches any letter
[a-zA-Z0-9] - Matches any alphanumeric character
Negated Character Classes
[^abc] - Matches any character except ‘a’, ‘b’, or ‘c’
[^0-9] - Matches any non-digit character
[^a-zA-Z] - Matches any non-letter character
Predefined Character Classes
\d - Matches any digit (equivalent to [0-9])
\D - Matches any non-digit (equivalent to [^0-9])
\w - Matches any word character (equivalent to [a-zA-Z0-9_])
\W - Matches any non-word character (equivalent to [^a-zA-Z0-9_])
\s - Matches any whitespace character (space, tab, newline, etc.)
\S - Matches any non-whitespace character
Quantifiers
Basic Quantifiers
* - 0 or more (greedy)
+ - 1 or more (greedy)
? - 0 or 1 (greedy)
{n} - Exactly n times
{n,} - n or more times
{n,m} - Between n and m times (inclusive)
Non-Greedy (Lazy) Quantifiers
*? - 0 or more (non-greedy)
+? - 1 or more (non-greedy)
?? - 0 or 1 (non-greedy)
{n,}? - n or more times (non-greedy)
{n,m}? - Between n and m times (non-greedy)
Anchors and Boundaries
Position Anchors
^ - Start of string/line
$ - End of string/line
\A - Start of string (absolute)
\Z - End of string (absolute)
\z - End of string (absolute, before final newline)
Word Boundaries
\b - Word boundary
\B - Non-word boundary
Groups and Capturing
Grouping
(abc) - Capturing group
(?:abc) - Non-capturing group
(?<name>abc) - Named capturing group
Backreferences
\1 - References first capturing group
\2 - References second capturing group
\k<name> - References named capturing group
Lookahead and Lookbehind
Lookahead
(?=abc) - Positive lookahead
(?!abc) - Negative lookahead
Lookbehind
(?<=abc) - Positive lookbehind
(?<!abc) - Negative lookbehind
Common Patterns
Email Validation (Basic)
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}
URL/Website
https?://(?:[-\w.])+(?:\:[0-9]+)?(?:/(?:[\w/_.])*(?:\?(?:[\w&=%.])*)?(?:\#(?:[\w.])*)?)?
IP Address
\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
Date (MM/DD/YYYY)
(0[1-9]|1[0-2])/(0[1-9]|[12][0-9]|3[01])/\d{4}
(1[0-2]|0?[1-9]):[0-5][0-9]\s?(AM|PM|am|pm)
Credit Card Number
\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}
Hex Color Code
#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})
Password Strength (8+ chars, uppercase, lowercase, digit)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d@$!%*?&]{8,}$
HTML Tag
<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)
Flags/Modifiers
Common Flags
i - Case insensitive
g - Global (find all matches)
m - Multiline (^ and $ match line breaks)
s - Dot matches newline
x - Extended (ignore whitespace and comments)
u - Unicode
Usage Examples
/pattern/i - Case insensitive match
/pattern/g - Global match (find all)
/pattern/gi - Global and case insensitive
Examples by Use Case
Text Validation
- Only letters:
^[a-zA-Z]+$
- Only numbers:
^[0-9]+$
- Alphanumeric:
^[a-zA-Z0-9]+$
- No special characters:
^[a-zA-Z0-9\s]+$
- Extract words:
\b\w+\b
- Extract numbers:
\d+
- Extract quoted text:
"([^"]*)"
- Extract between parentheses:
\(([^)]*)\)
Text Replacement
- Remove extra spaces:
\s+ →
- Remove leading/trailing spaces:
^\s+|\s+$ → “
- Convert to camelCase:
[-_\s]+(\w) → $1.toUpperCase()
Tips and Best Practices
- Use non-capturing groups
(?:) when you don’t need to capture
- Be specific with character classes instead of using
.
- Use anchors
^ and $ when matching entire strings
- Use lazy quantifiers
*?, +? when appropriate
Common Mistakes to Avoid
- Forgetting to escape special characters in literals
- Using greedy quantifiers when lazy ones are needed
- Not accounting for edge cases in validation patterns
- Overcomplicating patterns when simpler ones work
- RegExr (regexr.com)
- Regex101 (regex101.com)
- RegexPal (regexpal.com)
- Built-in language tools (e.g., Python’s
re module)