Basic Syntax

Literal Characters

  • a - Matches the character ‘a’
  • abc - Matches the exact string “abc”
  • 123 - Matches the exact string “123”

Special Characters (Meta Characters)

  • . - Matches any single character except newline
  • ^ - Matches start of string/line
  • $ - Matches end of string/line
  • * - Matches 0 or more of the preceding element
  • + - Matches 1 or more of the preceding element
  • ? - Matches 0 or 1 of the preceding element
  • | - OR operator (alternation)
  • () - Grouping
  • [] - Character class
  • {} - Quantifiers

Escaping Special Characters

  • \. - Literal dot
  • \* - Literal asterisk
  • \+ - Literal plus
  • \? - Literal question mark
  • \\ - Literal backslash
  • \( \) - Literal parentheses
  • \[ \] - Literal square brackets

Character Classes

Basic Character Classes

  • [abc] - Matches ‘a’, ‘b’, or ‘c’
  • [a-z] - Matches any lowercase letter
  • [A-Z] - Matches any uppercase letter
  • [0-9] - Matches any digit
  • [a-zA-Z] - Matches any letter
  • [a-zA-Z0-9] - Matches any alphanumeric character

Negated Character Classes

  • [^abc] - Matches any character except ‘a’, ‘b’, or ‘c’
  • [^0-9] - Matches any non-digit character
  • [^a-zA-Z] - Matches any non-letter character

Predefined Character Classes

  • \d - Matches any digit (equivalent to [0-9])
  • \D - Matches any non-digit (equivalent to [^0-9])
  • \w - Matches any word character (equivalent to [a-zA-Z0-9_])
  • \W - Matches any non-word character (equivalent to [^a-zA-Z0-9_])
  • \s - Matches any whitespace character (space, tab, newline, etc.)
  • \S - Matches any non-whitespace character

Quantifiers

Basic Quantifiers

  • * - 0 or more (greedy)
  • + - 1 or more (greedy)
  • ? - 0 or 1 (greedy)
  • {n} - Exactly n times
  • {n,} - n or more times
  • {n,m} - Between n and m times (inclusive)

Non-Greedy (Lazy) Quantifiers

  • *? - 0 or more (non-greedy)
  • +? - 1 or more (non-greedy)
  • ?? - 0 or 1 (non-greedy)
  • {n,}? - n or more times (non-greedy)
  • {n,m}? - Between n and m times (non-greedy)

Anchors and Boundaries

Position Anchors

  • ^ - Start of string/line
  • $ - End of string/line
  • \A - Start of string (absolute)
  • \Z - End of string (absolute)
  • \z - End of string (absolute, before final newline)

Word Boundaries

  • \b - Word boundary
  • \B - Non-word boundary

Groups and Capturing

Grouping

  • (abc) - Capturing group
  • (?:abc) - Non-capturing group
  • (?<name>abc) - Named capturing group

Backreferences

  • \1 - References first capturing group
  • \2 - References second capturing group
  • \k<name> - References named capturing group

Lookahead and Lookbehind

Lookahead

  • (?=abc) - Positive lookahead
  • (?!abc) - Negative lookahead

Lookbehind

  • (?<=abc) - Positive lookbehind
  • (?<!abc) - Negative lookbehind

Common Patterns

Email Validation (Basic)

[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}

Phone Number (US Format)

\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}

URL/Website

https?://(?:[-\w.])+(?:\:[0-9]+)?(?:/(?:[\w/_.])*(?:\?(?:[\w&=%.])*)?(?:\#(?:[\w.])*)?)?

IP Address

\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b

Date (MM/DD/YYYY)

(0[1-9]|1[0-2])/(0[1-9]|[12][0-9]|3[01])/\d{4}

Time (12-hour format)

(1[0-2]|0?[1-9]):[0-5][0-9]\s?(AM|PM|am|pm)

Credit Card Number

\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}

Hex Color Code

#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})

Password Strength (8+ chars, uppercase, lowercase, digit)

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d@$!%*?&]{8,}$

HTML Tag

<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)

Flags/Modifiers

Common Flags

  • i - Case insensitive
  • g - Global (find all matches)
  • m - Multiline (^ and $ match line breaks)
  • s - Dot matches newline
  • x - Extended (ignore whitespace and comments)
  • u - Unicode

Usage Examples

  • /pattern/i - Case insensitive match
  • /pattern/g - Global match (find all)
  • /pattern/gi - Global and case insensitive

Examples by Use Case

Text Validation

  • Only letters: ^[a-zA-Z]+$
  • Only numbers: ^[0-9]+$
  • Alphanumeric: ^[a-zA-Z0-9]+$
  • No special characters: ^[a-zA-Z0-9\s]+$

Text Extraction

  • Extract words: \b\w+\b
  • Extract numbers: \d+
  • Extract quoted text: "([^"]*)"
  • Extract between parentheses: \(([^)]*)\)

Text Replacement

  • Remove extra spaces: \s+
  • Remove leading/trailing spaces: ^\s+|\s+$ → “
  • Convert to camelCase: [-_\s]+(\w)$1.toUpperCase()

Tips and Best Practices

Performance Tips

  • Use non-capturing groups (?:) when you don’t need to capture
  • Be specific with character classes instead of using .
  • Use anchors ^ and $ when matching entire strings
  • Use lazy quantifiers *?, +? when appropriate

Common Mistakes to Avoid

  • Forgetting to escape special characters in literals
  • Using greedy quantifiers when lazy ones are needed
  • Not accounting for edge cases in validation patterns
  • Overcomplicating patterns when simpler ones work

Testing Tools

  • RegExr (regexr.com)
  • Regex101 (regex101.com)
  • RegexPal (regexpal.com)
  • Built-in language tools (e.g., Python’s re module)