Regular Expressions

Regex Fundamentals

Delimiters: Characters used to mark the beginning and end of a pattern.
Literal Characters: Characters without special meaning. Ex: a to z, 0 to 9
Meta-Characters: Characters with special meaning. Ex: . ^ $ * + ? {} [] \ | ()
`.` is a special metacharacter and it matches any single character except newline characters (\n, \r).
`\` is a special metacharacter and it is used to escape special characters or introduce special sequences (\n, \t, etc.)
Flags/Modifiers
- `g` (global): Searches for all occurrences of the pattern within the text, rather than just the first one.
- `i` (case-insensitive): Ignores case differences when matching letters.
- `m` (multi-line): Changes the behavior of ^ and $ anchors to match the beginning and end of each line within the text, rather than just the beginning and end of the entire text.
- `s` (dotAll): Allows the dot (.) to match newline characters (\n), which it doesn't do by default.
- `u` (unicode): Enables full Unicode support for the regex pattern.
- `y` (sticky): Matches only from the last index where a previous match ended.
Character Classes
- Defining sets of characters to match [abc].
- Ranges within character classes [a-z] or [0-9].
- Combined ranges within character classes [a-z0-9].
- Negated character classes [^abc].
Quantifiers
- `*` matches 0 or more occurrences.
- `+` matches 1 or more occurrences.
- `?` matches 0 or 1 occurrence.
- `{n}` matches exactly `n` occurrences.
- `{n,}` matches `n` or more occurrences.
- `{n,m}` matches between `n` and `m` occurrences.
Anchors
- `^` asserts the position at the start of the string.
- `$` asserts the position at the end of the string.
- `m` (multi-line flag) (not an anchor): Changes the behavior of ^ and $ anchors to match the beginning and end of each line within the text, rather than just the beginning and end of the entire text.
Predefined/Shorthand Character Classes
- `\d` matches any digit (equivalent to [0-9]).
- `\D` matches any non-digit.
- `\w` matches any word character (equivalent to [A-Za-z0-9_]).
- `\W` matches any non-word character.
- `\s` matches any whitespace character (spaces, tabs, line breaks).
- `\S` matches any non-whitespace character.
Alternation
- Using the vertical bar `|` to specify alternatives
Groups and Capturing
- Grouping with parentheses `()`.
- Capturing groups and backreferences `(\1\2)`
- Non-capturing groups `(?:...)`.
- Named capture groups (?<name>... )
- Backreferences using names of named capture groups. \k<name>
Word Boundaries
- Word boundaries `\b`
- Non-word boundaries `\B`
Lookaheads and Lookbehinds (Lookarounds)
- Positive lookahead `pattern1(?=pattern2)`
- Negative lookahead `pattern1(?!pattern2)`
- Positive lookbehind `(?<=pattern1)pattern2`
- Negative lookbehind `(?<!pattern1)pattern2`
Debugging Regular Expressions
- The current input position (the current position)
- Lookahead where it is before the main pattern `(?=pattern2)pattern1`
- Lookbehind where it is after the main pattern `pattern1(?<=pattern1)`
Substitutions or Replacements
- $& Inserts the matched substring.
- $` Inserts the portion of the string that precedes the matched substring.
- $' Inserts the portion of the string that follows the matched substring.
- $n Inserts the nth (1-indexed) capturing group where n is a positive integer less than 100.
- $<name> Inserts the named capturing group where Name is the group name.
- $$ Inserts a "$".
Regex in JavaScript