MySQL Regular Expressions
Overview
MySQL provides regular expression support through the REGEXP and RLIKE operators for pattern matching in string data. Regular expressions are powerful tools for complex pattern matching.
Regular Expression Basics
- Pattern Matching: Match specific patterns in text
- Wildcards: Special characters for flexible matching
- Anchors: Position of match in string
- Character Classes: Sets of characters to match
REGEXP Operator
Basic Syntax
-- REGEXP (case-sensitive by default)
SELECT 'MySQL' REGEXP 'My'; -- TRUE
SELECT 'mysql' REGEXP 'My'; -- FALSE (case-sensitive)
-- RLIKE is synonym for REGEXP
SELECT 'Hello' RLIKE 'Hell'; -- TRUE
-- NOT REGEXP for negative match
SELECT 'Hello' NOT REGEXP 'World'; -- TRUEREGEXP vs LIKE
-- LIKE uses simple wildcards
SELECT name FROM products WHERE name LIKE '%phone%';
-- Matches: iPhone, smartphone, telephone
-- REGEXP uses powerful patterns
SELECT name FROM products WHERE name REGEXP 'phone|phone';
-- Matches: iPhone, smartphone, telephone, phone
-- Multiple patterns with REGEXP
SELECT name FROM products WHERE name REGEXP 'phone|laptop|tablet';
-- Matches: iPhone, laptop, tablet
-- Equivalent with LIKE would require OR
SELECT name FROM products
WHERE name LIKE '%phone%'
OR name LIKE '%laptop%'
OR name LIKE '%tablet%';Regular Expression Metacharacters
Common Metacharacters
| Pattern |-----------------|------------------|----------------| | ^ | Start of string | $ | End of string | . | Any single character | | | OR operator | * | Zero or more occurrences | + | One or more occurrences | ? | Zero or one occurrence | [ ] | Character class | [^ ] | Negated character class | \ | Escape character
Examples
-- Start of string
SELECT 'Hello World' REGEXP '^Hello'; -- TRUE
SELECT 'Hello World' REGEXP '^World'; -- FALSE
-- End of string
SELECT 'Hello World' REGEXP 'World$'; -- TRUE
SELECT 'Hello World' REGEXP 'Hello$'; -- FALSE
-- Any character
SELECT 'cat' REGEXP 'c.t'; -- TRUE
SELECT 'cut' REGEXP 'c.t'; -- TRUE
SELECT 'ct' REGEXP 'c.t'; -- FALSE (needs one character)
-- OR operator
SELECT 'cat' REGEXP 'cat|dog'; -- TRUE
SELECT 'dog' REGEXP 'cat|dog'; -- TRUE
SELECT 'bird' REGEXP 'cat|dog'; -- FALSE
-- Repetition
SELECT 'ca' REGEXP 'ca*'; -- TRUE (zero a's)
SELECT 'caa' REGEXP 'ca*'; -- TRUE (two a's)
SELECT 'caaa' REGEXP 'ca+'; -- TRUE (one or more a's)
SELECT 'cb' REGEXP 'ca+'; -- FALSE (no a)
-- Optional
SELECT 'color' REGEXP 'colou?r'; -- TRUE
SELECT 'colour' REGEXP 'colou?r'; -- TRUECharacter Classes
Predefined Classes
| Class |------------|------------------| | [[:alnum:]] | Alphanumeric | [[:alpha:]] | Alphabetic | [[:digit:]] | Digits | [[:lower:]] | Lowercase | [::upper:]] | Uppercase | [[:space:]] | Whitespace | [::punct:]] | Punctuation
Examples
-- Alphanumeric
SELECT 'user123' REGEXP '[[:alnum:]]+'; -- TRUE
SELECT 'user_123' REGEXP '[[:alnum:]]+'; -- TRUE (matches part)
-- Digits
SELECT '123' REGEXP '[[:digit:]]+'; -- TRUE
SELECT 'abc' REGEXP '[[:digit:]]+'; -- FALSE
-- Alphabetic
SELECT 'hello' REGEXP '[[:alpha:]]+'; -- TRUE
SELECT 'hello123' REGEXP '[[:alpha:]]+'; -- TRUE (matches hello)
-- Custom character class
SELECT 'cat' REGEXP '[abc]at'; -- TRUE
SELECT 'dog' REGEXP '[abc]at'; -- FALSE
-- Range in character class
SELECT 'cat' REGEXP '[a-z]at'; -- TRUE
SELECT 'dog' REGEXP '[a-z]og'; -- TRUE
SELECT '123' REGEXP '[0-9]+'; -- TRUE
-- Negated character class
SELECT 'cat' REGEXP '[^x]at'; -- TRUE
SELECT 'xat' REGEXP '[^x]at'; -- FALSEPractical Examples
Email Validation
-- Simple email pattern
SELECT * FROM users
WHERE email REGEXP '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$';
-- More specific pattern
SELECT * FROM users
WHERE email REGEXP '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$';Phone Number Matching
-- Match US phone numbers
SELECT * FROM contacts
WHERE phone REGEXP '^\\([0-9]{3}\\) [0-9]{3}-[0-9]{4}$';
-- Match various phone formats
SELECT * FROM contacts
WHERE phone REGEXP '^[0-9]{3}-[0-9]{3}-[0-9]{4}$|^[0-9]{10}$';Product Codes
-- Match product codes like PROD-123
SELECT * FROM products
WHERE code REGEXP '^[A-Z]{4}-[0-9]{3}$';
-- Match SKU patterns
SELECT * FROM products
WHERE sku REGEXP '[A-Z]{2}[0-9]{4}[A-Z]';Text Search
-- Find words starting with specific letters
SELECT * FROM articles
WHERE content REGEXP '\\b[A-M][a-z]+\\b';
-- Find URLs in text
SELECT * FROM messages
WHERE content REGEXP 'https?://[a-zA-Z0-9./?=&_-]+';
-- Find mentions (@username)
SELECT * FROM tweets
WHERE content REGEXP '@[a-zA-Z0-9_]+';REGEXP_REPLACE Function
MySQL 8.0+ supports REGEXP_REPLACE for substitution.
-- Replace digits with X
SELECT REGEXP_REPLACE('Product123', '[0-9]', 'X');
-- Returns: ProductXXX
-- Remove non-alphanumeric characters
SELECT REGEXP_REPLACE('Item #123!', '[^a-zA-Z0-9]', '');
-- Returns: Item123
-- Format phone numbers
SELECT REGEXP_REPLACE('5551234567', '([0-9]{3})([0-9]{3})([0-9]{4})', '(\\1) \\2-\\3');
-- Returns: (555) 123-4567
-- Replace multiple spaces
SELECT REGEXP_REPLACE('Too many spaces', '[[:space:]]+', ' ');
-- Returns: Too many spacesREGEXP_LIKE / REGEXP_LIKE
Simple pattern matching without full regex support.
-- REGEXP_LIKE for simple patterns
SELECT name FROM products WHERE name REGEXP_LIKE 'phone%';
-- Similar to LIKE but uses REGEXP engine
-- Case-insensitive matching
SELECT name FROM products WHERE name REGEXP_LIKE 'phone%';Performance Considerations
REGEXP Performance
-- REGEXP can be slow on large datasets
-- Consider using indexed columns first
-- Efficient pattern
SELECT * FROM users
WHERE email REGEXP '^[a-z]+@example\\.com$'
AND email LIKE '%@example.com%';
-- LIKE uses index if possible, REGEXP refines
-- Avoid REGEXP with leading wildcards
-- Poor: (can't use index)
SELECT * FROM products WHERE name REGEXP '.*phone';
-- Better: (can use index)
SELECT * FROM products WHERE name LIKE 'phone%' OR name LIKE '%phone';Alternative Approaches
-- Use SUBSTRING for fixed patterns
SELECT * FROM products
WHERE code LIKE 'PROD-%';
-- Use LEFT for prefix matching
SELECT * FROM products
WHERE LEFT(code, 4) = 'PROD';
-- Use FULLTEXT for text search
SELECT * FROM articles
WHERE MATCH(title, content) AGAINST('database' IN NATURAL LANGUAGE MODE);Advanced Patterns
Capturing Groups
-- Extract parts using REGEXP_SUBSTR (MySQL 8.0+)
SELECT
code,
REGEXP_SUBSTR(code, '[A-Z]+') AS letters,
REGEXP_SUBSTR(code, '[0-9]+') AS numbers
FROM products
WHERE code = 'PROD123';
-- Returns: letters=PROD, numbers=123Word Boundaries
-- Match whole words
SELECT * FROM articles
WHERE content REGEXP '\\bdatabase\\b';
-- Match word at start of string
SELECT * FROM articles
WHERE title REGEXP '^\\bMySQL\\b';Repetition Quantifiers
-- Exact match {n}
SELECT 'aaa' REGEXP 'a{3}'; -- TRUE
SELECT 'aa' REGEXP 'a{3}'; -- FALSE
-- Range match {n,m}
SELECT 'aaa' REGEXP 'a{2,4}'; -- TRUE
SELECT 'aaaaa' REGEXP 'a{2,4}'; -- FALSE (too many)
-- Minimum match {n,}
SELECT 'aaa' REGEXP 'a{2,}'; -- TRUE
SELECT 'a' REGEXP 'a{2,}'; -- FALSEREGEXP Case Sensitivity
-- Case-sensitive (default)
SELECT 'MySQL' REGEXP 'mysql'; -- FALSE
-- Case-insensitive (using BINARY)
SELECT BINARY 'MySQL' REGEXP BINARY 'mysql'; -- FALSE
-- Case-insensitive pattern
SELECT 'MySQL' REGEXP '[Mm][Yy][Ss][Qq][Ll]'; -- TRUE
-- Convert to lowercase
SELECT LOWER('MySQL') REGEXP 'mysql'; -- TRUECommon Use Cases
Data Validation
-- Validate postal codes
SELECT * FROM addresses
WHERE postal_code REGEXP '^[0-9]{5}(-[0-9]{4})?$';
-- Validate credit card format
SELECT * FROM payments
WHERE card_number REGEXP '^[0-9]{13,19}$';Data Cleaning
-- Remove special characters
UPDATE products
SET name_clean = REGEXP_REPLACE(name, '[^a-zA-Z0-9 ]', '');
-- Standardize phone formats
UPDATE contacts
SET phone = REGEXP_REPLACE(phone, '[^0-9]', '');Pattern Extraction
-- Extract domain from email
SELECT
email,
REGEXP_SUBSTR(email, '@([a-zA-Z0-9.-]+)\\.[a-z]{2,}$') AS domain
FROM users;Summary
Regular expressions in MySQL provide:
- Powerful Pattern Matching: Complex patterns beyond LIKE
- Flexible Matching: Wildcards, character classes, anchors
- String Functions: REGEXP_REPLACE, REGEXP_SUBSTR (MySQL 8.0+)
- Search Capabilities: Email, phone, URL validation
Consider performance and use appropriate indexing when working with large datasets.
Previous: NULL Values
Next: Transactions