Java Regular Expressions
Regular expressions (Regex) are a powerful string processing tool. They are special character sequences used to define search patterns. In Java, the core functionality for handling regular expressions is located in the java.util.regex package.
What are Regular Expressions?
Regular expressions can be used to:
- Validate: Check if a string conforms to a certain format (such as email, phone number).
- Search: Find all substrings in a text that match a specific pattern.
- Replace: Find matching substrings and replace them with other content.
- Split: Split a string based on a pattern.
Core Classes in the java.util.regex Package
PatternClass: Represents a compiled regular expression. APatternobject has no public constructor and needs to be created through its static methodPattern.compile().MatcherClass: A regex matching engine. It performs matching operations on an input string by interpreting aPattern. AMatcherobject is obtained through thepattern.matcher(inputString)method.
Basic Matching Process
Using regular expressions typically follows these three steps:
- Create a
Patternobject usingPattern.compile(regex). - Create a
Matcherobject usingpattern.matcher(input). - Use methods of the
Matcherobject (such asfind(),matches()) to perform matching.
Note: In Java strings, the backslash
\is an escape character, so to use a\in a regular expression, you need to write\\in the string.
Common Matcher Methods
matches(): Attempts to match the entire input string against the pattern. Returnstrueonly if the entire string matches completely.find(): Attempts to find the next subsequence of the input string that matches the pattern. Each call continues searching from where the last match ended.lookingAt(): Attempts to match the pattern from the beginning of the input string. Returnstrueif the beginning matches, without requiring the entire string to match.group(): Returns the substring captured by the last matching operation (such asfind()).start()/end(): Returns the start index and end index (exclusive) of the last matched substring.replaceAll(replacement): Replaces all matching substrings.
Regex Methods in the String Class
For convenience, the String class also has some built-in methods that directly support regular expressions.
-
boolean matches(String regex): Determines if the entire string matches the given regular expression. Equivalent toPattern.matches(regex, this). -
String[] split(String regex): Splits the string based on a regular expression. -
String replaceAll(String regex, String replacement): Replaces all substrings matching the regular expression with the specified string.