Table of Contents
Introduction
Regular Expressions are a set or sequence of special characters that are generally used to map or match to other strings or collection of strings. These expressions have their own set of syntax which differs from the conventional programming style. To use the concept of regular expressions, the following package needs to be imported to your code:
import java.util.regex.*;
Capturing Groups
Capturing Groups is the technique of representing more than one set of characters as a single unit. Groups are commonly used in regular expression programs. A group is determined by brackets or parentheses. For example (cat) represents a single group or unit that has the characters ‘c’, ‘a’, and ‘t’ in it. Several groups can be clubbed together. The number of groups is determined by counting opening parentheses moving from left to right. Let us take the example of ((A)(B(C))). This expression has four groups as follows:
- ((A)(B(C)))
- (A)
- (B(C))
- (C)
Regular Expression syntax
Subexpression | Matches |
^ | Matches the start of the line |
$ | Matches end of the line |
[…] | Single Character in Brackets |
[^..] | Single Character not in Brackets |
\A | Beginning of the String |
\Z | End of the string |
a|b | Matches either to a or b |
\w | Word characters |
\W | Nonword characters |
\s | whitespace |
\d | Digits |
\D | Nondigits |
\G | Matches where the last match was made |
Some of the most used methods of the Matcher Class
- public int start () – Invoking this method returns an integer which is the starting index number of the last matched string.
- public int start (int group) – This method returns the starting index number of the subsequence recorded by the given group during the last match sequence.
- public int end () – This method returns the integer offset after the program finds the last character match.
- public int end (int group) – Returns the integer offset of the specified group after the program finds the last character match.
0 Comments