Table of Contents
Introduction
Metacharacters are part of Java Regex (Java Regular Expressions). Metacharacters are essentially characters of the English Alphabet but represent special meaning in the context of Java regular expressions.
List of Metacharacters
Metacharacter | Description |
d | Any digit, short for [0 – 9] |
D | Any non-digit, short for [^ 0 – 9] |
s | Whitespace character, short for [tnx0brf] |
S | Non-whitespace character, short for [^s] |
w | A word character, short for [a – zA – Z_0-9] |
W | A non-word character [^w] |
S+ | Collection of non-white characters |
b | Matches the word boundary |
Character Classes
When using metacharacters, we often use ‘[ ]’ and enclose characters within the brackets. This creates a character class inside a regular expression.
For example: [ABC] will match the characters A, B, and C. Similarly, [XYZ] will match the characters X, Y, and Z. If we have a regular expression “li[ed]d”, it will match both to “lied” and “lidd” (this word does not have any meaning, it is only demonstrated as an example).
[^ABC] denotes any character other than A, B, or C. When the expressed is reversed, the meaning changes completely. [ABC^] matches with A, B, C, or ^. The following table will illustrate better some of the character classes and their meanings:Character Class | Description |
[abc] | Matches with characters a, b, or c. |
[^xyz] | Any character other than x, y, and z. |
[a-z] | Any character starting from a to z. ‘-’ denotes a range of acceptable characters. |
[a-cx-z] | Characters from a to c, or x to z. Therefore, it would only include a, b, c, x, y, and z. |
[0-9&&[4-8]] | The number must lie between 0 to 9 as well as 4 to 8. Therefore, the possible range is 4-8. |
[a-z&&[^aeiou]] | This will accept any lowercase letter except a, e, i, o, and u. This can be represented as: [(lowercase alphabet) – {a, e, i, o, u}] |
0 Comments