Using Metacharacters

Introduction

Metacharacters are part of Java Regex (Java Regular Expressions). Metacharacters are essentially characters of the English Alphabet but represent special meaning in the context of Java regular expressions.

List of Metacharacters

MetacharacterDescription
dAny digit, short for [0 – 9]
DAny non-digit, short for [^ 0 – 9]
sWhitespace character, short for [tnx0brf]
SNon-whitespace character, short for [^s]
wA word character, short for [a – zA – Z_0-9]
WA non-word character [^w]
S+Collection of non-white characters
bMatches the word boundary

 

Character Classes

When using metacharacters, we often use ‘[  ]’ and enclose characters within the brackets. This creates a character class inside a regular expression.

For example: [ABC] will match the characters A, B, and C. Similarly, [XYZ] will match the characters X, Y, and Z. If we have a regular expression “li[ed]d”, it will match both to “lied” and “lidd” (this word does not have any meaning, it is only demonstrated as an example).

[^ABC] denotes any character other than A, B, or C. When the expressed is reversed, the meaning changes completely. [ABC^] matches with A, B, C, or ^. The following table will illustrate better some of the character classes and their meanings:

Character ClassDescription
[abc]Matches with characters a, b, or c.
[^xyz]Any character other than x, y, and z.
[a-z]Any character starting from a to z. ‘-’ denotes a range of acceptable characters.
[a-cx-z]Characters from a to c, or x to z. Therefore, it would only include a, b, c, x, y, and z.
[0-9&&[4-8]]The number must lie between 0 to 9 as well as 4 to 8. Therefore, the possible range is 4-8.
[a-z&&[^aeiou]]This will accept any lowercase letter except a, e, i, o, and u. This can be represented as: [(lowercase alphabet) – {a, e, i, o, u}]

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.