Java RegEx – Matching Groups example shows how to match groups in Java regular expression. Capturing groups in a regex can be done using brackets.
How to match groups in a regex pattern in Java?
Capturing groups in regex is an important function and can be very useful in extracting data from the string content. A group in regex is a group of characters and it is defined using the opening and closing parenthesis, “(” and “)”. Matched groups in the content are stored for later reference.
Here is a simple example of matching a group of digits in a string.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
package com.javacodeexamples.regex; import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegExMatchGroupsExample { public static void main(String[] args) { String strSource = "Square root of 144 is 12"; Pattern p = Pattern.compile("Square root of (\\d+) is (\\d+)"); Matcher m = p.matcher(strSource); if(m.matches()) { System.out.println("Group 1: " + m.group(1)); System.out.println("Group 2: " + m.group(2)); } } } |
Output
1 2 |
Group 1: 144 Group 2: 12 |
As you can see from the output, we have captured two numbers from the string using regex groups. The group
method of the Matcher class returns the captured group by the specified number.
1 |
public String group(int group) |
The group number starts from 1 while the group 0 is a special group that represents the entire expression.
We can also nest the groups inside other groups like (a(b(c)). There are 3 groups in this expression. The groups are counted from the left to the right. So in this expression the group 1 is (a(b(c)), group 2 is (b(c)), and group 3 is (c).
1 2 3 4 5 6 7 8 9 10 11 12 |
String strSource = "Processing file 12 of 100"; Pattern p = Pattern.compile("Processing (file (\\d+) of (\\d+))"); Matcher m = p.matcher(strSource); if(m.matches()) { System.out.println("Group 0: " + m.group(0)); System.out.println("Group 1: " + m.group(1)); System.out.println("Group 2: " + m.group(2)); System.out.println("Group 3: " + m.group(3)); } |
Output
1 2 3 4 |
Group 0: Processing file 12 of 100 Group 1: file 12 of 100 Group 2: 12 Group 3: 100 |
How to get the start and end index of captured groups?
The start
and end
methods of the Matcher class return the start and end index of the matched group respectively.
1 |
public int start() |
This method returns the start index of the most recent match.
1 |
public int end() |
This method returns the end index of the most recent match.
Example:
1 2 3 4 5 6 7 8 9 10 11 |
String strSource = "Processing file 12 of 100"; Pattern p = Pattern.compile("(\\d+)"); Matcher m = p.matcher(strSource); while( m.find() ) { System.out.println("Matched group: " + m.group()); System.out.println("Start index: " + m.start()); System.out.println("End index: " + m.end()); } |
Output
1 2 3 4 5 6 |
Matched group: 12 Start index: 16 End index: 18 Matched group: 100 Start index: 22 End index: 25 |
The captured groups can be referenced using the “\” followed by the group number. Please see below example for understating how to reference the groups.
1 2 3 4 5 6 7 8 |
String strSource = "cat & kitkat"; Pattern p = Pattern.compile("c(at) & kitk(\\1)"); Matcher m = p.matcher(strSource); if( m.matches() ) { System.out.println(m.group()); } |
Ouput
1 |
cat & kitkat |
In this example, I have referenced the group (at) using group reference “\1”. The pattern “c(at) & kitk(\\1)” is same as the pattern “c(at) & kitk(at)”.
If you learn more about regex, visit the Java RegEx tutorial.
Please let me know your views in the comments section below.