Using the Captured Text of a Group within a Pattern
It is possible to use the value of a group within the same pattern.
For example, suppose you're trying to extract the text between some
XML tags and you don't know what the possible sets of tags are.
However, you do know that the tag name appears in both the start and
end tags. How do you take the dynamically matched tag name from
the start tag and use it in the end tag?
Back references can be used in this scenario. A back reference
refers to a capture group (see Capturing Text in a Group in a Regular Expression)
within the pattern. It has the form \n where n is a group number
starting from 1. The back reference should not be contained in or
precede the named group.
// Compile regular expression with a back reference to group 1
String patternStr = "<(\\S+?).*?>(.*?)</\\1>";
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher("");
// Set the input
matcher.reset("xx <tag a=b> yy </tag> zz");
// Get tagname and contents of tag
boolean matchFound = matcher.find(); // true
String tagname = matcher.group(1); // tag
String contents = matcher.group(2); // yy
matcher.reset("xx <tag> yy </tag0>");
matchFound = matcher.find(); // false
Post a comment