Using the Captured Text of a Group within a Pattern

It is possible to use the value of a group within the same pattern. For example, suppose you're trying to extract the text between some XML tags and you don't know what the possible sets of tags are. However, you do know that the tag name appears in both the start and end tags. How do you take the dynamically matched tag name from the start tag and use it in the end tag?

Back references can be used in this scenario. A back reference refers to a capture group (see Capturing Text in a Group in a Regular Expression) within the pattern. It has the form \n where n is a group number starting from 1. The back reference should not be contained in or precede the named group.

// Compile regular expression with a back reference to group 1 String patternStr = "<(\\S+?).*?>(.*?)</\\1>"; Pattern pattern = Pattern.compile(patternStr); Matcher matcher = pattern.matcher(""); // Set the input matcher.reset("xx <tag a=b> yy </tag> zz"); // Get tagname and contents of tag boolean matchFound = matcher.find(); // true String tagname = matcher.group(1); // tag String contents = matcher.group(2); // yy matcher.reset("xx <tag> yy </tag0>"); matchFound = matcher.find(); // false

Post a comment

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image. Ignore spaces and be careful about upper and lower case.