Regular Expressions
Regular expressions provide a concise and flexible way to match strings of text, such as particular characters, words, or patterns of characters
For example, the regular expression "\scar
" matches all occurrences of the string "car
" that are preceded by any white-space character, such as a space, a line-feed, or a tab. So in the string "In this cartoon, the car runs on bicarbonate
", the match would be: "In this cartoon, the car runs on bicarbonate
".
Regular expressions can perform very complex searches, using classes of characters, groupings, back-referencing, zero-width assertions and many different types of conditions and options.
Java Regular Expressions
For details on regular expression with Java, see: http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html.
Examples
The text matched by the expression is highlighted in yellow. Named groups and their corresponding matches are sometimes highlighted in other colors. All the examples assume no options are set, except is stated otherwise.
Expression: tag1|tag2 Options: None Matches: Before <tag1> and <tag2> after
Expression: tag\b Options: None Matches: Before tag tagtag after
Expression: <.*>
Options: None
Matches: Before <tag1> and <tag2> after
Expression: <.*?> Options: None Matches: Before <tag1> and <tag2> after
Expression: colou?r Options: None Matches: Color, colour, color
Expression: (C|c)olou?r Options: None Matches: Color, colour, color
Expression: %(([-0+ #]?)[-0+ #]?)((\d\$)?)(([\d\*]*)(\.[\d\*]*)?)[dioxXucsfeEgGpn] Options: Ignore case: on Matches: %d files not found, including %s (%3.2d%% done) Matches: %1$d files not found, including %2$s (%3$*.*d%% done)
Expression: </?([A-Z0-9a-z]*)\b[^>]*> Options: Ignore case: on matches: Text in <b>bold</b> <a href='link.html'>Link</a> <img href="im.png"/>
SRX Regular Expressions
SRX, the standard format to define segmentation rules, also uses regular expressions.
See the "SRX and Java" page for details on limitations.