SRX and Java

From OkapiWiki

Jump to: navigation, search

The SRX 2.0 standard is based on the ICU regular expression notation.

many Java applications use Java's regular expressions to implement SRX because ICU4J (ICU for Java) does not provide support of ICU regular expressions.

As of version 1.6 Java does not have support for some of the Unicode-enabled features as described in ICU. For example in Java "\w" means "[a-zA-Z_0-9]" not "[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}]" like in ICU. Some ICU features can be replaced by an equivalent expression in Java, but some other features simply cannot be implemented in Java.

The following table shows the ICU and Java differences. The yellow entries denote a case where the ICU expression needs to be mapped to a Java equivalent (sometimes a complex one), and the red entries indicate the cases where the ICU expression cannot be mapped in Java.

Note: Starti