@Beta @GwtCompatible(emulated=true) public abstract class CharMatcher extends Objectimplements Predicate <Character >
char value, just as
Predicate does for any
Object. Also offers basic text processing methods based on this function. Implementations are strongly encouraged to be side-effect-free and immutable.
Throughout the documentation of this class, the phrase "matching character" is used to mean "any character c for which this.matches(c) returns true".
Note: This class deals only with char values; it does not understand supplementary Unicode code points in the range 0x10000 to 0x10FFFF. Such logical characters are encoded into a String using surrogate pairs, and a CharMatcher treats these just as two separate characters.
Example usages:
String trimmed =WHITESPACE.trimFrom(userInput); if (ASCII.matchesAllOf(s)) { ... }
See the Guava User Guide article on CharMatcher.
| Modifier and Type | Field and Description |
|---|---|
static CharMatcher |
ANY
Matches any character.
|
static CharMatcher |
ASCII
Determines whether a character is ASCII, meaning that its code point is less than 128.
|
static CharMatcher |
BREAKING_WHITESPACE
Determines whether a character is a breaking whitespace (that is, a whitespace which can be interpreted as a break between words for formatting purposes).
|
static CharMatcher |
DIGIT
Determines whether a character is a digit according to
Unicode.
|
static CharMatcher |
INVISIBLE
Determines whether a character is invisible; that is, if its Unicode category is any of SPACE_SEPARATOR, LINE_SEPARATOR, PARAGRAPH_SEPARATOR, CONTROL, FORMAT, SURROGATE, and PRIVATE_USE according to ICU4J.
|
static CharMatcher |
JAVA_DIGIT
Determines whether a character is a digit according to
Java's definition.
|
static CharMatcher |
JAVA_ISO_CONTROL
Determines whether a character is an ISO control character as specified by
Character.
|
static CharMatcher |
JAVA_LETTER
Determines whether a character is a letter according to
Java's definition.
|
static CharMatcher |
JAVA_LETTER_OR_DIGIT
Determines whether a character is a letter or digit according to
Java's definition.
|
static CharMatcher |
JAVA_LOWER_CASE
Determines whether a character is lower case according to
Java's definition.
|
static CharMatcher |
JAVA_UPPER_CASE
Determines whether a character is upper case according to
Java's definition.
|
static CharMatcher |
NONE
Matches no characters.
|
static CharMatcher |
SINGLE_WIDTH
Determines whether a character is single-width (not double-width).
|
static CharMatcher |
WHITESPACE
Determines whether a character is whitespace according to the latest Unicode standard, as illustrated
here.
|
| Modifier | Constructor and Description |
|---|---|
protected |
CharMatcher()
Constructor for use by subclasses.
|
| Modifier and Type | Method and Description |
|---|---|
CharMatcher |
and(CharMatcher
Returns a matcher that matches any character matched by both this matcher and
other.
|
static CharMatcher |
anyOf(CharSequence
Returns a
char matcher that matches any character present in the given character sequence.
|
boolean |
apply(Character
Deprecated.
Provided only to satisfy the
Predicate interface; use matches(char) instead.
|
String |
collapseFrom(CharSequence
Returns a string copy of the input character sequence, with each group of consecutive characters that match this matcher replaced by a single replacement character.
|
int |
countIn(CharSequence
Returns the number of matching characters found in a character sequence.
|
static CharMatcher |
forPredicate(Predicate
Returns a matcher with identical behavior to the given
Character-based predicate, but which operates on primitive
char instances instead.
|
int |
indexIn(CharSequence
Returns the index of the first matching character in a character sequence, or
-1 if no matching character is present.
|
int |
indexIn(CharSequence
Returns the index of the first matching character in a character sequence, starting from a given position, or
-1 if no character matches after that position.
|
static CharMatcher |
inRange(char startInclusive, char endInclusive)
Returns a
char matcher that matches any character in a given range (both endpoints are inclusive).
|
static CharMatcher |
is(char match)
Returns a
char matcher that matches only one specified character.
|
static CharMatcher |
isNot(char match)
Returns a
char matcher that matches any character except the one specified.
|
int |
lastIndexIn(CharSequence
Returns the index of the last matching character in a character sequence, or
-1 if no matching character is present.
|
abstract boolean |
matches(char c)
Determines a true or false value for the given character.
|
boolean |
matchesAllOf(CharSequence
Returns
true if a character sequence contains only matching characters.
|
boolean |
matchesAnyOf(CharSequence
Returns
true if a character sequence contains at least one matching character.
|
boolean |
matchesNoneOf(CharSequence
Returns
true if a character sequence contains no matching characters.
|
CharMatcher |
negate()
Returns a matcher that matches any character not matched by this matcher.
|
static CharMatcher |
noneOf(CharSequence
Returns a
char matcher that matches any character not present in the given character sequence.
|
CharMatcher |
or(CharMatcher
Returns a matcher that matches any character matched by either this matcher or
other.
|
CharMatcher |
precomputed()
Returns a
char matcher functionally equivalent to this one, but which may be faster to query than the original; your mileage may vary.
|
String |
removeFrom(CharSequence
Returns a string containing all non-matching characters of a character sequence, in order.
|
String |
replaceFrom(CharSequence
Returns a string copy of the input character sequence, with each character that matches this matcher replaced by a given replacement character.
|
String |
replaceFrom(CharSequence
Returns a string copy of the input character sequence, with each character that matches this matcher replaced by a given replacement sequence.
|
String |
retainFrom(CharSequence
Returns a string containing all matching characters of a character sequence, in order.
|
String |
toString()
Returns a string representation of this
CharMatcher, such as
CharMatcher.or(WHITESPACE, JAVA_DIGIT).
|
String |
trimAndCollapseFrom(CharSequence
Collapses groups of matching characters exactly as
collapseFrom(java.lang.CharSequence, char) does, except that groups of matching characters at the start or end of the sequence are removed without replacement.
|
String |
trimFrom(CharSequence
Returns a substring of the input character sequence that omits all characters this matcher matches from the beginning and from the end of the string.
|
String |
trimLeadingFrom(CharSequence
Returns a substring of the input character sequence that omits all characters this matcher matches from the beginning of the string.
|
String |
trimTrailingFrom(CharSequence
Returns a substring of the input character sequence that omits all characters this matcher matches from the end of the string.
|
public static final CharMatcherBREAKING_WHITESPACE
WHITESPACE for a discussion of that term.
public static final CharMatcherASCII
public static final CharMatcherDIGIT
inRange('0', '9').
public static final CharMatcherJAVA_DIGIT
inRange('0', '9').
public static final CharMatcherJAVA_LETTER
inRange('a', 'z').or(inRange('A', 'Z')).
public static final CharMatcherJAVA_LETTER_OR_DIGIT
public static final CharMatcherJAVA_UPPER_CASE
public static final CharMatcherJAVA_LOWER_CASE
public static final CharMatcherJAVA_ISO_CONTROL
Character.isISOControl(char) .
public static final CharMatcherINVISIBLE
public static final CharMatcherSINGLE_WIDTH
false (that is, it tends to assume a character is double-width).
Note: as the reference file evolves, we will modify this constant to keep it up to date.
public static final CharMatcherANY
public static final CharMatcherNONE
public static final CharMatcherWHITESPACE
Note: as the Unicode definition evolves, we will modify this constant to keep it up to date.
protected CharMatcher()
toString() to provide a useful description.
public static CharMatcheris(char match)
char matcher that matches only one specified character.
public static CharMatcherisNot(char match)
char matcher that matches any character except the one specified.
To negate another CharMatcher, use negate().
public static CharMatcheranyOf(CharSequence sequence)
char matcher that matches any character present in the given character sequence.
public static CharMatchernoneOf(CharSequence sequence)
char matcher that matches any character not present in the given character sequence.
public static CharMatcherinRange(char startInclusive, char endInclusive)
char matcher that matches any character in a given range (both endpoints are inclusive). For example, to match any lowercase letter of the English alphabet, use
CharMatcher.inRange('a', 'z').
IllegalArgumentException - if
endInclusive < startInclusive
public static CharMatcherforPredicate(Predicate <? super Character > predicate)
Character-based predicate, but which operates on primitive
char instances instead.
public abstract boolean matches(char c)
public CharMatchernegate()
public CharMatcherand(CharMatcher other)
other.
public CharMatcheror(CharMatcher other)
other.
public CharMatcherprecomputed()
char matcher functionally equivalent to this one, but which may be faster to query than the original; your mileage may vary. Precomputation takes time and is likely to be worthwhile only if the precomputed matcher is queried many thousands of times.
This method has no effect (returns this) when called in GWT: it's unclear whether a precomputed matcher is faster, but it certainly consumes more memory, which doesn't seem like a worthwhile tradeoff in a browser.
public boolean matchesAnyOf(CharSequencesequence)
true if a character sequence contains at least one matching character. Equivalent to
!matchesNoneOf(sequence).
The default implementation iterates over the sequence, invoking matches(char) for each character, until this returns true or the end is reached.
sequence - the character sequence to examine, possibly empty
true if this matcher matches at least one character in the sequence
public boolean matchesAllOf(CharSequencesequence)
true if a character sequence contains only matching characters.
The default implementation iterates over the sequence, invoking matches(char) for each character, until this returns false or the end is reached.
sequence - the character sequence to examine, possibly empty
true if this matcher matches every character in the sequence, including when the sequence is empty
public boolean matchesNoneOf(CharSequencesequence)
true if a character sequence contains no matching characters. Equivalent to
!matchesAnyOf(sequence).
The default implementation iterates over the sequence, invoking matches(char) for each character, until this returns false or the end is reached.
sequence - the character sequence to examine, possibly empty
true if this matcher matches every character in the sequence, including when the sequence is empty
public int indexIn(CharSequencesequence)
-1 if no matching character is present.
The default implementation iterates over the sequence in forward order calling matches(char) for each character.
sequence - the character sequence to examine from the beginning
-1 if no character matches
public int indexIn(CharSequencesequence, int start)
-1 if no character matches after that position.
The default implementation iterates over the sequence in forward order, beginning at start, calling matches(char) for each character.
sequence - the character sequence to examine
start - the first index to examine; must be nonnegative and no greater than
sequence.length()
start, or
-1 if no character matches
IndexOutOfBoundsException - if start is negative or greater than
sequence.length()
public int lastIndexIn(CharSequencesequence)
-1 if no matching character is present.
The default implementation iterates over the sequence in reverse order calling matches(char) for each character.
sequence - the character sequence to examine from the end
-1 if no character matches
public int countIn(CharSequencesequence)
public StringremoveFrom(CharSequence sequence)
CharMatcher.is('a').removeFrom("bazaar") ... returns
"bzr".
public StringretainFrom(CharSequence sequence)
CharMatcher.is('a').retainFrom("bazaar") ... returns
"aaa".
public StringreplaceFrom(CharSequence sequence, char replacement)
CharMatcher.is('a').replaceFrom("radar", 'o') ... returns
"rodor".
The default implementation uses indexIn(CharSequence) to find the first matching character, then iterates the remainder of the sequence calling matches(char) for each character.
sequence - the character sequence to replace matching characters in
replacement - the character to append to the result string in place of each matching character in
sequence
public StringreplaceFrom(CharSequence sequence, CharSequence replacement)
CharMatcher.is('a').replaceFrom("yaha", "oo") ... returns
"yoohoo".
Note: If the replacement is a fixed string with only one character, you are better off calling replaceFrom(CharSequence, char) directly.
sequence - the character sequence to replace matching characters in
replacement - the characters to append to the result string in place of each matching character in
sequence
public StringtrimFrom(CharSequence sequence)
CharMatcher.anyOf("ab").trimFrom("abacatbab") ... returns
"cat".
Note that:
CharMatcher.inRange('\0', ' ').trimFrom(str) ... is equivalent to
String.trim() .
public StringtrimLeadingFrom(CharSequence sequence)
CharMatcher.anyOf("ab").trimLeadingFrom("abacatbab") ... returns
"catbab".
public StringtrimTrailingFrom(CharSequence sequence)
CharMatcher.anyOf("ab").trimTrailingFrom("abacatbab") ... returns
"abacat".
public StringcollapseFrom(CharSequence sequence, char replacement)
CharMatcher.anyOf("eko").collapseFrom("bookkeeper", '-') ... returns
"b-p-r".
The default implementation uses indexIn(CharSequence) to find the first matching character, then iterates the remainder of the sequence calling matches(char) for each character.
sequence - the character sequence to replace matching groups of characters in
replacement - the character to append to the result string in place of each group of matching characters in
sequence
public StringtrimAndCollapseFrom(CharSequence sequence, char replacement)
collapseFrom(java.lang.CharSequence, char) does, except that groups of matching characters at the start or end of the sequence are removed without replacement.
@Deprecated public boolean apply(Charactercharacter)
Predicate interface; use matches(char) instead.
Predicate
input. This method is
generally expected, but not absolutely required, to have the following properties:
Objects.equal(a, b) implies that predicate.apply(a) == predicate.apply(b)). public StringtoString()
CharMatcher, such as
CharMatcher.or(WHITESPACE, JAVA_DIGIT).