The format of the set1 and set2 arguments resembles the format of regular expressions; however, they are not regular expressions, only lists of characters. Most characters simply represent themselves in these strings, but the strings can contain the shorthands listed below, for convenience. Some of them can be used only in set1 or set2, as noted below.
While a backslash followed by a character not listed above is
interpreted as that character, the backslash also effectively
removes any special significance, so it is useful to escape
‘[’, ‘]’, ‘*’, and ‘-’.
gnu tr does not support the System V syntax that uses square brackets to enclose ranges. Translations specified in that format sometimes work as expected, since the brackets are often transliterated to themselves. However, they should be avoided because they sometimes behave unexpectedly. For example, ‘tr -d '[0-9]'’ deletes brackets as well as digits.
Many historically common and even accepted uses of ranges are not
portable. For example, on EBCDIC hosts using the ‘A-Z’
range will not do what most would expect because ‘A’ through ‘Z’
are not contiguous as they are in ASCII.
If you can rely on a POSIX compliant version of tr, then
the best way to work around this is to use character classes (see below).
Otherwise, it is most portable (and most ugly) to enumerate the members
of the ranges.
upper
and lower
classes,
which expand in ascending order. When the --delete (-d)
and --squeeze-repeats (-s) options are both given, any
character class can be used in set2. Otherwise, only the
character classes lower
and upper
are accepted in
set2, and then only if the corresponding character class
(upper
and lower
, respectively) is specified in the same
relative position in set1. Doing this specifies case conversion.
The class names are given below; an error results when an invalid class
name is given.
alnum
alpha
blank
cntrl
digit
graph
lower
print
punct
space
upper
xdigit