2009. július 17., péntek

Linux script (Escapes in Regular Expressions)

GNU Extensions for Escapes in Regular Expressions

The list of these escapes is:

\a
Produces or matches a bel character, that is an “alert” (ascii 7).
\f
Produces or matches a form feed (ascii 12).
\n
Produces or matches a newline (ascii 10).
\r
Produces or matches a carriage return (ascii 13).
\t
Produces or matches a horizontal tab (ascii 9).
\v
Produces or matches a so called “vertical tab” (ascii 11).
\cx
Produces or matches Control-x, where x is any character. The precise effect of `\cx' is as follows: if x is a lower case letter, it is converted to upper case. Then bit 6 of the character (hex 40) is inverted. Thus `\cz' becomes hex 1A, but `\c{' becomes hex 3B, while `\c;' becomes hex 7B.
\dxxx
Produces or matches a character whose decimal ascii value is xxx.
\oxxx
Produces or matches a character whose octal ascii value is xxx.
\xxx
Produces or matches a character whose hexadecimal ascii value is xx.

`\b' (backspace) was omitted because of the conflict with the existing “word boundary” meaning.

Other escapes match a particular character class and are valid only in regular expressions:

\w
Matches any “word” character. A “word” character is any letter or digit or the underscore character.
\W
Matches any “non-word” character.
\b
Matches a word boundary; that is it matches if the character to the left is a “word” character and the character to the right is a “non-word” character, or vice-versa.
\B
Matches everywhere but on a word boundary; that is it matches if the character to the left and the character to the right are either both “word” characters or both “non-word” characters.
\`
Matches only at the start of pattern space. This is different from ^ in multi-line mode.
\'
Matches only at the end of pattern space. This is different from $ in multi-line mode.

Nincsenek megjegyzések:

Megjegyzés küldése