Proseguo, oggi qui:
[doc]/guide/regexp.html, anzi qui:
Basic Assertions dice la guida, dimmi te come si fa a tradurre? Anche perché poi sono le regole usuali di sempre, forse.
$ identify the beginning and the end of the text string, respectively. They ensure that their adjoining regexps match at one or other end of the text string:
regexp above fails to match because contact does not occur at the beginning of the text string. In
regexp matches the last
\b asserts that a word boundary exists, but this metasequence works only with
#px syntax. In
yackety doesn’t end at a word boundary so it isn’t matched. The second
yack does and is.
Ecco, questa è nuova :wink: mai fidarsi dei racketeers, sono troppo nerds :wink:
#px only) has the opposite effect to
\b; it asserts that a word boundary does not exist. In
an that doesn’t end in a word boundary is matched.
OK, pronto a passare alla prossima pagina, questa:
Caratteri e classi di caratteri
Typically, a character in the
regexp matches the same character in the text string. Sometimes it is necessary or convenient to use a
regexp metasequence to refer to a single character. For example, the metasequence
\. matches the period character.
. matches any character (other than newline in multi-line mode):
The above pattern also matches
p8t, but not
A character class matches any one character from a set of characters. A typical format for this is the bracketed character class
[...], which matches any one character from the non-empty sequence of characters enclosed within the brackets. Thus,
put, and nothing else.
Inside the brackets, a
- between two characters specifies the Unicode range between the characters. For example,
^ after the left bracket inverts the set specified by the rest of the contents; i.e., it specifies the set of characters other than those identified in the brackets. For example,
#rx"do[^g]" matches all three-character sequences starting with
Note that the metacharacter
^ inside brackets means something quite different from what it means outside. Most other metacharacters (
?, etc.) cease to be metacharacters when inside brackets, although you may still escape them for peace of mind. A
- is a metacharacter only when it’s inside brackets, and when it is neither the first nor the last character between the brackets.
Bracketed character classes cannot contain other bracketed character classes (although they contain certain other types of character classes; see below). Thus, a
[ inside a bracketed character class doesn’t have to be a metacharacter; it can stand for itself. For example,
Furthermore, since empty bracketed character classes are disallowed, a
] immediately occurring after the opening left bracket also doesn’t need to be a metacharacter. For example,
Insomma non più sempici del solito :roll:
Alcune classi di caratteri usate sovente
#px syntax, some standard character classes can be conveniently represented as metasequences instead of as explicit bracketed expressions:
\d matches a digit (the same as
\s matches an ASCII whitespace character; and
\w matches a character that could be part of a “word”.
Following regexp custom, we identify “word” characters as
[A-Za-z0-9_], although these are too restrictive for what a Racketeer might consider a “word.”
The upper-case versions of these metasequences stand for the inversions of the corresponding character classes:
\D matches a non-digit,
\S a non-whitespace character, and
\W a non-“word” character.
Remember to include a double backslash when putting these metasequences in a Racket string:
These character classes can be used inside a bracketed expression. For example,
#px"[a-z\\d]" matches a lower-case letter or a digit.
Classi di caratteri POSIX
A POSIX character class is a special metasequence of the form
[:...:] that can be used only inside a bracketed expression in
#px syntax. The POSIX classes supported are:
[:alnum:] — ASCII letters and digits
[:alpha:] — ASCII letters
[:ascii:] — ASCII characters
[:blank:] — ASCII widthful whitespace: space and tab
[:cntrl:] — “control” characters: ASCII 0 to 32
[:digit:] — ASCII digits, same as
[:graph:] — ASCII characters that use ink
[:lower:] — ASCII lower-case letters
[:print:] — ASCII ink-users plus widthful whitespace
[:space:] — ASCII whitespace, same as
[:upper:] — ASCII upper-case letters
[:word:] — ASCII letters and _, same as
[:xdigit:] — ASCII hex digits
For example, the
#px"[[:alpha:]_]" matches a letter or underscore.
The POSIX class notation is valid only inside a bracketed expression. For instance,
[:alpha:], when not inside a bracketed expression, will not be read as the letter class. Rather, it is (from previous principles) the character class containing the characters
Sono sicuro che non sono il solo a pasticciare qui :roll: