Capitolo nuovo, argomento classico e panicante, RE o regexp, qui,
La guida ci ricorda che questa (o meglio lei) è una versione modificata di un classico: Dorai Sitaram, “pregexp: Portable Regular Expressions for Scheme and Common Lisp.” 2002.
regexp value encapsulates a pattern that is described by a string or
byte string. The
regexp matcher tries to match this pattern against (a portion of) another string or byte string, which we will call the text string, when you call functions like
regexp-match. The text string is treated as raw text, and not as a pattern.
La guida ne parla diffusamente ma se non bastasse: Regular Expressions in The Racket Reference provides more on regexps, qui
OK, passo a
[doc]/guide/regexp-intro.html dove trovo…
Un paio di cose: userò regexp anch’io e poi il “patterns” del titolo originale –al solito– uhmmm, schemi, modelli, ___________.
A string or
string can be used directly as a
regexp pattern, or it can be prefixed with
#rx to form a literal
regexp value. For example,
#rx"abc" is a
regexp value, and
#rx#"abc" is a
regexp value. Alternately, a string or byte string can be prefixed with
#px, as in
#px"abc", for a slightly extended syntax of patterns within the string.
Most of the characters in a
regexp pattern are meant to match occurrences of themselves in the
string. Thus, the pattern
#rx"abc" matches a string that contains the characters
c in succession. Other characters act as metacharacters, and some character sequences act as metasequences. That is, they specify something other than their literal selves. For example, in the pattern
#rx"a.c", the characters a and c stand for themselves, but the
. can match any character. Therefore, the pattern
#rx"a.c" matches an
a, any character, and
c in succession.
Note: When we want a literal
\ inside a Racket string or regexp literal, we must escape it so that it shows up in the string at all. Racket strings use
\ as the escape character, so we end up with two
\s: one Racket-string
\ to escape the regexp
\, which then escapes the
.. Another character that would need escaping inside a Racket string is
If we needed to match the character
. itself, we can escape it by precede it with a
\. The character sequence
\. is thus a
metasequence, since it doesn’t match itself but rather just
.. So, to match
c in succession, we use the regexp pattern
#rx"a\\.c"; the double
\ is an artifact of Racket strings, not the
regexp pattern itself.
regexp function takes a string or byte string and produces a
regexp value. Use
regexp when you construct a pattern to be matched against multiple strings, since a pattern is compiled to a
regexp value before it can be used in a match. The
pregexp function is like
regexp, but using the extended syntax. Regexp values as literals with
#px are compiled once and for all when they are read.
regexp-quote function takes an arbitrary string and returns a string for a pattern that matches exactly the original string. In particular, characters in the input string that could serve as regexp metacharacters are escaped with a backslash, so that they safely match only themselves.
regexp-quote function is useful when building a composite
regexp from a mix of
regexp strings and verbatim strings.