203 \pcode{(re)} & groups regular expressions and remembers |
203 \pcode{(re)} & groups regular expressions and remembers |
204 matched text |
204 matched text |
205 \end{tabular} |
205 \end{tabular} |
206 \end{center} |
206 \end{center} |
207 |
207 |
|
208 \noindent |
208 The syntax is pretty universal and can be found in many regular |
209 The syntax is pretty universal and can be found in many regular |
209 expression libraries. If you need a quick recap about regular |
210 expression libraries. If you need a quick recap about regular |
210 expressions and how the match strings, here is a quick video: |
211 expressions and how the match strings, here is a quick video: |
211 \url{https://youtu.be/bgBWp9EIlMM}. |
212 \url{https://youtu.be/bgBWp9EIlMM}. |
212 |
213 |
407 still beat them hands down with our regex matcher. |
408 still beat them hands down with our regex matcher. |
408 |
409 |
409 \subsection*{Basic Regular Expressions} |
410 \subsection*{Basic Regular Expressions} |
410 |
411 |
411 The regular expressions shown earlier for Scala, we |
412 The regular expressions shown earlier for Scala, we |
412 will call \emph{extended regular expressions}. The ones we |
413 will in this module call \emph{extended regular expressions}. The ones we |
413 will mainly study in this module are \emph{basic regular |
414 will mainly study are \emph{basic regular |
414 expressions}, which by convention we will just call |
415 expressions}, which by convention we will just call |
415 \emph{regular expressions}, if it is clear what we mean. The |
416 \emph{regular expressions}, if it is clear what we mean. The |
416 attraction of (basic) regular expressions is that many |
417 attraction of (basic) regular expressions is that many |
417 features of the extended ones are just syntactic sugar. |
418 features of the extended ones are just syntactic sugar. |
418 (Basic) regular expressions are defined by the following |
419 (Basic) regular expressions are defined by the following |
432 \noindent Because we overload our notation, there are some |
433 \noindent Because we overload our notation, there are some |
433 subtleties you should be aware of. When regular expressions |
434 subtleties you should be aware of. When regular expressions |
434 are referred to, then $\ZERO$ (in bold font) does not stand for |
435 are referred to, then $\ZERO$ (in bold font) does not stand for |
435 the number zero: rather it is a particular pattern that does |
436 the number zero: rather it is a particular pattern that does |
436 not match any string. Similarly, in the context of regular |
437 not match any string. Similarly, in the context of regular |
437 expressions, $\ONE$ does not stand for the number one but for |
438 expressions, $\ONE$ does not stand for the number one, but for |
438 a regular expression that matches the empty string. The letter |
439 a regular expression that matches the empty string. The letter |
439 $c$ stands for any character from the alphabet at hand. Again |
440 $c$ stands for any character from the alphabet at hand. Again |
440 in the context of regular expressions, it is a particular |
441 in the context of regular expressions, it is a particular |
441 pattern that can match the specified character. You should |
442 pattern that can match the specified character. You should |
442 also be careful with our overloading of the star: assuming you |
443 also be careful with our overloading of the star: assuming you |