202 %This part is about regular expressions, Brzozowski derivatives, |
202 %This part is about regular expressions, Brzozowski derivatives, |
203 %and a bit-coded lexing algorithm with proven correctness and time bounds. |
203 %and a bit-coded lexing algorithm with proven correctness and time bounds. |
204 |
204 |
205 %TODO: look up snort rules to use here--give readers idea of what regexes look like |
205 %TODO: look up snort rules to use here--give readers idea of what regexes look like |
206 |
206 |
207 |
207 \marginpar{rephrasing using "imprecise words"} |
208 Regular expressions, since their inception in the 1940s, |
208 Regular expressions, since their inception in the 1940s, |
209 have been subject to extensive study and implementation. |
209 have been subject to extensive study and implementation. |
210 Their primary application lies in text processing--finding |
210 Their primary application lies in text processing--finding |
211 matches and identifying patterns in a string. |
211 matches and identifying patterns in a string. |
212 %It is often used to match strings that comprises of numerous fields, |
212 %It is often used to match strings that comprises of numerous fields, |
213 %where certain fields may recur or be omitted. |
213 %where certain fields may recur or be omitted. |
214 For example, a simple regular expression that tries |
214 For example, a simple regular expression that tries |
215 to recognise email addresses is |
215 to recognise email addresses is |
216 \marginpar{rephrased from "the regex for recognising" to "a simple regex that tries to match email"} |
216 \marginpar{rephrased from "the regex for recognising" to "a simple regex that tries to match email"} |
217 \begin{center} |
217 \begin{center} |
218 $[a-z0-9.\_]^\backslash+@[a-z0-9.-]^\backslash+\.\{a-z\}\{2,6\}$ |
218 \verb|[a-z0-9._]^+@[a-z0-9.-]^+\.\{a-z\}\{2,6\}| |
219 %$[a-z0-9._]^+@[a-z0-9.-]^+\.[a-z]{2,6}$. |
219 %$[a-z0-9._]^+@[a-z0-9.-]^+\.[a-z]{2,6}$. |
220 \end{center} |
220 \end{center} |
221 \marginpar{Simplified example, but the distinction between . and escaped . is correct |
221 \marginpar{Simplified example, but the distinction between . and escaped . is correct |
222 and therefore left unchanged.} |
222 and therefore left unchanged. Also verbatim package does not straightforwardly support superscripts so + kept as they are.} |
223 |
|
224 %Using this, regular expression matchers and lexers are able to extract |
223 %Using this, regular expression matchers and lexers are able to extract |
225 %the domain names by the use of \verb|[a-zA-Z0-9.-]+|. |
224 %the domain names by the use of \verb|[a-zA-Z0-9.-]+|. |
226 \marginpar{Rewrote explanation for the expression.} |
225 \marginpar{Rewrote explanation for the expression.} |
227 The bracketed sub-expressions are used to extract specific parts of an email address. |
226 The bracketed sub-expressions are used to extract specific parts of an email address. |
228 The local part is recognised by the expression enclosed in |
227 The local part is recognised by the expression enclosed in |