15 Knuth-Morris-Pratt algorithm, which is currently the most |
15 Knuth-Morris-Pratt algorithm, which is currently the most |
16 efficient general string search algorithm. But often we do |
16 efficient general string search algorithm. But often we do |
17 \emph{not} just look for a particular string, but for string |
17 \emph{not} just look for a particular string, but for string |
18 patterns. For example in programming code we need to identify |
18 patterns. For example in programming code we need to identify |
19 what are the keywords, what are the identifiers etc. A pattern |
19 what are the keywords, what are the identifiers etc. A pattern |
20 for identifiers could be that they start with a letter, |
20 for identifiers could be stated as: they start with a letter, |
21 followed by zero or more letters, numbers and the underscore. |
21 followed by zero or more letters, numbers and the underscore. |
22 Also often we face the problem that we are given a string (for |
22 Also often we face the problem that we are given a string (for |
23 example some user input) and want to know whether it matches a |
23 example some user input) and want to know whether it matches a |
24 particular pattern. In this way we can exclude user input that |
24 particular pattern. In this way we can exclude user input that |
25 would otherwise have nasty effects on our program (crashing it |
25 would otherwise have nasty effects on our program (crashing it |
26 or going into an infinite loop, if not worse). \defn{Regular |
26 or going into an infinite loop, if not worse). \defn{Regular |
27 expressions} help with conveniently specifying such patterns. |
27 expressions} help with conveniently specifying such patterns. |
28 The idea behind regular expressions is that they are a simple |
28 The idea behind regular expressions is that they are a simple |
29 method for describing languages (or sets of strings)\ldots at |
29 method for describing languages (or sets of strings)\ldots at |
30 least languages we are interested in in computer science. For |
30 least languages we are interested in in computer science. For |
31 example there is no convenient regular expression for |
31 example there is no convenient regular expression for |
32 describing the English language short of enumerating all |
32 describing the English language short of enumerating all |