8 \usepackage{tikz} |
9 \usepackage{tikz} |
9 \usepackage{pgf} |
10 \usepackage{pgf} |
10 \usetikzlibrary{positioning} |
11 \usetikzlibrary{positioning} |
11 %\usepackage{pdfsetup} |
12 %\usepackage{pdfsetup} |
12 \usepackage{stmaryrd} |
13 \usepackage{stmaryrd} |
13 %\usepackage{url} |
14 \usepackage{url} |
14 %\usepackage{color} |
15 %\usepackage{color} |
15 %\usepackage[safe]{tipa} |
16 %\usepackage[safe]{tipa} |
16 %\usepackage[sc]{mathpazo} |
17 %\usepackage[sc]{mathpazo} |
17 %\usepackage{fontspec} |
18 %\usepackage{fontspec} |
18 %\setmainfont[Ligatures=TeX]{Palatino Linotype} |
19 %\setmainfont[Ligatures=TeX]{Palatino Linotype} |
56 \def\Brz{Brzozowski} |
57 \def\Brz{Brzozowski} |
57 \def\der{\backslash} |
58 \def\der{\backslash} |
58 \newtheorem{falsehood}{Falsehood} |
59 \newtheorem{falsehood}{Falsehood} |
59 \newtheorem{conject}{Conjecture} |
60 \newtheorem{conject}{Conjecture} |
60 |
61 |
61 \bibliographystyle{plainurl} |
62 %!\bibliographystyle{plainurl} |
|
63 \bibliographystyle{plain} |
62 |
64 |
63 \title{{POSIX} {L}exing with {B}itcoded {D}erivatives} |
65 \title{{POSIX} {L}exing with {B}itcoded {D}erivatives} |
64 \titlerunning{POSIX Lexing with Bitcoded Derivatives} |
66 \titlerunning{POSIX Lexing with Bitcoded Derivatives} |
65 \author{Chengsong Tan}{King's College London}{chengsong.tan@kcl.ac.uk}{}{} |
67 \author{Chengsong Tan\inst{1,2} \and Christian Urban\inst{2}} |
66 \author{Christian Urban}{King's College London}{christian.urban@kcl.ac.uk}{}{} |
68 \institute{Imperial College London \and King's College London\\ |
67 \authorrunning{C.~Tan and C.~Urban} |
69 \email{\{chengsong.tan,christian.urban\}@kcl.ac.uk}} |
68 \keywords{POSIX matching and lexing, derivatives of regular expressions, Isabelle/HOL} |
70 %!\author{Chengsong Tan}{King's College London}{chengsong.tan@kcl.ac.uk}{}{} |
69 \category{} |
71 %!\author{Christian Urban}{King's College London}{christian.urban@kcl.ac.uk}{}{} |
70 \ccsdesc[100]{Design and analysis of algorithms} |
72 %!\authorrunning{C.~Tan and C.~Urban} |
71 \ccsdesc[100]{Formal languages and automata theory} |
73 %!\keywords{POSIX matching and lexing, derivatives of regular expressions, Isabelle/HOL} |
72 \Copyright{\mbox{}} |
74 %!\category{} |
73 \renewcommand{\DOIPrefix}{} |
75 %!\ccsdesc[100]{Design and analysis of algorithms} |
74 \nolinenumbers |
76 %!\ccsdesc[100]{Formal languages and automata theory} |
|
77 %!\Copyright{\mbox{}} |
|
78 %!\renewcommand{\DOIPrefix}{} |
|
79 %!\nolinenumbers |
75 |
80 |
76 |
81 |
77 \begin{document} |
82 \begin{document} |
78 \maketitle |
83 \maketitle |
79 |
84 |
91 derivatives finitely bounded. Without simplification the size of some derivatives can grow |
96 derivatives finitely bounded. Without simplification the size of some derivatives can grow |
92 arbitrarily big, resulting in an extremely slow lexing algorithm. In this |
97 arbitrarily big, resulting in an extremely slow lexing algorithm. In this |
93 paper we describe a variant of Sulzmann and Lu's algorithm: Our |
98 paper we describe a variant of Sulzmann and Lu's algorithm: Our |
94 variant is a recursive functional program, whereas Sulzmann |
99 variant is a recursive functional program, whereas Sulzmann |
95 and Lu's version involves a fixpoint construction. We \textit{(i)} |
100 and Lu's version involves a fixpoint construction. We \textit{(i)} |
96 prove in Isabelle/HOL that our algorithm is correct and generates |
101 prove in Isabelle/HOL that our variant is correct and generates |
97 unique POSIX values; we also \textit{(ii)} establish finite |
102 unique POSIX values (no such proof has been given for the |
98 bounds for the size of the derivatives. |
103 original algorithm by Sulzmann and Lu); we also \textit{(ii)} establish finite |
|
104 bounds for the size of our derivatives. |
99 |
105 |
100 %The size can be seen as a |
106 %The size can be seen as a |
101 %proxy measure for the efficiency of the lexing algorithm: because of |
107 %proxy measure for the efficiency of the lexing algorithm: because of |
102 %the polynomial bound our algorithm does not suffer from |
108 %the polynomial bound our algorithm does not suffer from |
103 %the exponential blowup in earlier works. |
109 %the exponential blowup in earlier works. |