lex/index.html
author Christian Urban <urbanc@in.tum.de>
Sat, 28 Jan 2017 07:17:00 +0000
changeset 465 4dac76eb27d9
parent 442 e36d33525295
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
392
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     2
"http://www.w3.org/TR/REC-html40/loose.dtd"> 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     3
<HEAD>
398
c6612bffb0c1 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 397
diff changeset
     4
<TITLE>ADU</TITLE>
465
4dac76eb27d9 updated
Christian Urban <urbanc@in.tum.de>
parents: 442
diff changeset
     5
<BASE HREF="http://nms.kcl.ac.uk/christian.urban/">
392
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     6
</HEAD>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     7
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     8
<BODY TEXT="#000000" 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     9
      BGCOLOR="#4169E1" 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    10
      LINK="#0000EF" 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    11
      VLINK="#51188E" 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    12
      ALINK="#FF0000">
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    13
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    14
<TABLE WIDTH="100%" 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    15
       BGCOLOR="#4169E1" 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    16
       BORDER="0"   
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    17
       FRAME="border"  
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    18
       CELLPADDING="10"     
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    19
       CELLSPACING="2"
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    20
       RULES="all">
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    21
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    22
<!-- right column -->
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    23
<TR>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    24
<TD BGCOLOR="#FFFFFF" WIDTH="75%">
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    25
<H2>POSIX Lexing with Derivatives of Regular Expressions (Proof Pearl)</H2>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    26
 
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    27
Fahad Ausaf, Roy Dyckhoff, Christian Urban
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    28
<p>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    29
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    30
Brzozowski introduced the notion of derivatives for regular
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    31
expressions. They can be used for a very simple regular expression
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    32
matching algorithm.  Sulzmann and Lu cleverly extended this algorithm
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    33
in order to deal with POSIX matching, which is the underlying
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    34
disambiguation strategy for regular expressions needed in lexers.
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    35
Sulzmann and Lu have made available on-line what they call a
440
99f91bb99418 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 398
diff changeset
    36
``rigorous proof'' of the correctness of their algorithm w.r.t. their
392
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    37
specification; regrettably, it appears to us to have unfillable gaps.
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    38
In the first part of this paper we give our inductive definition of
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    39
what a POSIX value is and show (i) that such a value is unique (for
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    40
given regular expression and string being matched) and (ii) that
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    41
Sulzmann and Lu's algorithm always generates such a value (provided
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    42
that the regular expression matches the string).  We also prove the
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    43
correctness of an optimised version of the POSIX matching
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    44
algorithm. Our definitions and proof are much simpler than those by
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    45
Sulzmann and Lu and can be easily formalised in Isabelle/HOL. In the
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    46
second part we analyse the correctness argument by Sulzmann and Lu and
440
99f91bb99418 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 398
diff changeset
    47
explain why the gaps in this argument cannot be filled easily.
99f91bb99418 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 398
diff changeset
    48
  
392
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    49
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    50
<H3>Theory Files for Isabelle 2016</H3>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    51
  
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    52
<ul>
442
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 441
diff changeset
    53
<li> <A HREF="http://talisker.inf.kcl.ac.uk/%7Eurbanc/cgi-bin/repos.cgi/lexing/raw-file/tip/thys/Lexer.thy">Lexer.thy</A>
396
8d4f3b55ead8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    54
<li> <A HREF="http://talisker.inf.kcl.ac.uk/%7Eurbanc/cgi-bin/repos.cgi/lexing/raw-file/tip/thys/Simplifying.thy">Simplifying.thy</A>
393
c08b9d616f9d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 392
diff changeset
    55
</ul>
392
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    56
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    57
394
d7ecd834b2a5 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 393
diff changeset
    58
<H3>Links</H3>
392
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    59
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    60
<ul>
441
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 440
diff changeset
    61
<li> <A HREF="http://talisker.inf.kcl.ac.uk/%7Eurbanc/cgi-bin/repos.cgi/lexing/raw-file/tip/thys/paper.pdf">our paper</A>
392
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    62
<li> <A HREF="http://talisker.inf.kcl.ac.uk/%7Eurbanc/cgi-bin/repos.cgi/lexing/raw-file/tip/Literature/sulzmann14-new.pdf">the paper</A> by Sulzmann and Lu
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    63
</ul>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    64
  
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    65
</TABLE>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    66
<P>
442
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 441
diff changeset
    67
<!-- hhmts start --> Last modified: Wed May 18 15:01:19 BST 2016 <!-- hhmts end -->
392
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    68
<a href="http://validator.w3.org/check/referer">[Validate this page.]</a>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    69
</BODY>
880e62843614 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    70
</HTML>