handouts/pep-ho.tex
changeset 143 11396c17cd8b
parent 125 dcaab8068baa
child 152 114a89518aea
equal deleted inserted replaced
142:6f4d8b5e6d80 143:11396c17cd8b
   586 that a function does not have a result, but potentially causes
   586 that a function does not have a result, but potentially causes
   587 some side-effect. Typical examples are the printing functions, 
   587 some side-effect. Typical examples are the printing functions, 
   588 like \code{print}.
   588 like \code{print}.
   589 
   589 
   590 
   590 
   591 \subsection*{Cool Stuff}
   591 % \subsection*{Cool Stuff}
   592 
   592 
   593 The first wow-moment I had with Scala was when I came across
   593 % The first wow-moment I had with Scala was when I came across
   594 the following code-snippet for reading a web-page. 
   594 % the following code-snippet for reading a web-page. 
   595 
   595 
   596 
   596 
   597 \begin{lstlisting}[ numbers=none]
   597 % \begin{lstlisting}[ numbers=none]
   598 import io.Source
   598 % import io.Source
   599 val url = """http://www.inf.kcl.ac.uk/staff/urbanc/"""
   599 % val url = """http://www.inf.kcl.ac.uk/staff/urbanc/"""
   600 Source.fromURL(url)("ISO-8859-1").take(10000).mkString
   600 % Source.fromURL(url)("ISO-8859-1").take(10000).mkString
   601 \end{lstlisting}
   601 % \end{lstlisting}
   602 
   602 
   603 
   603 
   604 \noindent These three lines return a string containing the
   604 % \noindent These three lines return a string containing the
   605 HTML-code of my webpage. It actually already does something
   605 % HTML-code of my webpage. It actually already does something
   606 more sophisticated, namely only returns the first 10000
   606 % more sophisticated, namely only returns the first 10000
   607 characters of a webpage in case it is too large. Why is that
   607 % characters of a webpage in case it is too large. Why is that
   608 code-snippet of any interest? Well, try implementing
   608 % code-snippet of any interest? Well, try implementing
   609 reading-from-a-webpage in Java. I also like the possibility of
   609 % reading-from-a-webpage in Java. I also like the possibility of
   610 triple-quoting strings, which I have only seen in Scala so
   610 % triple-quoting strings, which I have only seen in Scala so
   611 far. The idea behind this is that in such a string all
   611 % far. The idea behind this is that in such a string all
   612 characters are interpreted literally---there are no escaped
   612 % characters are interpreted literally---there are no escaped
   613 characters, like \verb|\n| for newlines.
   613 % characters, like \verb|\n| for newlines.
   614 
   614 
   615 My second wow-moment I had with a feature of Scala that other
   615 % My second wow-moment I had with a feature of Scala that other
   616 functional programming languages do not have. This feature is
   616 % functional programming languages do not have. This feature is
   617 about implicit type conversions. If you have regular
   617 % about implicit type conversions. If you have regular
   618 expressions and want to use them for language processing you
   618 % expressions and want to use them for language processing you
   619 often want to recognise keywords in a language, for example
   619 % often want to recognise keywords in a language, for example
   620 \code{for},{} \code{if},{} \code{yield} and so on. But the
   620 % \code{for},{} \code{if},{} \code{yield} and so on. But the
   621 basic regular expression \code{CHAR} can only recognise a
   621 % basic regular expression \code{CHAR} can only recognise a
   622 single character. In order to recognise a whole string, like
   622 % single character. In order to recognise a whole string, like
   623 \code{for}, you have to put many of those together using
   623 % \code{for}, you have to put many of those together using
   624 \code{SEQ}:
   624 % \code{SEQ}:
   625 
   625 
   626 
   626 
   627 \begin{lstlisting}[numbers=none]
   627 % \begin{lstlisting}[numbers=none]
   628 SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r')))
   628 % SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r')))
   629 \end{lstlisting}
   629 % \end{lstlisting}
   630 
   630 
   631 \noindent This gets quickly unreadable when the strings and
   631 % \noindent This gets quickly unreadable when the strings and
   632 regular expressions get more complicated. In other functional
   632 % regular expressions get more complicated. In other functional
   633 programming languages, you can explicitly write a conversion
   633 % programming languages, you can explicitly write a conversion
   634 function that takes a string, say \dq{\pcode{for}}, and
   634 % function that takes a string, say \dq{\pcode{for}}, and
   635 generates the regular expression above. But then your code is
   635 % generates the regular expression above. But then your code is
   636 littered with such conversion functions.
   636 % littered with such conversion functions.
   637 
   637 
   638 In Scala you can do better by ``hiding'' the conversion
   638 % In Scala you can do better by ``hiding'' the conversion
   639 functions. The keyword for doing this is \code{implicit} and
   639 % functions. The keyword for doing this is \code{implicit} and
   640 it needs a built-in library called 
   640 % it needs a built-in library called 
   641 
   641 
   642 \begin{lstlisting}[numbers=none]
   642 % \begin{lstlisting}[numbers=none]
   643 scala.language.implicitConversions
   643 % scala.language.implicitConversions
   644 \end{lstlisting}
   644 % \end{lstlisting}
   645 
   645 
   646 \noindent
   646 % \noindent
   647 Consider the code
   647 % Consider the code
   648 
   648 
   649 
   649 
   650 \begin{lstlisting}[language=Scala]
   650 % \begin{lstlisting}[language=Scala]
   651 import scala.language.implicitConversions
   651 % import scala.language.implicitConversions
   652 
   652 
   653 def charlist2rexp(s: List[Char]) : Rexp = s match {
   653 % def charlist2rexp(s: List[Char]) : Rexp = s match {
   654   case Nil => EMPTY
   654 %   case Nil => EMPTY
   655   case c::Nil => CHAR(c)
   655 %   case c::Nil => CHAR(c)
   656   case c::s => SEQ(CHAR(c), charlist2rexp(s))
   656 %   case c::s => SEQ(CHAR(c), charlist2rexp(s))
   657 }
   657 % }
   658 
   658 
   659 implicit def string2rexp(s: String) : Rexp = 
   659 % implicit def string2rexp(s: String) : Rexp = 
   660   charlist2rexp(s.toList)
   660 %   charlist2rexp(s.toList)
   661 \end{lstlisting}
   661 % \end{lstlisting}
   662 
   662 
   663 
   663 
   664 \noindent where the first seven lines implement a function
   664 % \noindent where the first seven lines implement a function
   665 that given a list of characters generates the corresponding
   665 % that given a list of characters generates the corresponding
   666 regular expression. In Lines 9 and 10, this function is used
   666 % regular expression. In Lines 9 and 10, this function is used
   667 for transforming a string into a regular expression. Since the
   667 % for transforming a string into a regular expression. Since the
   668 \code{string2rexp}-function is declared as \code{implicit},
   668 % \code{string2rexp}-function is declared as \code{implicit},
   669 the effect will be that whenever Scala expects a regular
   669 % the effect will be that whenever Scala expects a regular
   670 expression, but I only give it a string, it will automatically
   670 % expression, but I only give it a string, it will automatically
   671 insert a call to the \code{string2rexp}-function. I can now
   671 % insert a call to the \code{string2rexp}-function. I can now
   672 write for example
   672 % write for example
   673 
   673 
   674 \begin{lstlisting}[numbers=none]
   674 % \begin{lstlisting}[numbers=none]
   675 scala> ALT("ab", "ac")
   675 % scala> ALT("ab", "ac")
   676 res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))
   676 % res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))
   677 \end{lstlisting}
   677 % \end{lstlisting}
   678 
   678 
   679 \noindent Recall that \code{ALT} expects two regular
   679 % \noindent Recall that \code{ALT} expects two regular
   680 expressions as arguments, but I only supply two strings. The
   680 % expressions as arguments, but I only supply two strings. The
   681 implicit conversion function will transform the string into a
   681 % implicit conversion function will transform the string into a
   682 regular expression.
   682 % regular expression.
   683 
   683 
   684 Using implicit definitions, Scala allows me to introduce
   684 % Using implicit definitions, Scala allows me to introduce
   685 some further syntactic sugar for regular expressions:
   685 % some further syntactic sugar for regular expressions:
   686 
   686 
   687 
   687 
   688 \begin{lstlisting}[ numbers=none]
   688 % \begin{lstlisting}[ numbers=none]
   689 implicit def RexpOps(r: Rexp) = new {
   689 % implicit def RexpOps(r: Rexp) = new {
   690   def | (s: Rexp) = ALT(r, s)
   690 %   def | (s: Rexp) = ALT(r, s)
   691   def ~ (s: Rexp) = SEQ(r, s)
   691 %   def ~ (s: Rexp) = SEQ(r, s)
   692   def % = STAR(r)
   692 %   def % = STAR(r)
   693 }
   693 % }
   694 
   694 
   695 implicit def stringOps(s: String) = new {
   695 % implicit def stringOps(s: String) = new {
   696   def | (r: Rexp) = ALT(s, r)
   696 %   def | (r: Rexp) = ALT(s, r)
   697   def | (r: String) = ALT(s, r)
   697 %   def | (r: String) = ALT(s, r)
   698   def ~ (r: Rexp) = SEQ(s, r)
   698 %   def ~ (r: Rexp) = SEQ(s, r)
   699   def ~ (r: String) = SEQ(s, r)
   699 %   def ~ (r: String) = SEQ(s, r)
   700   def % = STAR(s)
   700 %   def % = STAR(s)
   701 }
   701 % }
   702 \end{lstlisting}
   702 % \end{lstlisting}
   703 
   703 
   704  
   704  
   705 \noindent This might seem a bit overly complicated, but its effect is
   705 % \noindent This might seem a bit overly complicated, but its effect is
   706 that I can now write regular expressions such as $ab + ac$ 
   706 % that I can now write regular expressions such as $ab + ac$ 
   707 simply as
   707 % simply as
   708 
   708 
   709 
   709 
   710 \begin{lstlisting}[numbers=none]
   710 % \begin{lstlisting}[numbers=none]
   711 scala> "ab" | "ac"
   711 % scala> "ab" | "ac"
   712 res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))
   712 % res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))
   713 \end{lstlisting}
   713 % \end{lstlisting}
   714 
   714 
   715  
   715  
   716 \noindent I leave you to figure out what the other
   716 % \noindent I leave you to figure out what the other
   717 syntactic sugar in the code above stands for.
   717 % syntactic sugar in the code above stands for.
   718  
   718  
   719 One more useful feature of Scala is the ability to define
   719 % One more useful feature of Scala is the ability to define
   720 functions with varying argument lists. This is a feature that
   720 % functions with varying argument lists. This is a feature that
   721 is already present in old languages, like C, but seems to have
   721 % is already present in old languages, like C, but seems to have
   722 been forgotten in the meantime---Java does not have it. In the
   722 % been forgotten in the meantime---Java does not have it. In the
   723 context of regular expressions this feature comes in handy:
   723 % context of regular expressions this feature comes in handy:
   724 Say you are fed up with writing many alternatives as
   724 % Say you are fed up with writing many alternatives as
   725 
   725 
   726 
   726 
   727 \begin{lstlisting}[numbers=none]
   727 % \begin{lstlisting}[numbers=none]
   728 ALT(..., ALT(..., ALT(..., ...)))
   728 % ALT(..., ALT(..., ALT(..., ...)))
   729 \end{lstlisting}
   729 % \end{lstlisting}
   730 
   730 
   731 
   731 
   732 \noindent To make it difficult, you do not know how deep such
   732 % \noindent To make it difficult, you do not know how deep such
   733 alternatives are nested. So you need something flexible that
   733 % alternatives are nested. So you need something flexible that
   734 can take as many alternatives as needed. In Scala one can
   734 % can take as many alternatives as needed. In Scala one can
   735 achieve this by adding a \code{*} to the type of an argument.
   735 % achieve this by adding a \code{*} to the type of an argument.
   736 Consider the code
   736 % Consider the code
   737 
   737 
   738 
   738 
   739 \begin{lstlisting}[language=Scala]
   739 % \begin{lstlisting}[language=Scala]
   740 def Alts(rs: List[Rexp]) : Rexp = rs match {
   740 % def Alts(rs: List[Rexp]) : Rexp = rs match {
   741   case Nil => NULL
   741 %   case Nil => NULL
   742   case r::Nil => r
   742 %   case r::Nil => r
   743   case r::rs => ALT(r, Alts(rs))
   743 %   case r::rs => ALT(r, Alts(rs))
   744 }
   744 % }
   745 
   745 
   746 def ALTS(rs: Rexp*) = Alts(rs.toList)
   746 % def ALTS(rs: Rexp*) = Alts(rs.toList)
   747 \end{lstlisting}
   747 % \end{lstlisting}
   748 
   748 
   749 
   749 
   750 \noindent The function in Lines 1 to 5 takes a list of regular
   750 % \noindent The function in Lines 1 to 5 takes a list of regular
   751 expressions and converts it into an appropriate alternative
   751 % expressions and converts it into an appropriate alternative
   752 regular expression. In Line 7 there is a wrapper for this
   752 % regular expression. In Line 7 there is a wrapper for this
   753 function which uses the feature of varying argument lists. The
   753 % function which uses the feature of varying argument lists. The
   754 effect of this code  is that I can write the regular
   754 % effect of this code  is that I can write the regular
   755 expression for keywords as
   755 % expression for keywords as
   756 
   756 
   757 
   757 
   758 \begin{lstlisting}[numbers=none]
   758 % \begin{lstlisting}[numbers=none]
   759 ALTS("for", "def", "yield", "implicit", "if", "match", "case")
   759 % ALTS("for", "def", "yield", "implicit", "if", "match", "case")
   760 \end{lstlisting}
   760 % \end{lstlisting}
   761 
   761 
   762 
   762 
   763 \noindent Again I leave it to you to find out how much this
   763 % \noindent Again I leave it to you to find out how much this
   764 simplifies the regular expression in comparison with if I had
   764 % simplifies the regular expression in comparison with if I had
   765 to write this by hand using only the ``plain'' regular
   765 % to write this by hand using only the ``plain'' regular
   766 expressions from the inductive datatype.
   766 % expressions from the inductive datatype.
       
   767 
       
   768 \bigskip\noindent
       
   769 \textit{More TBD.}
   767 
   770 
   768 \subsection*{More Info}
   771 \subsection*{More Info}
   769 
   772 
   770 There is much more to Scala than I can possibly describe in
   773 There is much more to Scala than I can possibly describe in
   771 this document. Fortunately there are a number of free books
   774 this document. Fortunately there are a number of free books