586 that a function does not have a result, but potentially causes |
586 that a function does not have a result, but potentially causes |
587 some side-effect. Typical examples are the printing functions, |
587 some side-effect. Typical examples are the printing functions, |
588 like \code{print}. |
588 like \code{print}. |
589 |
589 |
590 |
590 |
591 \subsection*{Cool Stuff} |
591 % \subsection*{Cool Stuff} |
592 |
592 |
593 The first wow-moment I had with Scala was when I came across |
593 % The first wow-moment I had with Scala was when I came across |
594 the following code-snippet for reading a web-page. |
594 % the following code-snippet for reading a web-page. |
595 |
595 |
596 |
596 |
597 \begin{lstlisting}[ numbers=none] |
597 % \begin{lstlisting}[ numbers=none] |
598 import io.Source |
598 % import io.Source |
599 val url = """http://www.inf.kcl.ac.uk/staff/urbanc/""" |
599 % val url = """http://www.inf.kcl.ac.uk/staff/urbanc/""" |
600 Source.fromURL(url)("ISO-8859-1").take(10000).mkString |
600 % Source.fromURL(url)("ISO-8859-1").take(10000).mkString |
601 \end{lstlisting} |
601 % \end{lstlisting} |
602 |
602 |
603 |
603 |
604 \noindent These three lines return a string containing the |
604 % \noindent These three lines return a string containing the |
605 HTML-code of my webpage. It actually already does something |
605 % HTML-code of my webpage. It actually already does something |
606 more sophisticated, namely only returns the first 10000 |
606 % more sophisticated, namely only returns the first 10000 |
607 characters of a webpage in case it is too large. Why is that |
607 % characters of a webpage in case it is too large. Why is that |
608 code-snippet of any interest? Well, try implementing |
608 % code-snippet of any interest? Well, try implementing |
609 reading-from-a-webpage in Java. I also like the possibility of |
609 % reading-from-a-webpage in Java. I also like the possibility of |
610 triple-quoting strings, which I have only seen in Scala so |
610 % triple-quoting strings, which I have only seen in Scala so |
611 far. The idea behind this is that in such a string all |
611 % far. The idea behind this is that in such a string all |
612 characters are interpreted literally---there are no escaped |
612 % characters are interpreted literally---there are no escaped |
613 characters, like \verb|\n| for newlines. |
613 % characters, like \verb|\n| for newlines. |
614 |
614 |
615 My second wow-moment I had with a feature of Scala that other |
615 % My second wow-moment I had with a feature of Scala that other |
616 functional programming languages do not have. This feature is |
616 % functional programming languages do not have. This feature is |
617 about implicit type conversions. If you have regular |
617 % about implicit type conversions. If you have regular |
618 expressions and want to use them for language processing you |
618 % expressions and want to use them for language processing you |
619 often want to recognise keywords in a language, for example |
619 % often want to recognise keywords in a language, for example |
620 \code{for},{} \code{if},{} \code{yield} and so on. But the |
620 % \code{for},{} \code{if},{} \code{yield} and so on. But the |
621 basic regular expression \code{CHAR} can only recognise a |
621 % basic regular expression \code{CHAR} can only recognise a |
622 single character. In order to recognise a whole string, like |
622 % single character. In order to recognise a whole string, like |
623 \code{for}, you have to put many of those together using |
623 % \code{for}, you have to put many of those together using |
624 \code{SEQ}: |
624 % \code{SEQ}: |
625 |
625 |
626 |
626 |
627 \begin{lstlisting}[numbers=none] |
627 % \begin{lstlisting}[numbers=none] |
628 SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r'))) |
628 % SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r'))) |
629 \end{lstlisting} |
629 % \end{lstlisting} |
630 |
630 |
631 \noindent This gets quickly unreadable when the strings and |
631 % \noindent This gets quickly unreadable when the strings and |
632 regular expressions get more complicated. In other functional |
632 % regular expressions get more complicated. In other functional |
633 programming languages, you can explicitly write a conversion |
633 % programming languages, you can explicitly write a conversion |
634 function that takes a string, say \dq{\pcode{for}}, and |
634 % function that takes a string, say \dq{\pcode{for}}, and |
635 generates the regular expression above. But then your code is |
635 % generates the regular expression above. But then your code is |
636 littered with such conversion functions. |
636 % littered with such conversion functions. |
637 |
637 |
638 In Scala you can do better by ``hiding'' the conversion |
638 % In Scala you can do better by ``hiding'' the conversion |
639 functions. The keyword for doing this is \code{implicit} and |
639 % functions. The keyword for doing this is \code{implicit} and |
640 it needs a built-in library called |
640 % it needs a built-in library called |
641 |
641 |
642 \begin{lstlisting}[numbers=none] |
642 % \begin{lstlisting}[numbers=none] |
643 scala.language.implicitConversions |
643 % scala.language.implicitConversions |
644 \end{lstlisting} |
644 % \end{lstlisting} |
645 |
645 |
646 \noindent |
646 % \noindent |
647 Consider the code |
647 % Consider the code |
648 |
648 |
649 |
649 |
650 \begin{lstlisting}[language=Scala] |
650 % \begin{lstlisting}[language=Scala] |
651 import scala.language.implicitConversions |
651 % import scala.language.implicitConversions |
652 |
652 |
653 def charlist2rexp(s: List[Char]) : Rexp = s match { |
653 % def charlist2rexp(s: List[Char]) : Rexp = s match { |
654 case Nil => EMPTY |
654 % case Nil => EMPTY |
655 case c::Nil => CHAR(c) |
655 % case c::Nil => CHAR(c) |
656 case c::s => SEQ(CHAR(c), charlist2rexp(s)) |
656 % case c::s => SEQ(CHAR(c), charlist2rexp(s)) |
657 } |
657 % } |
658 |
658 |
659 implicit def string2rexp(s: String) : Rexp = |
659 % implicit def string2rexp(s: String) : Rexp = |
660 charlist2rexp(s.toList) |
660 % charlist2rexp(s.toList) |
661 \end{lstlisting} |
661 % \end{lstlisting} |
662 |
662 |
663 |
663 |
664 \noindent where the first seven lines implement a function |
664 % \noindent where the first seven lines implement a function |
665 that given a list of characters generates the corresponding |
665 % that given a list of characters generates the corresponding |
666 regular expression. In Lines 9 and 10, this function is used |
666 % regular expression. In Lines 9 and 10, this function is used |
667 for transforming a string into a regular expression. Since the |
667 % for transforming a string into a regular expression. Since the |
668 \code{string2rexp}-function is declared as \code{implicit}, |
668 % \code{string2rexp}-function is declared as \code{implicit}, |
669 the effect will be that whenever Scala expects a regular |
669 % the effect will be that whenever Scala expects a regular |
670 expression, but I only give it a string, it will automatically |
670 % expression, but I only give it a string, it will automatically |
671 insert a call to the \code{string2rexp}-function. I can now |
671 % insert a call to the \code{string2rexp}-function. I can now |
672 write for example |
672 % write for example |
673 |
673 |
674 \begin{lstlisting}[numbers=none] |
674 % \begin{lstlisting}[numbers=none] |
675 scala> ALT("ab", "ac") |
675 % scala> ALT("ab", "ac") |
676 res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
676 % res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
677 \end{lstlisting} |
677 % \end{lstlisting} |
678 |
678 |
679 \noindent Recall that \code{ALT} expects two regular |
679 % \noindent Recall that \code{ALT} expects two regular |
680 expressions as arguments, but I only supply two strings. The |
680 % expressions as arguments, but I only supply two strings. The |
681 implicit conversion function will transform the string into a |
681 % implicit conversion function will transform the string into a |
682 regular expression. |
682 % regular expression. |
683 |
683 |
684 Using implicit definitions, Scala allows me to introduce |
684 % Using implicit definitions, Scala allows me to introduce |
685 some further syntactic sugar for regular expressions: |
685 % some further syntactic sugar for regular expressions: |
686 |
686 |
687 |
687 |
688 \begin{lstlisting}[ numbers=none] |
688 % \begin{lstlisting}[ numbers=none] |
689 implicit def RexpOps(r: Rexp) = new { |
689 % implicit def RexpOps(r: Rexp) = new { |
690 def | (s: Rexp) = ALT(r, s) |
690 % def | (s: Rexp) = ALT(r, s) |
691 def ~ (s: Rexp) = SEQ(r, s) |
691 % def ~ (s: Rexp) = SEQ(r, s) |
692 def % = STAR(r) |
692 % def % = STAR(r) |
693 } |
693 % } |
694 |
694 |
695 implicit def stringOps(s: String) = new { |
695 % implicit def stringOps(s: String) = new { |
696 def | (r: Rexp) = ALT(s, r) |
696 % def | (r: Rexp) = ALT(s, r) |
697 def | (r: String) = ALT(s, r) |
697 % def | (r: String) = ALT(s, r) |
698 def ~ (r: Rexp) = SEQ(s, r) |
698 % def ~ (r: Rexp) = SEQ(s, r) |
699 def ~ (r: String) = SEQ(s, r) |
699 % def ~ (r: String) = SEQ(s, r) |
700 def % = STAR(s) |
700 % def % = STAR(s) |
701 } |
701 % } |
702 \end{lstlisting} |
702 % \end{lstlisting} |
703 |
703 |
704 |
704 |
705 \noindent This might seem a bit overly complicated, but its effect is |
705 % \noindent This might seem a bit overly complicated, but its effect is |
706 that I can now write regular expressions such as $ab + ac$ |
706 % that I can now write regular expressions such as $ab + ac$ |
707 simply as |
707 % simply as |
708 |
708 |
709 |
709 |
710 \begin{lstlisting}[numbers=none] |
710 % \begin{lstlisting}[numbers=none] |
711 scala> "ab" | "ac" |
711 % scala> "ab" | "ac" |
712 res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
712 % res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
713 \end{lstlisting} |
713 % \end{lstlisting} |
714 |
714 |
715 |
715 |
716 \noindent I leave you to figure out what the other |
716 % \noindent I leave you to figure out what the other |
717 syntactic sugar in the code above stands for. |
717 % syntactic sugar in the code above stands for. |
718 |
718 |
719 One more useful feature of Scala is the ability to define |
719 % One more useful feature of Scala is the ability to define |
720 functions with varying argument lists. This is a feature that |
720 % functions with varying argument lists. This is a feature that |
721 is already present in old languages, like C, but seems to have |
721 % is already present in old languages, like C, but seems to have |
722 been forgotten in the meantime---Java does not have it. In the |
722 % been forgotten in the meantime---Java does not have it. In the |
723 context of regular expressions this feature comes in handy: |
723 % context of regular expressions this feature comes in handy: |
724 Say you are fed up with writing many alternatives as |
724 % Say you are fed up with writing many alternatives as |
725 |
725 |
726 |
726 |
727 \begin{lstlisting}[numbers=none] |
727 % \begin{lstlisting}[numbers=none] |
728 ALT(..., ALT(..., ALT(..., ...))) |
728 % ALT(..., ALT(..., ALT(..., ...))) |
729 \end{lstlisting} |
729 % \end{lstlisting} |
730 |
730 |
731 |
731 |
732 \noindent To make it difficult, you do not know how deep such |
732 % \noindent To make it difficult, you do not know how deep such |
733 alternatives are nested. So you need something flexible that |
733 % alternatives are nested. So you need something flexible that |
734 can take as many alternatives as needed. In Scala one can |
734 % can take as many alternatives as needed. In Scala one can |
735 achieve this by adding a \code{*} to the type of an argument. |
735 % achieve this by adding a \code{*} to the type of an argument. |
736 Consider the code |
736 % Consider the code |
737 |
737 |
738 |
738 |
739 \begin{lstlisting}[language=Scala] |
739 % \begin{lstlisting}[language=Scala] |
740 def Alts(rs: List[Rexp]) : Rexp = rs match { |
740 % def Alts(rs: List[Rexp]) : Rexp = rs match { |
741 case Nil => NULL |
741 % case Nil => NULL |
742 case r::Nil => r |
742 % case r::Nil => r |
743 case r::rs => ALT(r, Alts(rs)) |
743 % case r::rs => ALT(r, Alts(rs)) |
744 } |
744 % } |
745 |
745 |
746 def ALTS(rs: Rexp*) = Alts(rs.toList) |
746 % def ALTS(rs: Rexp*) = Alts(rs.toList) |
747 \end{lstlisting} |
747 % \end{lstlisting} |
748 |
748 |
749 |
749 |
750 \noindent The function in Lines 1 to 5 takes a list of regular |
750 % \noindent The function in Lines 1 to 5 takes a list of regular |
751 expressions and converts it into an appropriate alternative |
751 % expressions and converts it into an appropriate alternative |
752 regular expression. In Line 7 there is a wrapper for this |
752 % regular expression. In Line 7 there is a wrapper for this |
753 function which uses the feature of varying argument lists. The |
753 % function which uses the feature of varying argument lists. The |
754 effect of this code is that I can write the regular |
754 % effect of this code is that I can write the regular |
755 expression for keywords as |
755 % expression for keywords as |
756 |
756 |
757 |
757 |
758 \begin{lstlisting}[numbers=none] |
758 % \begin{lstlisting}[numbers=none] |
759 ALTS("for", "def", "yield", "implicit", "if", "match", "case") |
759 % ALTS("for", "def", "yield", "implicit", "if", "match", "case") |
760 \end{lstlisting} |
760 % \end{lstlisting} |
761 |
761 |
762 |
762 |
763 \noindent Again I leave it to you to find out how much this |
763 % \noindent Again I leave it to you to find out how much this |
764 simplifies the regular expression in comparison with if I had |
764 % simplifies the regular expression in comparison with if I had |
765 to write this by hand using only the ``plain'' regular |
765 % to write this by hand using only the ``plain'' regular |
766 expressions from the inductive datatype. |
766 % expressions from the inductive datatype. |
|
767 |
|
768 \bigskip\noindent |
|
769 \textit{More TBD.} |
767 |
770 |
768 \subsection*{More Info} |
771 \subsection*{More Info} |
769 |
772 |
770 There is much more to Scala than I can possibly describe in |
773 There is much more to Scala than I can possibly describe in |
771 this document. Fortunately there are a number of free books |
774 this document. Fortunately there are a number of free books |