cws/core_cw02.tex
changeset 498 0f1b97538ad4
parent 471 31b81f20fd9a
child 501 3717785f2c37
equal deleted inserted replaced
497:ef37fb04a343 498:0f1b97538ad4
     7 \begin{document}
     7 \begin{document}
     8 
     8 
     9 
     9 
    10 %% should ask to lower case the words.
    10 %% should ask to lower case the words.
    11 
    11 
    12 \section*{Core Part 2 (Scala, 3 Marks)}
    12 \section*{Core Part 2 (Scala, 1.5 Marks)}
    13 
    13 
    14 \mbox{}\hfill\textit{``What one programmer can do in one month,}\\
    14 \mbox{}\hfill\textit{``What one programmer can do in one month,}\\
    15 \mbox{}\hfill\textit{two programmers can do in two months.''}\smallskip\\
    15 \mbox{}\hfill\textit{two programmers can do in two months.''}\smallskip\\
    16 \mbox{}\hfill\textit{ --- Frederick P.~Brooks (author of The Mythical Man-Month)}\bigskip\medskip
    16 \mbox{}\hfill\textit{ --- Frederick P.~Brooks (author of The Mythical Man-Month)}\bigskip\medskip
    17 
    17 
    64 integers, \texttt{.max} calculates the maximum of a list.\bigskip
    64 integers, \texttt{.max} calculates the maximum of a list.\bigskip
    65 
    65 
    66 
    66 
    67 
    67 
    68 \newpage
    68 \newpage
    69 \subsection*{Core Part 2 (3 Marks, file docdiff.scala)}
    69 \subsection*{Core Part 2 (1.5  Marks, file docdiff.scala)}
    70 
    70 
    71 It seems plagiarism---stealing and submitting someone
    71 It seems plagiarism---stealing and submitting someone
    72 else's code---is a serious problem at other
    72 else's code---is a serious problem at other
    73 universities.\footnote{Surely, King's students, after all their
    73 universities.\footnote{Surely, King's students, after all their
    74   instructions and warnings, would never commit such an offence. Yes?}
    74   instructions and warnings, would never commit such an offence. Yes?}
    85 \item[(1)] Implement a function that `cleans' a string by finding all
    85 \item[(1)] Implement a function that `cleans' a string by finding all
    86   (proper) words in the string. For this use the regular expression
    86   (proper) words in the string. For this use the regular expression
    87   \texttt{\textbackslash{}w+} for recognising words and the library function
    87   \texttt{\textbackslash{}w+} for recognising words and the library function
    88   \texttt{findAllIn}. The function should return a document (a list of
    88   \texttt{findAllIn}. The function should return a document (a list of
    89   strings).
    89   strings).
    90   \mbox{}\hfill\mbox{[0.5 Marks]}
    90   \mbox{}\hfill\mbox{[0.25 Marks]}
    91 
    91 
    92 \item[(2)] In order to compute the overlap between two documents, we
    92 \item[(2)] In order to compute the overlap between two documents, we
    93   associate each document with a \texttt{Map}. This Map represents the
    93   associate each document with a \texttt{Map}. This Map represents the
    94   strings in a document and how many times these strings occur in the
    94   strings in a document and how many times these strings occur in the
    95   document. A simple (though slightly inefficient) method for counting
    95   document. A simple (though slightly inefficient) method for counting
   106 
   106 
   107   \begin{center}
   107   \begin{center}
   108   \pcode{occurrences(List("d", "b", "d", "b", "d"))}
   108   \pcode{occurrences(List("d", "b", "d", "b", "d"))}
   109   \end{center}
   109   \end{center}
   110 
   110 
   111   produces \pcode{Map(d -> 3, b -> 2)}.\hfill[1 Mark]
   111   produces \pcode{Map(d -> 3, b -> 2)}.\hfill[0.5 Marks]
   112 
   112 
   113 \item[(3)] You can think of the Maps calculated under (2) as memory-efficient
   113 \item[(3)] You can think of the Maps calculated under (2) as memory-efficient
   114   representations of sparse ``vectors''. In this subtask you need to
   114   representations of sparse ``vectors''. In this subtask you need to
   115   implement the \emph{product} of two such vectors, sometimes also called
   115   implement the \emph{product} of two such vectors, sometimes also called
   116   \emph{dot product} of two vectors.\footnote{\url{https://en.wikipedia.org/wiki/Dot_product}}
   116   \emph{dot product} of two vectors.\footnote{\url{https://en.wikipedia.org/wiki/Dot_product}}
   128     \underbrace{2 * 2}_{"b"} \;\;+\;\;
   128     \underbrace{2 * 2}_{"b"} \;\;+\;\;
   129     \underbrace{1 * 0}_{"c"} \;\;+\;\;
   129     \underbrace{1 * 0}_{"c"} \;\;+\;\;
   130     \underbrace{1 * 3}_{"d"} \qquad = 7
   130     \underbrace{1 * 3}_{"d"} \qquad = 7
   131   \]  
   131   \]  
   132   
   132   
   133   \hfill\mbox{[1 Mark]}
   133   \hfill\mbox{[0.5 Marks]}
   134 
   134 
   135 \item[(4)] Implement first a function that calculates the overlap
   135 \item[(4)] Implement first a function that calculates the overlap
   136   between two documents, say $d_1$ and $d_2$, according to the formula
   136   between two documents, say $d_1$ and $d_2$, according to the formula
   137 
   137 
   138   \[
   138   \[
   145 
   145 
   146   Second, implement a function that calculates the similarity of
   146   Second, implement a function that calculates the similarity of
   147   two strings, by first extracting the substrings using the clean
   147   two strings, by first extracting the substrings using the clean
   148   function from (1)
   148   function from (1)
   149   and then calculating the overlap of the resulting documents.\\
   149   and then calculating the overlap of the resulting documents.\\
   150   \mbox{}\hfill\mbox{[0.5 Marks]}
   150   \mbox{}\hfill\mbox{[0.25 Marks]}
   151 \end{itemize}
   151 \end{itemize}
   152 
   152 
   153 
   153 
   154 \end{document} 
   154 \end{document} 
   155 
   155