cws/core_cw02.tex
changeset 474 b528d1d3d3c3
parent 471 135bf034ac30
equal deleted inserted replaced
473:ac79c2e534bd 474:b528d1d3d3c3
    32 apply an automated marking script to them.\medskip
    32 apply an automated marking script to them.\medskip
    33 
    33 
    34 \noindent
    34 \noindent
    35 In addition, the Scala part comes with reference
    35 In addition, the Scala part comes with reference
    36 implementations in form of \texttt{jar}-files. This allows you to run
    36 implementations in form of \texttt{jar}-files. This allows you to run
    37 any test cases on your own computer. For example you can call scala-cli on
    37 any test cases on your own computer. For example you can call \texttt{scala-cli} on
    38 the command line with the option \texttt{--extra-jars docdiff.jar} and then
    38 the command line with the option \texttt{--extra-jars docdiff.jar} and then
    39 query any function from the template file. Say you want to find out
    39 query any function from the template file. Say you want to find out
    40 what the function \texttt{occurrences} produces: for this you just need
    40 what the function \texttt{occurrences} produces: for this you just need
    41 to prefix it with the object name \texttt{C2}.  If you want to find out what
    41 to prefix it with the object name \texttt{C2}.  If you want to find out what
    42 these functions produce for the list \texttt{List("a", "b", "b")},
    42 these functions produce for the list \texttt{List("a", "b", "b")},
   139   \texttt{overlap}(d_1, d_2) = \frac{d_1 \cdot d_2}{max(d_1^2, d_2^2)}  
   139   \texttt{overlap}(d_1, d_2) = \frac{d_1 \cdot d_2}{max(d_1^2, d_2^2)}  
   140   \]
   140   \]
   141 
   141 
   142   where $d_1^2$ means $d_1 \cdot d_1$ and so on.
   142   where $d_1^2$ means $d_1 \cdot d_1$ and so on.
   143   You can expect this function to return a \texttt{Double} between 0 and 1. The
   143   You can expect this function to return a \texttt{Double} between 0 and 1. The
   144   overlap between the lists in (2) is $0.5384615384615384$.
   144   overlap between the lists in Task (2) is $0.5384615384615384$.
   145 
   145 
   146   Second, implement a function that calculates the similarity of
   146   Second, implement a function that calculates the similarity of
   147   two strings, by first extracting the substrings using the clean
   147   two strings, by first extracting the substrings using the clean
   148   function from (1)
   148   function from (1)
   149   and then calculating the overlap of the resulting documents.\\
   149   and then calculating the overlap of the resulting documents.\\