pep-material: cws/main_cw02.tex@d076cb2e0b75



% !TEX program = xelatex
\documentclass{article}
\usepackage{../styles/style}
\usepackage{disclaimer}
\usepackage{../styles/langs}
\usepackage{graphicx}

\begin{document}


%% should ask to lower case the words.

\section*{Evil Wordle Game (Scala, 7 Marks)}

\mbox{}\hfill\textit{``C makes it easy to shoot yourself in the foot; C++ makes it harder,}\\
\mbox{}\hfill\textit{ but when you do, it blows your whole leg off.''}\smallskip\\
\mbox{}\hfill\textit{ --- Bjarne Stroustrup (creator of the C++ language)}\bigskip\bigskip




\noindent
You are asked to implement a Scala program for making the popular Wordle game as difficult
as possible.\bigskip

\IMPORTANTNONE{}

\noindent
Also note that the running time of each part will be restricted to a
maximum of 30 seconds on my laptop.

\DISCLAIMER{}

\subsection*{Reference Implementation}

Like the C++ part, the Scala part works like this: you push your files
to GitHub and receive (after sometimes a long delay) some automated
feedback. In the end we will take a snapshot of the submitted files
and apply an automated marking script to them.\medskip

\noindent
In addition, the Scala part comes with reference
implementations in form of \texttt{jar}-files. This allows you to run
any test cases on your own computer. For example you can call Scala on
the command line with the option \texttt{-cp danube.jar} and then
query any function from the template file. Say you want to find out
what the function \texttt{} produces: for this you just need
to prefix it with the object name \texttt{M2}.  If you want to find out what
these functions produce for the list \texttt{List("a", "b", "b")},
you would type something like:

\begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small]
$ scala -cp wordle.jar
scala> val secretsURL =
     | """https://nms.kcl.ac.uk/christian.urban/wordle.txt"""

scala> M2.get_wordle_list(secretsURL)
val res0: List[String] = List(aahed, aalii, ...)
\end{lstlisting}%$

\subsection*{Hints}

\noindent
Useful data functions: \texttt{Source.fromURL},
\texttt{Source.fromFile} for obtaining a webpage and reading a file,
\texttt{.getOrElse(..,..)} allows to query a Map, but also gives a
default value if the Map is not defined, a Map can be `updated' by
using \texttt{+}, \texttt{.contains} and \texttt{.filter} can test whether
an element is included in a list, and respectively filter out elements in a list,
\texttt{.sortBy(\_.\_2)} sorts a list of pairs according to the second
elements in the pairs---the sorting is done from smallest to highest,
\texttt{.groupBy} orders lists according to same elements
.


\newpage


\subsection*{Main Part 2 (6 Marks, file wordle.scala)}

You probably know the game of Wordle\footnote{\url{https://en.wikipedia.org/wiki/Wordle}}
where you are supposed to guess a five-letter word. The feedback for guesses can help
with the next guess (green letters are correct, orange letters are present, but in the
wrong place). For example:

\begin{center}
\includegraphics[scale=0.2]{../pics/w.jpeg}
\end{center}  

\noindent
The idea of the program to be implemented here is to make the Wordle game as evil as possible
by finding words that are the most difficult to guess. A word list of five-letter words is available
from 

\begin{center}
\begin{tabular}{ll}  
  \url{https://nms.kcl.ac.uk/christian.urban/wordle.txt} & (78 KByte)\\
\end{tabular}
\end{center}

\noindent
In your program you need to download this list and implement some
functions that in the end select the most difficult words (given an
input from the user).  If bandwidth is an issue for you, download the
file locally, but in the submitted version use \texttt{Source.fromURL}
instead of \texttt{Source.fromFile}.

\subsection*{Tasks}

\begin{itemize}
\item[(1)] Implement the function \pcode{get_wordle_list} which takes an
  URL-string as argument and requests the corresponding file. The function should
  return the word list appropriately broken up into lines.
  The result should be a list of strings (the lines in the file). In case
  the url does not produce a file, return the empty list.
  \mbox{}\hfill [0.5 Marks]

\item[(2)] Implement a polymorphic function \pcode{removeN}, which removes $n$ occurrences of an
  element from a list (if this element is less than $n$ times present, then remove all occurrences).
  For example

\begin{lstlisting}[numbers=none]
removeN(List(1,2,3,2,1), 3, 2)  => List(1, 2, 2, 1)
removeN(List(1,2,3,2,1), 2, 1)  => List(1, 3, 2, 1)
removeN(List(1,2,3,2,1), 2, 2)  => List(1, 3, 1)
removeN(List(1,2,3,2,1), 1, 1)  => List(2, 3, 2, 1)
removeN(List(1,2,3,2,1), 1, 3)  => List(2, 3, 2)
removeN(List(1,2,3,2,1), 0, 2)  => List(1, 2, 3, 2, 1)
\end{lstlisting}

Make sure you only remove at most $n$ occurrences of the element from the list.
This function should work for lists of integers but also lists of chars, strings etc.\\
  \mbox{}\hfill [0.5 Marks]

\item[(3)] Implement a function \pcode{score} that calculates the
  feedback for a word against a secret word using the rules of the
  Wordle game. The output of \pcode{score} should be a list of 5
  elements of type \pcode{Tip} representing three outcomes: a letter
  in the correct position, a letter that is present, but not in the
  correct position and a letter that is absent. For example given the
  secret word "chess" the score for the word "caves" is

\begin{lstlisting}[numbers=none]
List(Correct, Absent, Absent, Present, Correct)
\end{lstlisting}

  You have to be careful with multiple occurrences of letters. For example
  the secret "chess" with the guess "swiss" should produce

\begin{lstlisting}[numbers=none]
List(Absent, Absent, Absent, Correct, Correct)
\end{lstlisting}

even though the first 's' in "swiss" is present in the secret word, the 's' are already
`used up' by the two letters that are correct. To implement this you need to
implement first a function \pcode{pool} which calculates all the letters in
a secret that are not correct in a word. For example

\begin{lstlisting}[numbers=none]
  pool("chess", "caves")  => List(h, e, s)
  pool("chess", "swiss")  => List(c, h, e)
\end{lstlisting}

  Now the helper function \pcode{aux} can analyse the arguments secret and word recursively letter-by-letter and
  decide: if the letters are the same, then return \pcode{Correct} for the corresponding position.
  If they are not the same, but the letter is in the pool, then return \pcode{Present} and also remove
  this letter from the pool in the next recursive call of \pcode{aux}. Otherwise return \pcode{Absent} for the
  corresponding position. The function \pcode{score} is a wrapper for the function \pcode{aux}
  calling \pcode{aux} with the appropriate arguments (recall what is calculated with \pcode{pool}).\mbox{}\hfill [2 Marks]

\item[(4)] Implement a function \pcode{eval} that gives an integer value to each of the
  \pcode{Tip}s such that

  \begin{center}
  \begin{tabular}{lcl}  
    \textit{eval (Correct)} & $\dn$ & $10$\\
    \textit{eval (Present)} & $\dn$ & $1$\\
    \textit{eval (Absent)} & $\dn$ & $0$
  \end{tabular}                                   
  \end{center}  

  The function \pcode{iscore} then takes an output of \pcode{score} and sums
  up all corresponding values. For example for 

\begin{lstlisting}[numbers=none]
  iscore("chess", "caves")  => 21
  iscore("chess", "swiss")  => 20
\end{lstlisting}
  \mbox{}\hfill [0.5 Marks]

\item[(5)] The function \pcode{evil} takes a list of secrets (the list from Task 1)
  and a word as arguments, and calculates the list of words with the lowest
  score (remember we want to make the Wordle game as difficult as possible---therefore
  when the user gives us a word, we want to find the secrets that produce the lowest
  score). For this implement a helper function \pcode{lowest} that goes through
  the secrets one-by-one and calculates the score. The argument \pcode{current} is
  the score of the ``currently'' found secrets. When the function \pcode{lowest}
  is called for the first time then this will be set to the maximum integer value
  \pcode{Int.MaxValue}. The accumulator will be first empty. If a secret is found
  with the same score as \pcode{current} then this word is added to the accumulator.
  If the secret has a lower score, then the accumulator will be discarded and this
  secret will be the new accumulator. If the secret has a higher score, then it can be
  ignored. For example \pcode{evil} (the wrapper for \pcode{lowest}) generates

\begin{lstlisting}[numbers=none]
evil(secrets, "stent").length => 1907
evil(secrets, "hexes").length => 2966
evil(secrets, "horse").length => 1203
evil(secrets, "hoise").length => 971
evil(secrets, "house").length => 1228
\end{lstlisting}

where \pcode{secrets} is the list generated under Task 1.
In all cases above the iscore of the resulting secrets is 0, but this does not need to be the case
in general.

\color{red}
  Note that the template gives as type for \texttt{evil}:

  \begin{center}
  \texttt{def evil(secrets: List[String], word: String) = ???}  
  \end{center}

  where the return type is left unspecified. This return type is not needed when
  functions are not recursive---\texttt{evil} is meant to be just a wrapper that
  calls \texttt{lowest} with appropriate default arguments and returns whatever
  \texttt{lowest} returns. Therefore a return type is not needed. But a slightly
  more accurate template definition for \texttt{evil} is:\medskip\medskip

  \begin{minipage}{1.05\textwidth}
  \begin{center}
  \texttt{def evil(secrets: List[String], word: String) : List[String] = ???}  
  \end{center}
  \end{minipage}\medskip\medskip

  where also the return type is explicitly given.\\\color{black}
  \mbox{}\hfill [1.5 Marks]

\item[(6)] The secrets generated in Task 5 are the ones with the lowest score
  with respect to the word. You can think of these as the secrets that are furthest ``away'' from the
  given word. This is already quite evil for a secret word---remember we can choose
  a secret \emph{after} a user has given a first word. Now we want to make it
  even more evil by choosing words that have the most obscure letters. For this we
  calculate the frequency of how many times certain letters occur in our secrets
  list (see Task 1). The \emph{frequency} of the letter $c$, say, is given by the formula

  \begin{center}
  $\textit{freq(c)} \dn 1 - \frac{\textit{number of occurrences of c}}{\textit{number of all letters}}$  
  \end{center}  

  That means that letters that occur fewer times in our secrets have a higher frequency.
  For example the letter 'y' has the frequency 0.9680234350909651 while the much more
  often occurring letter 'e' has only 0.897286463151403 (all calculations should be done
  with Doubles).

  The function \pcode{frequencies} should calculate the frequencies for all lower-case letters
  by generating a Map from letters (\pcode{Char}) to Doubles (frequencies).\\ 
  \mbox{}\hfill [1 Mark]

\item[(7)] In this task we want to use the output of \pcode{evil}, rank each string in the
  generated set and then filter out the strings that are ranked highest (the ones with the most obscure letters).
  This list of strings often contains only a single word, but in general there might be more (see below).
  First implement a function \pcode{rank} that takes a frequency map (from 6) and a string as arguments and
  generates a rank by summing up all frequencies of the letters in the string. For example

\begin{lstlisting}[numbers=none]
rank(frequencies(secrets), "adobe") => 4.673604687018193
rank(frequencies(secrets), "gaffe") => 4.745205057045945
rank(frequencies(secrets), "fuzzy") => 4.898735738513722
\end{lstlisting}

  \color{red}
  The return type for \texttt{rank} is \texttt{Double}:

  \begin{center}
  \texttt{def rank(frqs: Map[Char, Double], s: String) : Double = ???}  
  \end{center}
  \color{black}

  Finally, implement a function \pcode{ranked_evil} that selects from the output of \pcode{evil}
  the string(s) which are highest ranked in evilness.

  
\begin{lstlisting}[numbers=none]
ranked_evil(secrets, "abbey") => List(whizz)
ranked_evil(secrets, "afear") => List(buzzy)
ranked_evil(secrets, "zincy") => List(jugum)
ranked_evil(secrets, "zippy") => List(chuff)
\end{lstlisting}

This means if the user types in "abbey" then the most evil word to choose as secret is ``whizz'' (according to
our calculations). This word has a zero \pcode{iscore} and the most obscure letters.

\color{red}
  The return type for \texttt{ranked\_evil} is \texttt{List[String]}:

  \begin{center}
  \texttt{def ranked\_evil(secrets: List[String], word: String) : List[String] = ???}  
  \end{center}
  \color{black}

%
%\color{red}
%\section*{Correction with \texttt{ranked\_evil}}
%
%The testcases above are actually not the maximum, but the minimum! I will make sure
%that the task will count as solved when either the minimum (as above) or the maximum (as intended)
%is used. The correct solutions for the above testcases using the maximum are:
%\color{black}
%
%\begin{lstlisting}[numbers=none]
%ranked_evil(secrets, "beats") => List(fuzzy)
%ranked_evil(secrets, "vitae") => List(fuzzy)
%ranked_evil(secrets, "bento") => List(fuzzy)
%ranked_evil(secrets, "belts") => List(fuzzy)
%\end{lstlisting}
%
%\noindent \textcolor{red}{Some further testcases for the maximum are:}
%
%\begin{lstlisting}[numbers=none]
%ranked_evil(secrets, "abbey") => List(whizz)
%ranked_evil(secrets, "afear") => List(buzzy)
%ranked_evil(secrets, "zincy") => List(jugum)
%ranked_evil(secrets, "zippy") => List(chuff)
%\end{lstlisting}
% 
%

\mbox{}\hfill [1 Mark]  
\end{itemize}

\end{document} 

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: t
%%% End:
author	Christian Urban <christian.urban@kcl.ac.uk>
	Fri, 23 Dec 2022 16:52:34 +0000
changeset 456	d076cb2e0b75
parent 445	b73e7ce91c10
child 458	d9f8245d0861
permissions	-rw-r--r--