cws/pre_cw01.tex
changeset 344 a2ac7e3fa330
parent 343 c8fcc0e0a57f
child 345 40657f9a4e4a
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/cws/pre_cw01.tex	Sat Oct 31 16:23:12 2020 +0000
@@ -0,0 +1,298 @@
+% !TEX program = xelatex
+\documentclass{article}
+\usepackage{../style}
+\usepackage{disclaimer}
+\usepackage{../langs}
+
+
+
+\begin{document}
+
+\section*{Preliminary Part 6 (Scala, 3 Marks)}
+
+\mbox{}\hfill\textit{``The most effective debugging tool is still careful thought,}\\
+\mbox{}\hfill\textit{coupled with judiciously placed print statements.''}\smallskip\\
+\mbox{}\hfill\textit{ --- Brian W. Kernighan, in Unix for Beginners (1979)}\bigskip
+
+ 
+\IMPORTANT{This part is about Scala. It is due on \cwSIX{} at 5pm and worth 3\%.}
+
+\noindent
+Also note that the running time of each part will be restricted to a
+maximum of 30 seconds on my laptop.
+
+\DISCLAIMER{}
+
+\subsection*{Reference Implementation}
+
+Like the C++ assignments, the Scala assignments will work like this: you
+push your files to GitHub and receive (after sometimes a long delay) some
+automated feedback. In the end we take a snapshot of the submitted files and
+apply an automated marking script to them.\medskip
+
+\noindent
+In addition, the Scala coursework comes with a reference implementation
+in form of \texttt{jar}-files. This allows you to run any test cases on
+your own computer. For example you can call Scala on the command line
+with the option \texttt{-cp collatz.jar} and then query any function
+from the template file. Say you want to find out what the functions
+\texttt{collatz} and \texttt{collatz\_max} produce: for this you just
+need to prefix them with the object name \texttt{CW6a}. If you want to
+find out what these functions produce for the argument \texttt{6}, you
+would type something like:
+
+\begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small]
+$ scala -cp collatz.jar
+  
+scala> CW6a.collatz(6)
+...
+scala> CW6a.collatz_max(6)
+...
+\end{lstlisting}%$
+
+\subsection*{Hints}
+
+\noindent
+\textbf{For Preliminary Part:} useful math operators: \texttt{\%} for modulo, \texttt{\&} for bit-wise and; useful
+functions: \mbox{\texttt{(1\,to\,10)}} for ranges, \texttt{.toInt},
+\texttt{.toList} for conversions, you can use \texttt{List(...).max} for the
+maximum of a list, \texttt{List(...).indexOf(...)} for the first index of
+a value in a list.\bigskip
+
+
+
+\newpage
+\subsection*{Preliminary Part (3 Marks, file collatz.scala)}
+
+This part is about function definitions and recursion. You are asked
+to implement a Scala program that tests examples of the
+\emph{$3n + 1$-conjecture}, also called \emph{Collatz
+  conjecture}.\video{https://www.youtube.com./watch?v=LqKpkdRRLZw}
+This conjecture can be described as follows: Start with any positive
+number $n$ greater than $0$:
+
+\begin{itemize}
+\item If $n$ is even, divide it by $2$ to obtain $n / 2$.
+\item If $n$ is odd, multiply it by $3$ and add $1$ to obtain $3n +
+  1$.
+\item Repeat this process and you will always end up with $1$.
+\end{itemize}
+
+\noindent
+For example if you start with, say, $6$ and $9$, you obtain the
+two \emph{Collatz series}
+%
+\[
+\begin{array}{@{}l@{\hspace{5mm}}l@{}}
+6, 3, 10, 5, 16, 8, 4, 2, 1 & \text{(= 8 steps)}\\
+9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1  & \text{(= 19 steps)}\\
+\end{array}
+\]
+
+\noindent
+As you can see, the numbers go up and down like a roller-coaster, but
+curiously they seem to always terminate in $1$. Nobody knows why. The
+conjecture is that this will \emph{always} happen for every number
+greater than 0.\footnote{While it is relatively easy to test this
+conjecture with particular numbers, it is an interesting open problem to
+\emph{prove} that the conjecture is true for \emph{all} numbers ($> 0$).
+Paul Erd\"o{}s, a famous mathematician you might have heard about, said
+about this conjecture: ``Mathematics may not [yet] be ready for such
+problems.'' and also offered a \$500 cash prize for its solution.
+Jeffrey Lagarias, another mathematician, claimed that based only on
+known information about this problem, ``this is an extraordinarily
+difficult problem, completely out of reach of present day mathematics.''
+There is also a \href{https://xkcd.com/710/}{xkcd} cartoon about this
+conjecture\here{https://xkcd.com/710/}). If you are able to solve this
+conjecture, you will definitely get famous.}\bigskip
+
+\noindent
+\textbf{Tasks}
+
+\begin{itemize}
+\item[(1)] You are asked to implement a recursive function that
+  calculates the number of steps needed until a series ends
+  with $1$. In case of starting with $6$, it takes $8$ steps and in
+  case of starting with $9$, it takes $19$ (see above). We assume it 
+  takes $0$ steps, if we start with $1$. In order to
+  try out this function with large numbers, you should use
+  \texttt{Long} as argument type, instead of \texttt{Int}.  You can
+  assume this function will be called with numbers between $1$ and
+  $1$ Million. \hfill[1 Mark]
+
+\item[(2)] Write a second function that takes an upper bound as
+  an argument and calculates the steps for all numbers in the range from
+  1 up to this bound (the bound including). It returns the maximum number of
+  steps and the corresponding number that needs that many steps.  More
+  precisely it returns a pair where the first component is the number
+  of steps and the second is the corresponding number. \hfill\mbox{[1
+    Mark]}
+
+\item[(3)] Write a function that calculates \emph{hard
+    numbers} \here{https://medium.com/cantors-paradise/the-collatz-conjecture-some-shocking-results-from-180-000-iterations-7fea130d0377}
+  in the Collatz series---these are the last odd numbers just before a
+  power of two is reached.  For this, implement an
+  \textit{is-power-of-two} function which tests whether a number is a
+  power of two. The easiest way to implement this is by using the
+  bit-operator $\&$ of Scala. For a power of two, say $n$ with $n > 0$, it
+  holds that $n \;\&\; (n - 1)$ is equal to zero. I let you think why
+  this is the case.
+
+  The function \textit{is-hard} calculates whether
+  $3n + 1$ is a power of two.  Finally the \textit{last-odd} function
+  calculates the last odd number before a power of 2 in the Collatz
+  series. This means for example when starting with 9,
+  we receive 5 as the last odd number.  Surprisingly a lot of numbers
+  have 5 as last-odd number. But for example for 113 we obtain 85,
+  because of the series
+  %
+  \[113, 340, 170, \,\fbox{85}\,, 256, 128, 64, 32, 16, 8, 4, 2, 1\]
+
+  The \textit{last-odd} function will only be called with numbers that are not
+  powers of 2 themselves.
+\end{itemize}
+
+\noindent
+\textbf{Test Data:} Some test ranges and cases are:
+
+\begin{itemize}
+\item 1 to 10 where $9$ takes 19 steps 
+\item 1 to 100 where $97$ takes 118 steps,
+\item 1 to 1,000 where $871$ takes 178 steps,
+\item 1 to 10,000 where $6,171$ takes 261 steps,
+\item 1 to 100,000 where $77,031$ takes 350 steps, 
+\item 1 to 1 Million where $837,799$ takes 524 steps
+  %% runs out of stack space
+  %% \item[$\bullet$] $1 - 10$ million where $8,400,511$ takes 685 steps
+\item 21 is the last odd number for 84
+\item 341 is the last odd number for 201, 604, 605 and 8600
+  
+\end{itemize}
+  
+
+
+
+
+
+\end{document}
+
+
+%%%%%%% Historical Stuff
+\newpage
+
+This part is about web-scraping and list-processing in Scala. It uses
+online data about the per-capita alcohol consumption for each country
+(per year?), and a file containing the data about the population size of
+each country.  From this data you are supposed to estimate how many
+litres of pure alcohol are consumed worldwide.\bigskip
+
+\noindent
+\textbf{Tasks (file alcohol.scala):}
+
+\begin{itemize}
+\item[(1)] Write a function that given an URL requests a
+  comma-separated value (CSV) list.  We are interested in the list
+  from the following URL
+
+\begin{center}
+  \url{https://raw.githubusercontent.com/fivethirtyeight/data/master/alcohol-consumption/drinks.csv}
+\end{center}
+
+\noindent Your function should take a string (the URL) as input, and
+produce a list of strings as output, where each string is one line in
+the corresponding CSV-list.  This list from the URL above should
+contain 194 lines.\medskip
+
+\noindent
+Write another function that can read the file \texttt{population.csv}
+from disk (the file is distributed with the assignment). This
+function should take a string as argument, the file name, and again
+return a list of strings corresponding to each entry in the
+CSV-list. For \texttt{population.csv}, this list should contain 216
+lines.\hfill[1 Mark]
+
+
+\item[(2)] Unfortunately, the CSV-lists contain a lot of ``junk'' and we
+  need to extract the data that interests us.  From the header of the
+  alcohol list, you can see there are 5 columns
+  
+  \begin{center}
+    \begin{tabular}{l}
+      \texttt{country (name),}\\
+      \texttt{beer\_servings,}\\
+      \texttt{spirit\_servings,}\\
+      \texttt{wine\_servings,}\\
+      \texttt{total\_litres\_of\_pure\_alcohol}
+    \end{tabular}  
+  \end{center}
+
+  \noindent
+  Write a function that extracts the data from the first column,
+  the country name, and the data from the fifth column (converted into
+  a \texttt{Double}). For this go through each line of the CSV-list
+  (except the first line), use the \texttt{split(",")} function to
+  divide each line into an array of 5 elements. Keep the data from the
+  first and fifth element in these arrays.\medskip
+
+  \noindent
+  Write another function that processes the population size list. This
+  is already of the form country name and population size.\footnote{Your
+    friendly lecturer already did the messy processing for you from the
+  Worldbank database, see \url{https://github.com/datasets/population/tree/master/data} for the original.} Again, split the
+  strings according to the commas. However, this time generate a
+  \texttt{Map} from country names to population sizes.\hfill[1 Mark]
+
+\item[(3)] In (2) you generated the data about the alcohol consumption
+  per-capita for each country, and also the population size for each
+  country. From this generate next a sorted(!) list of the overall
+  alcohol consumption for each country. The list should be sorted from
+  highest alcohol consumption to lowest. The difficulty is that the
+  data is scraped off from ``random'' sources on the Internet and
+  annoyingly the spelling of some country names does not always agree in both
+  lists. For example the alcohol list contains
+  \texttt{Bosnia-Herzegovina}, while the population writes this country as
+  \texttt{Bosnia and Herzegovina}. In your sorted
+  overall list include only countries from the alcohol list, whose
+  exact country name is also in the population size list. This means
+  you can ignore countries like Bosnia-Herzegovina from the overall
+  alcohol consumption. There are 177 countries where the names
+  agree. The UK is ranked 10th on this list by
+  consuming 671,976,864 Litres of pure alcohol each year.\medskip
+  
+  \noindent
+  Finally, write another function that takes an integer, say
+  \texttt{n}, as argument. You can assume this integer is between 0
+  and 177 (the number of countries in the sorted list above).  The
+  function should return a triple, where the first component is the
+  sum of the alcohol consumption in all countries (on the list); the
+  second component is the sum of the \texttt{n}-highest alcohol
+  consumers on the list; and the third component is the percentage the
+  \texttt{n}-highest alcohol consumers drink with respect to the
+  the world consumption. You will see that according to our data, 164
+  countries (out of 177) gobble up 100\% of the World alcohol
+  consumption.\hfill\mbox{[1 Mark]}
+\end{itemize}
+
+\noindent
+\textbf{Hints:} useful list functions: \texttt{.drop(n)},
+\texttt{.take(n)} for dropping or taking some elements in a list,
+\texttt{.getLines} for separating lines in a string;
+\texttt{.sortBy(\_.\_2)} sorts a list of pairs according to the second
+elements in the pairs---the sorting is done from smallest to highest;
+useful \texttt{Map} functions: \texttt{.toMap} converts a list of
+pairs into a \texttt{Map}, \texttt{.isDefinedAt(k)} tests whether the
+map is defined at that key, that is would produce a result when
+called with this key; useful data functions: \texttt{Source.fromURL},
+\texttt{Source.fromFile} for obtaining a webpage and reading a file.
+
+\newpage
+
+
+
+
+
+
+%%% Local Variables: 
+%%% mode: latex
+%%% TeX-master: t
+%%% End: