cws/core_cw01.tex
author Christian Urban <christian.urban@kcl.ac.uk>
Fri, 26 Apr 2024 17:37:56 +0100
changeset 488 5deaf53c2faa
parent 474 b528d1d3d3c3
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
276
52faee6d0be2 updated
Christian Urban <urbanc@in.tum.de>
parents: 266
diff changeset
     1
% !TEX program = xelatex
6
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     2
\documentclass{article}
426
b51467741af2 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 411
diff changeset
     3
\usepackage{../styles/style}
195
fc3ac7b70a06 updated
Christian Urban <urbanc@in.tum.de>
parents: 192
diff changeset
     4
\usepackage{disclaimer}
426
b51467741af2 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 411
diff changeset
     5
\usepackage{../styles/langs}
6
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     6
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
     7
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
     8
6
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     9
\begin{document}
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    10
396
3ffe978a5664 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 356
diff changeset
    11
\section*{Core Part 1 (Scala, 3 Marks)}
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    12
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    13
\mbox{}\hfill\textit{``The most effective debugging tool is still careful thought,}\\
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    14
\mbox{}\hfill\textit{coupled with judiciously placed print statements.''}\smallskip\\
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    15
\mbox{}\hfill\textit{ --- Brian W. Kernighan, in Unix for Beginners (1979)}\bigskip
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    16
279
14bf4e478534 updated
Christian Urban <urbanc@in.tum.de>
parents: 276
diff changeset
    17
 
411
f55742af79ef updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 396
diff changeset
    18
\IMPORTANTNONE{}
195
fc3ac7b70a06 updated
Christian Urban <urbanc@in.tum.de>
parents: 192
diff changeset
    19
127
b4def82f3f9f updated
Christian Urban <urbanc@in.tum.de>
parents: 125
diff changeset
    20
\noindent
195
fc3ac7b70a06 updated
Christian Urban <urbanc@in.tum.de>
parents: 192
diff changeset
    21
Also note that the running time of each part will be restricted to a
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    22
maximum of 30 seconds on my laptop.
192
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
    23
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
    24
\DISCLAIMER{}
6
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    25
201
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    26
\subsection*{Reference Implementation}
6
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    27
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    28
Like the C++ assignments, the Scala assignments will work like this: you
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    29
push your files to GitHub and receive (after sometimes a long delay) some
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    30
automated feedback. In the end we take a snapshot of the submitted files and
266
ca48ac1d3c3e updated to 2.13
Christian Urban <urbanc@in.tum.de>
parents: 252
diff changeset
    31
apply an automated marking script to them.\medskip
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    32
266
ca48ac1d3c3e updated to 2.13
Christian Urban <urbanc@in.tum.de>
parents: 252
diff changeset
    33
\noindent
306
Christian Urban <urbanc@in.tum.de>
parents: 284
diff changeset
    34
In addition, the Scala coursework comes with a reference implementation
474
b528d1d3d3c3 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 471
diff changeset
    35
in form of a \texttt{jar}-file. This allows you to run any test cases on
b528d1d3d3c3 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 471
diff changeset
    36
your own computer. For example you can call \texttt{scala-cli} on the command line
471
135bf034ac30 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 429
diff changeset
    37
with the option \texttt{--extra-jars collatz.jar} and then query any function
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    38
from the template file. Say you want to find out what the functions
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    39
\texttt{collatz} and \texttt{collatz\_max} produce: for this you just
396
3ffe978a5664 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 356
diff changeset
    40
need to prefix them with the object name \texttt{C1}. If you want to
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    41
find out what these functions produce for the argument \texttt{6}, you
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    42
would type something like:
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    43
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    44
\begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small]
471
135bf034ac30 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 429
diff changeset
    45
$ scala-cli --extra-jars collatz.jar
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    46
  
396
3ffe978a5664 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 356
diff changeset
    47
scala> C1.collatz(6)
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    48
...
396
3ffe978a5664 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 356
diff changeset
    49
scala> C1.collatz_max(6)
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    50
...
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    51
\end{lstlisting}%$
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    52
201
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    53
\subsection*{Hints}
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    54
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    55
\noindent
396
3ffe978a5664 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 356
diff changeset
    56
\textbf{For the Core Part 1:} useful math operators: \texttt{\%} for modulo, \texttt{\&} for bit-wise and; useful
201
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    57
functions: \mbox{\texttt{(1\,to\,10)}} for ranges, \texttt{.toInt},
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    58
\texttt{.toList} for conversions, you can use \texttt{List(...).max} for the
201
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    59
maximum of a list, \texttt{List(...).indexOf(...)} for the first index of
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    60
a value in a list.\bigskip
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    61
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    62
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    63
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
    64
\newpage
396
3ffe978a5664 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 356
diff changeset
    65
\subsection*{Core Part 1 (3 Marks, file collatz.scala)}
6
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    66
336
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
    67
This part is about function definitions and recursion. You are asked
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
    68
to implement a Scala program that tests examples of the
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
    69
\emph{$3n + 1$-conjecture}, also called \emph{Collatz
474
b528d1d3d3c3 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 471
diff changeset
    70
  conjecture}.\video{https://www.youtube.com/watch?v=LqKpkdRRLZw}
336
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
    71
This conjecture can be described as follows: Start with any positive
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
    72
number $n$ greater than $0$:
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    73
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    74
\begin{itemize}
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    75
\item If $n$ is even, divide it by $2$ to obtain $n / 2$.
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    76
\item If $n$ is odd, multiply it by $3$ and add $1$ to obtain $3n +
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    77
  1$.
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    78
\item Repeat this process and you will always end up with $1$.
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    79
\end{itemize}
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    80
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
    81
\noindent
266
ca48ac1d3c3e updated to 2.13
Christian Urban <urbanc@in.tum.de>
parents: 252
diff changeset
    82
For example if you start with, say, $6$ and $9$, you obtain the
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    83
two \emph{Collatz series}
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    84
%
11
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
    85
\[
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
    86
\begin{array}{@{}l@{\hspace{5mm}}l@{}}
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    87
6, 3, 10, 5, 16, 8, 4, 2, 1 & \text{(= 8 steps)}\\
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
    88
9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1  & \text{(= 19 steps)}\\
11
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
    89
\end{array}
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
    90
\]
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
    91
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
    92
\noindent
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
    93
As you can see, the numbers go up and down like a roller-coaster, but
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    94
curiously they seem to always terminate in $1$. Nobody knows why. The
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    95
conjecture is that this will \emph{always} happen for every number
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    96
greater than 0.\footnote{While it is relatively easy to test this
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    97
conjecture with particular numbers, it is an interesting open problem to
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    98
\emph{prove} that the conjecture is true for \emph{all} numbers ($> 0$).
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
    99
Paul Erd\"o{}s, a famous mathematician you might have heard about, said
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   100
about this conjecture: ``Mathematics may not [yet] be ready for such
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   101
problems.'' and also offered a \$500 cash prize for its solution.
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   102
Jeffrey Lagarias, another mathematician, claimed that based only on
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   103
known information about this problem, ``this is an extraordinarily
429
126d0e47ac85 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 426
diff changeset
   104
difficult problem, completely out of reach of present-day mathematics.''
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   105
There is also a \href{https://xkcd.com/710/}{xkcd} cartoon about this
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   106
conjecture\here{https://xkcd.com/710/}). If you are able to solve this
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   107
conjecture, you will definitely get famous.}\bigskip
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   108
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   109
\noindent
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   110
\textbf{Tasks}
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   111
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   112
\begin{itemize}
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   113
\item[(1)] You are asked to implement a recursive function that
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   114
  calculates the number of steps needed until a series ends
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   115
  with $1$. In case of starting with $6$, it takes $8$ steps and in
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   116
  case of starting with $9$, it takes $19$ (see above). We assume it 
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   117
  takes $0$ steps, if we start with $1$. In order to
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   118
  try out this function with large numbers, you should use
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   119
  \texttt{Long} as argument type, instead of \texttt{Int}.  You can
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   120
  assume this function will be called with numbers between $1$ and
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   121
  $1$ Million. \hfill[1 Mark]
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   122
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   123
\item[(2)] Write a second function that takes an upper bound as
282
ec9773fe1dc0 updated
Christian Urban <urbanc@in.tum.de>
parents: 279
diff changeset
   124
  an argument and calculates the steps for all numbers in the range from
210
63a1376cbebd updated
Christian Urban <urbanc@in.tum.de>
parents: 202
diff changeset
   125
  1 up to this bound (the bound including). It returns the maximum number of
63a1376cbebd updated
Christian Urban <urbanc@in.tum.de>
parents: 202
diff changeset
   126
  steps and the corresponding number that needs that many steps.  More
63a1376cbebd updated
Christian Urban <urbanc@in.tum.de>
parents: 202
diff changeset
   127
  precisely it returns a pair where the first component is the number
63a1376cbebd updated
Christian Urban <urbanc@in.tum.de>
parents: 202
diff changeset
   128
  of steps and the second is the corresponding number. \hfill\mbox{[1
63a1376cbebd updated
Christian Urban <urbanc@in.tum.de>
parents: 202
diff changeset
   129
    Mark]}
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   130
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   131
\item[(3)] Write a function that calculates \emph{hard
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   132
    numbers} \here{https://medium.com/cantors-paradise/the-collatz-conjecture-some-shocking-results-from-180-000-iterations-7fea130d0377}
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   133
  in the Collatz series---these are the last odd numbers just before a
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   134
  power of two is reached.  For this, implement an
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   135
  \textit{is-power-of-two} function which tests whether a number is a
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   136
  power of two. The easiest way to implement this is by using the
336
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
   137
  bit-operator $\&$ of Scala. For a power of two, say $n$ with $n > 0$, it
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   138
  holds that $n \;\&\; (n - 1)$ is equal to zero. I let you think why
344
a2ac7e3fa330 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 343
diff changeset
   139
  this is the case.
a2ac7e3fa330 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 343
diff changeset
   140
a2ac7e3fa330 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 343
diff changeset
   141
  The function \textit{is-hard} calculates whether
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   142
  $3n + 1$ is a power of two.  Finally the \textit{last-odd} function
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   143
  calculates the last odd number before a power of 2 in the Collatz
344
a2ac7e3fa330 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 343
diff changeset
   144
  series. This means for example when starting with 9,
335
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   145
  we receive 5 as the last odd number.  Surprisingly a lot of numbers
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   146
  have 5 as last-odd number. But for example for 113 we obtain 85,
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   147
  because of the series
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   148
  %
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   149
  \[113, 340, 170, \,\fbox{85}\,, 256, 128, 64, 32, 16, 8, 4, 2, 1\]
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   150
7e00d2b13b04 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 315
diff changeset
   151
  The \textit{last-odd} function will only be called with numbers that are not
396
3ffe978a5664 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 356
diff changeset
   152
  powers of 2 themselves.\hfill\mbox{[1 Mark]}
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   153
\end{itemize}
11
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
   154
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
   155
\noindent
429
126d0e47ac85 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 426
diff changeset
   156
\textbf{Test Data:} Some test cases are:
11
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
   157
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
   158
\begin{itemize}
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   159
\item 1 to 10 where $9$ takes 19 steps 
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   160
\item 1 to 100 where $97$ takes 118 steps,
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   161
\item 1 to 1,000 where $871$ takes 178 steps,
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   162
\item 1 to 10,000 where $6,171$ takes 261 steps,
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   163
\item 1 to 100,000 where $77,031$ takes 350 steps, 
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   164
\item 1 to 1 Million where $837,799$ takes 524 steps
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   165
  %% runs out of stack space
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   166
  %% \item[$\bullet$] $1 - 10$ million where $8,400,511$ takes 685 steps
336
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
   167
\item 21 is the last odd number for 84
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
   168
\item 341 is the last odd number for 201, 604, 605 and 8600
25d9c3b2bc99 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 335
diff changeset
   169
  
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   170
\end{itemize}
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   171
  
201
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
   172
127
b4def82f3f9f updated
Christian Urban <urbanc@in.tum.de>
parents: 125
diff changeset
   173
b4def82f3f9f updated
Christian Urban <urbanc@in.tum.de>
parents: 125
diff changeset
   174
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   175
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   176
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   177
\end{document}
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   178
276
52faee6d0be2 updated
Christian Urban <urbanc@in.tum.de>
parents: 266
diff changeset
   179
52faee6d0be2 updated
Christian Urban <urbanc@in.tum.de>
parents: 266
diff changeset
   180
%%%%%%% Historical Stuff
199
54befaf23648 updated
Christian Urban <urbanc@in.tum.de>
parents: 197
diff changeset
   181
\newpage
192
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   182
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   183
This part is about web-scraping and list-processing in Scala. It uses
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   184
online data about the per-capita alcohol consumption for each country
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   185
(per year?), and a file containing the data about the population size of
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   186
each country.  From this data you are supposed to estimate how many
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   187
litres of pure alcohol are consumed worldwide.\bigskip
192
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   188
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   189
\noindent
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   190
\textbf{Tasks (file alcohol.scala):}
192
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   191
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   192
\begin{itemize}
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   193
\item[(1)] Write a function that given an URL requests a
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   194
  comma-separated value (CSV) list.  We are interested in the list
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   195
  from the following URL
192
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   196
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   197
\begin{center}
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   198
  \url{https://raw.githubusercontent.com/fivethirtyeight/data/master/alcohol-consumption/drinks.csv}
192
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   199
\end{center}
127
b4def82f3f9f updated
Christian Urban <urbanc@in.tum.de>
parents: 125
diff changeset
   200
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   201
\noindent Your function should take a string (the URL) as input, and
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   202
produce a list of strings as output, where each string is one line in
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   203
the corresponding CSV-list.  This list from the URL above should
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   204
contain 194 lines.\medskip
192
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   205
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   206
\noindent
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   207
Write another function that can read the file \texttt{population.csv}
201
018b9c12ee1f updated
Christian Urban <urbanc@in.tum.de>
parents: 199
diff changeset
   208
from disk (the file is distributed with the assignment). This
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   209
function should take a string as argument, the file name, and again
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   210
return a list of strings corresponding to each entry in the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   211
CSV-list. For \texttt{population.csv}, this list should contain 216
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   212
lines.\hfill[1 Mark]
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   213
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   214
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   215
\item[(2)] Unfortunately, the CSV-lists contain a lot of ``junk'' and we
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   216
  need to extract the data that interests us.  From the header of the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   217
  alcohol list, you can see there are 5 columns
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   218
  
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   219
  \begin{center}
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   220
    \begin{tabular}{l}
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   221
      \texttt{country (name),}\\
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   222
      \texttt{beer\_servings,}\\
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   223
      \texttt{spirit\_servings,}\\
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   224
      \texttt{wine\_servings,}\\
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   225
      \texttt{total\_litres\_of\_pure\_alcohol}
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   226
    \end{tabular}  
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   227
  \end{center}
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   228
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   229
  \noindent
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   230
  Write a function that extracts the data from the first column,
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   231
  the country name, and the data from the fifth column (converted into
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   232
  a \texttt{Double}). For this go through each line of the CSV-list
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   233
  (except the first line), use the \texttt{split(",")} function to
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   234
  divide each line into an array of 5 elements. Keep the data from the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   235
  first and fifth element in these arrays.\medskip
192
a112e0e2325c updated
Christian Urban <urbanc@in.tum.de>
parents: 167
diff changeset
   236
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   237
  \noindent
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   238
  Write another function that processes the population size list. This
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   239
  is already of the form country name and population size.\footnote{Your
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   240
    friendly lecturer already did the messy processing for you from the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   241
  Worldbank database, see \url{https://github.com/datasets/population/tree/master/data} for the original.} Again, split the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   242
  strings according to the commas. However, this time generate a
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   243
  \texttt{Map} from country names to population sizes.\hfill[1 Mark]
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   244
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   245
\item[(3)] In (2) you generated the data about the alcohol consumption
343
c8fcc0e0a57f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 336
diff changeset
   246
  per-capita for each country, and also the population size for each
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   247
  country. From this generate next a sorted(!) list of the overall
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   248
  alcohol consumption for each country. The list should be sorted from
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   249
  highest alcohol consumption to lowest. The difficulty is that the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   250
  data is scraped off from ``random'' sources on the Internet and
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   251
  annoyingly the spelling of some country names does not always agree in both
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   252
  lists. For example the alcohol list contains
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   253
  \texttt{Bosnia-Herzegovina}, while the population writes this country as
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   254
  \texttt{Bosnia and Herzegovina}. In your sorted
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   255
  overall list include only countries from the alcohol list, whose
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   256
  exact country name is also in the population size list. This means
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   257
  you can ignore countries like Bosnia-Herzegovina from the overall
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   258
  alcohol consumption. There are 177 countries where the names
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   259
  agree. The UK is ranked 10th on this list by
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   260
  consuming 671,976,864 Litres of pure alcohol each year.\medskip
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   261
  
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   262
  \noindent
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   263
  Finally, write another function that takes an integer, say
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   264
  \texttt{n}, as argument. You can assume this integer is between 0
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   265
  and 177 (the number of countries in the sorted list above).  The
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   266
  function should return a triple, where the first component is the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   267
  sum of the alcohol consumption in all countries (on the list); the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   268
  second component is the sum of the \texttt{n}-highest alcohol
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   269
  consumers on the list; and the third component is the percentage the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   270
  \texttt{n}-highest alcohol consumers drink with respect to the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   271
  the world consumption. You will see that according to our data, 164
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   272
  countries (out of 177) gobble up 100\% of the World alcohol
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   273
  consumption.\hfill\mbox{[1 Mark]}
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   274
\end{itemize}
11
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
   275
417869f65585 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 9
diff changeset
   276
\noindent
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   277
\textbf{Hints:} useful list functions: \texttt{.drop(n)},
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   278
\texttt{.take(n)} for dropping or taking some elements in a list,
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   279
\texttt{.getLines} for separating lines in a string;
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   280
\texttt{.sortBy(\_.\_2)} sorts a list of pairs according to the second
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   281
elements in the pairs---the sorting is done from smallest to highest;
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   282
useful \texttt{Map} functions: \texttt{.toMap} converts a list of
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   283
pairs into a \texttt{Map}, \texttt{.isDefinedAt(k)} tests whether the
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   284
map is defined at that key, that is would produce a result when
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   285
called with this key; useful data functions: \texttt{Source.fromURL},
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   286
\texttt{Source.fromFile} for obtaining a webpage and reading a file.
127
b4def82f3f9f updated
Christian Urban <urbanc@in.tum.de>
parents: 125
diff changeset
   287
196
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   288
\newpage
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   289
c50b074b3047 updated
Christian Urban <urbanc@in.tum.de>
parents: 195
diff changeset
   290
18
87e55eb309ed updated
Christian Urban <urbanc@in.tum.de>
parents: 11
diff changeset
   291
129
b1a51285de7e updated
Christian Urban <urbanc@in.tum.de>
parents: 127
diff changeset
   292
135
077e63e96287 updated
Christian Urban <urbanc@in.tum.de>
parents: 129
diff changeset
   293
6
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   294
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   295
%%% Local Variables: 
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   296
%%% mode: latex
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   297
%%% TeX-master: t
aae256985251 updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   298
%%% End: