| author | Christian Urban <urbanc@in.tum.de> | 
| Mon, 05 Nov 2018 16:05:27 +0000 | |
| changeset 196 | 4973c2fb3c66 | 
| parent 195 | 4bacbe753e66 | 
| child 197 | ff7f68a511a6 | 
| permissions | -rw-r--r-- | 
| 6 | 1  | 
\documentclass{article}
 | 
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
2  | 
\usepackage{../style}
 | 
| 195 | 3  | 
\usepackage{disclaimer}
 | 
| 6 | 4  | 
%%\usepackage{../langs}
 | 
5  | 
||
6  | 
\begin{document}
 | 
|
7  | 
||
| 
9
 
48a477fdef21
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
6 
diff
changeset
 | 
8  | 
\section*{Coursework 6 (Scala)}
 | 
| 
31
 
d0caa12ab8d8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
30 
diff
changeset
 | 
9  | 
|
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
10  | 
This coursework is about Scala and is worth 10\%. The first and second  | 
| 196 | 11  | 
part are due on 16 November at 11pm, and the third part on 21 December  | 
| 
29
 
fde9223a5301
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
28 
diff
changeset
 | 
12  | 
at 11pm. You are asked to implement three programs about list  | 
| 18 | 13  | 
processing and recursion. The third part is more advanced and might  | 
| 196 | 14  | 
include material you have not yet seen in the first lecture.  | 
15  | 
\bigskip  | 
|
| 127 | 16  | 
|
| 195 | 17  | 
\IMPORTANT{}
 | 
18  | 
||
| 127 | 19  | 
\noindent  | 
| 195 | 20  | 
Also note that the running time of each part will be restricted to a  | 
| 196 | 21  | 
maximum of 360 seconds on my laptop.  | 
| 192 | 22  | 
|
| 196 | 23  | 
\DISCLAIMER{}
 | 
| 6 | 24  | 
|
25  | 
||
| 18 | 26  | 
\subsection*{Part 1 (3 Marks)}
 | 
| 6 | 27  | 
|
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
28  | 
This part is about recursion. You are asked to implement a Scala  | 
| 18 | 29  | 
program that tests examples of the \emph{$3n + 1$-conjecture}, also
 | 
30  | 
called \emph{Collatz conjecture}. This conjecture can be described as
 | 
|
31  | 
follows: Start with any positive number $n$ greater than $0$:  | 
|
32  | 
||
33  | 
\begin{itemize}
 | 
|
34  | 
\item If $n$ is even, divide it by $2$ to obtain $n / 2$.  | 
|
35  | 
\item If $n$ is odd, multiply it by $3$ and add $1$ to obtain $3n +  | 
|
36  | 
1$.  | 
|
37  | 
\item Repeat this process and you will always end up with $1$.  | 
|
38  | 
\end{itemize}
 | 
|
39  | 
||
40  | 
\noindent  | 
|
41  | 
For example if you start with $6$, respectively $9$, you obtain the  | 
|
42  | 
series  | 
|
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
43  | 
|
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
44  | 
\[  | 
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
45  | 
\begin{array}{@{}l@{\hspace{5mm}}l@{}}
 | 
| 18 | 46  | 
6, 3, 10, 5, 16, 8, 4, 2, 1 & \text{(= 9 steps)}\\
 | 
47  | 
9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1  & \text{(= 20 steps)}\\
 | 
|
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
48  | 
\end{array}
 | 
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
49  | 
\]  | 
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
50  | 
|
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
51  | 
\noindent  | 
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
52  | 
As you can see, the numbers go up and down like a roller-coaster, but  | 
| 18 | 53  | 
curiously they seem to always terminate in $1$. The conjecture is that  | 
54  | 
this will \emph{always} happen for every number greater than
 | 
|
55  | 
0.\footnote{While it is relatively easy to test this conjecture with
 | 
|
56  | 
particular numbers, it is an interesting open problem to  | 
|
| 196 | 57  | 
  \emph{prove} that the conjecture is true for \emph{all} numbers ($>
 | 
58  | 
  0$). Paul Erd\"o{}s, a famous mathematician you might have hard
 | 
|
59  | 
about, said about this conjecture: ``Mathematics may not be ready  | 
|
60  | 
for such problems.'' and also offered a \$500 cash prize for its  | 
|
61  | 
solution. Jeffrey Lagarias, another mathematician, claimed that  | 
|
62  | 
based only on known information about this problem, ``this is an  | 
|
63  | 
extraordinarily difficult problem, completely out of reach of  | 
|
64  | 
present day mathematics.'' There is also a  | 
|
65  | 
  \href{https://xkcd.com/710/}{xkcd} cartoon about this conjecture
 | 
|
| 18 | 66  | 
  (click \href{https://xkcd.com/710/}{here}). If you are able to solve
 | 
67  | 
this conjecture, you will definitely get famous.}\bigskip  | 
|
68  | 
||
69  | 
\noindent  | 
|
70  | 
\textbf{Tasks (file collatz.scala):}
 | 
|
71  | 
||
72  | 
\begin{itemize}
 | 
|
73  | 
\item[(1)] You are asked to implement a recursive function that  | 
|
74  | 
calculates the number of steps needed until a series ends  | 
|
75  | 
with $1$. In case of starting with $6$, it takes $9$ steps and in  | 
|
76  | 
case of starting with $9$, it takes $20$ (see above). In order to  | 
|
77  | 
try out this function with large numbers, you should use  | 
|
78  | 
  \texttt{Long} as argument type, instead of \texttt{Int}.  You can
 | 
|
79  | 
assume this function will be called with numbers between $1$ and  | 
|
| 196 | 80  | 
$1$ Million. \hfill[2 Marks]  | 
| 18 | 81  | 
|
82  | 
\item[(2)] Write a second function that takes an upper bound as  | 
|
83  | 
argument and calculates the steps for all numbers in the range from  | 
|
| 
20
 
07860dd35c2b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
18 
diff
changeset
 | 
84  | 
1 up to this bound. It returns the maximum number of steps and the  | 
| 
31
 
d0caa12ab8d8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
30 
diff
changeset
 | 
85  | 
corresponding number that needs that many steps. More precisely  | 
| 
24
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
86  | 
it returns a pair where the first  | 
| 
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
87  | 
component is the number of steps and the second is the  | 
| 
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
88  | 
  corresponding number. \hfill\mbox{[1 Mark]}
 | 
| 18 | 89  | 
\end{itemize}
 | 
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
90  | 
|
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
91  | 
\noindent  | 
| 18 | 92  | 
\textbf{Test Data:} Some test ranges are:
 | 
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
93  | 
|
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
94  | 
\begin{itemize}
 | 
| 18 | 95  | 
\item 1 to 10 where $9$ takes 20 steps  | 
96  | 
\item 1 to 100 where $97$ takes 119 steps,  | 
|
97  | 
\item 1 to 1,000 where $871$ takes 179 steps,  | 
|
98  | 
\item 1 to 10,000 where $6,171$ takes 262 steps,  | 
|
99  | 
\item 1 to 100,000 where $77,031$ takes 351 steps,  | 
|
| 196 | 100  | 
\item 1 to 1 Million where $837,799$ takes 525 steps  | 
| 18 | 101  | 
%%\item[$\bullet$] $1 - 10$ million where $8,400,511$ takes 686 steps  | 
| 196 | 102  | 
\end{itemize}
 | 
| 18 | 103  | 
|
| 196 | 104  | 
\noindent  | 
105  | 
\textbf{Hints:} useful math operators: \texttt{\%} for modulo; useful
 | 
|
106  | 
functions: \mbox{\texttt{(1\,to\,10)}} for ranges, \texttt{.toInt},
 | 
|
107  | 
\texttt{.toList} for conversions, \texttt{List(...).max} for the
 | 
|
108  | 
maximum of a list, \texttt{List(...).indexOf(...)} for the first index of
 | 
|
109  | 
a value in a list.  | 
|
| 127 | 110  | 
|
111  | 
||
| 196 | 112  | 
|
113  | 
\subsection*{Part 2 (3 Marks)}
 | 
|
| 192 | 114  | 
|
| 196 | 115  | 
This part is about web-scraping and list-processing in Scala. It uses  | 
116  | 
online data about the per-capita alcohol consumption for each country  | 
|
117  | 
(per year?), and a file containing the data about the population size of  | 
|
118  | 
each country. From this data you are supposed to estimate how many  | 
|
119  | 
litres of pure alcohol are consumed worldwide.\bigskip  | 
|
| 192 | 120  | 
|
121  | 
\noindent  | 
|
| 196 | 122  | 
\textbf{Tasks (file alcohol.scala):}
 | 
| 192 | 123  | 
|
124  | 
\begin{itemize}
 | 
|
| 196 | 125  | 
\item[(1)] Write a function that given an URL requests a  | 
126  | 
comma-separated value (CSV) list. We are interested in the list  | 
|
127  | 
from the following URL  | 
|
| 192 | 128  | 
|
129  | 
\begin{center}
 | 
|
| 196 | 130  | 
  \url{https://raw.githubusercontent.com/fivethirtyeight/data/master/alcohol-consumption/drinks.csv}
 | 
| 192 | 131  | 
\end{center}
 | 
| 127 | 132  | 
|
| 196 | 133  | 
\noindent Your function should take a string (the URL) as input, and  | 
134  | 
produce a list of strings as output, where each string is one line in  | 
|
135  | 
the corresponding CSV-list. This list from the URL above should  | 
|
136  | 
contain 194 lines.\medskip  | 
|
| 192 | 137  | 
|
138  | 
\noindent  | 
|
| 196 | 139  | 
Write another function that can read the file \texttt{population.csv}
 | 
140  | 
from disk (the file is distributed with the coursework). This  | 
|
141  | 
function should take a string as argument, the file name, and again  | 
|
142  | 
return a list of strings corresponding to each entry in the  | 
|
143  | 
CSV-list. For \texttt{population.csv}, this list should contain 216
 | 
|
144  | 
lines.\hfill[1 Mark]  | 
|
145  | 
||
146  | 
||
147  | 
\item[(2)] Unfortunately, the CSV-lists contain a lot of ``junk'' and we  | 
|
148  | 
need to extract the data that interests us. From the header of the  | 
|
149  | 
alcohol list, you can see there are 5 columns  | 
|
150  | 
||
151  | 
  \begin{center}
 | 
|
152  | 
    \begin{tabular}{l}
 | 
|
153  | 
      \texttt{country (name),}\\
 | 
|
154  | 
      \texttt{beer\_servings,}\\
 | 
|
155  | 
      \texttt{spirit\_servings,}\\
 | 
|
156  | 
      \texttt{wine\_servings,}\\
 | 
|
157  | 
      \texttt{total\_litres\_of\_pure\_alcohol}
 | 
|
158  | 
    \end{tabular}  
 | 
|
159  | 
  \end{center}
 | 
|
160  | 
||
161  | 
\noindent  | 
|
162  | 
Write a function that extracts the data from the first column,  | 
|
163  | 
the country name, and the data from the fifth column (converted into  | 
|
164  | 
  a \texttt{Double}). For this go through each line of the CSV-list
 | 
|
165  | 
  (except the first line), use the \texttt{split(",")} function to
 | 
|
166  | 
divide each line into an array of 5 elements. Keep the data from the  | 
|
167  | 
first and fifth element in these arrays.\medskip  | 
|
| 192 | 168  | 
|
| 196 | 169  | 
\noindent  | 
170  | 
Write another function that processes the population size list. This  | 
|
171  | 
  is already of the form country name and population size.\footnote{Your
 | 
|
172  | 
friendly lecturer already did the messy processing for you from the  | 
|
173  | 
  Worldbank database, see \url{https://github.com/datasets/population/tree/master/data} for the original.} Again, split the
 | 
|
174  | 
strings according to the commas. However, this time generate a  | 
|
175  | 
  \texttt{Map} from country names to population sizes.\hfill[1 Mark]
 | 
|
176  | 
||
177  | 
\item[(3)] In (2) you generated the data about the alcohol consumption  | 
|
178  | 
per capita for each country, and also the population size for each  | 
|
179  | 
country. From this generate next a sorted(!) list of the overall  | 
|
180  | 
alcohol consumption for each country. The list should be sorted from  | 
|
181  | 
highest alcohol consumption to lowest. The difficulty is that the  | 
|
182  | 
data is scraped off from ``random'' sources on the Internet and  | 
|
183  | 
annoyingly the spelling of some country names does not always agree in both  | 
|
184  | 
lists. For example the alcohol list contains  | 
|
185  | 
  \texttt{Bosnia-Herzegovina}, while the population writes this country as
 | 
|
186  | 
  \texttt{Bosnia and Herzegovina}. In your sorted
 | 
|
187  | 
overall list include only countries from the alcohol list, whose  | 
|
188  | 
exact country name is also in the population size list. This means  | 
|
189  | 
you can ignore countries like Bosnia-Herzegovina from the overall  | 
|
190  | 
alcohol consumption. There are 177 countries where the names  | 
|
191  | 
agree. The UK is ranked 10th on this list by  | 
|
192  | 
consuming 671,976,864 Litres of pure alcohol each year.\medskip  | 
|
193  | 
||
194  | 
\noindent  | 
|
195  | 
Finally, write another function that takes an integer, say  | 
|
196  | 
  \texttt{n}, as argument. You can assume this integer is between 0
 | 
|
197  | 
and 177 (the number of countries in the sorted list above). The  | 
|
198  | 
function should return a triple, where the first component is the  | 
|
199  | 
sum of the alcohol consumption in all countries (on the list); the  | 
|
200  | 
  second component is the sum of the \texttt{n}-highest alcohol
 | 
|
201  | 
consumers on the list; and the third component is the percentage the  | 
|
202  | 
  \texttt{n}-highest alcohol consumers drink with respect to the
 | 
|
203  | 
the world consumption. You will see that according to our data, 164  | 
|
204  | 
countries (out of 177) gobble up 100\% of the World alcohol  | 
|
205  | 
  consumption.\hfill\mbox{[1 Mark]}
 | 
|
| 18 | 206  | 
\end{itemize}
 | 
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
207  | 
|
| 
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
208  | 
\noindent  | 
| 196 | 209  | 
\textbf{Hints:} useful list functions: \texttt{.drop(n)},
 | 
210  | 
\texttt{.take(n)} for dropping or taking some elements in a list,
 | 
|
211  | 
\texttt{.getLines} for separating lines in a string;
 | 
|
212  | 
\texttt{.sortBy(\_.\_2)} sorts a list of pairs according to the second
 | 
|
213  | 
elements in the pairs---the sorting is done from smallest to highest;  | 
|
214  | 
useful \texttt{Map} functions: \texttt{.toMap} converts a list of
 | 
|
215  | 
pairs into a \texttt{Map}, \texttt{.isDefinedAt(k)} tests whether the
 | 
|
216  | 
map is defined at that key, that is would produce a result when  | 
|
217  | 
called with this key; useful data functions: \texttt{Source.fromURL},
 | 
|
218  | 
\texttt{Source.fromFile} for obtaining a webpage and reading a file.
 | 
|
| 127 | 219  | 
|
| 196 | 220  | 
\newpage  | 
221  | 
||
222  | 
\subsection*{Advanced Part 3 (4 Marks)}
 | 
|
| 18 | 223  | 
|
| 
35
 
9fea5f751be4
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
31 
diff
changeset
 | 
224  | 
A purely fictional character named Mr T.~Drumb inherited in 1978  | 
| 
 
9fea5f751be4
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
31 
diff
changeset
 | 
225  | 
approximately 200 Million Dollar from his father. Mr Drumb prides  | 
| 
20
 
07860dd35c2b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
18 
diff
changeset
 | 
226  | 
himself to be a brilliant business man because nowadays it is  | 
| 
 
07860dd35c2b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
18 
diff
changeset
 | 
227  | 
estimated he is 3 Billion Dollar worth (one is not sure, of course,  | 
| 
35
 
9fea5f751be4
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
31 
diff
changeset
 | 
228  | 
because Mr Drumb refuses to make his tax records public).  | 
| 18 | 229  | 
|
| 196 | 230  | 
Since the question about Mr Drumb's business acumen remains open,  | 
231  | 
let's do a quick back-of-the-envelope calculation in Scala whether his  | 
|
232  | 
claim has any merit. Let's suppose we are given \$100 in 1978 and we  | 
|
233  | 
follow a really dumb investment strategy, namely:  | 
|
| 18 | 234  | 
|
235  | 
\begin{itemize}
 | 
|
| 
24
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
236  | 
\item We blindly choose a portfolio of stocks, say some Blue-Chip stocks  | 
| 18 | 237  | 
or some Real Estate stocks.  | 
238  | 
\item If some of the stocks in our portfolio are traded in January of  | 
|
| 
24
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
239  | 
a year, we invest our money in equal amounts in each of these  | 
| 18 | 240  | 
stocks. For example if we have \$100 and there are four stocks that  | 
| 
24
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
241  | 
are traded in our portfolio, we buy \$25 worth of stocks  | 
| 18 | 242  | 
from each.  | 
243  | 
\item Next year in January, we look how our stocks did, liquidate  | 
|
244  | 
everything, and re-invest our (hopefully) increased money in again  | 
|
245  | 
the stocks from our portfolio (there might be more stocks available,  | 
|
246  | 
if companies from our portfolio got listed in that year, or less if  | 
|
| 196 | 247  | 
some companies went bust or were de-listed).  | 
248  | 
\item We do this for 39 years until January 2017 and check what would  | 
|
| 18 | 249  | 
have become out of our \$100.  | 
| 196 | 250  | 
\end{itemize}
 | 
251  | 
||
252  | 
\noindent  | 
|
253  | 
Until Yahoo was bought by Altaba this summer, historical stock market  | 
|
254  | 
data for such back-of-the-envelope calculations was freely available  | 
|
255  | 
online. Unfortuantely nowadays this kind of data is difficult to  | 
|
256  | 
obtain, unless you are prepared to pay extortionate prices or be  | 
|
257  | 
severely rate-limited. Therefore this coursework comes with a number  | 
|
258  | 
of files containing CSV-lists with the historical stock prices for the  | 
|
259  | 
companies in our portfolios. Use these files for the following  | 
|
260  | 
tasks.\bigskip  | 
|
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
261  | 
|
| 18 | 262  | 
\noindent  | 
| 
42
 
a5106bc13db6
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
35 
diff
changeset
 | 
263  | 
\textbf{Tasks (file drumb.scala):}
 | 
| 18 | 264  | 
|
265  | 
\begin{itemize}
 | 
|
| 196 | 266  | 
\item[(1.a)] Write a function \texttt{get\_january\_data} that takes a
 | 
267  | 
stock symbol and a year as arguments. The function reads the  | 
|
268  | 
corresponding CSV-file and returns the list of strings that start  | 
|
269  | 
with the given year (each line in the CSV-list is of the form  | 
|
270  | 
  \texttt{year-01-someday,someprice}).
 | 
|
| 18 | 271  | 
|
| 196 | 272  | 
\item[(1.b)] Write a function \texttt{get\_first\_price} that takes
 | 
273  | 
again a stock symbol and a year as arguments. It should return the  | 
|
274  | 
first January price for the stock symbol in given the year. For this  | 
|
275  | 
it uses the list of strings generated by  | 
|
276  | 
  \texttt{get\_january\_data}.  A problem is that normally a stock
 | 
|
277  | 
exchange is not open on 1st of January, but depending on the day of  | 
|
278  | 
the week on a later day (maybe 3rd or 4th). The easiest way to solve  | 
|
279  | 
this problem is to obtain the whole January data for a stock symbol  | 
|
280  | 
and then select the earliest, or first, entry in this list. The  | 
|
281  | 
stock price of this entry should be converted into a double. Such a  | 
|
282  | 
price might not exist, in case the company does not exist in the given  | 
|
283  | 
year. For example, if you query for Google in January of 1980, then  | 
|
284  | 
clearly Google did not exist yet. Therefore you are asked to  | 
|
285  | 
  return a trade price as \texttt{Option[Double]}\ldots\texttt{None}
 | 
|
286  | 
will be the value for when no price exists.  | 
|
| 18 | 287  | 
|
| 196 | 288  | 
\item[(1.c)] Write a function \texttt{get\_prices} that takes a
 | 
289  | 
portfolio (a list of stock symbols), a years range and gets all the  | 
|
290  | 
first trading prices for each year in the range. You should organise  | 
|
291  | 
  this as a list of lists of \texttt{Option[Double]}'s. The inner
 | 
|
292  | 
lists are for all stock symbols from the portfolio and the outer  | 
|
293  | 
list for the years. For example for Google and Apple in years 2010  | 
|
294  | 
(first line), 2011 (second line) and 2012 (third line) you obtain:  | 
|
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
295  | 
|
| 18 | 296  | 
\begin{verbatim}
 | 
| 196 | 297  | 
List(List(Some(311.349976), Some(27.505054)),  | 
298  | 
List(Some(300.222351), Some(42.357094)),  | 
|
299  | 
List(Some(330.555054), Some(52.852215)))  | 
|
300  | 
\end{verbatim}\hfill[2 Marks]
 | 
|
| 18 | 301  | 
|
302  | 
\item[(2.a)] Write a function that calculates the \emph{change factor} (delta)
 | 
|
303  | 
for how a stock price has changed from one year to the next. This is  | 
|
304  | 
only well-defined, if the corresponding company has been traded in both  | 
|
305  | 
years. In this case you can calculate  | 
|
| 
11
 
417869f65585
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
9 
diff
changeset
 | 
306  | 
|
| 18 | 307  | 
\[  | 
308  | 
  \frac{price_{new} - price_{old}}{price_{old}}
 | 
|
309  | 
\]  | 
|
310  | 
||
| 196 | 311  | 
If the change factor is defined, you should return it  | 
312  | 
  as \texttt{Some(change factor)}; if not, you should return
 | 
|
313  | 
  \texttt{None}.
 | 
|
| 18 | 314  | 
|
315  | 
\item[(2.b)] Write a function that calculates all change factors  | 
|
316  | 
(deltas) for the prices we obtained under Task 1. For the running  | 
|
| 
24
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
317  | 
example of Google and Apple for the years 2010 to 2012 you should  | 
| 
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
318  | 
obtain 4 change factors:  | 
| 18 | 319  | 
|
320  | 
\begin{verbatim}  
 | 
|
| 196 | 321  | 
List(List(Some(-0.03573992567129673), Some(0.5399749442411563))  | 
322  | 
List(Some(0.10103412653643493), Some(0.2477771728154912)))  | 
|
| 18 | 323  | 
\end{verbatim}
 | 
324  | 
||
| 
24
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
325  | 
That means Google did a bit badly in 2010, while Apple did very well.  | 
| 196 | 326  | 
Both did OK in 2011. Make sure you handle the cases where a company is  | 
327  | 
  not listed in a year. In such cases the change factor should be \texttt{None}
 | 
|
328  | 
(see 2.a).\\  | 
|
329  | 
  \mbox{}\hfill\mbox{[1 Mark]}
 | 
|
| 18 | 330  | 
|
331  | 
\item[(3.a)] Write a function that calculates the ``yield'', or  | 
|
332  | 
balance, for one year for our portfolio. This function takes the  | 
|
333  | 
change factors, the starting balance and the year as arguments. If  | 
|
334  | 
no company from our portfolio existed in that year, the balance is  | 
|
335  | 
unchanged. Otherwise we invest in each existing company an equal  | 
|
336  | 
amount of our balance. Using the change factors computed under Task  | 
|
337  | 
2, calculate the new balance. Say we had \$100 in 2010, we would have  | 
|
| 196 | 338  | 
received in our running example involving Google and Apple:  | 
| 6 | 339  | 
|
| 18 | 340  | 
  \begin{verbatim}
 | 
| 196 | 341  | 
$50 * -0.03573992567129673 + $50 * 0.5399749442411563  | 
342  | 
= $25.21175092849298  | 
|
| 18 | 343  | 
  \end{verbatim}
 | 
344  | 
||
345  | 
as profit for that year, and our new balance for 2011 is \$125 when  | 
|
346  | 
  converted to a \texttt{Long}.
 | 
|
347  | 
||
348  | 
\item[(3.b)] Write a function that calculates the overall balance  | 
|
349  | 
for a range of years where each year the yearly profit is compounded to  | 
|
| 196 | 350  | 
the new balances and then re-invested into our portfolio.\\  | 
351  | 
  \mbox{}\hfill\mbox{[1 Mark]}
 | 
|
| 18 | 352  | 
\end{itemize}\medskip  
 | 
353  | 
||
354  | 
\noindent  | 
|
355  | 
\textbf{Test Data:} File \texttt{drumb.scala} contains two portfolios
 | 
|
356  | 
collected from the S\&P 500, one for blue-chip companies, including  | 
|
| 196 | 357  | 
Facebook, Amazon and Baidu; and another for listed real-estate  | 
358  | 
companies, whose names I have never heard of. Following the dumb  | 
|
359  | 
investment strategy from 1978 until 2017 would have turned a starting  | 
|
360  | 
balance of \$100 into roughly \$30,895 for real estate and a whopping  | 
|
361  | 
\$349,597 for blue chips. Note when comparing these results with your  | 
|
362  | 
own calculations: there might be some small rounding errors, which  | 
|
363  | 
when compounded lead to moderately different values.\bigskip  | 
|
364  | 
||
365  | 
\noindent  | 
|
366  | 
\textbf{Hints:} useful string functions: \texttt{.startsWith(...)} for
 | 
|
367  | 
checking whether a string has a given prefix, \texttt{\_ ++ \_} for
 | 
|
368  | 
concatenating two strings; useful option functions: \texttt{.flatten}
 | 
|
369  | 
flattens a list of options such that it filters way all  | 
|
370  | 
\texttt{None}'s, \texttt{Try(...) getOrElse ...} runs some code that
 | 
|
371  | 
might raise an exception---if yes, then a default value can be given;  | 
|
372  | 
useful list functions: \texttt{.head} for obtaining the first element
 | 
|
373  | 
in a non-empty list, \texttt{.length} for the length of a
 | 
|
374  | 
list.\bigskip  | 
|
375  | 
||
| 18 | 376  | 
|
377  | 
\noindent  | 
|
| 
24
 
66b97f9a40f8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
20 
diff
changeset
 | 
378  | 
\textbf{Moral:} Reflecting on our assumptions, we are over-estimating
 | 
| 18 | 379  | 
our yield in many ways: first, who can know in 1978 about what will  | 
380  | 
turn out to be a blue chip company. Also, since the portfolios are  | 
|
381  | 
chosen from the current S\&P 500, they do not include the myriad  | 
|
382  | 
of companies that went bust or were de-listed over the years.  | 
|
| 
35
 
9fea5f751be4
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
31 
diff
changeset
 | 
383  | 
So where does this leave our fictional character Mr T.~Drumb? Well, given  | 
| 18 | 384  | 
his inheritance, a really dumb investment strategy would have done  | 
| 196 | 385  | 
equally well, if not much better.\medskip  | 
| 129 | 386  | 
|
| 135 | 387  | 
|
| 6 | 388  | 
\end{document}
 | 
389  | 
||
390  | 
%%% Local Variables:  | 
|
391  | 
%%% mode: latex  | 
|
392  | 
%%% TeX-master: t  | 
|
393  | 
%%% End:  |