336
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1 |
\documentclass{article}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
2 |
\usepackage{../style}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
3 |
\usepackage{../langs}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
4 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
5 |
\begin{document}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
6 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
7 |
\section*{August Exam (Scala): Chat Log Mining}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
8 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
9 |
This coursework is worth 50\%. It is about mining a log of an online
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
10 |
chat between 85 participants. The log is given as a csv-list in the file
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
11 |
\texttt{log.csv}. The log is an unordered list containing information which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
12 |
message has been sent, by whom, when and in response to which other
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
13 |
message. Each message has also a number and a unique hash code.\bigskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
14 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
15 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
16 |
\textbf{Important:} Make sure the file you submit can be processed
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
17 |
by just calling
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
18 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
19 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
20 |
\texttt{scala <<filename.scala>>}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
21 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
22 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
23 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
24 |
Do not use any mutable data structures in your
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
25 |
submission! They are not needed. This means you cannot use
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
26 |
\texttt{ListBuffer}s, \texttt{Array}s, for example. Do not use
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
27 |
\texttt{return} in your code! It has a different meaning in Scala,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
28 |
than in Java. Do not use \texttt{var}! This declares a mutable
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
29 |
variable.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
30 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
31 |
\subsection*{Disclaimer}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
32 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
33 |
It should be understood that the work you submit represents your own
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
34 |
effort! You have not copied from anyone or anywhere else. An exception
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
35 |
is the Scala code I showed during the lectures or uploaded to KEATS,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
36 |
which you can freely use.\bigskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
37 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
38 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
39 |
\subsection*{Background}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
40 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
41 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
42 |
The fields in the file \texttt{log.csv} are organised
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
43 |
as follows:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
44 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
45 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
46 |
\texttt{counter, id, time\_date, name, country, parent\_id, msg}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
47 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
48 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
49 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
50 |
Each line in this file contains the data for a single message. The field
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
51 |
\texttt{counter} is an integer number given to each message; \texttt{id} is a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
52 |
unique hash string for a message; \texttt{time\_date} is the time when the message
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
53 |
was sent; \texttt{name} and \texttt{country} is data about the author
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
54 |
of the message, whereby sometimes the authors left the country information
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
55 |
empty; \texttt{parent\_id} is a hash specifying which other message the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
56 |
message answers (this can also be empty). \texttt{Msg} is the actual
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
57 |
message text. \textbf{Be careful} for the tasks below that this text can contain
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
58 |
commas and needs to be treated special when the line is split up
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
59 |
by using \texttt{line.split(",").toList}. Tasks (2) and (3) are about
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
60 |
processing this data and storing it into the \texttt{Rec}-data-structure, which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
61 |
is pre-defined in the file \texttt{resit.scala}:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
62 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
63 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
64 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
65 |
\begin{verbatim}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
66 |
Rec(num: Int,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
67 |
msg_id: String,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
68 |
date: String,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
69 |
msg: String,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
70 |
author: String,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
71 |
country: Option[String],
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
72 |
reply_id : Option[String],
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
73 |
parent: Option[Int] = None,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
74 |
children: List[Int] = Nil)
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
75 |
\end{verbatim}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
76 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
77 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
78 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
79 |
The transformation into a Rec-data-structure is a two-step process
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
80 |
where first the fields for parents and children are given default
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
81 |
values. This information is then filled in in a second step.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
82 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
83 |
The main information that will be computed in the tasks below is from
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
84 |
which country authors are and how many authors are from each
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
85 |
country. The last task will also rank which messages have been the most
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
86 |
popular in terms of how many replies they received (this will computed
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
87 |
according to be the number children, grand-children and so on of a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
88 |
message).
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
89 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
90 |
\subsection*{Tasks}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
91 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
92 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
93 |
\begin{itemize}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
94 |
\item[(1)] The function \texttt{get\_csv} takes a file name as
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
95 |
argument. It should read the corresponding file and return its
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
96 |
content. The content should be returned as a list of strings, namely a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
97 |
string for each line in the file. Since the file is a csv-file, the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
98 |
first line (the header) should be dropped in the result. Lines are
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
99 |
separated by \verb!"\n"!. For the file \texttt{log.csv} there should
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
100 |
be a list of 680 separate strings.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
101 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
102 |
\mbox{}\hfill[5\% Marks]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
103 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
104 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
105 |
\item[(2)] The function \texttt{process\_line} takes a single line
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
106 |
from the csv-file (as generated by \texttt{get\_csv}) and creates a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
107 |
Rec(ord) data structure. This data structure is pre-defined in the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
108 |
Scala file.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
109 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
110 |
For processing a line, you should use the function
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
111 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
112 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
113 |
\verb!<<some_line>>.split(",").toList!
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
114 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
115 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
116 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
117 |
in order to separate the fields. HOWEVER BE CAREFUL that the message
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
118 |
text in the last field of \texttt{log.cvs} can contain commas and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
119 |
therefore the split will not always result in a list of only 7
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
120 |
elements. You need to concatenate anything beyond the 7th field into
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
121 |
a single string before assigning the field \texttt{msg}.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
122 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
123 |
\mbox{}\hfill[10\% Marks]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
124 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
125 |
\item[(3)] Each record in the log contains a unique hash code
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
126 |
identifying each message. For example
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
127 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
128 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
129 |
\verb!"5ebeb459ac278d01301f1497"!
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
130 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
131 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
132 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
133 |
Some messages also contain a hash code identifying the parent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
134 |
message (that is to which question they reply). The function
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
135 |
\texttt{post\_process} fills in the information about potential
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
136 |
children and a potential parent message.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
137 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
138 |
The auxiliary function \texttt{get\_children} takes a record
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
139 |
\texttt{e} and a record list \texttt{rs} as arguments, and returns
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
140 |
the list of all direct children (children have the hash code of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
141 |
\texttt{e} as \texttt{reply\_id}). The list of children is returned
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
142 |
as a list of \texttt{num}s. The \texttt{num}s can be used later
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
143 |
as indexes in a Rec-list.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
144 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
145 |
The auxiliary function \texttt{get\_parent} returns the number of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
146 |
the record corresponding to the \texttt{reply\_id} (encoded as
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
147 |
\texttt{Some} if there exists one, otherwise it returns \texttt{None}).
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
148 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
149 |
In order to update a record, say \texttt{r}, with some additional
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
150 |
information, you can use the Scala code
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
151 |
\begin{verbatim}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
152 |
r.copy(parent = ....,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
153 |
children = ....)
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
154 |
\end{verbatim}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
155 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
156 |
\mbox{}\hfill[10\% Marks]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
157 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
158 |
\item[(4)] The functions \texttt{get\_countries} and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
159 |
\texttt{get\_countries\_numbers} calculate the countries where
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
160 |
message authors are coming from and how many authors come from each
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
161 |
country (returned as a \texttt{Map} from countries to Integers). In
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
162 |
case an author did not specify a country, the empty string should
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
163 |
be returned.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
164 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
165 |
\mbox{}\hfill[10\% Mark]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
166 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
167 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
168 |
\item[(5)] This task identifies the most popular questions in the log,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
169 |
whereby popularity is measured in terms of how many follow-up
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
170 |
questions were asked. We call such questions as belonging to a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
171 |
\emph{thread}. It can be assumed that in \texttt{log.csv} there are
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
172 |
no circular references, that is no question refers to a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
173 |
follow-up question as parent.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
174 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
175 |
The function \texttt{ordered\_thread\_sizes} orders the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
176 |
message threads according to how many answers were given for one
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
177 |
message (that is how many children, grand-children and so on one
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
178 |
message has).
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
179 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
180 |
The auxiliary function \texttt{search} enumerates all children,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
181 |
grand-children and so on for a given record \texttt{r} (including
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
182 |
the record \texttt{r} itself). \texttt{Search} returns these children
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
183 |
as a list of \texttt{Rec}s.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
184 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
185 |
The function \texttt{thread\_size} generates for a record, say
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
186 |
\texttt{r}, a pair consisting of the number of \texttt{r} and the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
187 |
number of all children as produced by search. The numbers are the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
188 |
integers given for each message---for \texttt{log.cvs} a number
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
189 |
is between 0 and 679.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
190 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
191 |
The function \texttt{ordered\_thread\_sizes} orders the list of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
192 |
pairs according to which thread in the chat is the longest (the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
193 |
longest should be first).
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
194 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
195 |
\mbox{}\hfill[15\% Mark]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
196 |
\end{itemize}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
197 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
198 |
\end{document}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
199 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
200 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
201 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
202 |
\end{document}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
203 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
204 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
205 |
%%% Local Variables:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
206 |
%%% mode: latex
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
207 |
%%% TeX-master: t
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
208 |
%%% End:
|