handouts/ho01.tex
changeset 178 13c6bd6e3477
parent 177 46e581d66f3a
child 179 1cacbe5c67cf
--- a/handouts/ho01.tex	Thu Sep 25 05:48:08 2014 +0100
+++ b/handouts/ho01.tex	Thu Sep 25 06:43:43 2014 +0100
@@ -172,112 +172,119 @@
 
 \subsection*{Of Cookies and Salts}
 
-Lets look at another example which helps us to understand how
-passwords should be verified and stored. Imagine you need to
-develop a web-application that has the feature of recording
-how many times a customer visits a page. For example to 
-give a discount whenever the customer visited a webpage some 
-$x$ number of times (say $x$ equal $5$). For a number of years
-the webpage of the New York Times operated in this way: it 
-allowed you to read ten articles per months for free; if
-you wanted to read more you had to pay. There is one more
-constraint: we want to store the information about the number
-of times a customer has visited inside a cookie. 
+Lets look at another example which should helps with
+understanding how passwords should be verified and stored.
+Imagine you need to develop a web-application that has the
+feature of recording how many times a customer visits a page.
+For example to give a discount whenever the customer visited a
+webpage some $x$ number of times (say $x$ equal $5$). There is
+one more constraint: we want to store the information about
+the number of times a customer has visited inside a cookie. I
+think, for a number of years the webpage of the New York Times
+operated in this way: it allowed you to read ten articles per
+months for free; if you wanted to read more, you had to pay.
+My guess is it used cookies for recording how many times their
+pages was visited, because if you switched browsers you could
+easily circumvent the restriction about ten articles.
 
-A typical web-application works as follows: The browser sends
-a GET request for a particular page to a server. The server 
-answers is request. A simple JavaScript program that realises
-a ``hello world'' webpage is as follows:
+To implement our web-application it is good to look under the
+hood what happens when a webpage is requested. A typical
+web-application works as follows: The browser sends a GET
+request for a particular page to a server. The server answers
+this request. A simple JavaScript program that realises a
+``hello world'' webpage is as follows:
 
 \begin{center}
 \lstinputlisting{../progs/ap0.js}
 \end{center}
 
-\noindent The interesting lines are 4 to 7 where the answer
-to the GET request is generated\ldots in this case it is just
-a simple string. This program is run on the server and will
-be run whenever a browser initiates such a GET request.
+\noindent The interesting lines are 4 to 7 where the answer to
+the GET request is generated\ldots in this case it is just a
+simple string. This program is run on the server and will be
+executed whenever a browser initiates such a GET request.
 
 For our web-application of interest is the feature that the
 server when answering the request can store some information
-on the client. This information is called a \emph{cookie}.
-The next time the browser makes another GET request to the 
-same webpage, this cookie can be read by the browser. 
-Therefore we can use a cookie in order to store a counter
-recording the number of times a webpage has been visited. 
-This can be realised with the following small program
+at the client's side. This information is called a
+\emph{cookie}. The next time the browser makes another GET
+request to the same webpage, this cookie can be read by the
+server. We can use cookies in order to store a counter that
+records the number of times our webpage has been visited. This
+can be realised with the following small program
 
 \begin{center}
 \lstinputlisting{../progs/ap2.js}
 \end{center}
 
-\noindent The overall structure of this code is the same as
-the earlier program: Lines 7 to 17 generate the answer to a
+\noindent The overall structure of this program is the same as
+the earlier one: Lines 7 to 17 generate the answer to a
 GET-request. The new part is in Line 8 where we read the
 cookie called \pcode{counter}. If present, this cookie will be
 send together with the GET-request from the client. The value
 of this counter will come in form of a string, therefore we
 use the function \pcode{parseInt} in order to transform it
-into a string. In case the cookie is not present, or has been
-deleted, we default the counter to zero. The odd looking
-construction \code{...|| 0} is realising this in JavaScript.
-In Line 9 we increase the counter by one and store it back
-to the client (under the name \pcode{counter}, since potentially 
-more than one value could be stored). In Lines 10 to 15 we
-test whether this counter is greater or equal than 5 and
-send accordingly a message back to the client.
+into an integer. In case the cookie is not present, we default
+the counter to zero. The odd looking construction \code{...||
+0} is realising this defaulting in JavaScript. In Line 9 we
+increase the counter by one and store it back to the client
+(under the name \pcode{counter}, since potentially more than
+one value could be stored). In Lines 10 to 15 we test whether
+this counter is greater or equal than 5 and send accordingly a
+specially grafted message back to the client.
 
 Let us step back and analyse this program from a security
-perspective. We store a counter in plain text on the client's
-browser (which is not under our control at all). Depending on
-this value we want to unlock a resource (like a discount) when
-it reaches a threshold. If the client deletes the cookie, then
-the counter will just be reset to zero. This does not bother
-us, because the purported discount will just be granted later.
-This does not lose us any (hypothetical) money. What we need
-to be concerned about is when a client artificially increases
-this counter without having visited our web-page. This is
-actually a trivial task for a knowledgeable person, since
-there are convenient tools that allow us to set a cookie to an
-arbitrary value, for example above our threshold for the
-discount. 
+point of view. We store a counter in plain text on the
+client's browser (which is not under our control). Depending
+on this value we want to unlock a resource (like a discount)
+when it reaches a threshold. If the client deletes the cookie,
+then the counter will just be reset to zero. This does not
+bother us, because the purported discount will just not be
+granted. In this way we do not lose us any (hypothetical)
+money. What we need to be concerned about is, however, when a
+client artificially increases this counter without having
+visited our web-page. This is actually a trivial task for a
+knowledgeable person, since there are convenient tools that
+allow one to set a cookie to an arbitrary value, for example
+above our threshold for the discount. 
 
-There is no real way to prevent this kind of tampering with
-cookies, because the whole purpose of cookies is that they are
-stored on the client's side, which from the the server's
-perspective is in a potentially hostile environment. What we
-need to ensure is the integrity of this counter in this
-hostile environment. We could think of encrypting the counter.
-But this has two drawbacks to do with the key for encryption.
-If you use a `global' key for all our client's that visit our
-site, then we risk that our whole ``business'' might colapse
-when this key gets known to the outside world. Suddenly all
-cookies we might have set in the past, can now be manipulated.
-If on the other hand, we use a ``private'' key for every
-client, then we have to solve the problem of having to
-securely store this key on our server side (obviously we
-cannot store the key with the client because then the client
-again has all data to tamper with the counter; and obviously
-we also cannot encrypt the key, lest we can solve a
-chicken-and-egg problem). So encryption seems to not solve the
-problem we face with the integrity of our counter.
+There seems to be no real way to prevent this kind of
+tampering with cookies, because the whole purpose of cookies
+is that they are stored on the client's side, which from the
+the server's perspective is a potentially hostile environment.
+What we need to ensure is the integrity of this counter in
+this hostile environment. We could think of encrypting the
+counter. But this has two drawbacks to do with the key for
+encryption. If you use a single, global key for all the
+clients that visit our site, then we risk that our whole
+``business'' might collapse in the event this key gets known
+to the outside world. Then all cookies we might have set in
+the past, can now be decrypted and manipulated. If, on the
+other hand, we use many ``private'' keys for the clients, then
+we have to solve the problem of having to securely store this
+key on our server side (obviously we cannot store the key with
+the client because then the client again has all data to
+tamper with the counter; and obviously we also cannot encrypt
+the key, lest we can solve a chicken-and-egg problem). So
+encryption seems to not solve the problem we face with the
+integrity of our counter.
 
-Fortunately, \emph{hash function} seem to be more suitable for
-our purpose. Like encryption, hash functions scrambles data
-but in such a way that it is easy to calculate the output of a
-has function from the input. But it is hard (i.e.~practically
+Fortunately, \emph{hash functions} seem to be more suitable
+for our purpose. Like encryption, hash functions scramble data
+in such a way that it is easy to calculate the output of a has
+function from the input. But it is hard (i.e.~practically
 impossible) to calculate the input from knowing the output.
-Therefore has functions are often called one-way functions.
-There are several such hashing function. For example SHA-1
-would has the string \pcode{"hello world"} to
+Therefore hash functions are often called \emph{one-way
+functions}. There are several such hashing function. For
+example SHA-1 would hash the string \pcode{"hello world"} to
+produce
 
 \begin{center}
 \pcode{2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
 \end{center}
 
 \noindent Another handy feature of hash functions is that if
-the input changes a little bit, the output changes 
-drastically. For example \pcode{"iello world"} produces under 
+the input changes only a little, the output changes
+drastically. For example \pcode{"iello world"} produces under
 SHA-1 the output
 
 \begin{center}
@@ -285,51 +292,57 @@
 \end{center}
 
 \noindent That means it is not predictable what the output
-will be from input that is ``close by''. 
+will be from just looking at input that is ``close by''. 
 
-We can use hashes and store in the cookie the value of the
-counter together with its hash. We need to store both pieces
-of data such we can extract both components (below I will just
-separate them using a \pcode{"-"}). If we now read back the
-cookie when the client visits our webpage, we can extract the
-counter, hash it again and compare the result to the stored
-hash value inside the cookie. If these hashes disagree, then
-we can deduce that cookie has been tampered with.
-Unfortunately if they agree, we can still not be entirely sure
-that not a clever hacker has tampered with the cookie. The
-reason is that the hacker can see the clear text part of the
-cookie, say \pcode{3}, and its hash. It does not take much
-trial and error to find out that we used the SHA-1 hashing
-functions and then graft a cookie accordingly. This is eased
-by the fact that for SHA-1 many strings and corresponding
-hashvalues are precalculated. Type into Google for example the
-hash value for \pcode{"hello wolrd"} and you will actually
-pretty quickly find that it was generated by \pcode{"hello
+We can use hashes in our web-application and store in the
+cookie the value of the counter in plain text but together
+with its hash. We need to store both pieces of data such we
+can extract both components (below I will just separate them
+using a \pcode{"-"}). If we now read back the cookie when the
+client visits our webpage, we can extract the counter, hash it
+again and compare the result to the stored hash value inside
+the cookie. If these hashes disagree, then we can deduce that
+the cookie has been tampered with. Unfortunately, if they
+agree, we can still not be entirely sure that not a clever
+hacker has tampered with the cookie. The reason is that the
+hacker can see the clear text part of the cookie, say
+\pcode{3}, and also its hash. It does not take much trial and
+error to find out that we used the SHA-1 hashing functions and
+then graft a cookie accordingly. This is eased by the fact
+that for SHA-1 many strings and corresponding hashvalues are
+precalculated. Type, for example, into Google the hash value
+for \pcode{"hello wolrd"} and you will actually pretty quickly
+find that it was generated by input string \pcode{"hello
 wolrd"}. This defeats the purpose of a hashing functions and
-would not help us for our web-applications. The corresponding
-attack is called \emph{dictionary attack}\ldots hashes are not
-reversed by brute force calculations, that is trying out all
-possible combinations.
+thus would not help us for our web-applications. 
+
 
 
 There is one ingredient missing, which happens to be called
-\emph{salt}. The salt is a random key, which is added to the
+\emph{salts}. Salts are random keys, which are added to the
 counter before the hash is calculated. In our case we need to
-keep the salt secret. As can be see from 
-Figure~\ref{hashsalt},
-we now need to extract the cookie data (Line 20). When we 
-set the new increased cookie, we will add the salt before 
-hashing (this is done in Line 13). 
+keep the salt secret. As can be see in Figure~\ref{hashsalt},
+we now need to extract from the cookie the counter value and
+the hash (Lines 19 and 20). But before has the counter again
+(Line 22) we need to add the secret salt. Similarly, when we
+set the new increased counter, we will need to add the salt
+before hashing (this is done in Line 15). Our web-application
+will now store cookies like 
 
 \begin{figure}[p]
-\lstinputlisting{../progs/App3.js}
+\lstinputlisting{../progs/App4.js}
 \caption{\label{hashsalt}}
 \end{figure}
 
+%The corresponding attack is called \emph{dictionary
+%attack}\ldots hashes are not reversed by brute force
+%calculations, that is trying out all possible combinations.
 
 
+%We have to make sure the salt does not get known.
 
-Note ....NYT 
+
+%Note ....NYT 
 \end{document}
 
 %%% Local Variables: