handouts/ho01.tex
changeset 178 13c6bd6e3477
parent 177 46e581d66f3a
child 179 1cacbe5c67cf
equal deleted inserted replaced
177:46e581d66f3a 178:13c6bd6e3477
   170 secure. Getting the incentives right in favour of security is
   170 secure. Getting the incentives right in favour of security is
   171 often a tricky business.
   171 often a tricky business.
   172 
   172 
   173 \subsection*{Of Cookies and Salts}
   173 \subsection*{Of Cookies and Salts}
   174 
   174 
   175 Lets look at another example which helps us to understand how
   175 Lets look at another example which should helps with
   176 passwords should be verified and stored. Imagine you need to
   176 understanding how passwords should be verified and stored.
   177 develop a web-application that has the feature of recording
   177 Imagine you need to develop a web-application that has the
   178 how many times a customer visits a page. For example to 
   178 feature of recording how many times a customer visits a page.
   179 give a discount whenever the customer visited a webpage some 
   179 For example to give a discount whenever the customer visited a
   180 $x$ number of times (say $x$ equal $5$). For a number of years
   180 webpage some $x$ number of times (say $x$ equal $5$). There is
   181 the webpage of the New York Times operated in this way: it 
   181 one more constraint: we want to store the information about
   182 allowed you to read ten articles per months for free; if
   182 the number of times a customer has visited inside a cookie. I
   183 you wanted to read more you had to pay. There is one more
   183 think, for a number of years the webpage of the New York Times
   184 constraint: we want to store the information about the number
   184 operated in this way: it allowed you to read ten articles per
   185 of times a customer has visited inside a cookie. 
   185 months for free; if you wanted to read more, you had to pay.
   186 
   186 My guess is it used cookies for recording how many times their
   187 A typical web-application works as follows: The browser sends
   187 pages was visited, because if you switched browsers you could
   188 a GET request for a particular page to a server. The server 
   188 easily circumvent the restriction about ten articles.
   189 answers is request. A simple JavaScript program that realises
   189 
   190 a ``hello world'' webpage is as follows:
   190 To implement our web-application it is good to look under the
       
   191 hood what happens when a webpage is requested. A typical
       
   192 web-application works as follows: The browser sends a GET
       
   193 request for a particular page to a server. The server answers
       
   194 this request. A simple JavaScript program that realises a
       
   195 ``hello world'' webpage is as follows:
   191 
   196 
   192 \begin{center}
   197 \begin{center}
   193 \lstinputlisting{../progs/ap0.js}
   198 \lstinputlisting{../progs/ap0.js}
   194 \end{center}
   199 \end{center}
   195 
   200 
   196 \noindent The interesting lines are 4 to 7 where the answer
   201 \noindent The interesting lines are 4 to 7 where the answer to
   197 to the GET request is generated\ldots in this case it is just
   202 the GET request is generated\ldots in this case it is just a
   198 a simple string. This program is run on the server and will
   203 simple string. This program is run on the server and will be
   199 be run whenever a browser initiates such a GET request.
   204 executed whenever a browser initiates such a GET request.
   200 
   205 
   201 For our web-application of interest is the feature that the
   206 For our web-application of interest is the feature that the
   202 server when answering the request can store some information
   207 server when answering the request can store some information
   203 on the client. This information is called a \emph{cookie}.
   208 at the client's side. This information is called a
   204 The next time the browser makes another GET request to the 
   209 \emph{cookie}. The next time the browser makes another GET
   205 same webpage, this cookie can be read by the browser. 
   210 request to the same webpage, this cookie can be read by the
   206 Therefore we can use a cookie in order to store a counter
   211 server. We can use cookies in order to store a counter that
   207 recording the number of times a webpage has been visited. 
   212 records the number of times our webpage has been visited. This
   208 This can be realised with the following small program
   213 can be realised with the following small program
   209 
   214 
   210 \begin{center}
   215 \begin{center}
   211 \lstinputlisting{../progs/ap2.js}
   216 \lstinputlisting{../progs/ap2.js}
   212 \end{center}
   217 \end{center}
   213 
   218 
   214 \noindent The overall structure of this code is the same as
   219 \noindent The overall structure of this program is the same as
   215 the earlier program: Lines 7 to 17 generate the answer to a
   220 the earlier one: Lines 7 to 17 generate the answer to a
   216 GET-request. The new part is in Line 8 where we read the
   221 GET-request. The new part is in Line 8 where we read the
   217 cookie called \pcode{counter}. If present, this cookie will be
   222 cookie called \pcode{counter}. If present, this cookie will be
   218 send together with the GET-request from the client. The value
   223 send together with the GET-request from the client. The value
   219 of this counter will come in form of a string, therefore we
   224 of this counter will come in form of a string, therefore we
   220 use the function \pcode{parseInt} in order to transform it
   225 use the function \pcode{parseInt} in order to transform it
   221 into a string. In case the cookie is not present, or has been
   226 into an integer. In case the cookie is not present, we default
   222 deleted, we default the counter to zero. The odd looking
   227 the counter to zero. The odd looking construction \code{...||
   223 construction \code{...|| 0} is realising this in JavaScript.
   228 0} is realising this defaulting in JavaScript. In Line 9 we
   224 In Line 9 we increase the counter by one and store it back
   229 increase the counter by one and store it back to the client
   225 to the client (under the name \pcode{counter}, since potentially 
   230 (under the name \pcode{counter}, since potentially more than
   226 more than one value could be stored). In Lines 10 to 15 we
   231 one value could be stored). In Lines 10 to 15 we test whether
   227 test whether this counter is greater or equal than 5 and
   232 this counter is greater or equal than 5 and send accordingly a
   228 send accordingly a message back to the client.
   233 specially grafted message back to the client.
   229 
   234 
   230 Let us step back and analyse this program from a security
   235 Let us step back and analyse this program from a security
   231 perspective. We store a counter in plain text on the client's
   236 point of view. We store a counter in plain text on the
   232 browser (which is not under our control at all). Depending on
   237 client's browser (which is not under our control). Depending
   233 this value we want to unlock a resource (like a discount) when
   238 on this value we want to unlock a resource (like a discount)
   234 it reaches a threshold. If the client deletes the cookie, then
   239 when it reaches a threshold. If the client deletes the cookie,
   235 the counter will just be reset to zero. This does not bother
   240 then the counter will just be reset to zero. This does not
   236 us, because the purported discount will just be granted later.
   241 bother us, because the purported discount will just not be
   237 This does not lose us any (hypothetical) money. What we need
   242 granted. In this way we do not lose us any (hypothetical)
   238 to be concerned about is when a client artificially increases
   243 money. What we need to be concerned about is, however, when a
   239 this counter without having visited our web-page. This is
   244 client artificially increases this counter without having
   240 actually a trivial task for a knowledgeable person, since
   245 visited our web-page. This is actually a trivial task for a
   241 there are convenient tools that allow us to set a cookie to an
   246 knowledgeable person, since there are convenient tools that
   242 arbitrary value, for example above our threshold for the
   247 allow one to set a cookie to an arbitrary value, for example
   243 discount. 
   248 above our threshold for the discount. 
   244 
   249 
   245 There is no real way to prevent this kind of tampering with
   250 There seems to be no real way to prevent this kind of
   246 cookies, because the whole purpose of cookies is that they are
   251 tampering with cookies, because the whole purpose of cookies
   247 stored on the client's side, which from the the server's
   252 is that they are stored on the client's side, which from the
   248 perspective is in a potentially hostile environment. What we
   253 the server's perspective is a potentially hostile environment.
   249 need to ensure is the integrity of this counter in this
   254 What we need to ensure is the integrity of this counter in
   250 hostile environment. We could think of encrypting the counter.
   255 this hostile environment. We could think of encrypting the
   251 But this has two drawbacks to do with the key for encryption.
   256 counter. But this has two drawbacks to do with the key for
   252 If you use a `global' key for all our client's that visit our
   257 encryption. If you use a single, global key for all the
   253 site, then we risk that our whole ``business'' might colapse
   258 clients that visit our site, then we risk that our whole
   254 when this key gets known to the outside world. Suddenly all
   259 ``business'' might collapse in the event this key gets known
   255 cookies we might have set in the past, can now be manipulated.
   260 to the outside world. Then all cookies we might have set in
   256 If on the other hand, we use a ``private'' key for every
   261 the past, can now be decrypted and manipulated. If, on the
   257 client, then we have to solve the problem of having to
   262 other hand, we use many ``private'' keys for the clients, then
   258 securely store this key on our server side (obviously we
   263 we have to solve the problem of having to securely store this
   259 cannot store the key with the client because then the client
   264 key on our server side (obviously we cannot store the key with
   260 again has all data to tamper with the counter; and obviously
   265 the client because then the client again has all data to
   261 we also cannot encrypt the key, lest we can solve a
   266 tamper with the counter; and obviously we also cannot encrypt
   262 chicken-and-egg problem). So encryption seems to not solve the
   267 the key, lest we can solve a chicken-and-egg problem). So
   263 problem we face with the integrity of our counter.
   268 encryption seems to not solve the problem we face with the
   264 
   269 integrity of our counter.
   265 Fortunately, \emph{hash function} seem to be more suitable for
   270 
   266 our purpose. Like encryption, hash functions scrambles data
   271 Fortunately, \emph{hash functions} seem to be more suitable
   267 but in such a way that it is easy to calculate the output of a
   272 for our purpose. Like encryption, hash functions scramble data
   268 has function from the input. But it is hard (i.e.~practically
   273 in such a way that it is easy to calculate the output of a has
       
   274 function from the input. But it is hard (i.e.~practically
   269 impossible) to calculate the input from knowing the output.
   275 impossible) to calculate the input from knowing the output.
   270 Therefore has functions are often called one-way functions.
   276 Therefore hash functions are often called \emph{one-way
   271 There are several such hashing function. For example SHA-1
   277 functions}. There are several such hashing function. For
   272 would has the string \pcode{"hello world"} to
   278 example SHA-1 would hash the string \pcode{"hello world"} to
       
   279 produce
   273 
   280 
   274 \begin{center}
   281 \begin{center}
   275 \pcode{2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
   282 \pcode{2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
   276 \end{center}
   283 \end{center}
   277 
   284 
   278 \noindent Another handy feature of hash functions is that if
   285 \noindent Another handy feature of hash functions is that if
   279 the input changes a little bit, the output changes 
   286 the input changes only a little, the output changes
   280 drastically. For example \pcode{"iello world"} produces under 
   287 drastically. For example \pcode{"iello world"} produces under
   281 SHA-1 the output
   288 SHA-1 the output
   282 
   289 
   283 \begin{center}
   290 \begin{center}
   284 \pcode{d2b1402d84e8bcef5ae18f828e43e7065b841ff1}
   291 \pcode{d2b1402d84e8bcef5ae18f828e43e7065b841ff1}
   285 \end{center}
   292 \end{center}
   286 
   293 
   287 \noindent That means it is not predictable what the output
   294 \noindent That means it is not predictable what the output
   288 will be from input that is ``close by''. 
   295 will be from just looking at input that is ``close by''. 
   289 
   296 
   290 We can use hashes and store in the cookie the value of the
   297 We can use hashes in our web-application and store in the
   291 counter together with its hash. We need to store both pieces
   298 cookie the value of the counter in plain text but together
   292 of data such we can extract both components (below I will just
   299 with its hash. We need to store both pieces of data such we
   293 separate them using a \pcode{"-"}). If we now read back the
   300 can extract both components (below I will just separate them
   294 cookie when the client visits our webpage, we can extract the
   301 using a \pcode{"-"}). If we now read back the cookie when the
   295 counter, hash it again and compare the result to the stored
   302 client visits our webpage, we can extract the counter, hash it
   296 hash value inside the cookie. If these hashes disagree, then
   303 again and compare the result to the stored hash value inside
   297 we can deduce that cookie has been tampered with.
   304 the cookie. If these hashes disagree, then we can deduce that
   298 Unfortunately if they agree, we can still not be entirely sure
   305 the cookie has been tampered with. Unfortunately, if they
   299 that not a clever hacker has tampered with the cookie. The
   306 agree, we can still not be entirely sure that not a clever
   300 reason is that the hacker can see the clear text part of the
   307 hacker has tampered with the cookie. The reason is that the
   301 cookie, say \pcode{3}, and its hash. It does not take much
   308 hacker can see the clear text part of the cookie, say
   302 trial and error to find out that we used the SHA-1 hashing
   309 \pcode{3}, and also its hash. It does not take much trial and
   303 functions and then graft a cookie accordingly. This is eased
   310 error to find out that we used the SHA-1 hashing functions and
   304 by the fact that for SHA-1 many strings and corresponding
   311 then graft a cookie accordingly. This is eased by the fact
   305 hashvalues are precalculated. Type into Google for example the
   312 that for SHA-1 many strings and corresponding hashvalues are
   306 hash value for \pcode{"hello wolrd"} and you will actually
   313 precalculated. Type, for example, into Google the hash value
   307 pretty quickly find that it was generated by \pcode{"hello
   314 for \pcode{"hello wolrd"} and you will actually pretty quickly
       
   315 find that it was generated by input string \pcode{"hello
   308 wolrd"}. This defeats the purpose of a hashing functions and
   316 wolrd"}. This defeats the purpose of a hashing functions and
   309 would not help us for our web-applications. The corresponding
   317 thus would not help us for our web-applications. 
   310 attack is called \emph{dictionary attack}\ldots hashes are not
   318 
   311 reversed by brute force calculations, that is trying out all
       
   312 possible combinations.
       
   313 
   319 
   314 
   320 
   315 There is one ingredient missing, which happens to be called
   321 There is one ingredient missing, which happens to be called
   316 \emph{salt}. The salt is a random key, which is added to the
   322 \emph{salts}. Salts are random keys, which are added to the
   317 counter before the hash is calculated. In our case we need to
   323 counter before the hash is calculated. In our case we need to
   318 keep the salt secret. As can be see from 
   324 keep the salt secret. As can be see in Figure~\ref{hashsalt},
   319 Figure~\ref{hashsalt},
   325 we now need to extract from the cookie the counter value and
   320 we now need to extract the cookie data (Line 20). When we 
   326 the hash (Lines 19 and 20). But before has the counter again
   321 set the new increased cookie, we will add the salt before 
   327 (Line 22) we need to add the secret salt. Similarly, when we
   322 hashing (this is done in Line 13). 
   328 set the new increased counter, we will need to add the salt
       
   329 before hashing (this is done in Line 15). Our web-application
       
   330 will now store cookies like 
   323 
   331 
   324 \begin{figure}[p]
   332 \begin{figure}[p]
   325 \lstinputlisting{../progs/App3.js}
   333 \lstinputlisting{../progs/App4.js}
   326 \caption{\label{hashsalt}}
   334 \caption{\label{hashsalt}}
   327 \end{figure}
   335 \end{figure}
   328 
   336 
   329 
   337 %The corresponding attack is called \emph{dictionary
   330 
   338 %attack}\ldots hashes are not reversed by brute force
   331 
   339 %calculations, that is trying out all possible combinations.
   332 Note ....NYT 
   340 
       
   341 
       
   342 %We have to make sure the salt does not get known.
       
   343 
       
   344 
       
   345 %Note ....NYT 
   333 \end{document}
   346 \end{document}
   334 
   347 
   335 %%% Local Variables: 
   348 %%% Local Variables: 
   336 %%% mode: latex
   349 %%% mode: latex
   337 %%% TeX-master: t
   350 %%% TeX-master: t