updated
authorChristian Urban <christian dot urban at kcl dot ac dot uk>
Fri, 26 Sep 2014 12:14:41 +0100
changeset 184 55968b3205cc
parent 183 6ed7c9b8b291
child 185 f10d905e947f
updated
handouts/ho01.pdf
handouts/ho01.tex
Binary file handouts/ho01.pdf has changed
--- a/handouts/ho01.tex	Fri Sep 26 10:01:46 2014 +0100
+++ b/handouts/ho01.tex	Fri Sep 26 12:14:41 2014 +0100
@@ -49,8 +49,9 @@
 can be used to compromise security and privacy in systems.
 This will many times result in insights where well-intended
 security mechanisms made a system actually less
-secure.\smallskip
+secure.\medskip
 
+\noindent 
 {\Large\bf Warning!} However, don’t be evil! Using those
 techniques in the real world may violate the law or King’s
 rules, and it may be unethical. Under some circumstances, even
@@ -64,15 +65,15 @@
 so that you cannot cause any harm, not even accidentally.
 Don't be evil. Be an ethical hacker.\medskip
 
-\noindent
-In this lecture I want to make you familiar with the security mindset
-and dispel the myth that encryption is the answer to all security
-problems (it is certainly often part of an answer, but almost always
-never a sufficient one). This is actually an important thread going
-through the whole course: We will assume that encryption works
-perfectly, but still attack ``things''. By ``works perfectly'' we mean
-that we will assume encryption is a black box and, for example, will
-not look at the underlying mathematics and break the 
+\noindent In this lecture I want to make you familiar with the
+security mindset and dispel the myth that encryption is the
+answer to all security problems (it is certainly often a part
+of an answer, but almost always never a sufficient one). This
+is actually an important thread going through the whole
+course: We will assume that encryption works perfectly, but
+still attack ``things''. By ``works perfectly'' we mean that
+we will assume encryption is a black box and, for example,
+will not look at the underlying mathematics and break the
 algorithms.\footnote{Though fascinating this might be.}
  
 For a secure system, it seems, four requirements need to come
@@ -108,27 +109,30 @@
 system as being absolutely secure and indeed fraud rates
 initially went down, security researchers were not convinced
 (especially the group around Ross Anderson). To begin with,
-the Chip-and-PIN system introduced a ``new player'' that
-needed to be trusted: the PIN terminals and their
+the Chip-and-PIN system introduced a ``new player'' into the
+system that needed to be trusted: the PIN terminals and their
 manufacturers. It was claimed that these terminals were
 tamper-resistant, but needless to say this was a weak link in
 the system, which criminals successfully attacked. Some
 terminals were even so skilfully manipulated that they
 transmitted skimmed PIN numbers via built-in mobile phone
 connections. To mitigate this flaw in the security of
-Chip-and-PIN, you need to vet quite closely the supply chain
-of such terminals.
+Chip-and-PIN, you need to be able to vet quite closely the
+supply chain of such terminals. This is something that is
+mostly beyond the control of customers who need to use these
+terminals.
 
-Later on Ross Anderson and his group were able to perform
-man-in-the-middle attacks against Chip-and-PIN. Essentially
-they made the terminal think the correct PIN was entered and
-the card think that a signature was used. This is a kind of
-\emph{protocol failure}. After discovery, the flaw was
-mitigated by requiring that a link between the card and the
-bank is established at every time the card is used. Even later
-this group found another problem with Chip-and-PIN and ATMs
-which did not generate random enough numbers (nonces) on which
-the security of the underlying protocols relies. 
+To make matters worse for Chip-and-PIN, in around 2009 Ross
+Anderson and his group were able to perform man-in-the-middle
+attacks against Chip-and-PIN. Essentially they made the
+terminal think the correct PIN was entered and the card think
+that a signature was used. This is a kind of \emph{protocol
+failure}. After discovery, the flaw was mitigated by requiring
+that a link between the card and the bank is established at
+every time the card is used. Even later this group found
+another problem with Chip-and-PIN and ATMs which did not
+generate random enough numbers (nonces) on which the security
+of the underlying protocols relies. 
 
 The problem with all this is that the banks who introduced
 Chip-and-PIN managed with the new system to shift the
@@ -144,36 +148,38 @@
 Since banks managed to successfully claim that their
 Chip-and-PIN system is secure, they were under the new system
 able to point the finger at the customer when fraud occurred:
-customers must have been negligent loosing their PIN and they
+customers must have been negligent losing their PIN and they
 had almost no way of defending themselves in such situations.
 That is why the work of \emph{ethical} hackers like Ross
 Anderson's group was so important, because they and others
 established that the bank's claim that their system is secure
 and it must have been the customer's fault, was bogus. In 2009
-for example the law changed and the burden of proof went back
-to the banks. They need to prove whether it was really the
-customer who used a card or not.
+the law changed and the burden of proof went back to the
+banks. They need to prove whether it was really the customer
+who used a card or not.
 
 This is a classic example where a security design principle
 was violated: Namely, the one who is in the position to
 improve security, also needs to bear the financial losses if
 things go wrong. Otherwise, you end up with an insecure
 system. In case of the Chip-and-PIN system, no good security
-engineer would dare claim that it is secure beyond reproach:
-the specification of the EMV protocol (underlying
+engineer would dare to claim that it is secure beyond
+reproach: the specification of the EMV protocol (underlying
 Chip-and-PIN) is some 700 pages long, but still leaves out
 many things (like how to implement a good random number
 generator). No human being is able to scrutinise such a
 specification and ensure it contains no flaws. Moreover, banks
 can add their own sub-protocols to EMV. With all the
 experience we already have, it is as clear as day that
-criminals were eventually able to poke holes into it and
-measures need to be taken to address them. However, with how
-the system was set up, the banks had no real incentive to come
-up with a system that is really secure. Getting the incentives
-right in favour of security is often a tricky business. From a
-customer point of view the system was much less secure than
-the old signature-based method.
+criminals were bound to eventually be able to poke holes into
+it and measures need to be taken to address them. However,
+with how the system was set up, the banks had no real
+incentive to come up with a system that is really secure.
+Getting the incentives right in favour of security is often a
+tricky business. From a customer point of view, the
+Chip-and-PIN system was much less secure than the old
+signature-based method. The customer could now lose
+significant amounts of money.
 
 \subsection*{Of Cookies and Salts}
 
@@ -310,25 +316,32 @@
 We can use hashes in our web-application and store in the
 cookie the value of the counter in plain text but together
 with its hash. We need to store both pieces of data in such a
-way that we can extract them again later on (in the code below
-I will just separate them using a \pcode{"-"}). If we now read
-back the cookie when the client visits our webpage, we can
-extract the counter, hash it again and compare the result to
-the stored hash value inside the cookie. If these hashes
-disagree, then we can deduce that the cookie has been tampered
-with. Unfortunately, if they agree, we can still not be
-entirely sure that not a clever hacker has tampered with the
-cookie. The reason is that the hacker can see the clear text
-part of the cookie, say \pcode{3}, and also its hash. It does
-not take much trial and error to find out that we used the
-SHA-1 hashing function and then the hacker can graft a cookie
+way that we can extract them again later on. In the code below
+I will just separate them using a \pcode{"-"}, for example
+
+\begin{center}
+\pcode{1-356a192b7913b04c54574d18c28d46e6395428ab}
+\end{center}
+
+\noindent for the counter \pcode{1}. If we now read back the
+cookie when the client visits our webpage, we can extract the
+counter, hash it again and compare the result to the stored
+hash value inside the cookie. If these hashes disagree, then
+we can deduce that the cookie has been tampered with.
+Unfortunately, if they agree, we can still not be entirely
+sure that not a clever hacker has tampered with the cookie.
+The reason is that the hacker can see the clear text part of
+the cookie, say \pcode{3}, and also its hash. It does not take
+much trial and error to find out that we used the SHA-1
+hashing function and then the hacker can graft a cookie
 accordingly. This is eased by the fact that for SHA-1 many
 strings and corresponding hash-values are precalculated. Type,
 for example, into Google the hash value for \pcode{"hello
 world"} and you will actually pretty quickly find that it was
-generated by input string \pcode{"hello world"}. This defeats
-the purpose of a hashing function and thus would not help us
-with our web-applications and later also not with how to store
+generated by input string \pcode{"hello world"}. Similarly for
+the hash-value for \pcode{1}. This defeats the purpose of a
+hashing function and thus would not help us with our
+web-applications and later also not with how to store
 passwords properly. 
 
 
@@ -336,7 +349,7 @@
 \emph{salts}. Salts are random keys, which are added to the
 counter before the hash is calculated. In our case we must
 keep the salt secret. As can be see in Figure~\ref{hashsalt},
-we need to extract from the cookie the counter value and the
+we need to extract from the cookie the counter value and its
 hash (Lines 19 and 20). But before hashing the counter again
 (Line 22) we need to add the secret salt. Similarly, when we
 set the new increased counter, we will need to add the salt
@@ -396,10 +409,11 @@
 passwords is of course that if the user typed in
 \pcode{foobar} as password, we need to verify whether it
 matches with the password that is already stored for this user
-in the system. But doing this verification in plain text is
-really a bad idea. Unfortunately, evidence suggests, however,
-it is still a widespread practice. I leave you to think about
-why verifying passwords in plain text is a bad idea.
+in the system. Why not doing this with plain-text passwords?
+But doing this verification in plain text is really a bad
+idea. Unfortunately, evidence suggests it is still a
+widespread practice. I leave you to think about why verifying
+passwords in plain text is a bad idea.
 
 Using hash functions, like in our web-application, we can do
 better. They allow us to not having to store passwords in
@@ -410,8 +424,12 @@
 in the web-application before.
 
 Lets analyse what happens when a hacker gets hold of such a
-hashed password database. The hacker has then a list of user
-names and associated hash-values, like 
+hashed password database. That is the scenario we want to
+defend against.\footnote{If we could assume our servers can
+never be broken into, then storing passwords in plain text
+would be no problem. The point, however, is that servers are
+never absolutely secure.} The hacker has then a list of user names and
+associated hash-values, like 
 
 \begin{center}
 \pcode{urbanc:2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
@@ -438,11 +456,12 @@
 \noindent and so on, hash them and check whether they match
 with the hash-values in the database. Such brute force attacks
 are surprisingly effective. With modern technology (usually
-GPU graphic cards), passwords of moderate length only needs
-seconds or hours to be cracked. Well the only defence we have
-is to make passwords longer and force users to use the whole
-spectrum of letters and keys for passwords in order to make
-the search space to big for an effective brute force attack.
+GPU graphic cards), passwords of moderate length only need
+seconds or hours to be cracked. Well, the only defence we have
+against such brute force attacks is to make passwords longer
+and force users to use the whole spectrum of letters and keys
+for passwords. The hope is that this makes the search space
+too big for an effective brute force attack.
 
 Unfortunately, clever hackers have another ace up their
 sleeves. These are called \emph{dictionary attacks}. The idea
@@ -458,10 +477,15 @@
 \pcode{...}
 \end{center}
 
-\noindent So an attacker just needs to compile a list
-as large as possible of such likely candidates of passwords
-and also compute their hash-values. Now if the attacker
-knows the hash-value of a password is
+\noindent So an attacker just needs to compile a list as large
+as possible of such likely candidates of passwords and also
+compute their hash-values. The difference between a brute
+force attack, where maybe $2^{80}$ many strings need to be
+considered, a dictionary attack might get away witch checking
+only 10 Million (remember the language English ``only''
+contains 600,000 words). This is a drastic simplification for
+attackers. Now if the attacker knows the hash-value of a
+password is
 
 \begin{center}
 \pcode{5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8}
@@ -475,31 +499,34 @@
 space, which nowadays is pretty cheap. A hacker might in this
 way not be able to crack all passwords in our database, but
 even being able to crack 50\% can be serious damage for a
-large company (because then you have to think how to make
-users to change their old passwords). And hackers are very
-industrious in compiling these dictionaries: for example they
-definitely include variations like \pcode{passw0rd} and also
-includes rules that cover cases like \pcode{passwordpassword}
-or \pcode{drowssap} (password reversed). Historically,
-compiling a list for a dictionary attack is not as simple as
-it might seem. At the beginning only ``real'' dictionaries
-were available (like the Oxford English Dictionary), but such
-dictionaries are not ``optimised'' for the purpose of passwords.
-The first real hard date was obtained when a company called
-RockYou ``lost'' 32 Million plain-text password. With this
-data of real-life passwords, dictionary attacks took off.
+large company (because then you have to think about how to
+make users to change their old passwords---a major hassle).
+And hackers are very industrious in compiling these
+dictionaries: for example they definitely include variations
+like \pcode{passw0rd} and also include rules that cover cases
+like \pcode{passwordpassword} or \pcode{drowssap} (password
+reversed). Historically, compiling a list for a dictionary
+attack is not as simple as it might seem. At the beginning
+only ``real'' dictionaries were available (like the Oxford
+English Dictionary), but such dictionaries are not
+``optimised'' for the purpose of passwords. The first real
+hard data about actually used passwords was obtained when a
+company called RockYou ``lost'' 32 Million plain-text
+passwords. With this data of real-life passwords, dictionary
+attacks took off. Compiling such dictionaries is nowadays very
+easy with the help of off-the-shelf tools.
 
 These dictionary attacks can be prevented by using salts.
 Remember a hacker needs to use the most likely candidates 
-of passwords and calculate their has-value. If we add before
-hashing a password with a random salt, like \pcode{mPX2aq},
+of passwords and calculate their hash-value. If we add before
+hashing a password a random salt, like \pcode{mPX2aq},
 then the string \pcode{passwordmPX2aq} will almost certainly 
 not be in the dictionary. Like in the web-application in the
-previous section a salt does not prevent us from verifying a 
+previous section, a salt does not prevent us from verifying a 
 password. We just need to add the salt whenever the password 
 is typed in again. 
 
-There is a question whether we should us a single random salt
+There is a question whether we should use a single random salt
 for every password in our database. A single salt would
 already make dictionary attacks considerably more difficult.
 It turns out, however, that in case of password databases
@@ -511,12 +538,12 @@
 \pcode{urbanc:$6$3WWbKfr1$4vblknvGr6FcDeF92R5xFn3mskfdnEn...:...}
 \end{center}
 
-\noindent where the first part is the login-name, followed
-by a field \pcode{$6$} which specifies which hash-function
-is used. After that follows the salt \pcode{3WWbKfr1} and 
-after that the hash-value that is stored for the password plus 
-salt. I leave it to you to figure out how the password 
-verification would need to work based on this data.
+\noindent where the first part is the login-name, followed by
+a field \pcode{$6$} which specifies which hash-function is
+used. After that follows the salt \pcode{3WWbKfr1} and after
+that the hash-value that is stored for the password ( which
+includes the salt). I leave it to you to figure out how the
+password verification would need to work based on this data.
 
 There is a non-obvious benefit of using a separate salt for
 each password. Recall that \pcode{123456} is a popular
@@ -528,16 +555,17 @@
 to concentrate on those very popular passwords. This is not
 possible if each password gets its own salt: since we assume
 the salt is generated randomly, each version of \pcode{123456}
-will be associated with a different hash-value.  
+will be associated with a different hash-value. This will
+make the life harder for an attacker.
 
 Note another interesting point. The web-application from the
 previous section was only secure when the salt was secret. In
 the password case, this is not needed. The salt can be public
-as shown above and is actually stored as part of the password
-entry. Knowing the salt does not give the attacker any
-advantage, but prevents that dictionaries can be precompiled.
-The moral is that you should never store passwords in plain 
-text. Never ever.
+as shown above in the Unix password file where is actually
+stored as part of the password entry. Knowing the salt does
+not give the attacker any advantage, but prevents that
+dictionaries can be precompiled. The moral is that you should
+never store passwords in plain text. Never ever.
 
 \end{document}