sen-material: comparison handouts/ho01.tex

equal deleted inserted replaced

-:6ed7c9b8b291
+:55968b3205cc
 you need to have this kind mindset and be able to think like
 an attacker. This will include understanding techniques that
 can be used to compromise security and privacy in systems.
 This will many times result in insights where well-intended
 security mechanisms made a system actually less
-secure.\smallskip
+secure.\medskip
+\noindent
 {\Large\bf Warning!} However, don’t be evil! Using those
 techniques in the real world may violate the law or King’s
 rules, and it may be unethical. Under some circumstances, even
 probing for weaknesses of a system may result in severe
 penalties, up to and including expulsion, fines and
 tamper with any of King's systems. If you try out a technique,
 always make doubly sure you are working in a safe environment
 so that you cannot cause any harm, not even accidentally.
 Don't be evil. Be an ethical hacker.\medskip
-\noindent
+\noindent In this lecture I want to make you familiar with the
-In this lecture I want to make you familiar with the security mindset
+security mindset and dispel the myth that encryption is the
-and dispel the myth that encryption is the answer to all security
+answer to all security problems (it is certainly often a part
-problems (it is certainly often part of an answer, but almost always
+of an answer, but almost always never a sufficient one). This
-never a sufficient one). This is actually an important thread going
+is actually an important thread going through the whole
-through the whole course: We will assume that encryption works
+course: We will assume that encryption works perfectly, but
-perfectly, but still attack ``things''. By ``works perfectly'' we mean
+still attack ``things''. By ``works perfectly'' we mean that
-that we will assume encryption is a black box and, for example, will
+we will assume encryption is a black box and, for example,
-not look at the underlying mathematics and break the
+will not look at the underlying mathematics and break the
 algorithms.\footnote{Though fascinating this might be.}
 For a secure system, it seems, four requirements need to come
 together: First a security policy (what is supposed to be
 achieved?); second a mechanism (cipher, access controls,
 stored on a chip on the card and a PIN number for
 authorisation. Even though the banks involved trumpeted their
 system as being absolutely secure and indeed fraud rates
 initially went down, security researchers were not convinced
 (especially the group around Ross Anderson). To begin with,
-the Chip-and-PIN system introduced a ``new player'' that
+the Chip-and-PIN system introduced a ``new player'' into the
-needed to be trusted: the PIN terminals and their
+system that needed to be trusted: the PIN terminals and their
 manufacturers. It was claimed that these terminals were
 tamper-resistant, but needless to say this was a weak link in
 the system, which criminals successfully attacked. Some
 terminals were even so skilfully manipulated that they
 transmitted skimmed PIN numbers via built-in mobile phone
 connections. To mitigate this flaw in the security of
-Chip-and-PIN, you need to vet quite closely the supply chain
+Chip-and-PIN, you need to be able to vet quite closely the
-of such terminals.
+supply chain of such terminals. This is something that is
+mostly beyond the control of customers who need to use these
-Later on Ross Anderson and his group were able to perform
+terminals.
-man-in-the-middle attacks against Chip-and-PIN. Essentially
-they made the terminal think the correct PIN was entered and
+To make matters worse for Chip-and-PIN, in around 2009 Ross
-the card think that a signature was used. This is a kind of
+Anderson and his group were able to perform man-in-the-middle
-\emph{protocol failure}. After discovery, the flaw was
+attacks against Chip-and-PIN. Essentially they made the
-mitigated by requiring that a link between the card and the
+terminal think the correct PIN was entered and the card think
-bank is established at every time the card is used. Even later
+that a signature was used. This is a kind of \emph{protocol
-this group found another problem with Chip-and-PIN and ATMs
+failure}. After discovery, the flaw was mitigated by requiring
-which did not generate random enough numbers (nonces) on which
+that a link between the card and the bank is established at
-the security of the underlying protocols relies.
+every time the card is used. Even later this group found
+another problem with Chip-and-PIN and ATMs which did not
+generate random enough numbers (nonces) on which the security
+of the underlying protocols relies.
 The problem with all this is that the banks who introduced
 Chip-and-PIN managed with the new system to shift the
 liability for any fraud and the burden of proof onto the
 customer. In the old system, the banks had to prove that the
 profits too much.
 Since banks managed to successfully claim that their
 Chip-and-PIN system is secure, they were under the new system
 able to point the finger at the customer when fraud occurred:
-customers must have been negligent loosing their PIN and they
+customers must have been negligent losing their PIN and they
 had almost no way of defending themselves in such situations.
 That is why the work of \emph{ethical} hackers like Ross
 Anderson's group was so important, because they and others
 established that the bank's claim that their system is secure
 and it must have been the customer's fault, was bogus. In 2009
-for example the law changed and the burden of proof went back
+the law changed and the burden of proof went back to the
-to the banks. They need to prove whether it was really the
+banks. They need to prove whether it was really the customer
-customer who used a card or not.
+who used a card or not.
 This is a classic example where a security design principle
 was violated: Namely, the one who is in the position to
 improve security, also needs to bear the financial losses if
 things go wrong. Otherwise, you end up with an insecure
 system. In case of the Chip-and-PIN system, no good security
-engineer would dare claim that it is secure beyond reproach:
+engineer would dare to claim that it is secure beyond
-the specification of the EMV protocol (underlying
+reproach: the specification of the EMV protocol (underlying
 Chip-and-PIN) is some 700 pages long, but still leaves out
 many things (like how to implement a good random number
 generator). No human being is able to scrutinise such a
 specification and ensure it contains no flaws. Moreover, banks
 can add their own sub-protocols to EMV. With all the
 experience we already have, it is as clear as day that
-criminals were eventually able to poke holes into it and
+criminals were bound to eventually be able to poke holes into
-measures need to be taken to address them. However, with how
+it and measures need to be taken to address them. However,
-the system was set up, the banks had no real incentive to come
+with how the system was set up, the banks had no real
-up with a system that is really secure. Getting the incentives
+incentive to come up with a system that is really secure.
-right in favour of security is often a tricky business. From a
+Getting the incentives right in favour of security is often a
-customer point of view the system was much less secure than
+tricky business. From a customer point of view, the
-the old signature-based method.
+Chip-and-PIN system was much less secure than the old
+signature-based method. The customer could now lose
+significant amounts of money.
 \subsection*{Of Cookies and Salts}
 Lets look at another example which will help with
 understanding how passwords should be verified and stored.
 will be from just looking at input that is ``close by''.
 We can use hashes in our web-application and store in the
 cookie the value of the counter in plain text but together
 with its hash. We need to store both pieces of data in such a
-way that we can extract them again later on (in the code below
+way that we can extract them again later on. In the code below
-I will just separate them using a \pcode{"-"}). If we now read
+I will just separate them using a \pcode{"-"}, for example
-back the cookie when the client visits our webpage, we can
-extract the counter, hash it again and compare the result to
+\begin{center}
-the stored hash value inside the cookie. If these hashes
+\pcode{1-356a192b7913b04c54574d18c28d46e6395428ab}
-disagree, then we can deduce that the cookie has been tampered
+\end{center}
-with. Unfortunately, if they agree, we can still not be
-entirely sure that not a clever hacker has tampered with the
+\noindent for the counter \pcode{1}. If we now read back the
-cookie. The reason is that the hacker can see the clear text
+cookie when the client visits our webpage, we can extract the
-part of the cookie, say \pcode{3}, and also its hash. It does
+counter, hash it again and compare the result to the stored
-not take much trial and error to find out that we used the
+hash value inside the cookie. If these hashes disagree, then
-SHA-1 hashing function and then the hacker can graft a cookie
+we can deduce that the cookie has been tampered with.
+Unfortunately, if they agree, we can still not be entirely
+sure that not a clever hacker has tampered with the cookie.
+The reason is that the hacker can see the clear text part of
+the cookie, say \pcode{3}, and also its hash. It does not take
+much trial and error to find out that we used the SHA-1
+hashing function and then the hacker can graft a cookie
 accordingly. This is eased by the fact that for SHA-1 many
 strings and corresponding hash-values are precalculated. Type,
 for example, into Google the hash value for \pcode{"hello
 world"} and you will actually pretty quickly find that it was
-generated by input string \pcode{"hello world"}. This defeats
+generated by input string \pcode{"hello world"}. Similarly for
-the purpose of a hashing function and thus would not help us
+the hash-value for \pcode{1}. This defeats the purpose of a
-with our web-applications and later also not with how to store
+hashing function and thus would not help us with our
+web-applications and later also not with how to store
 passwords properly.
 There is one ingredient missing, which happens to be called
 \emph{salts}. Salts are random keys, which are added to the
 counter before the hash is calculated. In our case we must
 keep the salt secret. As can be see in Figure~\ref{hashsalt},
-we need to extract from the cookie the counter value and the
+we need to extract from the cookie the counter value and its
 hash (Lines 19 and 20). But before hashing the counter again
 (Line 22) we need to add the secret salt. Similarly, when we
 set the new increased counter, we will need to add the salt
 before hashing (this is done in Line 15). Our web-application
 will now store cookies like
 unbelievable that nowadays systems still do this with
 passwords in plain text. The idea behind such plain-text
 passwords is of course that if the user typed in
 \pcode{foobar} as password, we need to verify whether it
 matches with the password that is already stored for this user
-in the system. But doing this verification in plain text is
+in the system. Why not doing this with plain-text passwords?
-really a bad idea. Unfortunately, evidence suggests, however,
+But doing this verification in plain text is really a bad
-it is still a widespread practice. I leave you to think about
+idea. Unfortunately, evidence suggests it is still a
-why verifying passwords in plain text is a bad idea.
+widespread practice. I leave you to think about why verifying
+passwords in plain text is a bad idea.
 Using hash functions, like in our web-application, we can do
 better. They allow us to not having to store passwords in
 plain text for verification whether a password matches or not.
 We can just hash the password and store the hash-value. And
 whenever the user types in a new password, well then we hash
 it again and check whether the hash-values agree. Just like
 in the web-application before.
 Lets analyse what happens when a hacker gets hold of such a
-hashed password database. The hacker has then a list of user
+hashed password database. That is the scenario we want to
-names and associated hash-values, like
+defend against.\footnote{If we could assume our servers can
+never be broken into, then storing passwords in plain text
+would be no problem. The point, however, is that servers are
+never absolutely secure.} The hacker has then a list of user names and
+associated hash-values, like
 \begin{center}
 \pcode{urbanc:2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
 \end{center}
 \end{center}
 \noindent and so on, hash them and check whether they match
 with the hash-values in the database. Such brute force attacks
 are surprisingly effective. With modern technology (usually
-GPU graphic cards), passwords of moderate length only needs
+GPU graphic cards), passwords of moderate length only need
-seconds or hours to be cracked. Well the only defence we have
+seconds or hours to be cracked. Well, the only defence we have
-is to make passwords longer and force users to use the whole
+against such brute force attacks is to make passwords longer
-spectrum of letters and keys for passwords in order to make
+and force users to use the whole spectrum of letters and keys
-the search space to big for an effective brute force attack.
+for passwords. The hope is that this makes the search space
+too big for an effective brute force attack.
 Unfortunately, clever hackers have another ace up their
 sleeves. These are called \emph{dictionary attacks}. The idea
 behind dictionary attack is the observation that only few
 people are competent enough to use sufficiently strong
 \pcode{qwerty},
 \pcode{letmein},
 \pcode{...}
 \end{center}
-\noindent So an attacker just needs to compile a list
+\noindent So an attacker just needs to compile a list as large
-as large as possible of such likely candidates of passwords
+as possible of such likely candidates of passwords and also
-and also compute their hash-values. Now if the attacker
+compute their hash-values. The difference between a brute
-knows the hash-value of a password is
+force attack, where maybe $2^{80}$ many strings need to be
+considered, a dictionary attack might get away witch checking
+only 10 Million (remember the language English ``only''
+contains 600,000 words). This is a drastic simplification for
+attackers. Now if the attacker knows the hash-value of a
+password is
 \begin{center}
 \pcode{5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8}
 \end{center}
 precompiled in the ``comfort of the hacker's home'' before an
 actual attack is launched. It just needs sufficient storage
 space, which nowadays is pretty cheap. A hacker might in this
 way not be able to crack all passwords in our database, but
 even being able to crack 50\% can be serious damage for a
-large company (because then you have to think how to make
+large company (because then you have to think about how to
-users to change their old passwords). And hackers are very
+make users to change their old passwords---a major hassle).
-industrious in compiling these dictionaries: for example they
+And hackers are very industrious in compiling these
-definitely include variations like \pcode{passw0rd} and also
+dictionaries: for example they definitely include variations
-includes rules that cover cases like \pcode{passwordpassword}
+like \pcode{passw0rd} and also include rules that cover cases
-or \pcode{drowssap} (password reversed). Historically,
+like \pcode{passwordpassword} or \pcode{drowssap} (password
-compiling a list for a dictionary attack is not as simple as
+reversed). Historically, compiling a list for a dictionary
-it might seem. At the beginning only ``real'' dictionaries
+attack is not as simple as it might seem. At the beginning
-were available (like the Oxford English Dictionary), but such
+only ``real'' dictionaries were available (like the Oxford
-dictionaries are not ``optimised'' for the purpose of passwords.
+English Dictionary), but such dictionaries are not
-The first real hard date was obtained when a company called
+``optimised'' for the purpose of passwords. The first real
-RockYou ``lost'' 32 Million plain-text password. With this
+hard data about actually used passwords was obtained when a
-data of real-life passwords, dictionary attacks took off.
+company called RockYou ``lost'' 32 Million plain-text
+passwords. With this data of real-life passwords, dictionary
+attacks took off. Compiling such dictionaries is nowadays very
+easy with the help of off-the-shelf tools.
 These dictionary attacks can be prevented by using salts.
 Remember a hacker needs to use the most likely candidates
-of passwords and calculate their has-value. If we add before
+of passwords and calculate their hash-value. If we add before
-hashing a password with a random salt, like \pcode{mPX2aq},
+hashing a password a random salt, like \pcode{mPX2aq},
 then the string \pcode{passwordmPX2aq} will almost certainly
 not be in the dictionary. Like in the web-application in the
-previous section a salt does not prevent us from verifying a
+previous section, a salt does not prevent us from verifying a
 password. We just need to add the salt whenever the password
 is typed in again.
-There is a question whether we should us a single random salt
+There is a question whether we should use a single random salt
 for every password in our database. A single salt would
 already make dictionary attacks considerably more difficult.
 It turns out, however, that in case of password databases
 every password should get their own salt. This salt is
 generated at the time when the password is first set.
 \begin{center}
 \pcode{urbanc:$6$3WWbKfr1$4vblknvGr6FcDeF92R5xFn3mskfdnEn...:...}
 \end{center}
-\noindent where the first part is the login-name, followed
+\noindent where the first part is the login-name, followed by
-by a field \pcode{$6$} which specifies which hash-function
+a field \pcode{$6$} which specifies which hash-function is
-is used. After that follows the salt \pcode{3WWbKfr1} and
+used. After that follows the salt \pcode{3WWbKfr1} and after
-after that the hash-value that is stored for the password plus
+that the hash-value that is stored for the password ( which
-salt. I leave it to you to figure out how the password
+includes the salt). I leave it to you to figure out how the
-verification would need to work based on this data.
+password verification would need to work based on this data.
 There is a non-obvious benefit of using a separate salt for
 each password. Recall that \pcode{123456} is a popular
 password that is most likely used by several of your users
 (especially if the database contains millions of entries). If
 same for this password. So if a hacker is in the business of
 cracking as much passwords as possible, then it is a good idea
 to concentrate on those very popular passwords. This is not
 possible if each password gets its own salt: since we assume
 the salt is generated randomly, each version of \pcode{123456}
-will be associated with a different hash-value.
+will be associated with a different hash-value. This will
+make the life harder for an attacker.
 Note another interesting point. The web-application from the
 previous section was only secure when the salt was secret. In
 the password case, this is not needed. The salt can be public
-as shown above and is actually stored as part of the password
+as shown above in the Unix password file where is actually
-entry. Knowing the salt does not give the attacker any
+stored as part of the password entry. Knowing the salt does
-advantage, but prevents that dictionaries can be precompiled.
+not give the attacker any advantage, but prevents that
-The moral is that you should never store passwords in plain
+dictionaries can be precompiled. The moral is that you should
-text. Never ever.
+never store passwords in plain text. Never ever.
 \end{document}
 %%% Local Variables:
 %%% mode: latex

changeset 184	55968b3205cc
parent 183	6ed7c9b8b291
child 185	f10d905e947f