diff -r 3391a4fc3533 -r d58f8e3e78a5 handouts/ho01.tex --- a/handouts/ho01.tex Fri Jun 01 15:46:34 2018 +0100 +++ b/handouts/ho01.tex Sat Jun 09 21:01:46 2018 +0100 @@ -6,21 +6,19 @@ \begin{document} -\fnote{\copyright{} Christian Urban, -King's College London, 2014, 2015, 2016} +\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016} % passwords at dropbox -%%https://blogs.dropbox.com/tech/2016/09/how-dropbox-securely-stores-your-passwords/ +% %https://blogs.dropbox.com/tech/2016/09/how-dropbox-securely-stores-your-passwords/ -%Ross anderson -%https://youtu.be/FY2YKxBxOkg +%Ross anderson https://youtu.be/FY2YKxBxOkg %http://www.scmagazineuk.com/amazon-launches-open-source-tls-implementation-s2n/article/424360/ %Singapurs Behörden gehen offline % how to store passwords -%https://nakedsecurity.sophos.com/2013/11/20/serious-security-how-to-store-your-users-passwords-safely/ +% https://nakedsecurity.sophos.com/2013/11/20/serious-security-how-to-store-your-users-passwords-safely/ %hashes %http://web.archive.org/web/20071226014140/http://www.cits.rub.de/MD5Collisions/ @@ -35,416 +33,372 @@ % https://nakedsecurity.sophos.com/2015/10/26/the-internet-of-things-stop-the-things-i-want-to-get-off/ % cloning creditc cards and passports -%https://www.youtube.com/watch?v=-4_on9zj-zs +% https://www.youtube.com/watch?v=-4_on9zj-zs \section*{Handout 1 (Security Engineering)} - -Much of the material and inspiration in this module is taken -from the works of Bruce Schneier, Ross Anderson and Alex -Halderman. I think they are the world experts in the area of -security engineering. I especially like that they argue that a -security engineer requires a certain \emph{security mindset}. -Bruce Schneier for example writes: +Much of the material and inspiration in this module is taken from the +works of Bruce Schneier, Ross Anderson and Alex Halderman. I think they +are the world experts in the area of security engineering. I especially +like that they argue that a security engineer requires a certain +\emph{security mindset}. Bruce Schneier for example writes: \begin{quote} -\it ``Security engineers --- at least the good ones --- see -the world differently. They can't walk into a store without -noticing how they might shoplift. They can't use a computer -without wondering about the security vulnerabilities. They -can't vote without trying to figure out how to vote twice. -They just can't help it.'' +\it ``Security engineers --- at least the good ones --- see the world +differently. They can't walk into a store without noticing how they +might shoplift. They can't use a computer without wondering about the +security vulnerabilities. They can't vote without trying to figure out +how to vote twice. They just can't help it.'' \end{quote} \noindent and \begin{quote} -\it ``Security engineering\ldots requires you to think -differently. You need to figure out not how something works, -but how something can be made to not work. You have to imagine -an intelligent and malicious adversary inside your system -\ldots, constantly trying new ways to -subvert it. You have to consider all the ways your system can -fail, most of them having nothing to do with the design -itself. You have to look at everything backwards, upside down, -and sideways. You have to think like an alien.'' +\it ``Security engineering\ldots requires you to think differently. You +need to figure out not how something works, but how something can be +made to not work. You have to imagine an intelligent and malicious +adversary inside your system \ldots, constantly trying new ways to +subvert it. You have to consider all the ways your system can fail, most +of them having nothing to do with the design itself. You have to look at +everything backwards, upside down, and sideways. You have to think like +an alien.'' \end{quote} -\noindent In this module I like to teach you this security -mindset. This might be a mindset that you think is very -foreign to you---after all we are all good citizens and do not -hack into things. However, I beg to differ: You have this -mindset already when in school you were thinking, at least -hypothetically, about ways in which you can cheat in an exam -(whether it is by hiding notes or by looking over the -shoulders of your fellow pupils). Right? To defend a system, -you need to have this kind of mindset and be able to think -like an attacker. This will include understanding techniques -that can be used to compromise security and privacy in -systems. This will many times result in insights where +\noindent In this module I like to teach you this security mindset. This +might be a mindset that you think is very foreign to you---after all we +are all good citizens and do not hack into things. However, I beg to +differ: You have this mindset already when in school you were thinking, +at least hypothetically, about ways in which you can cheat in an exam +(whether it is by hiding notes or by looking over the shoulders of your +fellow pupils). Right? To defend a system, you need to have this kind of +mindset and be able to think like an attacker. This will include +understanding techniques that can be used to compromise security and +privacy in systems. This will many times result in insights where well-intended security mechanisms made a system actually less secure.\medskip \noindent -{\Large\bf Warning!} However, don’t be evil! Using those -techniques in the real world may violate the law or King’s -rules, and it may be unethical. Under some circumstances, even -probing for weaknesses of a system may result in severe -penalties, up to and including expulsion, fines and -jail time. Acting lawfully and ethically is your -responsibility. Ethics requires you to refrain from doing -harm. Always respect privacy and rights of others. Do not -tamper with any of King's systems. If you try out a technique, -always make doubly sure you are working in a safe environment -so that you cannot cause any harm, not even accidentally. -Don't be evil. Be an ethical hacker.\medskip +{\Large\bf Warning!} However, don’t be evil! Using those techniques in +the real world may violate the law or King’s rules, and it may be +unethical. Under some circumstances, even probing for weaknesses of a +system may result in severe penalties, up to and including expulsion, +fines and jail time. Acting lawfully and ethically is your +responsibility. Ethics requires you to refrain from doing harm. Always +respect privacy and rights of others. Do not tamper with any of King's +systems. If you try out a technique, always make doubly sure you are +working in a safe environment so that you cannot cause any harm, not +even accidentally. Don't be evil. Be an ethical hacker.\medskip -\noindent In this lecture I want to make you familiar with the -security mindset and dispel the myth that encryption is the -answer to all security problems (it is certainly often a part -of an answer, but almost always never a sufficient one). This -is actually an important thread going through the whole -course: We will assume that encryption works perfectly, but -still attack ``things''. By ``works perfectly'' we mean that -we will assume encryption is a black box and, for example, -will not look at the underlying mathematics and break the +\noindent In this lecture I want to make you familiar with the security +mindset and dispel the myth that encryption is the answer to all +security problems (it is certainly often a part of an answer, but almost +always never a sufficient one). This is actually an important thread +going through the whole course: We will assume that encryption works +perfectly, but still attack ``things''. By ``works perfectly'' we mean +that we will assume encryption is a black box and, for example, will not +look at the underlying mathematics and break the algorithms.\footnote{Though fascinating this might be.} -For a secure system, it seems, four requirements need to come -together: First a security policy (what is supposed to be -achieved?); second a mechanism (cipher, access controls, -tamper resistance etc); third the assurance we obtain from the -mechanism (the amount of reliance we can put on the mechanism) -and finally the incentives (the motive that the people -guarding and maintaining the system have to do their job -properly, and also the motive that the attackers have to try -to defeat your policy). The last point is often overlooked, -but plays an important role. To illustrate this let's look at -an example. +For a secure system, it seems, four requirements need to come together: +First a security policy (what is supposed to be achieved?); second a +mechanism (cipher, access controls, tamper resistance etc); third the +assurance we obtain from the mechanism (the amount of reliance we can +put on the mechanism) and finally the incentives (the motive that the +people guarding and maintaining the system have to do their job +properly, and also the motive that the attackers have to try to defeat +your policy). The last point is often overlooked, but plays an important +role. To illustrate this let's look at an example. \subsubsection*{Chip-and-PIN is Surely More Secure, No?} -The questions is whether the Chip-and-PIN system used with -modern credit cards is more secure than the older method of -signing receipts at the till? On first glance the answer seems -obvious: Chip-and-PIN must be more secure and indeed improved -security was the central plank in the ``marketing speak'' of -the banks behind Chip-and-PIN. The earlier system was based on -a magnetic stripe or a mechanical imprint on the cards and -required customers to sign receipts at the till whenever they -bought something. This signature authorised the transactions. -Although in use for a long time, this system had some crucial -security flaws, including making clones of credit cards and -forging signatures. +The questions is whether the Chip-and-PIN system used with modern credit +cards is more secure than the older method of signing receipts at the +till? On first glance the answer seems obvious: Chip-and-PIN must be +more secure and indeed improved security was the central plank in the +``marketing speak'' of the banks behind Chip-and-PIN. The earlier system +was based on a magnetic stripe or a mechanical imprint on the cards and +required customers to sign receipts at the till whenever they bought +something. This signature authorised the transactions. Although in use +for a long time, this system had some crucial security flaws, including +making clones of credit cards and forging signatures. -Chip-and-PIN, as the name suggests, relies on data being -stored on a chip on the card and a PIN number for -authorisation. Even though the banks involved trumpeted their -system as being absolutely secure and indeed fraud rates -initially went down, security researchers were not convinced -(especially not the group around Ross -Anderson).\footnote{Actually, historical data about fraud -showed that first fraud rates went up (while early problems to -do with the introduction of Chip-and-PIN we exploited), then -down, but recently up again (because criminals getting more -familiar with the technology and how it can be exploited).} To begin with, the -Chip-and-PIN system introduced a ``new player'' into the -system that needed to be trusted: the PIN terminals and their -manufacturers. It was claimed that these terminals were -tamper-resistant, but needless to say this was a weak link in -the system, which criminals successfully attacked. Some -terminals were even so skilfully manipulated that they -transmitted skimmed PIN numbers via built-in mobile phone -connections. To mitigate this flaw in the security of -Chip-and-PIN, you need to be able to vet quite closely the -supply chain of such terminals. This is something that is -mostly beyond the control of customers who need to use these +Chip-and-PIN, as the name suggests, relies on data being stored on a +chip on the card and a PIN number for authorisation. Even though the +banks involved trumpeted their system as being absolutely secure and +indeed fraud rates initially went down, security researchers were not +convinced (especially not the group around Ross +Anderson).\footnote{Actually, historical data about fraud showed that +first fraud rates went up (while early problems to do with the +introduction of Chip-and-PIN we exploited), then down, but recently up +again (because criminals getting more familiar with the technology and +how it can be exploited).} To begin with, the Chip-and-PIN system +introduced a ``new player'' into the system that needed to be trusted: +the PIN terminals and their manufacturers. It was claimed that these +terminals were tamper-resistant, but needless to say this was a weak +link in the system, which criminals successfully attacked. Some +terminals were even so skilfully manipulated that they transmitted +skimmed PIN numbers via built-in mobile phone connections. To mitigate +this flaw in the security of Chip-and-PIN, you need to be able to vet +quite closely the supply chain of such terminals. This is something that +is mostly beyond the control of customers who need to use these terminals. -To make matters worse for Chip-and-PIN, around 2009 Ross -Anderson and his group were able to perform man-in-the-middle -attacks against Chip-and-PIN. Essentially they made the -terminal think the correct PIN was entered and the card think -that a signature was used. This is a kind of \emph{protocol -failure}. After discovery, the flaw was mitigated by requiring -that a link between the card and the bank is established at -every time the card is used. Even later this group found -another problem with Chip-and-PIN and ATMs which did not -generate random enough numbers (cryptographic nonces) on which -the security of the underlying protocols relies. +To make matters worse for Chip-and-PIN, around 2009 Ross Anderson and +his group were able to perform man-in-the-middle attacks against +Chip-and-PIN. Essentially they made the terminal think the correct PIN +was entered and the card think that a signature was used. This is a kind +of \emph{protocol failure}. After discovery, the flaw was mitigated by +requiring that a link between the card and the bank is established at +every time the card is used. Even later this group found another problem +with Chip-and-PIN and ATMs which did not generate random enough numbers +(cryptographic nonces) on which the security of the underlying protocols +relies. -The overarching problem with all this is that the banks who -introduced Chip-and-PIN managed with the new system to shift -the liability for any fraud and the burden of proof onto the -customer. In the old system, the banks had to prove that the -customer used the card, which they often did not bother with. -In effect, if fraud occurred the customers were either +The overarching problem with all this is that the banks who introduced +Chip-and-PIN managed with the new system to shift the liability for any +fraud and the burden of proof onto the customer. In the old system, the +banks had to prove that the customer used the card, which they often did +not bother with. In effect, if fraud occurred the customers were either refunded fully or lost only a small amount of money. This -taking-responsibility-of-potential-fraud was part of the -``business plan'' of the banks and did not reduce their -profits too much. +taking-responsibility-of-potential-fraud was part of the ``business +plan'' of the banks and did not reduce their profits too much. + -Since banks managed to successfully claim that their -Chip-and-PIN system is secure, they were under the new system -able to point the finger at the customer when fraud occurred: -customers must have been negligent losing their PIN and -customers had almost no way of defending themselves in such -situations. That is why the work of \emph{ethical} hackers -like Ross Anderson's group is so important, because they and -others established that the banks' claim that their system is -secure and it must have been the customer's fault, was bogus. -In 2009 the law changed and the burden of proof went back to -the banks. They need to prove whether it was really the -customer who used a card or not. The current state of affairs, -however, is that standing up for your right requires you to be -knowledgeable, potentially having to go to court\ldots{}if +Since banks managed to successfully claim that their Chip-and-PIN system +is secure, they were under the new system able to point the finger at +the customer when fraud occurred: customers must have been negligent +losing their PIN and customers had almost no way of defending themselves +in such situations. That is why the work of \emph{ethical} hackers like +Ross Anderson's group is so important, because they and others +established that the banks' claim that their system is secure and it +must have been the customer's fault, was bogus. In 2009 the law changed +and the burden of proof went back to the banks. They need to prove +whether it was really the customer who used a card or not. The current +state of affairs, however, is that standing up for your right requires +you to be knowledgeable, potentially having to go to court\ldots{}if not, the banks are happy to take advantage of you. -This is a classic example where a security design principle -was violated: Namely, the one who is in the position to -improve security, also needs to bear the financial losses if -things go wrong. Otherwise, you end up with an insecure -system. In case of the Chip-and-PIN system, no good security -engineer would dare to claim that it is secure beyond -reproach: the specification of the EMV protocol (underlying -Chip-and-PIN) is some 700 pages long, but still leaves out -many things (like how to implement a good random number -generator). No human being is able to scrutinise such a -specification and ensure it contains no flaws. Moreover, banks -can add their own sub-protocols to EMV. With all the -experience we already have, it is as clear as day that -criminals were bound to eventually be able to poke holes into -it and measures need to be taken to address them. However, -with how the system was set up, the banks had no real -incentive to come up with a system that is really secure. -Getting the incentives right in favour of security is often a -tricky business. From a customer point of view, the -Chip-and-PIN system was much less secure than the old -signature-based method. The customer could now lose -significant amounts of money. +This is a classic example where a fundamental security design principle +was violated: Namely, the one who is in the position to improve +security, also needs to bear the financial losses if things go wrong. +Otherwise, you end up with an insecure system. In case of the +Chip-and-PIN system, no good security engineer would dare to claim that +it is secure beyond reproach: the specification of the EMV protocol +(underlying Chip-and-PIN) is some 700 pages long, but still leaves out +many things (like how to implement a good random number generator). No +human being is able to scrutinise such a specification and ensure it +contains no flaws. Moreover, banks can add their own sub-protocols to +EMV. With all the experience we already have, it is as clear as day that +criminals were bound to eventually be able to poke holes into it and +measures need to be taken to address them. However, with how the system +was set up, the banks had no real incentive to come up with a system +that is really secure. Getting the incentives right in favour of +security is often a tricky business. From a customer point of view, the +Chip-and-PIN system was much less secure than the old signature-based +method. The customer could now lose significant amounts of money. -If you want to watch an entertaining talk about attacking -Chip-and-PIN cards, then this talk from the 2014 Chaos -Computer Club conference is for you: +If you want to watch an entertaining talk about attacking Chip-and-PIN +cards, then this talk from the 2014 Chaos Computer Club conference is +for you: \begin{center} \url{https://goo.gl/zuwVHb} \end{center} -\noindent They claim that they are able to clone Chip-and-PINs -cards such that they get all data that was on the Magstripe, -except for three digits (the CVV number). Remember, -Chip-and-PIN cards were introduced exactly for preventing -this. Ross Anderson also talked about his research at the -BlackHat Conference in 2014: +\noindent They claim that they are able to clone Chip-and-PINs cards +such that they get all data that was on the Magstripe, except for three +digits (the CVV number). Remember, Chip-and-PIN cards were introduced +exactly for preventing this. Ross Anderson also talked about his +research at the BlackHat Conference in 2014: \begin{center} \url{https://www.youtube.com/watch?v=ET0MFkRorbo} \end{center} -\noindent An article about reverse-engineering a PIN-number skimmer -is at +\noindent An article about reverse-engineering a PIN-number skimmer is +at \begin{center}\small \url{https://trustfoundry.net/reverse-engineering-a-discovered-atm-skimmer/} \end{center} \noindent -including a scary video of how a PIN-pad overlay is -installed by some crooks. +including a scary video of how a PIN-pad overlay is installed by some +crooks. \subsection*{Of Cookies and Salts} Let us look at another example which will help with understanding how -passwords should be verified and stored. Imagine you need to develop -a web-application that has the feature of recording how many times a +passwords should be verified and stored. Imagine you need to develop a +web-application that has the feature of recording how many times a customer visits a page. For example in order to give a discount whenever the customer has visited a webpage some $x$ number of times -(say $x$ equals $5$). There is one more constraint: we want to store -the information about the number of visits as a cookie on the -browser. I think, for a number of years the webpage of the New York -Times operated in this way: it allowed you to read ten articles per -month for free; if you wanted to read more, you had to pay. My best -guess is that it used cookies for recording how many times their pages -was visited, because if I switched browsers I could easily circumvent -the restriction about ten articles.\footnote{Another online media that - works in this way is the Times Higher Education - \url{http://www.timeshighereducation.co.uk}. It also seems to - use cookies to restrict the number of free articles to five.} +(say $x$ equals $5$). There is one more constraint: we want to store the +information about the number of visits as a cookie on the browser. I +think, for a number of years the webpage of the New York Times operated +in this way: it allowed you to read ten articles per month for free; if +you wanted to read more, you had to pay. My best guess is that it used +cookies for recording how many times their pages was visited, because if +I switched browsers I could easily circumvent the restriction about ten +articles.\footnote{Another online media that works in this way is the +Times Higher Education \url{http://www.timeshighereducation.co.uk}. It +also seems to use cookies to restrict the number of free articles to +five.} -To implement our web-application it is good to look under the -hood what happens when a webpage is displayed in a browser. A -typical web-application works as follows: The browser sends a -GET request for a particular page to a server. The server -answers this request with a webpage in HTML (for our purposes -we can ignore the details about HTML). A simple JavaScript -program that realises a server answering with a ``Hello -World'' webpage is as follows: +To implement our web-application it is good to look under the hood what +happens when a webpage is displayed in a browser. A typical +web-application works as follows: The browser sends a GET request for a +particular page to a server. The server answers this request with a +webpage in HTML (for our purposes we can ignore the details about HTML). +A simple JavaScript program that realises a server answering with a +``Hello World'' webpage is as follows: \begin{center} \lstinputlisting{../progs/ap0.js} \end{center} -\noindent The interesting lines are 4 to 7 where the answer to -the GET request is generated\ldots in this case it is just a -simple string. This program is run on the server and will be -executed whenever a browser initiates such a GET request. You -can run this program on your computer and then direct a -browser to the address \pcode{localhost:8000} in order to -simulate a request over the internet. You are encouraged -to try this out\ldots{}theory is always good, but practice is -better. +\noindent The interesting lines are 4 to 7 where the answer to the GET +request is generated\ldots in this case it is just a simple string. This +program is run on the server and will be executed whenever a browser +initiates such a GET request. You can run this program on your computer +and then direct a browser to the address \pcode{localhost:8000} in order +to simulate a request over the internet. You are encouraged to try this +out\ldots{}theory is always good, but practice is better. -For our web-application of interest is the feature that the -server when answering the request can store some information -on the client's side. This information is called a -\emph{cookie}. The next time the browser makes another GET -request to the same webpage, this cookie can be read again by -the server. We can use cookies in order to store a counter -that records the number of times our webpage has been visited. -This can be realised with the following small program +For our web-application of interest is the feature that the server when +answering the request can store some information on the client's side. +This information is called a \emph{cookie}. The next time the browser +makes another GET request to the same webpage, this cookie can be read +again by the server. We can use cookies in order to store a counter that +records the number of times our webpage has been visited. This can be +realised with the following small program \begin{center} \lstinputlisting{../progs/ap2.js} \end{center} -\noindent The overall structure of this program is the same as -the earlier one: Lines 7 to 17 generate the answer to a -GET-request. The new part is in Line 8 where we read the -cookie called \pcode{counter}. If present, this cookie will be -send together with the GET-request from the client. The value -of this counter will come in form of a string, therefore we -use the function \pcode{parseInt} in order to transform it -into an integer. In case the cookie is not present, we default -the counter to zero. The odd looking construction \code{...|| -0} is realising this defaulting in JavaScript. In Line 9 we -increase the counter by one and store it back to the client -(under the name \pcode{counter}, since potentially more than -one value could be stored). In Lines 10 to 15 we test whether -this counter is greater or equal than 5 and send accordingly a -specially grafted message back to the client. +\noindent The overall structure of this program is the same as the +earlier one: Lines 7 to 17 generate the answer to a GET-request. The new +part is in Line 8 where we read the cookie called \pcode{counter}. If +present, this cookie will be send together with the GET-request from the +client. The value of this counter will come in form of a string, +therefore we use the function \pcode{parseInt} in order to transform it +into an integer. In case the cookie is not present, we default the +counter to zero. The odd looking construction \code{...|| 0} is +realising this defaulting in JavaScript. In Line 9 we increase the +counter by one and store it back to the client (under the name +\pcode{counter}, since potentially more than one value could be stored). +In Lines 10 to 15 we test whether this counter is greater or equal than +5 and send accordingly a specially grafted message back to the client. -Let us step back and analyse this program from a security -point of view. We store a counter in plain text on the -client's browser (which is not under our control). Depending -on this value we want to unlock a resource (like a discount) -when it reaches a threshold. If the client deletes the cookie, -then the counter will just be reset to zero. This does not -bother us, because the purported discount will just not be -granted. In this way we do not lose any (hypothetical) money. -What we need to be concerned about is, however, when a client -artificially increases this counter without having visited our -web-page. This is actually a trivial task for a knowledgeable -person, since there are convenient tools that allow one to set -a cookie to an arbitrary value, for example above our +Let us step back and analyse this program from a security point of view. +We store a counter in plain text on the client's browser (which is not +under our control). Depending on this value we want to unlock a resource +(like a discount) when it reaches a threshold. If the client deletes the +cookie, then the counter will just be reset to zero. This does not +bother us, because the purported discount will just not be granted. In +this way we do not lose any (hypothetical) money. What we need to be +concerned about is, however, when a client artificially increases this +counter without having visited our web-page. This is actually a trivial +task for a knowledgeable person, since there are convenient tools that +allow one to set a cookie to an arbitrary value, for example above our threshold for the discount. -There seems to be no simple way to prevent this kind of -tampering with cookies, because the whole purpose of cookies -is that they are stored on the client's side, which from the -the server's perspective is a potentially hostile environment. -What we need to ensure is the integrity of this counter in -this hostile environment. We could think of encrypting the -counter. But this has two drawbacks to do with the keys for -encryption. If you use a single, global key for all the -clients that visit our site, then we risk that our whole -``business'' might collapse in the event this key gets known -to the outside world. Then all cookies we might have set in -the past, can now be decrypted and manipulated. If, on the -other hand, we use many ``private'' keys for the clients, then -we have to solve the problem of having to securely store this -key on our server side (obviously we cannot store the key with -the client because then the client again has all data to -tamper with the counter; and obviously we also cannot encrypt -the key, lest we can solve an impossible chicken-and-egg -problem). So encryption seems to not solve the problem we face -with the integrity of our counter. +There seems to be no simple way to prevent this kind of tampering with +cookies, because the whole purpose of cookies is that they are stored on +the client's side, which from the the server's perspective is a +potentially hostile environment. What we need to ensure is the integrity +of this counter in this hostile environment. We could think of +encrypting the counter. But this has two drawbacks to do with the keys +for encryption. If you use a single, global key for all the clients that +visit our site, then we risk that our whole ``business'' might collapse +in the event this key gets known to the outside world. Then all cookies +we might have set in the past, can now be decrypted and manipulated. If, +on the other hand, we use many ``private'' keys for the clients, then we +have to solve the problem of having to securely store this key on our +server side (obviously we cannot store the key with the client because +then the client again has all data to tamper with the counter; and +obviously we also cannot encrypt the key, lest we can solve an +impossible chicken-and-egg problem). So encryption seems to not solve +the problem we face with the integrity of our counter. -Fortunately, \emph{cryptographic hash functions} seem to be -more suitable for our purpose. Like encryption, hash functions -scramble data in such a way that it is easy to calculate the -output of a hash function from the input. But it is hard -(i.e.~practically impossible) to calculate the input from -knowing the output. This is often called \emph{preimage -resistance}. Cryptographic hash functions also ensure that -given a message and a hash, it is computationally infeasible to -find another message with the same hash. This is called -\emph{collusion resistance}. Because of these properties, hash -functions are often called \emph{one-way functions}: you -cannot go back from the output to the input (without some -tricks, see below). +Fortunately, \emph{cryptographic hash functions} seem to be more +suitable for our purpose. Like encryption, hash functions scramble data +in such a way that it is easy to calculate the output of a hash function +from the input. But it is hard (i.e.~practically impossible) to +calculate the input from knowing the output. This is often called +\emph{preimage resistance}. Cryptographic hash functions also ensure +that given a message and a hash, it is computationally infeasible to +find another message with the same hash. This is called \emph{collusion +resistance}. Because of these properties, hash functions are often +called \emph{one-way functions}: you cannot go back from the output to +the input (without some tricks, see below). -There are several such hashing function. For example SHA-1 -would hash the string \pcode{"hello world"} to produce the -hash-value +There are several such hashing function. For example SHA-1 would hash +the string \pcode{"hello world"} to produce the hash-value \begin{center} \pcode{2aae6c35c94fcfb415dbe95f408b9ce91ee846ed} \end{center} -\noindent Another handy feature of hash functions is that if -the input changes only a little, the output changes -drastically. For example \pcode{"iello world"} produces under -SHA-1 the output +\noindent Another handy feature of hash functions is that if the input +changes only a little, the output changes drastically. For example +\pcode{"iello world"} produces under SHA-1 the output \begin{center} \pcode{d2b1402d84e8bcef5ae18f828e43e7065b841ff1} \end{center} -\noindent That means it is not predictable what the output -will be from just looking at input that is ``close by''. +\noindent That means it is not predictable what the output will be from +just looking at input that is ``close by''. -We can use hashes in our web-application and store in the -cookie the value of the counter in plain text but together -with its hash. We need to store both pieces of data in such a -way that we can extract them again later on. In the code below -I will just separate them using a \pcode{"-"}. For the -counter \pcode{1} for example +We can use hashes in our web-application and store in the cookie the +value of the counter in plain text but together with its hash. We need +to store both pieces of data in such a way that we can extract them +again later on. In the code below I will just separate them using a +\pcode{"-"}. For the counter \pcode{1} for example \begin{center} \pcode{1-356a192b7913b04c54574d18c28d46e6395428ab} \end{center} -\noindent If we now read back the cookie when the client -visits our webpage, we can extract the counter, hash it again -and compare the result to the stored hash value inside the -cookie. If these hashes disagree, then we can deduce that the -cookie has been tampered with. Unfortunately, if they agree, -we can still not be entirely sure that not a clever hacker has -tampered with the cookie. The reason is that the hacker can -see the clear text part of the cookie, say \pcode{3}, and also -its hash. It does not take much trial and error to find out -that we used the SHA-1 hashing function and then the hacker -can graft a cookie accordingly. This is eased by the fact that -for SHA-1 many strings and corresponding hash-values are -precalculated. Type, for example, into Google the hash value -for \pcode{"hello world"} and you will actually pretty quickly -find that it was generated by input string \pcode{"hello -world"}. Similarly for the hash-value for \pcode{1}. This -defeats the purpose of a hashing function and thus would not -help us with our web-applications and later also not with how -to store passwords properly. +\noindent If we now read back the cookie when the client visits our +webpage, we can extract the counter, hash it again and compare the +result to the stored hash value inside the cookie. If these hashes +disagree, then we can deduce that the cookie has been tampered with. +Unfortunately, if they agree, we can still not be entirely sure that not +a clever hacker has tampered with the cookie. The reason is that the +hacker can see the clear text part of the cookie, say \pcode{3}, and +also its hash. It does not take much trial and error to find out that we +used the SHA-1 hashing function and then the hacker can graft a cookie +accordingly. This is eased by the fact that for SHA-1 many strings and +corresponding hash-values are precalculated. Type, for example, into +Google the hash value for \pcode{"hello world"} and you will actually +pretty quickly find that it was generated by input string \pcode{"hello +world"}. Similarly for the hash-value for \pcode{1}. This defeats the +purpose of a hashing function and thus would not help us with our +web-applications and later also not with how to store passwords +properly. There is one ingredient missing, which happens to be called -\emph{salts}. Salts are random keys, which are added to the -counter before the hash is calculated. In our case we must -keep the salt secret. As can be see in Figure~\ref{hashsalt}, -we need to extract from the cookie the counter value and its -hash (Lines 19 and 20). But before hashing the counter again -(Line 22) we need to add the secret salt. Similarly, when we -set the new increased counter, we will need to add the salt -before hashing (this is done in Line 15). Our web-application +\emph{salts}. Salts are random keys, which are added to the counter +before the hash is calculated. In our case we must keep the salt secret. +As can be see in Figure~\ref{hashsalt}, we need to extract from the +cookie the counter value and its hash (Lines 19 and 20). But before +hashing the counter again (Line 22) we need to add the secret salt. +Similarly, when we set the new increased counter, we will need to add +the salt before hashing (this is done in Line 15). Our web-application will now store cookies like \begin{figure}[p] \lstinputlisting{../progs/App4.js} -\caption{A Node.js web-app that sets a cookie in the client's -browser for counting the number of visits to a page.\label{hashsalt}} +\caption{A Node.js web-app that sets a cookie in the client's browser +for counting the number of visits to a page.\label{hashsalt}} \end{figure} \begin{center}\tt @@ -457,218 +411,191 @@ \end{tabular} \end{center} -\noindent These hashes allow us to read and set the value of -the counter, and also give us confidence that the counter has -not been tampered with. This of course depends on being able -to keep the salt secret. Once the salt is public, we better -ignore all cookies and start setting them again with a new -salt. +\noindent These hashes allow us to read and set the value of the +counter, and also give us confidence that the counter has not been +tampered with. This of course depends on being able to keep the salt +secret. Once the salt is public, we better ignore all cookies and start +setting them again with a new salt. -There is an interesting and very subtle point to note with -respect to the 'New York Times' way of checking the number -visits. Essentially they have their `resource' unlocked at the -beginning and lock it only when the data in the cookie states -that the allowed free number of visits are up. As said before, -this can be easily circumvented by just deleting the cookie or -by switching the browser. This would mean the New York Times -will lose revenue whenever this kind of tampering occurs. The -`quick fix' to require that a cookie must always be present -does not work, because then this newspaper will cut off any -new readers, or anyone who gets a new computer. In contrast, -our web-application has the resource (discount) locked at the -beginning and only unlocks it if the cookie data says so. If -the cookie is deleted, well then the resource just does not -get unlocked. No major harm will result to us. You can see: -the same security mechanism behaves rather differently -depending on whether the ``resource'' needs to be locked or -unlocked. Apart from thinking about the difference very -carefully, I do not know of any good ``theory'' that could -help with solving such security intricacies in any other way. +There is an interesting and very subtle point to note with respect to +the 'New York Times' way of checking the number visits. Essentially they +have their `resource' unlocked at the beginning and lock it only when +the data in the cookie states that the allowed free number of visits are +up. As said before, this can be easily circumvented by just deleting the +cookie or by switching the browser. This would mean the New York Times +will lose revenue whenever this kind of tampering occurs. The `quick +fix' to require that a cookie must always be present does not work, +because then this newspaper will cut off any new readers, or anyone who +gets a new computer. In contrast, our web-application has the resource +(discount) locked at the beginning and only unlocks it if the cookie +data says so. If the cookie is deleted, well then the resource just does +not get unlocked. No major harm will result to us. You can see: the same +security mechanism behaves rather differently depending on whether the +``resource'' needs to be locked or unlocked. Apart from thinking about +the difference very carefully, I do not know of any good ``theory'' that +could help with solving such security intricacies in any other way. \subsection*{How to Store Passwords Properly?} -While admittedly quite silly, the simple web-application in -the previous section should help with the more important -question of how passwords should be verified and stored. It is -unbelievable that nowadays systems still do this with -passwords in plain text. The idea behind such plain-text -passwords is of course that if the user typed in -\pcode{foobar} as password, we need to verify whether it -matches with the password that is already stored for this user -in the system. Why not doing this with plain-text passwords? -Unfortunately doing this verification in plain text is really -a bad idea. Alas, evidence suggests it is still a -widespread practice. I leave you to think about why verifying -passwords in plain text is a bad idea. +While admittedly quite silly, the simple web-application in the previous +section should help with the more important question of how passwords +should be verified and stored. It is unbelievable that nowadays systems +still do this with passwords in plain text. The idea behind such +plain-text passwords is of course that if the user typed in +\pcode{foobar} as password, we need to verify whether it matches with +the password that is already stored for this user in the system. Why not +doing this with plain-text passwords? Unfortunately doing this +verification in plain text is really a bad idea. Alas, evidence suggests +it is still a widespread practice. I leave you to think about why +verifying passwords in plain text is a bad idea. -Using hash functions, like in our web-application, we can do -better. They allow us to not having to store passwords in -plain text for verification whether a password matches or not. -We can just hash the password and store the hash-value. And -whenever the user types in a new password, well then we hash -it again and check whether the hash-values agree. Just like -in the web-application before. +Using hash functions, like in our web-application, we can do better. +They allow us to not having to store passwords in plain text for +verification whether a password matches or not. We can just hash the +password and store the hash-value. And whenever the user types in a new +password, well then we hash it again and check whether the hash-values +agree. Just like in the web-application before. -Lets analyse what happens when a hacker gets hold of such a -hashed password database. That is the scenario we want to -defend against.\footnote{If we could assume our servers can -never be broken into, then storing passwords in plain text -would be no problem. The point, however, is that servers are -never absolutely secure.} The hacker has then a list of user names and -associated hash-values, like +Lets analyse what happens when a hacker gets hold of such a hashed +password database. That is the scenario we want to defend +against.\footnote{If we could assume our servers can never be broken +into, then storing passwords in plain text would be no problem. The +point, however, is that servers are never absolutely secure.} The hacker +has then a list of user names and associated hash-values, like \begin{center} \pcode{urbanc:2aae6c35c94fcfb415dbe95f408b9ce91ee846ed} \end{center} -\noindent For a beginner-level hacker this information is of -no use. It would not work to type in the hash value instead of -the password, because it will go through the hashing function -again and then the resulting two hash-values will not match. -One attack a hacker can try, however, is called a \emph{brute -force attack}. Essentially this means trying out exhaustively -all strings +\noindent For a beginner-level hacker this information is of no use. It +would not work to type in the hash value instead of the password, +because it will go through the hashing function again and then the +resulting two hash-values will not match. One attack a hacker can try, +however, is called a \emph{brute force attack}. Essentially this means +trying out exhaustively all strings \begin{center} -\pcode{a}, -\pcode{aa}, -\pcode{...}, -\pcode{ba}, -\pcode{...}, +\pcode{a}, \pcode{aa}, \pcode{...}, \pcode{ba}, \pcode{...}, \pcode{zzz}, \pcode{...} \end{center} -\noindent and so on, hash them and check whether they match -with the hash-values in the database. Such brute force attacks -are surprisingly effective. With modern technology (usually -GPU graphic cards), passwords of moderate length only need -seconds or hours to be cracked. Well, the only defence we have -against such brute force attacks is to make passwords longer -and force users to use the whole spectrum of letters and keys -for passwords. The hope is that this makes the search space -too big for an effective brute force attack. +\noindent and so on, hash them and check whether they match with the +hash-values in the database. Such brute force attacks are surprisingly +effective. With modern technology (usually GPU graphic cards), passwords +of moderate length only need seconds or hours to be cracked. Well, the +only defence we have against such brute force attacks is to make +passwords longer and force users to use the whole spectrum of letters +and keys for passwords. The hope is that this makes the search space too +big for an effective brute force attack. -Unfortunately, clever hackers have another ace up their -sleeves. These are called \emph{dictionary attacks}. The idea -behind dictionary attack is the observation that only few -people are competent enough to use sufficiently strong -passwords. Most users (at least too many) use passwords like +Unfortunately, clever hackers have another ace up their sleeves. These +are called \emph{dictionary attacks}. The idea behind dictionary attack +is the observation that only few people are competent enough to use +sufficiently strong passwords. Most users (at least too many) use +passwords like \begin{center} -\pcode{123456}, -\pcode{password}, -\pcode{qwerty}, -\pcode{letmein}, +\pcode{123456}, \pcode{password}, \pcode{qwerty}, \pcode{letmein}, \pcode{...} \end{center} -\noindent So an attacker just needs to compile a list as large -as possible of such likely candidates of passwords and also -compute their hash-values. The difference between a brute -force attack, where maybe $2^{80}$ many strings need to be -considered, is that a dictionary attack might get away with -checking only 10 Million words (remember the language English -``only'' contains 600,000 words). This is a drastic -simplification for attackers. Now, if the attacker knows the -hash-value of a password is +\noindent So an attacker just needs to compile a list as large as +possible of such likely candidates of passwords and also compute their +hash-values. The difference between a brute force attack, where maybe +$2^{80}$ many strings need to be considered, is that a dictionary attack +might get away with checking only 10 Million words (remember the +language English ``only'' contains 600,000 words). This is a drastic +simplification for attackers. Now, if the attacker knows the hash-value +of a password is \begin{center} \pcode{5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8} \end{center} -\noindent then just a lookup in the dictionary will reveal -that the plain-text password was \pcode{password}. What is -good about this attack is that the dictionary can be -precompiled in the ``comfort of the hacker's home'' before an -actual attack is launched. It just needs sufficient storage -space, which nowadays is pretty cheap. A hacker might in this -way not be able to crack all passwords in our database, but -even being able to crack 50\% can be serious damage for a -large company (because then you have to think about how to -make users to change their old passwords---a major hassle). -And hackers are very industrious in compiling these -dictionaries: for example they definitely include variations -like \pcode{passw0rd} and also include rules that cover cases +\noindent then just a lookup in the dictionary will reveal that the +plain-text password was \pcode{password}. What is good about this attack +is that the dictionary can be precompiled in the ``comfort of the +hacker's home'' before an actual attack is launched. It just needs +sufficient storage space, which nowadays is pretty cheap. A hacker might +in this way not be able to crack all passwords in our database, but even +being able to crack 50\% can be serious damage for a large company +(because then you have to think about how to make users to change their +old passwords---a major hassle). And hackers are very industrious in +compiling these dictionaries: for example they definitely include +variations like \pcode{passw0rd} and also include rules that cover cases like \pcode{passwordpassword} or \pcode{drowssap} (password -reversed).\footnote{Some entertaining rules for creating -effective dictionaries are described in the book ``Applied -Cryptography'' by Bruce Schneier (in case you can find it in -the library), and also in the original research literature -which can be accessed for free from +reversed).\footnote{Some entertaining rules for creating effective +dictionaries are described in the book ``Applied Cryptography'' by Bruce +Schneier (in case you can find it in the library), and also in the +original research literature which can be accessed for free from \url{http://www.klein.com/dvk/publications/passwd.pdf}.} -Historically, compiling a list for a dictionary attack is not -as simple as it might seem. At the beginning only ``real'' -dictionaries were available (like the Oxford English -Dictionary), but such dictionaries are not optimised for the -purpose of cracking passwords. The first real hard data about -actually used passwords was obtained when a company called -RockYou ``lost'' at the end of 2009 32 Million plain-text -passwords. With this data of real-life passwords, dictionary -attacks took off. Compiling such dictionaries is nowadays very -easy with the help of off-the-shelf tools. +Historically, compiling a list for a dictionary attack is not as simple +as it might seem. At the beginning only ``real'' dictionaries were +available (like the Oxford English Dictionary), but such dictionaries +are not optimised for the purpose of cracking passwords. The first real +hard data about actually used passwords was obtained when a company +called RockYou ``lost'' at the end of 2009 32 Million plain-text +passwords. With this data of real-life passwords, dictionary attacks +took off. Compiling such dictionaries is nowadays very easy with the +help of off-the-shelf tools. -These dictionary attacks can be prevented by using salts. -Remember a hacker needs to use the most likely candidates -of passwords and calculate their hash-value. If we add before -hashing a password a random salt, like \pcode{mPX2aq}, -then the string \pcode{passwordmPX2aq} will almost certainly -not be in the dictionary. Like in the web-application in the -previous section, a salt does not prevent us from verifying a -password. We just need to add the salt whenever the password -is typed in again. +These dictionary attacks can be prevented by using salts. Remember a +hacker needs to use the most likely candidates of passwords and +calculate their hash-value. If we add before hashing a password a random +salt, like \pcode{mPX2aq}, then the string \pcode{passwordmPX2aq} will +almost certainly not be in the dictionary. Like in the web-application +in the previous section, a salt does not prevent us from verifying a +password. We just need to add the salt whenever the password is typed in +again. -There is a question whether we should use a single random salt -for every password in our database. A single salt would -already make dictionary attacks considerably more difficult. -It turns out, however, that in case of password databases -every password should get their own salt. This salt is -generated at the time when the password is first set. -If you look at a Unix password file you will find entries like +There is a question whether we should use a single random salt for every +password in our database. A single salt would already make dictionary +attacks considerably more difficult. It turns out, however, that in case +of password databases every password should get their own salt. This +salt is generated at the time when the password is first set. If you +look at a Unix password file you will find entries like \begin{center} \pcode{urbanc:$6$3WWbKfr1$4vblknvGr6FcDeF92R5xFn3mskfdnEn...$...} \end{center} -\noindent where the first part is the login-name, followed by -a field \pcode{$6$} which specifies which hash-function is -used. After that follows the salt \pcode{3WWbKfr1} and after -that the hash-value that is stored for the password (which -includes the salt). I leave it to you to figure out how the -password verification would need to work based on this data. +\noindent where the first part is the login-name, followed by a field +\pcode{$6$} which specifies which hash-function is used. After that +follows the salt \pcode{3WWbKfr1} and after that the hash-value that is +stored for the password (which includes the salt). I leave it to you to +figure out how the password verification would need to work based on +this data. -There is a non-obvious benefit of using a separate salt for -each password. Recall that \pcode{123456} is a popular -password that is most likely used by several of your users -(especially if the database contains millions of entries). If -we use no salt or one global salt, all hash-values will be the -same for this password. So if a hacker is in the business of -cracking as many passwords as possible, then it is a good idea -to concentrate on those very popular passwords. This is not -possible if each password gets its own salt: since we assume -the salt is generated randomly, each version of \pcode{123456} -will be associated with a different hash-value. This will -make the life harder for an attacker. +There is a non-obvious benefit of using a separate salt for each +password. Recall that \pcode{123456} is a popular password that is most +likely used by several of your users (especially if the database +contains millions of entries). If we use no salt or one global salt, all +hash-values will be the same for this password. So if a hacker is in the +business of cracking as many passwords as possible, then it is a good +idea to concentrate on those very popular passwords. This is not +possible if each password gets its own salt: since we assume the salt is +generated randomly, each version of \pcode{123456} will be associated +with a different hash-value. This will make the life harder for an +attacker. -Note another interesting point. The web-application from the -previous section was only secure when the salt was secret. In -the password case, this is not needed. The salt can be public -as shown above in the Unix password file where it is actually -stored as part of the password entry. Knowing the salt does -not give the attacker any advantage, but prevents that -dictionaries can be precompiled. While salts do not solve -every problem, they help with protecting against dictionary -attacks on password files. It protects people who have the -same passwords on multiple machines. But it does not protect -against a focused attack against a single password and also -does not make poorly chosen passwords any better. Still the -moral is that you should never store passwords in plain text. -Never ever. +Note another interesting point. The web-application from the previous +section was only secure when the salt was secret. In the password case, +this is not needed. The salt can be public as shown above in the Unix +password file where it is actually stored as part of the password entry. +Knowing the salt does not give the attacker any advantage, but prevents +that dictionaries can be precompiled. While salts do not solve every +problem, they help with protecting against dictionary attacks on +password files. It protects people who have the same passwords on +multiple machines. But it does not protect against a focused attack +against a single password and also does not make poorly chosen passwords +any better. Still the moral is that you should never store passwords in +plain text. Never ever. \subsubsection*{Further Reading} -A readable article by Bruce Schneier on ``How Security Companies Sucker Us with -Lemons'' +A readable article by Bruce Schneier on ``How Security Companies Sucker +Us with Lemons'' \begin{center} \url{http://archive.wired.com/politics/security/commentary/securitymatters/2007/04/securitymatters_0419} @@ -682,33 +609,32 @@ \end{center} \noindent -A slightly different point of view about the economies of -password cracking: +A slightly different point of view about the economies of password +cracking: \begin{center} \url{http://xkcd.com/538/} \end{center} -\noindent If you want to know more about passwords, the book -by Bruce Schneier about Applied Cryptography is recommendable, -though quite expensive. There is also another expensive book -about penetration testing, but the readable chapter about -password attacks (Chapter 9) is free: +\noindent If you want to know more about passwords, the book by Bruce +Schneier about Applied Cryptography is recommendable, though quite +expensive. There is also another expensive book about penetration +testing, but the readable chapter about password attacks (Chapter 9) is +free: \begin{center} \url{http://www.nostarch.com/pentesting} \end{center} -\noindent Even the government recently handed out some -advice about passwords +\noindent Even the government recently handed out some advice about +passwords \begin{center} \url{http://goo.gl/dIzqMg} \end{center} -\noindent Here is an interesting blog-post about how a group -``cracked'' efficiently millions of bcrypt passwords from the -Ashley Madison leak. +\noindent Here is an interesting blog-post about how a group ``cracked'' +efficiently millions of bcrypt passwords from the Ashley Madison leak. \begin{center} \url{http://goo.gl/83Ho0N} @@ -721,17 +647,16 @@ \end{center} \noindent The attack used dictionaries with up to 15 Billion -entries.\footnote{Compare this with the full brute-force space -of $62^8$} If eHarmony had properly salted their passwords, -the attack would have taken 31 years. +entries.\footnote{Compare this with the full brute-force space of +$62^8$} If eHarmony had properly salted their passwords, the attack +would have taken 31 years. -Clearly, passwords are a technology that comes to -the end of its usefulness, because brute force attacks become -more and more powerful and it is unlikely that humans get any -better in remembering (securely) longer and longer passwords. -The big question is which technology can replace -passwords\ldots +Clearly, passwords are a technology that comes to the end of its +usefulness, because brute force attacks become more and more powerful +and it is unlikely that humans get any better in remembering (securely) +longer and longer passwords. The big question is which technology can +replace passwords\ldots \medskip @@ -744,7 +669,4 @@ http://randomwalker.info/publications/cookie-surveillance-v2.pdf -%%% Local Variables: -%%% mode: latex -%%% TeX-master: t -%%% End: +%%% Local Variables: %% mode: latex %% TeX-master: t %% End: