handouts/ho01.tex
changeset 565 d58f8e3e78a5
parent 539 48e0c8b03ae5
equal deleted inserted replaced
564:3391a4fc3533 565:d58f8e3e78a5
     4 
     4 
     5 \lstset{language=JavaScript}
     5 \lstset{language=JavaScript}
     6 
     6 
     7 
     7 
     8 \begin{document}
     8 \begin{document}
     9 \fnote{\copyright{} Christian Urban, 
     9 \fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016}
    10 King's College London, 2014, 2015, 2016}
       
    11 
    10 
    12 % passwords at dropbox
    11 % passwords at dropbox
    13 %%https://blogs.dropbox.com/tech/2016/09/how-dropbox-securely-stores-your-passwords/
    12 % %https://blogs.dropbox.com/tech/2016/09/how-dropbox-securely-stores-your-passwords/
    14 
    13 
    15 
    14 
    16 %Ross anderson
    15 %Ross anderson https://youtu.be/FY2YKxBxOkg
    17 %https://youtu.be/FY2YKxBxOkg
       
    18 %http://www.scmagazineuk.com/amazon-launches-open-source-tls-implementation-s2n/article/424360/
    16 %http://www.scmagazineuk.com/amazon-launches-open-source-tls-implementation-s2n/article/424360/
    19 
    17 
    20 %Singapurs Behörden gehen offline
    18 %Singapurs Behörden gehen offline
    21 
    19 
    22 % how to store passwords
    20 % how to store passwords
    23 %https://nakedsecurity.sophos.com/2013/11/20/serious-security-how-to-store-your-users-passwords-safely/
    21 % https://nakedsecurity.sophos.com/2013/11/20/serious-security-how-to-store-your-users-passwords-safely/
    24 
    22 
    25 %hashes
    23 %hashes
    26 %http://web.archive.org/web/20071226014140/http://www.cits.rub.de/MD5Collisions/
    24 %http://web.archive.org/web/20071226014140/http://www.cits.rub.de/MD5Collisions/
    27 %https://blog.codinghorror.com/speed-hashing/
    25 %https://blog.codinghorror.com/speed-hashing/
    28 %https://blogs.dropbox.com/tech/2016/09/how-dropbox-securely-stores-your-passwords/
    26 %https://blogs.dropbox.com/tech/2016/09/how-dropbox-securely-stores-your-passwords/
    33 
    31 
    34 % IoT
    32 % IoT
    35 % https://nakedsecurity.sophos.com/2015/10/26/the-internet-of-things-stop-the-things-i-want-to-get-off/
    33 % https://nakedsecurity.sophos.com/2015/10/26/the-internet-of-things-stop-the-things-i-want-to-get-off/
    36 
    34 
    37 % cloning creditc cards and passports
    35 % cloning creditc cards and passports
    38 %https://www.youtube.com/watch?v=-4_on9zj-zs
    36 % https://www.youtube.com/watch?v=-4_on9zj-zs
    39 
    37 
    40 
    38 
    41 \section*{Handout 1 (Security Engineering)}
    39 \section*{Handout 1 (Security Engineering)}
    42 
    40 
    43 
    41 Much of the material and inspiration in this module is taken from the
    44 Much of the material and inspiration in this module is taken
    42 works of Bruce Schneier, Ross Anderson and Alex Halderman. I think they
    45 from the works of Bruce Schneier, Ross Anderson and Alex
    43 are the world experts in the area of security engineering. I especially
    46 Halderman. I think they are the world experts in the area of
    44 like that they argue that a security engineer requires a certain
    47 security engineering. I especially like that they argue that a
    45 \emph{security mindset}. Bruce Schneier for example writes:
    48 security engineer requires a certain \emph{security mindset}.
       
    49 Bruce Schneier for example writes:
       
    50 
    46 
    51 \begin{quote} 
    47 \begin{quote} 
    52 \it ``Security engineers --- at least the good ones --- see
    48 \it ``Security engineers --- at least the good ones --- see the world
    53 the world differently. They can't walk into a store without
    49 differently. They can't walk into a store without noticing how they
    54 noticing how they might shoplift. They can't use a computer
    50 might shoplift. They can't use a computer without wondering about the
    55 without wondering about the security vulnerabilities. They
    51 security vulnerabilities. They can't vote without trying to figure out
    56 can't vote without trying to figure out how to vote twice.
    52 how to vote twice. They just can't help it.''
    57 They just can't help it.''
       
    58 \end{quote}
    53 \end{quote}
    59 
    54 
    60 \noindent
    55 \noindent
    61 and
    56 and
    62 
    57 
    63 \begin{quote}
    58 \begin{quote}
    64 \it ``Security engineering\ldots requires you to think
    59 \it ``Security engineering\ldots requires you to think differently. You
    65 differently. You need to figure out not how something works,
    60 need to figure out not how something works, but how something can be
    66 but how something can be made to not work. You have to imagine
    61 made to not work. You have to imagine an intelligent and malicious
    67 an intelligent and malicious adversary inside your system
    62 adversary inside your system \ldots, constantly trying new ways to
    68 \ldots, constantly trying new ways to
    63 subvert it. You have to consider all the ways your system can fail, most
    69 subvert it. You have to consider all the ways your system can
    64 of them having nothing to do with the design itself. You have to look at
    70 fail, most of them having nothing to do with the design
    65 everything backwards, upside down, and sideways. You have to think like
    71 itself. You have to look at everything backwards, upside down,
    66 an alien.''
    72 and sideways. You have to think like an alien.''
       
    73 \end{quote}
    67 \end{quote}
    74 
    68 
    75 \noindent In this module I like to teach you this security
    69 \noindent In this module I like to teach you this security mindset. This
    76 mindset. This might be a mindset that you think is very
    70 might be a mindset that you think is very foreign to you---after all we
    77 foreign to you---after all we are all good citizens and do not
    71 are all good citizens and do not hack into things. However, I beg to
    78 hack into things. However, I beg to differ: You have this
    72 differ: You have this mindset already when in school you were thinking,
    79 mindset already when in school you were thinking, at least
    73 at least hypothetically, about ways in which you can cheat in an exam
    80 hypothetically, about ways in which you can cheat in an exam
    74 (whether it is by hiding notes or by looking over the shoulders of your
    81 (whether it is by hiding notes or by looking over the
    75 fellow pupils). Right? To defend a system, you need to have this kind of
    82 shoulders of your fellow pupils). Right? To defend a system,
    76 mindset and be able to think like an attacker. This will include
    83 you need to have this kind of mindset and be able to think
    77 understanding techniques that can be used to compromise security and
    84 like an attacker. This will include understanding techniques
    78 privacy in systems. This will many times result in insights where
    85 that can be used to compromise security and privacy in
       
    86 systems. This will many times result in insights where
       
    87 well-intended security mechanisms made a system actually less
    79 well-intended security mechanisms made a system actually less
    88 secure.\medskip
    80 secure.\medskip
    89 
    81 
    90 \noindent 
    82 \noindent 
    91 {\Large\bf Warning!} However, don’t be evil! Using those
    83 {\Large\bf Warning!} However, don’t be evil! Using those techniques in
    92 techniques in the real world may violate the law or King’s
    84 the real world may violate the law or King’s rules, and it may be
    93 rules, and it may be unethical. Under some circumstances, even
    85 unethical. Under some circumstances, even probing for weaknesses of a
    94 probing for weaknesses of a system may result in severe
    86 system may result in severe penalties, up to and including expulsion,
    95 penalties, up to and including expulsion, fines and
    87 fines and jail time. Acting lawfully and ethically is your
    96 jail time. Acting lawfully and ethically is your
    88 responsibility. Ethics requires you to refrain from doing harm. Always
    97 responsibility. Ethics requires you to refrain from doing
    89 respect privacy and rights of others. Do not tamper with any of King's
    98 harm. Always respect privacy and rights of others. Do not
    90 systems. If you try out a technique, always make doubly sure you are
    99 tamper with any of King's systems. If you try out a technique,
    91 working in a safe environment so that you cannot cause any harm, not
   100 always make doubly sure you are working in a safe environment
    92 even accidentally. Don't be evil. Be an ethical hacker.\medskip
   101 so that you cannot cause any harm, not even accidentally.
    93 
   102 Don't be evil. Be an ethical hacker.\medskip
    94 \noindent In this lecture I want to make you familiar with the security
   103 
    95 mindset and dispel the myth that encryption is the answer to all
   104 \noindent In this lecture I want to make you familiar with the
    96 security problems (it is certainly often a part of an answer, but almost
   105 security mindset and dispel the myth that encryption is the
    97 always never a sufficient one). This is actually an important thread
   106 answer to all security problems (it is certainly often a part
    98 going through the whole course: We will assume that encryption works
   107 of an answer, but almost always never a sufficient one). This
    99 perfectly, but still attack ``things''. By ``works perfectly'' we mean
   108 is actually an important thread going through the whole
   100 that we will assume encryption is a black box and, for example, will not
   109 course: We will assume that encryption works perfectly, but
   101 look at the underlying mathematics and break the
   110 still attack ``things''. By ``works perfectly'' we mean that
       
   111 we will assume encryption is a black box and, for example,
       
   112 will not look at the underlying mathematics and break the
       
   113 algorithms.\footnote{Though fascinating this might be.}
   102 algorithms.\footnote{Though fascinating this might be.}
   114  
   103  
   115 For a secure system, it seems, four requirements need to come
   104 For a secure system, it seems, four requirements need to come together:
   116 together: First a security policy (what is supposed to be
   105 First a security policy (what is supposed to be achieved?); second a
   117 achieved?); second a mechanism (cipher, access controls,
   106 mechanism (cipher, access controls, tamper resistance etc); third the
   118 tamper resistance etc); third the assurance we obtain from the
   107 assurance we obtain from the mechanism (the amount of reliance we can
   119 mechanism (the amount of reliance we can put on the mechanism)
   108 put on the mechanism) and finally the incentives (the motive that the
   120 and finally the incentives (the motive that the people
   109 people guarding and maintaining the system have to do their job
   121 guarding and maintaining the system have to do their job
   110 properly, and also the motive that the attackers have to try to defeat
   122 properly, and also the motive that the attackers have to try
   111 your policy). The last point is often overlooked, but plays an important
   123 to defeat your policy). The last point is often overlooked,
   112 role. To illustrate this let's look at an example. 
   124 but plays an important role. To illustrate this let's look at
       
   125 an example. 
       
   126 
   113 
   127 \subsubsection*{Chip-and-PIN is Surely More Secure, No?}
   114 \subsubsection*{Chip-and-PIN is Surely More Secure, No?}
   128 
   115 
   129 The questions is whether the Chip-and-PIN system used with
   116 The questions is whether the Chip-and-PIN system used with modern credit
   130 modern credit cards is more secure than the older method of
   117 cards is more secure than the older method of signing receipts at the
   131 signing receipts at the till? On first glance the answer seems
   118 till? On first glance the answer seems obvious: Chip-and-PIN must be
   132 obvious: Chip-and-PIN must be more secure and indeed improved
   119 more secure and indeed improved security was the central plank in the
   133 security was the central plank in the ``marketing speak'' of
   120 ``marketing speak'' of the banks behind Chip-and-PIN. The earlier system
   134 the banks behind Chip-and-PIN. The earlier system was based on
   121 was based on a magnetic stripe or a mechanical imprint on the cards and
   135 a magnetic stripe or a mechanical imprint on the cards and
   122 required customers to sign receipts at the till whenever they bought
   136 required customers to sign receipts at the till whenever they
   123 something. This signature authorised the transactions. Although in use
   137 bought something. This signature authorised the transactions.
   124 for a long time, this system had some crucial security flaws, including
   138 Although in use for a long time, this system had some crucial
   125 making clones of credit cards and forging signatures. 
   139 security flaws, including making clones of credit cards and
   126 
   140 forging signatures. 
   127 Chip-and-PIN, as the name suggests, relies on data being stored on a
   141 
   128 chip on the card and a PIN number for authorisation. Even though the
   142 Chip-and-PIN, as the name suggests, relies on data being
   129 banks involved trumpeted their system as being absolutely secure and
   143 stored on a chip on the card and a PIN number for
   130 indeed fraud rates initially went down, security researchers were not
   144 authorisation. Even though the banks involved trumpeted their
   131 convinced (especially not the group around Ross
   145 system as being absolutely secure and indeed fraud rates
   132 Anderson).\footnote{Actually, historical data about fraud showed that
   146 initially went down, security researchers were not convinced
   133 first fraud rates went up (while early problems to do with the
   147 (especially not the group around Ross
   134 introduction of Chip-and-PIN we exploited), then down, but recently up
   148 Anderson).\footnote{Actually, historical data about fraud
   135 again (because criminals getting more familiar with the technology and
   149 showed that first fraud rates went up (while early problems to
   136 how it can be exploited).} To begin with, the Chip-and-PIN system
   150 do with the introduction of Chip-and-PIN we exploited), then
   137 introduced a ``new player'' into the system that needed to be trusted:
   151 down, but recently up again (because criminals getting more
   138 the PIN terminals and their manufacturers. It was claimed that these
   152 familiar with the technology and how it can be exploited).} To begin with, the
   139 terminals were tamper-resistant, but needless to say this was a weak
   153 Chip-and-PIN system introduced a ``new player'' into the
   140 link in the system, which criminals successfully attacked. Some
   154 system that needed to be trusted: the PIN terminals and their
   141 terminals were even so skilfully manipulated that they transmitted
   155 manufacturers. It was claimed that these terminals were
   142 skimmed PIN numbers via built-in mobile phone connections. To mitigate
   156 tamper-resistant, but needless to say this was a weak link in
   143 this flaw in the security of Chip-and-PIN, you need to be able to vet
   157 the system, which criminals successfully attacked. Some
   144 quite closely the supply chain of such terminals. This is something that
   158 terminals were even so skilfully manipulated that they
   145 is mostly beyond the control of customers who need to use these
   159 transmitted skimmed PIN numbers via built-in mobile phone
       
   160 connections. To mitigate this flaw in the security of
       
   161 Chip-and-PIN, you need to be able to vet quite closely the
       
   162 supply chain of such terminals. This is something that is
       
   163 mostly beyond the control of customers who need to use these
       
   164 terminals. 
   146 terminals. 
   165 
   147 
   166 To make matters worse for Chip-and-PIN, around 2009 Ross
   148 To make matters worse for Chip-and-PIN, around 2009 Ross Anderson and
   167 Anderson and his group were able to perform man-in-the-middle
   149 his group were able to perform man-in-the-middle attacks against
   168 attacks against Chip-and-PIN. Essentially they made the
   150 Chip-and-PIN. Essentially they made the terminal think the correct PIN
   169 terminal think the correct PIN was entered and the card think
   151 was entered and the card think that a signature was used. This is a kind
   170 that a signature was used. This is a kind of \emph{protocol
   152 of \emph{protocol failure}. After discovery, the flaw was mitigated by
   171 failure}. After discovery, the flaw was mitigated by requiring
   153 requiring that a link between the card and the bank is established at
   172 that a link between the card and the bank is established at
   154 every time the card is used. Even later this group found another problem
   173 every time the card is used. Even later this group found
   155 with Chip-and-PIN and ATMs which did not generate random enough numbers
   174 another problem with Chip-and-PIN and ATMs which did not
   156 (cryptographic nonces) on which the security of the underlying protocols
   175 generate random enough numbers (cryptographic nonces) on which
   157 relies. 
   176 the security of the underlying protocols relies. 
   158 
   177 
   159 The overarching problem with all this is that the banks who introduced
   178 The overarching problem with all this is that the banks who
   160 Chip-and-PIN managed with the new system to shift the liability for any
   179 introduced Chip-and-PIN managed with the new system to shift
   161 fraud and the burden of proof onto the customer. In the old system, the
   180 the liability for any fraud and the burden of proof onto the
   162 banks had to prove that the customer used the card, which they often did
   181 customer. In the old system, the banks had to prove that the
   163 not bother with. In effect, if fraud occurred the customers were either
   182 customer used the card, which they often did not bother with.
       
   183 In effect, if fraud occurred the customers were either
       
   184 refunded fully or lost only a small amount of money. This
   164 refunded fully or lost only a small amount of money. This
   185 taking-responsibility-of-potential-fraud was part of the
   165 taking-responsibility-of-potential-fraud was part of the ``business
   186 ``business plan'' of the banks and did not reduce their
   166 plan'' of the banks and did not reduce their profits too much.
   187 profits too much. 
   167 
   188 
   168 
   189 Since banks managed to successfully claim that their
   169 Since banks managed to successfully claim that their Chip-and-PIN system
   190 Chip-and-PIN system is secure, they were under the new system
   170 is secure, they were under the new system able to point the finger at
   191 able to point the finger at the customer when fraud occurred:
   171 the customer when fraud occurred: customers must have been negligent
   192 customers must have been negligent losing their PIN and
   172 losing their PIN and customers had almost no way of defending themselves
   193 customers had almost no way of defending themselves in such
   173 in such situations. That is why the work of \emph{ethical} hackers like
   194 situations. That is why the work of \emph{ethical} hackers
   174 Ross Anderson's group is so important, because they and others
   195 like Ross Anderson's group is so important, because they and
   175 established that the banks' claim that their system is secure and it
   196 others established that the banks' claim that their system is
   176 must have been the customer's fault, was bogus. In 2009 the law changed
   197 secure and it must have been the customer's fault, was bogus.
   177 and the burden of proof went back to the banks. They need to prove
   198 In 2009 the law changed and the burden of proof went back to
   178 whether it was really the customer who used a card or not. The current
   199 the banks. They need to prove whether it was really the
   179 state of affairs, however, is that standing up for your right requires
   200 customer who used a card or not. The current state of affairs,
   180 you to be knowledgeable, potentially having to go to court\ldots{}if
   201 however, is that standing up for your right requires you to be
       
   202 knowledgeable, potentially having to go to court\ldots{}if
       
   203 not, the banks are happy to take advantage of you.
   181 not, the banks are happy to take advantage of you.
   204 
   182 
   205 This is a classic example where a security design principle
   183 This is a classic example where a fundamental security design principle
   206 was violated: Namely, the one who is in the position to
   184 was violated: Namely, the one who is in the position to improve
   207 improve security, also needs to bear the financial losses if
   185 security, also needs to bear the financial losses if things go wrong.
   208 things go wrong. Otherwise, you end up with an insecure
   186 Otherwise, you end up with an insecure system. In case of the
   209 system. In case of the Chip-and-PIN system, no good security
   187 Chip-and-PIN system, no good security engineer would dare to claim that
   210 engineer would dare to claim that it is secure beyond
   188 it is secure beyond reproach: the specification of the EMV protocol
   211 reproach: the specification of the EMV protocol (underlying
   189 (underlying Chip-and-PIN) is some 700 pages long, but still leaves out
   212 Chip-and-PIN) is some 700 pages long, but still leaves out
   190 many things (like how to implement a good random number generator). No
   213 many things (like how to implement a good random number
   191 human being is able to scrutinise such a specification and ensure it
   214 generator). No human being is able to scrutinise such a
   192 contains no flaws. Moreover, banks can add their own sub-protocols to
   215 specification and ensure it contains no flaws. Moreover, banks
   193 EMV. With all the experience we already have, it is as clear as day that
   216 can add their own sub-protocols to EMV. With all the
   194 criminals were bound to eventually be able to poke holes into it and
   217 experience we already have, it is as clear as day that
   195 measures need to be taken to address them. However, with how the system
   218 criminals were bound to eventually be able to poke holes into
   196 was set up, the banks had no real incentive to come up with a system
   219 it and measures need to be taken to address them. However,
   197 that is really secure. Getting the incentives right in favour of
   220 with how the system was set up, the banks had no real
   198 security is often a tricky business. From a customer point of view, the
   221 incentive to come up with a system that is really secure.
   199 Chip-and-PIN system was much less secure than the old signature-based
   222 Getting the incentives right in favour of security is often a
   200 method. The customer could now lose significant amounts of money.
   223 tricky business. From a customer point of view, the
   201 
   224 Chip-and-PIN system was much less secure than the old
   202 If you want to watch an entertaining talk about attacking Chip-and-PIN
   225 signature-based method. The customer could now lose
   203 cards, then this talk from the 2014 Chaos Computer Club conference is
   226 significant amounts of money.
   204 for you:
   227 
       
   228 If you want to watch an entertaining talk about attacking
       
   229 Chip-and-PIN cards, then this talk from the 2014 Chaos
       
   230 Computer Club conference is for you:
       
   231 
   205 
   232 \begin{center}
   206 \begin{center}
   233 \url{https://goo.gl/zuwVHb}
   207 \url{https://goo.gl/zuwVHb}
   234 \end{center}
   208 \end{center}
   235 
   209 
   236 \noindent They claim that they are able to clone Chip-and-PINs
   210 \noindent They claim that they are able to clone Chip-and-PINs cards
   237 cards such that they get all data that was on the Magstripe,
   211 such that they get all data that was on the Magstripe, except for three
   238 except for three digits (the CVV number). Remember,
   212 digits (the CVV number). Remember, Chip-and-PIN cards were introduced
   239 Chip-and-PIN cards were introduced exactly for preventing
   213 exactly for preventing this. Ross Anderson also talked about his
   240 this. Ross Anderson also talked about his research at the
   214 research at the BlackHat Conference in 2014:
   241 BlackHat Conference in 2014:
       
   242 
   215 
   243 \begin{center}
   216 \begin{center}
   244 \url{https://www.youtube.com/watch?v=ET0MFkRorbo}
   217 \url{https://www.youtube.com/watch?v=ET0MFkRorbo}
   245 \end{center}
   218 \end{center}
   246 
   219 
   247 \noindent An article about reverse-engineering a PIN-number skimmer
   220 \noindent An article about reverse-engineering a PIN-number skimmer is
   248 is at 
   221 at 
   249 
   222 
   250 \begin{center}\small
   223 \begin{center}\small
   251 \url{https://trustfoundry.net/reverse-engineering-a-discovered-atm-skimmer/}
   224 \url{https://trustfoundry.net/reverse-engineering-a-discovered-atm-skimmer/}
   252 \end{center}
   225 \end{center}
   253 
   226 
   254 \noindent
   227 \noindent
   255 including a scary video of how a PIN-pad overlay is
   228 including a scary video of how a PIN-pad overlay is installed by some
   256 installed by some crooks.
   229 crooks. 
   257 
   230 
   258 
   231 
   259 \subsection*{Of Cookies and Salts}
   232 \subsection*{Of Cookies and Salts}
   260 
   233 
   261 Let us look at another example which will help with understanding how
   234 Let us look at another example which will help with understanding how
   262 passwords should be verified and stored.  Imagine you need to develop
   235 passwords should be verified and stored.  Imagine you need to develop a
   263 a web-application that has the feature of recording how many times a
   236 web-application that has the feature of recording how many times a
   264 customer visits a page.  For example in order to give a discount
   237 customer visits a page.  For example in order to give a discount
   265 whenever the customer has visited a webpage some $x$ number of times
   238 whenever the customer has visited a webpage some $x$ number of times
   266 (say $x$ equals $5$). There is one more constraint: we want to store
   239 (say $x$ equals $5$). There is one more constraint: we want to store the
   267 the information about the number of visits as a cookie on the
   240 information about the number of visits as a cookie on the browser. I
   268 browser. I think, for a number of years the webpage of the New York
   241 think, for a number of years the webpage of the New York Times operated
   269 Times operated in this way: it allowed you to read ten articles per
   242 in this way: it allowed you to read ten articles per month for free; if
   270 month for free; if you wanted to read more, you had to pay. My best
   243 you wanted to read more, you had to pay. My best guess is that it used
   271 guess is that it used cookies for recording how many times their pages
   244 cookies for recording how many times their pages was visited, because if
   272 was visited, because if I switched browsers I could easily circumvent
   245 I switched browsers I could easily circumvent the restriction about ten
   273 the restriction about ten articles.\footnote{Another online media that
   246 articles.\footnote{Another online media that works in this way is the
   274   works in this way is the Times Higher Education
   247 Times Higher Education \url{http://www.timeshighereducation.co.uk}. It
   275   \url{http://www.timeshighereducation.co.uk}. It also seems to 
   248 also seems to use cookies to restrict the number of free articles to
   276   use cookies to restrict the number of free articles to five.}
   249 five.}
   277 
   250 
   278 To implement our web-application it is good to look under the
   251 To implement our web-application it is good to look under the hood what
   279 hood what happens when a webpage is displayed in a browser. A
   252 happens when a webpage is displayed in a browser. A typical
   280 typical web-application works as follows: The browser sends a
   253 web-application works as follows: The browser sends a GET request for a
   281 GET request for a particular page to a server. The server
   254 particular page to a server. The server answers this request with a
   282 answers this request with a webpage in HTML (for our purposes
   255 webpage in HTML (for our purposes we can ignore the details about HTML).
   283 we can ignore the details about HTML). A simple JavaScript
   256 A simple JavaScript program that realises a server answering with a
   284 program that realises a server answering with a ``Hello
   257 ``Hello World'' webpage is as follows:
   285 World'' webpage is as follows:
       
   286 
   258 
   287 \begin{center}
   259 \begin{center}
   288 \lstinputlisting{../progs/ap0.js}
   260 \lstinputlisting{../progs/ap0.js}
   289 \end{center}
   261 \end{center}
   290 
   262 
   291 \noindent The interesting lines are 4 to 7 where the answer to
   263 \noindent The interesting lines are 4 to 7 where the answer to the GET
   292 the GET request is generated\ldots in this case it is just a
   264 request is generated\ldots in this case it is just a simple string. This
   293 simple string. This program is run on the server and will be
   265 program is run on the server and will be executed whenever a browser
   294 executed whenever a browser initiates such a GET request. You
   266 initiates such a GET request. You can run this program on your computer
   295 can run this program on your computer and then direct a
   267 and then direct a browser to the address \pcode{localhost:8000} in order
   296 browser to the address \pcode{localhost:8000} in order to
   268 to simulate a request over the internet. You are encouraged to try this
   297 simulate a request over the internet. You are encouraged
   269 out\ldots{}theory is always good, but practice is better.
   298 to try this out\ldots{}theory is always good, but practice is 
   270 
   299 better.
   271 
   300 
   272 For our web-application of interest is the feature that the server when
   301 
   273 answering the request can store some information on the client's side.
   302 For our web-application of interest is the feature that the
   274 This information is called a \emph{cookie}. The next time the browser
   303 server when answering the request can store some information
   275 makes another GET request to the same webpage, this cookie can be read
   304 on the client's side. This information is called a
   276 again by the server. We can use cookies in order to store a counter that
   305 \emph{cookie}. The next time the browser makes another GET
   277 records the number of times our webpage has been visited. This can be
   306 request to the same webpage, this cookie can be read again by
   278 realised with the following small program
   307 the server. We can use cookies in order to store a counter
       
   308 that records the number of times our webpage has been visited.
       
   309 This can be realised with the following small program
       
   310 
   279 
   311 \begin{center}
   280 \begin{center}
   312 \lstinputlisting{../progs/ap2.js}
   281 \lstinputlisting{../progs/ap2.js}
   313 \end{center}
   282 \end{center}
   314 
   283 
   315 \noindent The overall structure of this program is the same as
   284 \noindent The overall structure of this program is the same as the
   316 the earlier one: Lines 7 to 17 generate the answer to a
   285 earlier one: Lines 7 to 17 generate the answer to a GET-request. The new
   317 GET-request. The new part is in Line 8 where we read the
   286 part is in Line 8 where we read the cookie called \pcode{counter}. If
   318 cookie called \pcode{counter}. If present, this cookie will be
   287 present, this cookie will be send together with the GET-request from the
   319 send together with the GET-request from the client. The value
   288 client. The value of this counter will come in form of a string,
   320 of this counter will come in form of a string, therefore we
   289 therefore we use the function \pcode{parseInt} in order to transform it
   321 use the function \pcode{parseInt} in order to transform it
   290 into an integer. In case the cookie is not present, we default the
   322 into an integer. In case the cookie is not present, we default
   291 counter to zero. The odd looking construction \code{...|| 0} is
   323 the counter to zero. The odd looking construction \code{...||
   292 realising this defaulting in JavaScript. In Line 9 we increase the
   324 0} is realising this defaulting in JavaScript. In Line 9 we
   293 counter by one and store it back to the client (under the name
   325 increase the counter by one and store it back to the client
   294 \pcode{counter}, since potentially more than one value could be stored).
   326 (under the name \pcode{counter}, since potentially more than
   295 In Lines 10 to 15 we test whether this counter is greater or equal than
   327 one value could be stored). In Lines 10 to 15 we test whether
   296 5 and send accordingly a specially grafted message back to the client.
   328 this counter is greater or equal than 5 and send accordingly a
   297 
   329 specially grafted message back to the client.
   298 Let us step back and analyse this program from a security point of view.
   330 
   299 We store a counter in plain text on the client's browser (which is not
   331 Let us step back and analyse this program from a security
   300 under our control). Depending on this value we want to unlock a resource
   332 point of view. We store a counter in plain text on the
   301 (like a discount) when it reaches a threshold. If the client deletes the
   333 client's browser (which is not under our control). Depending
   302 cookie, then the counter will just be reset to zero. This does not
   334 on this value we want to unlock a resource (like a discount)
   303 bother us, because the purported discount will just not be granted. In
   335 when it reaches a threshold. If the client deletes the cookie,
   304 this way we do not lose any (hypothetical) money. What we need to be
   336 then the counter will just be reset to zero. This does not
   305 concerned about is, however, when a client artificially increases this
   337 bother us, because the purported discount will just not be
   306 counter without having visited our web-page. This is actually a trivial
   338 granted. In this way we do not lose any (hypothetical) money.
   307 task for a knowledgeable person, since there are convenient tools that
   339 What we need to be concerned about is, however, when a client
   308 allow one to set a cookie to an arbitrary value, for example above our
   340 artificially increases this counter without having visited our
       
   341 web-page. This is actually a trivial task for a knowledgeable
       
   342 person, since there are convenient tools that allow one to set
       
   343 a cookie to an arbitrary value, for example above our
       
   344 threshold for the discount. 
   309 threshold for the discount. 
   345 
   310 
   346 There seems to be no simple way to prevent this kind of
   311 There seems to be no simple way to prevent this kind of tampering with
   347 tampering with cookies, because the whole purpose of cookies
   312 cookies, because the whole purpose of cookies is that they are stored on
   348 is that they are stored on the client's side, which from the
   313 the client's side, which from the the server's perspective is a
   349 the server's perspective is a potentially hostile environment.
   314 potentially hostile environment. What we need to ensure is the integrity
   350 What we need to ensure is the integrity of this counter in
   315 of this counter in this hostile environment. We could think of
   351 this hostile environment. We could think of encrypting the
   316 encrypting the counter. But this has two drawbacks to do with the keys
   352 counter. But this has two drawbacks to do with the keys for
   317 for encryption. If you use a single, global key for all the clients that
   353 encryption. If you use a single, global key for all the
   318 visit our site, then we risk that our whole ``business'' might collapse
   354 clients that visit our site, then we risk that our whole
   319 in the event this key gets known to the outside world. Then all cookies
   355 ``business'' might collapse in the event this key gets known
   320 we might have set in the past, can now be decrypted and manipulated. If,
   356 to the outside world. Then all cookies we might have set in
   321 on the other hand, we use many ``private'' keys for the clients, then we
   357 the past, can now be decrypted and manipulated. If, on the
   322 have to solve the problem of having to securely store this key on our
   358 other hand, we use many ``private'' keys for the clients, then
   323 server side (obviously we cannot store the key with the client because
   359 we have to solve the problem of having to securely store this
   324 then the client again has all data to tamper with the counter; and
   360 key on our server side (obviously we cannot store the key with
   325 obviously we also cannot encrypt the key, lest we can solve an
   361 the client because then the client again has all data to
   326 impossible chicken-and-egg problem). So encryption seems to not solve
   362 tamper with the counter; and obviously we also cannot encrypt
   327 the problem we face with the integrity of our counter.
   363 the key, lest we can solve an impossible chicken-and-egg
   328 
   364 problem). So encryption seems to not solve the problem we face
   329 Fortunately, \emph{cryptographic hash functions} seem to be more
   365 with the integrity of our counter.
   330 suitable for our purpose. Like encryption, hash functions scramble data
   366 
   331 in such a way that it is easy to calculate the output of a hash function
   367 Fortunately, \emph{cryptographic hash functions} seem to be
   332 from the input. But it is hard (i.e.~practically impossible) to
   368 more suitable for our purpose. Like encryption, hash functions
   333 calculate the input from knowing the output. This is often called
   369 scramble data in such a way that it is easy to calculate the
   334 \emph{preimage resistance}. Cryptographic hash functions also ensure
   370 output of a hash function from the input. But it is hard
   335 that given a message and a hash, it is computationally infeasible to
   371 (i.e.~practically impossible) to calculate the input from
   336 find another message with the same hash. This is called \emph{collusion
   372 knowing the output. This is often called \emph{preimage
   337 resistance}. Because of these properties, hash functions are often
   373 resistance}. Cryptographic hash functions also ensure that
   338 called \emph{one-way functions}: you cannot go back from the output to
   374 given a message and a hash, it is computationally infeasible to
   339 the input (without some tricks, see below). 
   375 find another message with the same hash. This is called
   340 
   376 \emph{collusion resistance}. Because of these properties, hash
   341 There are several such hashing function. For example SHA-1 would hash
   377 functions are often called \emph{one-way functions}: you
   342 the string \pcode{"hello world"} to produce the hash-value
   378 cannot go back from the output to the input (without some
       
   379 tricks, see below). 
       
   380 
       
   381 There are several such hashing function. For example SHA-1
       
   382 would hash the string \pcode{"hello world"} to produce the
       
   383 hash-value
       
   384 
   343 
   385 \begin{center}
   344 \begin{center}
   386 \pcode{2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
   345 \pcode{2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
   387 \end{center}
   346 \end{center}
   388 
   347 
   389 \noindent Another handy feature of hash functions is that if
   348 \noindent Another handy feature of hash functions is that if the input
   390 the input changes only a little, the output changes
   349 changes only a little, the output changes drastically. For example
   391 drastically. For example \pcode{"iello world"} produces under
   350 \pcode{"iello world"} produces under SHA-1 the output
   392 SHA-1 the output
       
   393 
   351 
   394 \begin{center}
   352 \begin{center}
   395 \pcode{d2b1402d84e8bcef5ae18f828e43e7065b841ff1}
   353 \pcode{d2b1402d84e8bcef5ae18f828e43e7065b841ff1}
   396 \end{center}
   354 \end{center}
   397 
   355 
   398 \noindent That means it is not predictable what the output
   356 \noindent That means it is not predictable what the output will be from
   399 will be from just looking at input that is ``close by''. 
   357 just looking at input that is ``close by''. 
   400 
   358 
   401 We can use hashes in our web-application and store in the
   359 We can use hashes in our web-application and store in the cookie the
   402 cookie the value of the counter in plain text but together
   360 value of the counter in plain text but together with its hash. We need
   403 with its hash. We need to store both pieces of data in such a
   361 to store both pieces of data in such a way that we can extract them
   404 way that we can extract them again later on. In the code below
   362 again later on. In the code below I will just separate them using a
   405 I will just separate them using a \pcode{"-"}. For the
   363 \pcode{"-"}. For the counter \pcode{1} for example
   406 counter \pcode{1} for example
       
   407 
   364 
   408 \begin{center}
   365 \begin{center}
   409 \pcode{1-356a192b7913b04c54574d18c28d46e6395428ab}
   366 \pcode{1-356a192b7913b04c54574d18c28d46e6395428ab}
   410 \end{center}
   367 \end{center}
   411 
   368 
   412 \noindent If we now read back the cookie when the client
   369 \noindent If we now read back the cookie when the client visits our
   413 visits our webpage, we can extract the counter, hash it again
   370 webpage, we can extract the counter, hash it again and compare the
   414 and compare the result to the stored hash value inside the
   371 result to the stored hash value inside the cookie. If these hashes
   415 cookie. If these hashes disagree, then we can deduce that the
   372 disagree, then we can deduce that the cookie has been tampered with.
   416 cookie has been tampered with. Unfortunately, if they agree,
   373 Unfortunately, if they agree, we can still not be entirely sure that not
   417 we can still not be entirely sure that not a clever hacker has
   374 a clever hacker has tampered with the cookie. The reason is that the
   418 tampered with the cookie. The reason is that the hacker can
   375 hacker can see the clear text part of the cookie, say \pcode{3}, and
   419 see the clear text part of the cookie, say \pcode{3}, and also
   376 also its hash. It does not take much trial and error to find out that we
   420 its hash. It does not take much trial and error to find out
   377 used the SHA-1 hashing function and then the hacker can graft a cookie
   421 that we used the SHA-1 hashing function and then the hacker
   378 accordingly. This is eased by the fact that for SHA-1 many strings and
   422 can graft a cookie accordingly. This is eased by the fact that
   379 corresponding hash-values are precalculated. Type, for example, into
   423 for SHA-1 many strings and corresponding hash-values are
   380 Google the hash value for \pcode{"hello world"} and you will actually
   424 precalculated. Type, for example, into Google the hash value
   381 pretty quickly find that it was generated by input string \pcode{"hello
   425 for \pcode{"hello world"} and you will actually pretty quickly
   382 world"}. Similarly for the hash-value for \pcode{1}. This defeats the
   426 find that it was generated by input string \pcode{"hello
   383 purpose of a hashing function and thus would not help us with our
   427 world"}. Similarly for the hash-value for \pcode{1}. This
   384 web-applications and later also not with how to store passwords
   428 defeats the purpose of a hashing function and thus would not
   385 properly. 
   429 help us with our web-applications and later also not with how
       
   430 to store passwords properly. 
       
   431 
   386 
   432 
   387 
   433 There is one ingredient missing, which happens to be called
   388 There is one ingredient missing, which happens to be called
   434 \emph{salts}. Salts are random keys, which are added to the
   389 \emph{salts}. Salts are random keys, which are added to the counter
   435 counter before the hash is calculated. In our case we must
   390 before the hash is calculated. In our case we must keep the salt secret.
   436 keep the salt secret. As can be see in Figure~\ref{hashsalt},
   391 As can be see in Figure~\ref{hashsalt}, we need to extract from the
   437 we need to extract from the cookie the counter value and its
   392 cookie the counter value and its hash (Lines 19 and 20). But before
   438 hash (Lines 19 and 20). But before hashing the counter again
   393 hashing the counter again (Line 22) we need to add the secret salt.
   439 (Line 22) we need to add the secret salt. Similarly, when we
   394 Similarly, when we set the new increased counter, we will need to add
   440 set the new increased counter, we will need to add the salt
   395 the salt before hashing (this is done in Line 15). Our web-application
   441 before hashing (this is done in Line 15). Our web-application
       
   442 will now store cookies like 
   396 will now store cookies like 
   443 
   397 
   444 \begin{figure}[p]
   398 \begin{figure}[p]
   445 \lstinputlisting{../progs/App4.js}
   399 \lstinputlisting{../progs/App4.js}
   446 \caption{A Node.js web-app that sets a cookie in the client's
   400 \caption{A Node.js web-app that sets a cookie in the client's browser
   447 browser for counting the number of visits to a page.\label{hashsalt}}
   401 for counting the number of visits to a page.\label{hashsalt}}
   448 \end{figure}
   402 \end{figure}
   449 
   403 
   450 \begin{center}\tt
   404 \begin{center}\tt
   451 \begin{tabular}{l}
   405 \begin{tabular}{l}
   452 1 + salt - 8189effef4d4f7411f4153b13ff72546dd682c69\\
   406 1 + salt - 8189effef4d4f7411f4153b13ff72546dd682c69\\
   455 4 + salt - 5b9e85269e4461de0238a6bf463ed3f25778cbba\\
   409 4 + salt - 5b9e85269e4461de0238a6bf463ed3f25778cbba\\
   456 ...\\
   410 ...\\
   457 \end{tabular}
   411 \end{tabular}
   458 \end{center}
   412 \end{center}
   459 
   413 
   460 \noindent These hashes allow us to read and set the value of
   414 \noindent These hashes allow us to read and set the value of the
   461 the counter, and also give us confidence that the counter has
   415 counter, and also give us confidence that the counter has not been
   462 not been tampered with. This of course depends on being able
   416 tampered with. This of course depends on being able to keep the salt
   463 to keep the salt secret. Once the salt is public, we better
   417 secret. Once the salt is public, we better ignore all cookies and start
   464 ignore all cookies and start setting them again with a new
   418 setting them again with a new salt.
   465 salt.
   419 
   466 
   420 There is an interesting and very subtle point to note with respect to
   467 There is an interesting and very subtle point to note with
   421 the 'New York Times' way of checking the number visits. Essentially they
   468 respect to the 'New York Times' way of checking the number
   422 have their `resource' unlocked at the beginning and lock it only when
   469 visits. Essentially they have their `resource' unlocked at the
   423 the data in the cookie states that the allowed free number of visits are
   470 beginning and lock it only when the data in the cookie states
   424 up. As said before, this can be easily circumvented by just deleting the
   471 that the allowed free number of visits are up. As said before,
   425 cookie or by switching the browser. This would mean the New York Times
   472 this can be easily circumvented by just deleting the cookie or
   426 will lose revenue whenever this kind of tampering occurs. The `quick
   473 by switching the browser. This would mean the New York Times
   427 fix' to require that a cookie must always be present does not work,
   474 will lose revenue whenever this kind of tampering occurs. The
   428 because then this newspaper will cut off any new readers, or anyone who
   475 `quick fix' to require that a cookie must always be present
   429 gets a new computer. In contrast, our web-application has the resource
   476 does not work, because then this newspaper will cut off any
   430 (discount) locked at the beginning and only unlocks it if the cookie
   477 new readers, or anyone who gets a new computer. In contrast,
   431 data says so. If the cookie is deleted, well then the resource just does
   478 our web-application has the resource (discount) locked at the
   432 not get unlocked. No major harm will result to us. You can see: the same
   479 beginning and only unlocks it if the cookie data says so. If
   433 security mechanism behaves rather differently depending on whether the
   480 the cookie is deleted, well then the resource just does not
   434 ``resource'' needs to be locked or unlocked. Apart from thinking about
   481 get unlocked. No major harm will result to us. You can see:
   435 the difference very carefully, I do not know of any good ``theory'' that
   482 the same security mechanism behaves rather differently
   436 could help with solving such security intricacies in any other way.  
   483 depending on whether the ``resource'' needs to be locked or
       
   484 unlocked. Apart from thinking about the difference very
       
   485 carefully, I do not know of any good ``theory'' that could
       
   486 help with solving such security intricacies in any other way.  
       
   487 
   437 
   488 \subsection*{How to Store Passwords Properly?}
   438 \subsection*{How to Store Passwords Properly?}
   489 
   439 
   490 While admittedly quite silly, the simple web-application in
   440 While admittedly quite silly, the simple web-application in the previous
   491 the previous section should help with the more important
   441 section should help with the more important question of how passwords
   492 question of how passwords should be verified and stored. It is
   442 should be verified and stored. It is unbelievable that nowadays systems
   493 unbelievable that nowadays systems still do this with
   443 still do this with passwords in plain text. The idea behind such
   494 passwords in plain text. The idea behind such plain-text
   444 plain-text passwords is of course that if the user typed in
   495 passwords is of course that if the user typed in
   445 \pcode{foobar} as password, we need to verify whether it matches with
   496 \pcode{foobar} as password, we need to verify whether it
   446 the password that is already stored for this user in the system. Why not
   497 matches with the password that is already stored for this user
   447 doing this with plain-text passwords? Unfortunately doing this
   498 in the system. Why not doing this with plain-text passwords?
   448 verification in plain text is really a bad idea. Alas, evidence suggests
   499 Unfortunately doing this verification in plain text is really
   449 it is still a widespread practice. I leave you to think about why
   500 a bad idea. Alas, evidence suggests it is still a
   450 verifying passwords in plain text is a bad idea.
   501 widespread practice. I leave you to think about why verifying
   451 
   502 passwords in plain text is a bad idea.
   452 Using hash functions, like in our web-application, we can do better.
   503 
   453 They allow us to not having to store passwords in plain text for
   504 Using hash functions, like in our web-application, we can do
   454 verification whether a password matches or not. We can just hash the
   505 better. They allow us to not having to store passwords in
   455 password and store the hash-value. And whenever the user types in a new
   506 plain text for verification whether a password matches or not.
   456 password, well then we hash it again and check whether the hash-values
   507 We can just hash the password and store the hash-value. And
   457 agree. Just like in the web-application before.
   508 whenever the user types in a new password, well then we hash
   458 
   509 it again and check whether the hash-values agree. Just like
   459 Lets analyse what happens when a hacker gets hold of such a hashed
   510 in the web-application before.
   460 password database. That is the scenario we want to defend
   511 
   461 against.\footnote{If we could assume our servers can never be broken
   512 Lets analyse what happens when a hacker gets hold of such a
   462 into, then storing passwords in plain text would be no problem. The
   513 hashed password database. That is the scenario we want to
   463 point, however, is that servers are never absolutely secure.} The hacker
   514 defend against.\footnote{If we could assume our servers can
   464 has then a list of user names and associated hash-values, like 
   515 never be broken into, then storing passwords in plain text
       
   516 would be no problem. The point, however, is that servers are
       
   517 never absolutely secure.} The hacker has then a list of user names and
       
   518 associated hash-values, like 
       
   519 
   465 
   520 \begin{center}
   466 \begin{center}
   521 \pcode{urbanc:2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
   467 \pcode{urbanc:2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
   522 \end{center}
   468 \end{center}
   523 
   469 
   524 \noindent For a beginner-level hacker this information is of
   470 \noindent For a beginner-level hacker this information is of no use. It
   525 no use. It would not work to type in the hash value instead of
   471 would not work to type in the hash value instead of the password,
   526 the password, because it will go through the hashing function
   472 because it will go through the hashing function again and then the
   527 again and then the resulting two hash-values will not match.
   473 resulting two hash-values will not match. One attack a hacker can try,
   528 One attack a hacker can try, however, is called a \emph{brute
   474 however, is called a \emph{brute force attack}. Essentially this means
   529 force attack}. Essentially this means trying out exhaustively
   475 trying out exhaustively all strings
   530 all strings
   476 
   531 
   477 \begin{center}
   532 \begin{center}
   478 \pcode{a}, \pcode{aa}, \pcode{...}, \pcode{ba}, \pcode{...},
   533 \pcode{a},
       
   534 \pcode{aa},
       
   535 \pcode{...},
       
   536 \pcode{ba},
       
   537 \pcode{...},
       
   538 \pcode{zzz},
   479 \pcode{zzz},
   539 \pcode{...}
   480 \pcode{...}
   540 \end{center}   
   481 \end{center}   
   541 
   482 
   542 \noindent and so on, hash them and check whether they match
   483 \noindent and so on, hash them and check whether they match with the
   543 with the hash-values in the database. Such brute force attacks
   484 hash-values in the database. Such brute force attacks are surprisingly
   544 are surprisingly effective. With modern technology (usually
   485 effective. With modern technology (usually GPU graphic cards), passwords
   545 GPU graphic cards), passwords of moderate length only need
   486 of moderate length only need seconds or hours to be cracked. Well, the
   546 seconds or hours to be cracked. Well, the only defence we have
   487 only defence we have against such brute force attacks is to make
   547 against such brute force attacks is to make passwords longer
   488 passwords longer and force users to use the whole spectrum of letters
   548 and force users to use the whole spectrum of letters and keys
   489 and keys for passwords. The hope is that this makes the search space too
   549 for passwords. The hope is that this makes the search space
   490 big for an effective brute force attack.
   550 too big for an effective brute force attack.
   491 
   551 
   492 Unfortunately, clever hackers have another ace up their sleeves. These
   552 Unfortunately, clever hackers have another ace up their
   493 are called \emph{dictionary attacks}. The idea behind dictionary attack
   553 sleeves. These are called \emph{dictionary attacks}. The idea
   494 is the observation that only few people are competent enough to use
   554 behind dictionary attack is the observation that only few
   495 sufficiently strong passwords. Most users (at least too many) use
   555 people are competent enough to use sufficiently strong
   496 passwords like
   556 passwords. Most users (at least too many) use passwords like
   497 
   557 
   498 \begin{center}
   558 \begin{center}
   499 \pcode{123456}, \pcode{password}, \pcode{qwerty}, \pcode{letmein},
   559 \pcode{123456},
       
   560 \pcode{password},
       
   561 \pcode{qwerty},
       
   562 \pcode{letmein},
       
   563 \pcode{...}
   500 \pcode{...}
   564 \end{center}
   501 \end{center}
   565 
   502 
   566 \noindent So an attacker just needs to compile a list as large
   503 \noindent So an attacker just needs to compile a list as large as
   567 as possible of such likely candidates of passwords and also
   504 possible of such likely candidates of passwords and also compute their
   568 compute their hash-values. The difference between a brute
   505 hash-values. The difference between a brute force attack, where maybe
   569 force attack, where maybe $2^{80}$ many strings need to be
   506 $2^{80}$ many strings need to be considered, is that a dictionary attack
   570 considered, is that a dictionary attack might get away with
   507 might get away with checking only 10 Million words (remember the
   571 checking only 10 Million words (remember the language English
   508 language English ``only'' contains 600,000 words). This is a drastic
   572 ``only'' contains 600,000 words). This is a drastic
   509 simplification for attackers. Now, if the attacker knows the hash-value
   573 simplification for attackers. Now, if the attacker knows the
   510 of a password is
   574 hash-value of a password is
       
   575 
   511 
   576 \begin{center}
   512 \begin{center}
   577 \pcode{5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8}
   513 \pcode{5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8}
   578 \end{center}
   514 \end{center}
   579 
   515 
   580 \noindent then just a lookup in the dictionary will reveal
   516 \noindent then just a lookup in the dictionary will reveal that the
   581 that the plain-text password was \pcode{password}. What is
   517 plain-text password was \pcode{password}. What is good about this attack
   582 good about this attack is that the dictionary can be
   518 is that the dictionary can be precompiled in the ``comfort of the
   583 precompiled in the ``comfort of the hacker's home'' before an
   519 hacker's home'' before an actual attack is launched. It just needs
   584 actual attack is launched. It just needs sufficient storage
   520 sufficient storage space, which nowadays is pretty cheap. A hacker might
   585 space, which nowadays is pretty cheap. A hacker might in this
   521 in this way not be able to crack all passwords in our database, but even
   586 way not be able to crack all passwords in our database, but
   522 being able to crack 50\% can be serious damage for a large company
   587 even being able to crack 50\% can be serious damage for a
   523 (because then you have to think about how to make users to change their
   588 large company (because then you have to think about how to
   524 old passwords---a major hassle). And hackers are very industrious in
   589 make users to change their old passwords---a major hassle).
   525 compiling these dictionaries: for example they definitely include
   590 And hackers are very industrious in compiling these
   526 variations like \pcode{passw0rd} and also include rules that cover cases
   591 dictionaries: for example they definitely include variations
       
   592 like \pcode{passw0rd} and also include rules that cover cases
       
   593 like \pcode{passwordpassword} or \pcode{drowssap} (password
   527 like \pcode{passwordpassword} or \pcode{drowssap} (password
   594 reversed).\footnote{Some entertaining rules for creating
   528 reversed).\footnote{Some entertaining rules for creating effective
   595 effective dictionaries are described in the book ``Applied
   529 dictionaries are described in the book ``Applied Cryptography'' by Bruce
   596 Cryptography'' by Bruce Schneier (in case you can find it in
   530 Schneier (in case you can find it in the library), and also in the
   597 the library), and also in the original research literature
   531 original research literature which can be accessed for free from
   598 which can be accessed for free from
       
   599 \url{http://www.klein.com/dvk/publications/passwd.pdf}.}
   532 \url{http://www.klein.com/dvk/publications/passwd.pdf}.}
   600 Historically, compiling a list for a dictionary attack is not
   533 Historically, compiling a list for a dictionary attack is not as simple
   601 as simple as it might seem. At the beginning only ``real''
   534 as it might seem. At the beginning only ``real'' dictionaries were
   602 dictionaries were available (like the Oxford English
   535 available (like the Oxford English Dictionary), but such dictionaries
   603 Dictionary), but such dictionaries are not optimised for the
   536 are not optimised for the purpose of cracking passwords. The first real
   604 purpose of cracking passwords. The first real hard data about
   537 hard data about actually used passwords was obtained when a company
   605 actually used passwords was obtained when a company called
   538 called RockYou ``lost'' at the end of 2009 32 Million plain-text
   606 RockYou ``lost'' at the end of 2009 32 Million plain-text
   539 passwords. With this data of real-life passwords, dictionary attacks
   607 passwords. With this data of real-life passwords, dictionary
   540 took off. Compiling such dictionaries is nowadays very easy with the
   608 attacks took off. Compiling such dictionaries is nowadays very
   541 help of off-the-shelf tools.
   609 easy with the help of off-the-shelf tools.
   542 
   610 
   543 These dictionary attacks can be prevented by using salts. Remember a
   611 These dictionary attacks can be prevented by using salts.
   544 hacker needs to use the most likely candidates of passwords and
   612 Remember a hacker needs to use the most likely candidates 
   545 calculate their hash-value. If we add before hashing a password a random
   613 of passwords and calculate their hash-value. If we add before
   546 salt, like \pcode{mPX2aq}, then the string \pcode{passwordmPX2aq} will
   614 hashing a password a random salt, like \pcode{mPX2aq},
   547 almost certainly not be in the dictionary. Like in the web-application
   615 then the string \pcode{passwordmPX2aq} will almost certainly 
   548 in the previous section, a salt does not prevent us from verifying a
   616 not be in the dictionary. Like in the web-application in the
   549 password. We just need to add the salt whenever the password is typed in
   617 previous section, a salt does not prevent us from verifying a 
   550 again. 
   618 password. We just need to add the salt whenever the password 
   551 
   619 is typed in again. 
   552 There is a question whether we should use a single random salt for every
   620 
   553 password in our database. A single salt would already make dictionary
   621 There is a question whether we should use a single random salt
   554 attacks considerably more difficult. It turns out, however, that in case
   622 for every password in our database. A single salt would
   555 of password databases every password should get their own salt. This
   623 already make dictionary attacks considerably more difficult.
   556 salt is generated at the time when the password is first set. If you
   624 It turns out, however, that in case of password databases
   557 look at a Unix password file you will find entries like
   625 every password should get their own salt. This salt is
       
   626 generated at the time when the password is first set. 
       
   627 If you look at a Unix password file you will find entries like
       
   628 
   558 
   629 \begin{center}
   559 \begin{center}
   630 \pcode{urbanc:$6$3WWbKfr1$4vblknvGr6FcDeF92R5xFn3mskfdnEn...$...}
   560 \pcode{urbanc:$6$3WWbKfr1$4vblknvGr6FcDeF92R5xFn3mskfdnEn...$...}
   631 \end{center}
   561 \end{center}
   632 
   562 
   633 \noindent where the first part is the login-name, followed by
   563 \noindent where the first part is the login-name, followed by a field
   634 a field \pcode{$6$} which specifies which hash-function is
   564 \pcode{$6$} which specifies which hash-function is used. After that
   635 used. After that follows the salt \pcode{3WWbKfr1} and after
   565 follows the salt \pcode{3WWbKfr1} and after that the hash-value that is
   636 that the hash-value that is stored for the password (which
   566 stored for the password (which includes the salt). I leave it to you to
   637 includes the salt). I leave it to you to figure out how the
   567 figure out how the password verification would need to work based on
   638 password verification would need to work based on this data.
   568 this data.
   639 
   569 
   640 There is a non-obvious benefit of using a separate salt for
   570 There is a non-obvious benefit of using a separate salt for each
   641 each password. Recall that \pcode{123456} is a popular
   571 password. Recall that \pcode{123456} is a popular password that is most
   642 password that is most likely used by several of your users
   572 likely used by several of your users (especially if the database
   643 (especially if the database contains millions of entries). If
   573 contains millions of entries). If we use no salt or one global salt, all
   644 we use no salt or one global salt, all hash-values will be the
   574 hash-values will be the same for this password. So if a hacker is in the
   645 same for this password. So if a hacker is in the business of
   575 business of cracking as many passwords as possible, then it is a good
   646 cracking as many passwords as possible, then it is a good idea
   576 idea to concentrate on those very popular passwords. This is not
   647 to concentrate on those very popular passwords. This is not
   577 possible if each password gets its own salt: since we assume the salt is
   648 possible if each password gets its own salt: since we assume
   578 generated randomly, each version of \pcode{123456} will be associated
   649 the salt is generated randomly, each version of \pcode{123456}
   579 with a different hash-value. This will make the life harder for an
   650 will be associated with a different hash-value. This will
   580 attacker.
   651 make the life harder for an attacker.
   581 
   652 
   582 Note another interesting point. The web-application from the previous
   653 Note another interesting point. The web-application from the
   583 section was only secure when the salt was secret. In the password case,
   654 previous section was only secure when the salt was secret. In
   584 this is not needed. The salt can be public as shown above in the Unix
   655 the password case, this is not needed. The salt can be public
   585 password file where it is actually stored as part of the password entry.
   656 as shown above in the Unix password file where it is actually
   586 Knowing the salt does not give the attacker any advantage, but prevents
   657 stored as part of the password entry. Knowing the salt does
   587 that dictionaries can be precompiled. While salts do not solve every
   658 not give the attacker any advantage, but prevents that
   588 problem, they help with protecting against dictionary attacks on
   659 dictionaries can be precompiled. While salts do not solve
   589 password files. It protects people who have the same passwords on
   660 every problem, they help with protecting against dictionary
   590 multiple machines. But it does not protect against a focused attack
   661 attacks on password files. It protects people who have the
   591 against a single password and also does not make poorly chosen passwords
   662 same passwords on multiple machines. But it does not protect
   592 any better. Still the moral is that you should never store passwords in
   663 against a focused attack against a single password and also
   593 plain text. Never ever.
   664 does not make poorly chosen passwords any better. Still the
       
   665 moral is that you should never store passwords in plain text.
       
   666 Never ever.
       
   667 
   594 
   668 \subsubsection*{Further Reading}
   595 \subsubsection*{Further Reading}
   669 
   596 
   670 A readable article by Bruce Schneier on ``How Security Companies Sucker Us with 
   597 A readable article by Bruce Schneier on ``How Security Companies Sucker
   671 Lemons''
   598 Us with Lemons''
   672 
   599 
   673 \begin{center}
   600 \begin{center}
   674 \url{http://archive.wired.com/politics/security/commentary/securitymatters/2007/04/securitymatters_0419}
   601 \url{http://archive.wired.com/politics/security/commentary/securitymatters/2007/04/securitymatters_0419}
   675 \end{center}
   602 \end{center}
   676 
   603 
   680 \begin{center}
   607 \begin{center}
   681 \url{http://randomwalker.info/publications/cookie-surveillance-v2.pdf}
   608 \url{http://randomwalker.info/publications/cookie-surveillance-v2.pdf}
   682 \end{center}
   609 \end{center}
   683 
   610 
   684 \noindent
   611 \noindent
   685 A slightly different point of view about the economies of 
   612 A slightly different point of view about the economies of password
   686 password cracking:
   613 cracking:
   687 
   614 
   688 \begin{center}
   615 \begin{center}
   689 \url{http://xkcd.com/538/}
   616 \url{http://xkcd.com/538/}
   690 \end{center}
   617 \end{center}
   691 
   618 
   692 \noindent If you want to know more about passwords, the book
   619 \noindent If you want to know more about passwords, the book by Bruce
   693 by Bruce Schneier about Applied Cryptography is recommendable,
   620 Schneier about Applied Cryptography is recommendable, though quite
   694 though quite expensive. There is also another expensive book
   621 expensive. There is also another expensive book about penetration
   695 about penetration testing, but the readable chapter about
   622 testing, but the readable chapter about password attacks (Chapter 9) is
   696 password attacks (Chapter 9) is free:
   623 free:
   697 
   624 
   698 \begin{center}
   625 \begin{center}
   699 \url{http://www.nostarch.com/pentesting}
   626 \url{http://www.nostarch.com/pentesting}
   700 \end{center}
   627 \end{center}
   701 
   628 
   702 \noindent Even the government recently handed out some 
   629 \noindent Even the government recently handed out some advice about
   703 advice about passwords
   630 passwords
   704 
   631 
   705 \begin{center}
   632 \begin{center}
   706 \url{http://goo.gl/dIzqMg}
   633 \url{http://goo.gl/dIzqMg}
   707 \end{center}
   634 \end{center}
   708 
   635 
   709 \noindent Here is an interesting blog-post about how a group
   636 \noindent Here is an interesting blog-post about how a group ``cracked''
   710 ``cracked'' efficiently millions of bcrypt passwords from the
   637 efficiently millions of bcrypt passwords from the Ashley Madison leak.
   711 Ashley Madison leak.
       
   712 
   638 
   713 \begin{center}
   639 \begin{center}
   714 \url{http://goo.gl/83Ho0N}
   640 \url{http://goo.gl/83Ho0N}
   715 \end{center}
   641 \end{center}
   716 
   642 
   719 \begin{center}
   645 \begin{center}
   720 \url{https://goo.gl/W63Xhw}
   646 \url{https://goo.gl/W63Xhw}
   721 \end{center}
   647 \end{center}
   722 
   648 
   723 \noindent The attack used dictionaries with up to 15 Billion
   649 \noindent The attack used dictionaries with up to 15 Billion
   724 entries.\footnote{Compare this with the full brute-force space
   650 entries.\footnote{Compare this with the full brute-force space of
   725 of $62^8$} If eHarmony had properly salted their passwords,
   651 $62^8$} If eHarmony had properly salted their passwords, the attack
   726 the attack would have taken 31 years.
   652 would have taken 31 years.
   727 
   653 
   728 
   654 
   729 Clearly, passwords are a technology that comes to
   655 Clearly, passwords are a technology that comes to the end of its
   730 the end of its usefulness, because brute force attacks become
   656 usefulness, because brute force attacks become more and more powerful
   731 more and more powerful and it is unlikely that humans get any
   657 and it is unlikely that humans get any better in remembering (securely)
   732 better in remembering (securely) longer and longer passwords.
   658 longer and longer passwords. The big question is which technology can
   733 The big question is which technology can replace
   659 replace passwords\ldots 
   734 passwords\ldots 
       
   735 \medskip
   660 \medskip
   736 
   661 
   737 
   662 
   738 \end{document}
   663 \end{document}
   739 
   664 
   742 
   667 
   743 %%% cookies
   668 %%% cookies
   744 http://randomwalker.info/publications/cookie-surveillance-v2.pdf
   669 http://randomwalker.info/publications/cookie-surveillance-v2.pdf
   745 
   670 
   746 
   671 
   747 %%% Local Variables: 
   672 %%% Local Variables: %% mode: latex %% TeX-master: t %% End: 
   748 %%% mode: latex
       
   749 %%% TeX-master: t
       
   750 %%% End: