handouts/ho01.tex
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Thu, 25 Sep 2014 05:48:08 +0100
changeset 177 46e581d66f3a
parent 176 5336ad2fd3fa
child 178 13c6bd6e3477
permissions -rw-r--r--
updated

\documentclass{article}
\usepackage{../style}
\usepackage{../langs}

\lstset{language=JavaScript}

\begin{document}

\section*{Handout 1 (Security Engineering)}

Much of the material and inspiration in this module is taken
from the works of Bruce Schneier, Ross Anderson and Alex
Halderman. I think they are the world experts in the area of
security engineering. I especially like that they argue that a
security engineer requires a certain \emph{security mindset}.
Bruce Schneier for example writes:

\begin{quote} 
\it ``Security engineers --- at least the good ones --- see
the world differently. They can't walk into a store without
noticing how they might shoplift. They can't use a computer
without wondering about the security vulnerabilities. They
can't vote without trying to figure out how to vote twice.
They just can't help it.''
\end{quote}

\begin{quote}
\it ``Security engineering\ldots requires you to think
differently. You need to figure out not how something works,
but how something can be made to not work. You have to imagine
an intelligent and malicious adversary inside your system
\ldots, constantly trying new ways to
subvert it. You have to consider all the ways your system can
fail, most of them having nothing to do with the design
itself. You have to look at everything backwards, upside down,
and sideways. You have to think like an alien.''
\end{quote}

\noindent In this module I like to teach you this security
mindset. This might be a mindset that you think is very
foreign to you---after all we are all good citizens and not
hack into things. I beg to differ: You have this mindset
already when in school you were thinking, at least
hypothetically, about ways in which you can cheat in an exam
(whether it is about hiding notes or looking over the
shoulders of your fellow pupils). Right? To defend a system,
you need to have this kind mindset and be able to think like
an attacker. This will include understanding techniques that
can be used to compromise security and privacy in systems.
This will many times result in insights where well-intended
security mechanisms made a system actually less
secure.\smallskip

{\Large\bf Warning!} However, don’t be evil! Using those
techniques in the real world may violate the law or King’s
rules, and it may be unethical. Under some circumstances, even
probing for weaknesses of a system may result in severe
penalties, up to and including expulsion, fines and
jail time. Acting lawfully and ethically is your
responsibility. Ethics requires you to refrain from doing
harm. Always respect privacy and rights of others. Do not
tamper with any of King's systems. If you try out a technique,
always make doubly sure you are working in a safe environment
so that you cannot cause any harm, not even accidentally.
Don't be evil. Be an ethical hacker.\medskip

\noindent
In this lecture I want to make you familiar with the security mindset
and dispel the myth that encryption is the answer to all security
problems (it is certainly often part of an answer, but almost always
never a sufficient one). This is actually an important thread going
through the whole course: We will assume that encryption works
perfectly, but still attack ``things''. By ``works perfectly'' we mean
that we will assume encryption is a black box and, for example, will
not look at the underlying mathematics and break the 
algorithms.\footnote{Though fascinating this might be.}
 
For a secure system, it seems, four requirements need to come
together: First a security policy (what is supposed to be
achieved?); second a mechanism (cipher, access controls,
tamper resistance etc); third the assurance we obtain from the
mechanism (the amount of reliance we can put on the mechanism)
and finally the incentives (the motive that the people
guarding and maintaining the system have to do their job
properly, and also the motive that the attackers have to try
to defeat your policy). The last point is often overlooked,
but plays an important role. To illustrate this lets look at
an example. 

The questions is whether the Chip-and-PIN system with credit
cards is more secure than the older method of signing receipts
at the till. On first glance Chip-and-PIN seems obviously more
secure and improved security was also the central plank in the
``marketing speak'' of the banks behind Chip-and-PIN. The
earlier system was based on a magnetic stripe or a mechanical
imprint on the card and required customers to sign receipts at
the till whenever they bought something. This signature
authorised the transactions. Although in use for a long time,
this system had some crucial security flaws, including making
clones of credit cards and forging signatures. 

Chip-and-PIN, as the name suggests, relies on data being
stored on a chip on the card and a PIN number for
authorisation. Even though the banks involved trumpeted their
system as being absolutely secure and indeed fraud rates
initially went down, security researchers were not convinced
(especially the group around Ross Anderson). To begin with,
the Chip-and-PIN system introduced a ``new player'' that
needed to be trusted: the PIN terminals and their
manufacturers. It was claimed that these terminals are
tamper-resistant, but needless to say this was a weak link in
the system, which criminals successfully attacked. Some
terminals were even so skilfully manipulated that they
transmitted skimmed PIN numbers via built-in mobile phone
connections. To mitigate this flaw in the security of
Chip-and-PIN, you need to vet quite closely the supply chain
of such terminals.

Later on Ross Anderson and his group managed to launch a
man-in-the-middle attacks against Chip-and-PIN. Essentially
they made the terminal think the correct PIN was entered and
the card think that a signature was used. This was a more
serious security problem. The flaw was mitigated by requiring
that a link between the card and the bank is established at
every time the card is used. Even later this group found
another problem with Chip-and-PIN and ATMs which do not
generate random enough numbers (nonces) on which the security
of the underlying protocols relies. 

The problem with all this is that the banks who introduced
Chip-and-PIN managed with the new system to shift the
liability for any fraud and the burden of proof onto the
customer. In the old system, the banks had to prove that the
customer used the card, which they often did not bother with.
In effect, if fraud occurred the customers were either refunded
fully or lost only a small amount of money. This
taking-responsibility-of-potential-fraud was part of the
``business plan'' of the banks and did not reduce their
profits too much. 

Since banks managed to successfully claim that their
Chip-and-PIN system is secure, they were under the new system
able to point the finger at the customer when fraud occurred:
they must have been negligent loosing their PIN. The customer
had almost no means to defend themselves in such situations.
That is why the work of \emph{ethical} hackers like Ross
Anderson's group was so important, because they and others
established that the bank's claim that their system is secure
and it must have been the customer's fault, was bogus. In 2009
for example the law changed and the burden of proof went back
to the banks. They need to prove whether it was really the
customer who used a card or not.

This is a classic example where a security design principle
was violated: Namely, the one who is in the position to
improve security, also needs to bear the financial losses if
things go wrong. Otherwise, you end up with an insecure
system. In case of the Chip-and-PIN system, no good security
engineer would claim that it is secure beyond reproach: the
specification of the EMV protocol (underlying Chip-and-PIN) is
some 700 pages long, but still leaves out many things (like
how to implement a good random number generator). No human
being is able to scrutinise such a specification and ensure it
contains no flaws. Moreover, banks can add their own
sub-protocols to EMV. With all the experience we already have,
it is as clear as day that criminals were eventually able to
poke holes into it and measures need to be taken to address
them. However, with how the system was set up, the banks had
no real incentive to come up with a system that is really
secure. Getting the incentives right in favour of security is
often a tricky business.

\subsection*{Of Cookies and Salts}

Lets look at another example which helps us to understand how
passwords should be verified and stored. Imagine you need to
develop a web-application that has the feature of recording
how many times a customer visits a page. For example to 
give a discount whenever the customer visited a webpage some 
$x$ number of times (say $x$ equal $5$). For a number of years
the webpage of the New York Times operated in this way: it 
allowed you to read ten articles per months for free; if
you wanted to read more you had to pay. There is one more
constraint: we want to store the information about the number
of times a customer has visited inside a cookie. 

A typical web-application works as follows: The browser sends
a GET request for a particular page to a server. The server 
answers is request. A simple JavaScript program that realises
a ``hello world'' webpage is as follows:

\begin{center}
\lstinputlisting{../progs/ap0.js}
\end{center}

\noindent The interesting lines are 4 to 7 where the answer
to the GET request is generated\ldots in this case it is just
a simple string. This program is run on the server and will
be run whenever a browser initiates such a GET request.

For our web-application of interest is the feature that the
server when answering the request can store some information
on the client. This information is called a \emph{cookie}.
The next time the browser makes another GET request to the 
same webpage, this cookie can be read by the browser. 
Therefore we can use a cookie in order to store a counter
recording the number of times a webpage has been visited. 
This can be realised with the following small program

\begin{center}
\lstinputlisting{../progs/ap2.js}
\end{center}

\noindent The overall structure of this code is the same as
the earlier program: Lines 7 to 17 generate the answer to a
GET-request. The new part is in Line 8 where we read the
cookie called \pcode{counter}. If present, this cookie will be
send together with the GET-request from the client. The value
of this counter will come in form of a string, therefore we
use the function \pcode{parseInt} in order to transform it
into a string. In case the cookie is not present, or has been
deleted, we default the counter to zero. The odd looking
construction \code{...|| 0} is realising this in JavaScript.
In Line 9 we increase the counter by one and store it back
to the client (under the name \pcode{counter}, since potentially 
more than one value could be stored). In Lines 10 to 15 we
test whether this counter is greater or equal than 5 and
send accordingly a message back to the client.

Let us step back and analyse this program from a security
perspective. We store a counter in plain text on the client's
browser (which is not under our control at all). Depending on
this value we want to unlock a resource (like a discount) when
it reaches a threshold. If the client deletes the cookie, then
the counter will just be reset to zero. This does not bother
us, because the purported discount will just be granted later.
This does not lose us any (hypothetical) money. What we need
to be concerned about is when a client artificially increases
this counter without having visited our web-page. This is
actually a trivial task for a knowledgeable person, since
there are convenient tools that allow us to set a cookie to an
arbitrary value, for example above our threshold for the
discount. 

There is no real way to prevent this kind of tampering with
cookies, because the whole purpose of cookies is that they are
stored on the client's side, which from the the server's
perspective is in a potentially hostile environment. What we
need to ensure is the integrity of this counter in this
hostile environment. We could think of encrypting the counter.
But this has two drawbacks to do with the key for encryption.
If you use a `global' key for all our client's that visit our
site, then we risk that our whole ``business'' might colapse
when this key gets known to the outside world. Suddenly all
cookies we might have set in the past, can now be manipulated.
If on the other hand, we use a ``private'' key for every
client, then we have to solve the problem of having to
securely store this key on our server side (obviously we
cannot store the key with the client because then the client
again has all data to tamper with the counter; and obviously
we also cannot encrypt the key, lest we can solve a
chicken-and-egg problem). So encryption seems to not solve the
problem we face with the integrity of our counter.

Fortunately, \emph{hash function} seem to be more suitable for
our purpose. Like encryption, hash functions scrambles data
but in such a way that it is easy to calculate the output of a
has function from the input. But it is hard (i.e.~practically
impossible) to calculate the input from knowing the output.
Therefore has functions are often called one-way functions.
There are several such hashing function. For example SHA-1
would has the string \pcode{"hello world"} to

\begin{center}
\pcode{2aae6c35c94fcfb415dbe95f408b9ce91ee846ed}
\end{center}

\noindent Another handy feature of hash functions is that if
the input changes a little bit, the output changes 
drastically. For example \pcode{"iello world"} produces under 
SHA-1 the output

\begin{center}
\pcode{d2b1402d84e8bcef5ae18f828e43e7065b841ff1}
\end{center}

\noindent That means it is not predictable what the output
will be from input that is ``close by''. 

We can use hashes and store in the cookie the value of the
counter together with its hash. We need to store both pieces
of data such we can extract both components (below I will just
separate them using a \pcode{"-"}). If we now read back the
cookie when the client visits our webpage, we can extract the
counter, hash it again and compare the result to the stored
hash value inside the cookie. If these hashes disagree, then
we can deduce that cookie has been tampered with.
Unfortunately if they agree, we can still not be entirely sure
that not a clever hacker has tampered with the cookie. The
reason is that the hacker can see the clear text part of the
cookie, say \pcode{3}, and its hash. It does not take much
trial and error to find out that we used the SHA-1 hashing
functions and then graft a cookie accordingly. This is eased
by the fact that for SHA-1 many strings and corresponding
hashvalues are precalculated. Type into Google for example the
hash value for \pcode{"hello wolrd"} and you will actually
pretty quickly find that it was generated by \pcode{"hello
wolrd"}. This defeats the purpose of a hashing functions and
would not help us for our web-applications. The corresponding
attack is called \emph{dictionary attack}\ldots hashes are not
reversed by brute force calculations, that is trying out all
possible combinations.


There is one ingredient missing, which happens to be called
\emph{salt}. The salt is a random key, which is added to the
counter before the hash is calculated. In our case we need to
keep the salt secret. As can be see from 
Figure~\ref{hashsalt},
we now need to extract the cookie data (Line 20). When we 
set the new increased cookie, we will add the salt before 
hashing (this is done in Line 13). 

\begin{figure}[p]
\lstinputlisting{../progs/App3.js}
\caption{\label{hashsalt}}
\end{figure}




Note ....NYT 
\end{document}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: t
%%% End: