--- a/handouts/ho07.tex Tue Sep 26 12:03:24 2017 +0100
+++ b/handouts/ho07.tex Tue Sep 26 12:10:41 2017 +0100
@@ -1,9 +1,11 @@
\documentclass{article}
\usepackage{../style}
\usepackage{../graphics}
+\usepackage{../langs}
+\usepackage{../data}
-\begin{document}
-\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015}
+%https://crypto.stanford.edu/cs251/
+%https://programmingblockchain.gitbooks.io/programmingblockchain/content/
%% spying self defence
%%https://ssd.eff.org/en/module/communicating-others
@@ -18,532 +20,1001 @@
%https://fpf.org/wp-content/uploads/Differential-Privacy-as-a-Response-to-the-Reidentification-Threat-Klinefelter-and-Chin.pdf
%http://research.neustar.biz/2014/09/08/differential-privacy-the-basics/
-%=====
-%Tim Greene, Network World, 17 Dec 2015 (via ACM TechNews, 18 Dec 2015)
-%
-%Massachusetts Institute of Technology (MIT) researchers' experimental
-%Vuvuzela messaging system offers more privacy than The Onion Router (Tor) by
-%rendering text messages sent through it untraceable. MIT Ph.D. student
-%David Lazar says Vuvuzela resists traffic analysis attacks, while Tor
-%cannot. The researchers say the system functions no matter how many parties
-%are using it to communicate, and it employs encryption and a set of servers
-%to conceal whether or not parties are participating in text-based dialogues.
-%"Vuvuzela prevents an adversary from learning which pairs of users are
-%communicating, as long as just one out of [the] servers is not compromised,
-%even for users who continue to use Vuvuzela for years," they note. Vuvuzela
-%can support millions of users hosted on commodity servers deployed by a
-%single group of users. Instead of anonymizing users, Vuvuzela prevents
-%outside observers from differentiating between people sending messages,
-%receiving messages, or neither, according to Lazar. The system imposes
-%noise on the client-server traffic which cannot be distinguished from actual
-%messages, and all communications are triple-wrapped in encryption by three
-%servers. "Vuvuzela guarantees privacy as long as one of the servers is
-%uncompromised, so using more servers increases security at the cost of
-%increased message latency," Lazar notes.
-%http://orange.hosting.lsoft.com/trk/click?ref=znwrbbrs9_5-e70bx2d991x066779&
+% hard forks
+% https://github.com/bitcoin/bips/blob/master/bip-0050.mediawiki
+
+% only 25% needed to obtain larger shares of mining
+% http://www.cs.cornell.edu/~ie53/publications/btcProcFC.pdf
-%%%%
-%% canvas tracking
-%%https://freedom-to-tinker.com/blog/englehardt/the-princeton-web-census-a-1-million-site-measurement-and-analysis-of-web-privacy/
-
-%%%
-%% cupit re-identification attack
-%% https://nakedsecurity.sophos.com/2016/05/20/published-personal-data-on-70000-okcupid-users-taken-down-after-dmca-order/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+nakedsecurity+%28Naked+Security+-+Sophos%29
+% re-identification attacks
+% https://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf
-%Differential privacy
-%=====================
-%https://www.wired.com/2016/06/apples-differential-privacy-collecting-data/
-
-%Differential privacy, translated from Apple-speak, is the
-%statistical science of trying to learn as much as possible
-%about a group while learning as little as possible about any
-%individual in it.
+% bit-coin papers
+% https://crypto.stanford.edu/cs251/syllabus.html
-%As Roth notes when he refers to a “mathematical proof,”
-%differential privacy doesn’t merely try to obfuscate or
-%“anonymize” users’ data. That anonymization approach, he
-%argues, tends to fail. In 2007, for instance, Netflix released
-%a large collection of its viewers’ film ratings as part of a
-%competition to optimize its recommendations, removing people’s
-%names and other identifying details and publishing only their
-%Netflix ratings. But researchers soon cross-referenced the
-%Netflix data with public review data on IMDB to match up
-%similar patterns of recommendations between the sites and add
-%names back into Netflix’s supposedly anonymous database.
-
-%As an example of that last method, Microsoft’s Dwork points to
-%the technique in which a survey asks if the respondent has
-%ever, say, broken a law. But first, the survey asks them to
-%flip a coin. If the result is tails, they should answer
-%honestly. If the result is heads, they’re instructed to flip
-%the coin again and then answer “yes” for heads or “no” for
-%tails. The resulting random noise can be subtracted from the
-%results with a bit of algebra, and every respondent is
-%protected from punishment if they admitted to lawbreaking.
-
-%https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf
+% bit coin talk --- at 20:00 mins
+%https://www.usenix.org/conference/lisa16/conference-program/presentation/perlman
-% Windows 10 data send back to Microsoft (Cortana)
-%Here’s a non-exhaustive list of data sent back: location data, text
-%input, voice input, touch input, webpages you visit, and telemetry
-%data regarding your general usage of your computer, including which
-%programs you run and for how long.
-
-% Businesses are already using customised pricing online based on
-% information they can glean about you. It is hard to know how
-% widespread the practice is; companies keep their pricing strategies
-% closely guarded and are wary of the bad PR price discrimination
-% could pose. However, it is clear that a number of large retailers
-% are experimenting with it. Staples, for example, has offered
-% discounted prices based on whether rival stores are within 20 miles
-% of its customers’ location. Office Depot has admitted to using its
-% customers’ browsing history and location to vary its range of offers
-% and products. A 2014 study from Northeastern University found
-% evidence of “steering” or differential pricing at four out of 10
-% general merchandise websites and five out of five travel
-% websites. (Steering is when a company doesn’t give you a customised
-% price, but points you towards more expensive options if it thinks
-% you will pay more.) The online travel company Orbitz raised
-% headlines in 2012 when it emerged that the firm was pointing Mac
-% users towards higher-priced hotel rooms than PC users.
+% In fact, far from freeing people from the oppression of the state,
+% blockchains perversely promise the perfect tool for a fully
+% auditable, tax compliant, cashless society. Similarly, the belief it
+% is an anonymous digital cash has quickly vanished and we are now
+% seeing a large number of analytics companies, set-up specifically to
+% work with law enforcement agencies, to police this new parallel
+% financial system.
+%
+% But today blockchain is riddled with
+% contradictions and misunderstandings. Most of its problems are very
+% fixable, if you want to fix them
-%%% government will overwrite your wishes if it is annoymous
-%% https://www.lightbluetouchpaper.org/2016/12/05/government-u-turn-on-health-privacy/
-
-%% corporate surveilance / privacy - report and CC3C talk
-%% http://crackedlabs.org/en/networksofcontrol
-%% https://media.ccc.de/v/33c3-8414-corporate_surveillance_digital_tracking_big_data_privacy#video&t=2933
-
-\section*{Handout 6 (Privacy)}
+% history of bitcoins
+% https://futurism.com/images/this-week-in-tech-jan-15-22-2016/
-The first motor car was invented around 1886. For ten years,
-until 1896, the law in the UK (and elsewhere) required a
-person to walk in front of any moving car waving a red flag.
-Cars were such a novelty that most people did not know what to
-make of them. The person with the red flag was intended to
-warn the public, for example horse owners, about the impending
-novelty---a car. In my humble opinion, we are at the same
-stage of development with privacy. Nobody really knows what it
-is about or what it is good for. All seems very hazy. There
-are a few laws (e.g.~cookie law, right-to-be-forgotten law)
-which address problems with privacy, but even if they are well
-intentioned, they either back-fire or are already obsolete
-because of newer technologies. The result is that the world of
-``privacy'' looks a little bit like the old Wild
-West---lawless and mythical.
+\begin{document}
+\fnote{\copyright{} Christian Urban, 2014, 2015}
-We would have hoped that after Snowden, Western governments
-would be a bit more sensitive and enlightned about the topic
-of privacy, but this is far from the truth. Ross Anderson
-wrote the following in his blog\footnote{\url{https://www.lightbluetouchpaper.org/2016/02/11/report-on-the-ip-bill/}} about the approach taken in
-the US to lessons learned from the Snowden leaks and contrasts
-this with the new snooping bill that is considered in the UK
-parliament:
+\section*{Handout 7 (Bitcoins)}
-\begin{quote}\it
-``The comparison with the USA is stark. There, all three
-branches of government realised they'd gone too far after
-Snowden. President Obama set up the NSA review group, and
-implemented most of its recommendations by executive order;
-the judiciary made changes to the procedures of the FISA
-Court; and Congress failed to renew the data retention
-provisions in the Patriot Act (aided by the judiciary). Yet
-here in Britain the response is just to take Henry VIII powers
-to legalise all the illegal things that GCHQ had been up to,
-and hope that the European courts won't strike the law down
-yet again.''
-\end{quote}
+In my opinion Bitcoins are an elaborate Ponzi
+scheme\footnote{\url{http://en.wikipedia.org/wiki/Ponzi_scheme}}---still
+the ideas behind them are really beautiful and not too
+difficult to understand. Since many colourful claims about
+Bitcoins float around in the mainstream and not-so-mainstream
+media, it will be instructive to re-examine such claims from a
+more technically informed vantage point. For example, it is
+often claimed that Bitcoins are anonymous and free from any
+potential government meddling. It turns out that the first
+claim ignores a lot of research in de-anonymising social
+networks, and the second underestimates the persuasive means a
+government has at its disposal.
-\noindent Unfortunately, also big organisations besides
-governments seem to take an unenlightened approach to privacy.
-For example, UCAS, a charity set up to help students with
-applying to universities in the UK, has a commercial unit that
-happily sells your email addresses to anybody who forks out
-enough money for bombarding you with spam. Yes, you can opt
-out very often from such ``schemes'', but in case of UCAS any
-opt-out will limit also legit emails you might actually be
-interested in.\footnote{The main objectionable point, in my
-opinion, is that the \emph{charity} everybody has to use for
-HE applications has actually very honourable goals
-(e.g.~assist applicants in gaining access to universities),
-but the small print (or better the link ``About us'') reveals
-they set up their organisation so that they can also
-shamelessly sell the email addresses they ``harvest''.
-Everything is of course very legal\ldots{}ethical?\ldots{}well
-that is in the eye of the beholder. See:
+There are a lot of articles, blogposts, research papers
+etc.~available about Bitcoins. Below I will follow closely the
+very readable explanations from
+
+\begin{center}
+\url{http://www.michaelnielsen.org/ddi/how-the-bitcoin-protocol-actually-works/} \;\;and\smallskip\\
+\url{http://www.imponderablethings.com/2013/07/how-bitcoin-works-under-hood.html}
+\end{center}
-\url{http://www.ucas.com/about-us/inside-ucas/advertising-opportunities}
-or
-\url{http://www.theguardian.com/uk-news/2014/mar/12/ucas-sells-marketing-access-student-data-advertisers}}
+\noindent The latter also contains a link to a nice youtube
+video about the technical details behind Bitcoins. I will
+also use some of their pictures.
-Another example: Verizon, an ISP who is supposed to provide
-you just with connectivity, has found a ``nice'' side-business
-too: When you have enabled all privacy guards in your browser
-(the few you have at your disposal), Verizon happily adds a
-kind of cookie to your
-HTTP-requests.\footnote{\url{http://webpolicy.org/2014/10/24/how-verizons-advertising-header-works/}}
-As shown in the picture below, this cookie will be sent to
-every web-site you visit. The web-sites then can forward the
-cookie to advertisers who in turn pay Verizon to tell them
-everything they want to know about the person who just made
-this request, that is you.
-
+Let us start with the question who invented Bitcoins? You
+could not make up the answer, but we actually do not know who
+the inventor is. All we know is that the first paper
+
\begin{center}
-\includegraphics[scale=0.16]{../pics/verizon.png}
+\url{https://bitcoin.org/bitcoin.pdf}
\end{center}
-\noindent How disgusting! Even worse, Verizon is not known for
-being the cheapest ISP on the planet (completely the
-contrary), and also not known for providing the fastest
-possible speeds, but rather for being among the few ISPs in
-the US with a quasi-monopolistic ``market distribution''.
+\noindent is signed by Satoshi Nakamoto, which however is
+likely only a pen name. There is a lot of speculation who
+could be the inventor, or inventors, but we simply do not
+know. This part of Bitcoins is definitely anonymous so far.
+The paper above is from the end of 2008; the first Bitcoin
+transaction was made in January 2009. The rules in Bitcoin are
+set up so that there will only ever be 21 Million Bitcoins
+with the maximum reached around the year 2140. Currently there
+are already 11 Million Bitcoins in `existence'. Contrast this
+with traditional fiat currencies where money can be printed
+almost at will. The smallest unit of a Bitcoin is called a
+Satoshi, which is the $10^{-8}$th part of a Bitcoin. Remember
+a Penny is the $10^{-2}$th part of a Pound.
+The two main cryptographic building blocks of Bitcoins are
+cryptographic hashing functions (SHA-256) and public-private
+keys using the elliptic-curve encryption scheme for digital
+signatures. Hashes are used to generate `fingerprints' of data
+that ensure integrity (absence of tampering). Public-private
+keys are used for signatures. For example sending a message,
+say $msg$, together with the encrypted version
+
+\[
+msg, \{msg\}_{K^{priv}}
+\]
+
+\noindent allows everybody with access to the corresponding
+public key $K^{pub}$ to verify that the message came from the
+person who knew the private key. Signatures are used in
+Bitcoins for verifying the addresses where the Bitcoins are
+sent from. Addresses in Bitcoins are essentially the public
+keys. There are $2^{160}$ possible addresses, which is such a
+vast amount that there is not even a check for duplicates, or
+already used addresses. If you start with a random number to
+generate a public-private key pair it is very unlikely that
+you step on somebody else's shoes. Compare this with the
+email-addresses you wanted to register with, say
+Gmail, but which are always already taken.
-Well, we could go on and on\ldots{}and that has not even
-started us yet with all the naughty things NSA \& Friends are
-up to. Why does privacy actually matter? Nobody, I think, has
-a conclusive answer to this question yet. Maybe the following
-four notions help with clarifying the overall picture
-somewhat:
+One major difference between Bitcoins and traditional banking
+is that you do not have a place, or few places, that record the
+balance on your account. Traditional banking involves a
+central ledger which specifies the current balance in each
+account, for example
+
+\begin{center}
+\begin{tabular}{l|r}
+account owner & balance\\\hline
+Alice & \pounds{10.01}\\
+Bob & \pounds{4.99}\\
+Charlie & -\pounds{1.23}\\
+Eve & \pounds{0.00}
+\end{tabular}
+\end{center}
+
+\noindent Bitcoins work differently in that there is no such
+central ledger, but instead a public record of all
+transactions ever made. This means spending money corresponds
+to sending messages of the (oversimplified) form
+
+\begin{equation}
+\{\text{I, Alice, am giving Bob one Bitcoin.}\}_{K^{priv}_{Alice}}
+\end{equation}
+
+\noindent These messages, called transactions, are the only
+data that is ever stored in the Bitcoin system (we will come
+to the precise details later on). The transactions are
+encrypted with Alice's private key so that everybody,
+including Bob, can use Alice's public key $K^{pub}_{Alice}$ to
+verify that this message came really from Alice, or more
+precisely from the person who knows $K^{priv}_{Alice}$.
+
+The problem with such messages in a distributed system is that
+what happens if Bob receives 10, say, of these transactions?
+Did Alice intend to send him 10 Bitcoins, or did the message
+get duplicated by for example an attacker re-playing a sniffed
+message? What is needed is a kind of serial number for such
+transactions. This means transaction messages shoul look more like
+
+\begin{center}
+$\{\text{I, Alice, am giving Bob Bitcoin \#1234567.}\}_{K^{priv}_{Alice}}$
+\end{center}
+
+\noindent There are two difficulties, however, that need to be
+solved with serial numbers. One is who is assigning serial
+numbers to Bitcoins and also how can Bob verify that Alice
+actually owns this Bitcoin to pay him? In a system with a bank
+as trusted third-party, Bob could do the following:
\begin{itemize}
-\item \textbf{Secrecy} is the mechanism used to limit the
- number of principals with access to information (e.g.,
- cryptography or access controls). For example I better
- keep my password secret, otherwise people from the wrong
- side of the law might impersonate me.
-
-\item \textbf{Confidentiality} is the obligation to protect
- the secrets of other people or organisations (secrecy
- for the benefit of an organisation). For example as a
- staff member at King's I have access to data, even
- private data, I am allowed to use in my work but not
- allowed to disclose to anyone else.
-
-\item \textbf{Anonymity} is the ability to leave no evidence of
- an activity (e.g., sharing a secret). This is not equal
- with privacy---anonymity is required in many
- circumstances, for example for whistle-blowers,
- voting, exam marking and so on.
-
-\item \textbf{Privacy} is the ability or right to protect your
- personal secrets (secrecy for the benefit of an
- individual). For example, in a job interview, I might
- not like to disclose that I am pregnant, if I were a
- woman, or that I am a father. Lest they might not hire
- me. Similarly, I might not like to disclose my location
- data, because thieves might break into my house if they
- know I am away at work. Privacy is essentially
- everything which ``shouldn't be anybody's business''.
-
+\item Bob asks the bank whether the Bitcoin with that serial
+ number belongs to Alice and Alice hasn't already spent
+ this Bitcoin.
+\item If yes, then Bob tells the bank he accepts this Bitcoin.
+ The bank updates the records to show that the Bitcoin
+ with that serial number is now in Bob’s possession and
+ no longer belongs to Alice.
\end{itemize}
-\noindent While this might provide us with some rough
-definitions, the problem with privacy is that it is an
-extremely fine line what should stay private and what should
-not. For example, since I am working in academia, I am every
-so often very happy to be a digital exhibitionist: I am very
-happy to disclose all `trivia' related to my work on my
-personal web-page. This is a kind of bragging that is normal
-in academia (at least in the field of CS), even expected if
-you look for a job. I am even happy that Google maintains a
-profile about all my academic papers and their citations.
+\noindent But for this banks would need to be trusted and
+would also be an easy target for any government interference,
+for example. Think of the early days of music sharing where
+the company Napster was the trusted third-party but also the single point of ``failure'' which
+was taken offline by law enforcement. Bitcoins is more like a
+system such as BitTorrent without a single central entity that
+can be taken offline.\footnote{There is some Bitcoin
+infrastructure that is not so immune from being taken offline:
+for example Bitcoin exchanges, HQs of Bitcoin mining pools,
+Bitcoin developers and so on.}
+
+Bitcoins solve the problem of not being able to rely on a bank
+by making everybody the ``bank''. Everybody who cares can have
+the entire transaction history starting with the first
+transaction made in January 2009. This history of transactions
+is called the \emph{blockchain}. Bob, for example, can use his
+copy of the blockchain for determining whether Alice owned the
+Bitcoin he received, and if she did, he transmits the message
+that he owns it now to every other participant on the Bitcoin
+network. An illustration of a three-block segment of the
+blockchain is (simplified) as follows
+
+\begin{equation}
+\includegraphics[scale=0.4]{../pics/bitcoinblockchain0.png}
+\label{segment}
+\end{equation}
+
+\noindent The chain grows with time. Each block contains a
+list of individual transactions, written txn in the picture
+above, and also a reference to the previous block, written
+prev. The data in a block (txn's and prev) is hashed so that
+the reference and transactions in them cannot be tampered
+with. This hash is also the unique serial number of each
+block. Since this previous-block-reference is also part of the
+hash, the whole chain is robust against tampering. I let you
+think why this is the case?\ldots{}But does it actually
+eliminate all possibilities of fraud?
+
+We can check the consistency of the blockchain by checking
+whether all the references and hashes are correctly recorded.
+I have not tried it myself, but it is said that with the
+current amount of data (appr.~12GB) it takes roughly a day to
+check the consistency of the blockchain on a normal computer.
+Fortunately this ``extended'' consistency check usually only
+needs to be done once. Afterwards the blockchain only needs to
+be updated consistently.
+
+Recall I wrote earlier that Bitcoins do not maintain a ledger,
+which lists all the current balances in each account. Instead
+only transactions are recorded. While a current balance of an
+account is not immediately available, it is possible to
+extract from the blockchain a transaction graph that looks
+like the picture shown in Figure~\ref{txngraph}. Each
+rectangle represents a single transaction. Take for example
+the rightmost lower transaction from Charles to Emily. This
+transaction has as receiver the address of Emily and as the
+sender the address of Charles. In this way no Bitcoins can
+appear out of thin air (we will discuss later how Bitcoins are
+actually generated). If Charles did not have a transaction of
+at least the amount he wants to give Emily to his name
+(i.e.~send to an address with his public-private key) then
+there is no way he can make a payment to Emily. Equally, if
+now Emily wants to pay for a coffee, say, with the Bitcoin she
+received from Charles she can essentially only forward the
+message she received. The only slight complication with this
+setup in Bitcoins is that ``incoming'' Bitcoins can be
+combined in a transaction and ``outgoing'' Bitcoins can be
+split. For example in the leftmost upper transactions in
+Figure~\ref{txngraph}, Fred makes a payment to Alice. But this
+payment (or transaction) combines the Bitcoins that were send
+by Jane to Fred and also by Juan to Fred. This allows you to
+``consolidate'' your funds: if it were only possible to split
+transactions, then the amounts would get smaller and smaller.
+
+In Bitcoins you have the ability to both combine incoming
+transactions, but also to split outgoing transactions to
+potentially more than one receiver. The latter is also needed.
+Consider again the rightmost transactions in
+Figure~\ref{txngraph} and suppose Alice is a coffeeshop owner
+selling coffees for 1 Bitcoin. Charles received a transaction
+from Zack over 5 Bitcoins, say. How does Charles pay for the
+coffee? There is no explicit notion of \emph{change} in the
+Bitcoin system. What Charles has to do instead is to make one
+single transaction with 1 Bitcoin to Alice and with 4 Bitcoins
+going back to himself, which then Charles can use to give to
+Emily, for example.
+
+\begin{figure}[t]
+\begin{center}
+\includegraphics[scale=0.4]{../pics/blockchain.png}
+\end{center}
+\caption{Transaction graph that is implicitly recorded in the
+public blockchain.\label{txngraph}}
+\end{figure}
+
+Let us consider another example. Suppose Emily received 4
+Bitcoins from Charles and independently received another
+transaction (not shown in the picture) that sends 6 Bitcoins
+to her. If she now wants to buy a coffee from Alice for 1
+Bitcoin, she has two possibilities: She could just forward the
+transaction from Charles over 4 Bitcoins to Alice split in
+such a way that Alice receives 1 Bitcoin and Emily sends the
+remaining 3 Bitcoins back to herself. In this case she would
+now be in the possession of two unspend Bitcoin transactions,
+one over 3 Bitcoins and the independent one over 6 Bitcoins.
+Or, Emily could combine both transactions (one over 4 Bitcoins
+from Charles and the independent one over 6 Bitcoins) and then
+split this amount with 1 Bitcoin going to Alice and 9 Bitcoins
+going back to herself.
+
+I think this is a good time for you to pause to let this
+concept of transactions to really sink in\ldots{}You should
+come to the conclusion that there is really no need for a central ledger and no
+need for an account balance as familiar from traditional
+banking. The closest what Bitcoin has to offer for the notion
+of a balance in a bank account are the unspend transactions
+that a person (more precisely a public-private key address)
+received. That means transactions that can still be forwarded.
+
+After the pause also consider the fact that whatever
+transaction is recorded in the blockchain will be in the
+``historical record'' for the Bitcoin system. If a transaction
+says 1 Bitcoin goes from address $A$ to address $B$, then this
+is what will be---$B$ has then the possibility to spend the
+corresponding Bitcoins, whether the transaction was done
+fraudulently or not. There is no exception to this rule.
+Interestingly this is also how Bitcoins can get lost: One
+possibility is that you send Bitcoins to an address for which
+nobody has generated a private key, for example because of a
+typo in the address field---bad luck for fat
+fingers\footnote{\url{http://en.wikipedia.org/wiki/Typographical_error}}
+in the Bitcoin system. The reason is that nobody has a private
+key for this erroneous address and consequently cannot forward
+the transaction anymore. Another possibility is that you
+forget your private key and you had messages forwarded to the
+corresponding public key. Also in this case bad luck: you will
+never be able to forward this message again, because you will
+not be able to form a valid message that sends this to
+somebody else (we will see the details of this later). But
+this is also a way how you can get robbed of your Bitcoins. By
+old-fashioned hacking-into-a-computer crime, for example, an
+attacker might get hold of your private key and then quickly
+forwards the Bitcoins that are in your name to an address the
+attacker controls. You will never again have access to these
+Bitcoins, because for the Bitcoin system they are assumed to
+be spent. And remember with Bitcoins you cannot appeal to any
+higher authority. Once the Bitcoins are gone, they are gone.
+This is much different in traditional banking where at least
+you can try to harass the bank to roll back the transaction.
+
+This brings us to back to problem of double spend. Suppose Bob
+is a merchant. How can he make sure that Alice does not cheat
+him? She could for example send a transaction to Bob. But also
+forward the ``same'' transaction to Charlie, or even herself.
+If Alice manages to get the second transaction into the
+blockchain, Bob will be cheated out of his money. The problem
+in such conflicting situations is how should the network
+update their blockchain? You might end up with a picture like
+this
+
+\begin{center}
+\includegraphics[scale=0.4]{../pics/bitcoindisagreement.png}
+\end{center}
-On the other hand I would be very irritated if anybody I do
-not know had a too close look on my private live---it
-shouldn't be anybody's business. The reason is that knowledge
-about my private life can often be used against me. As mentioned
-above, public location data might mean I get robbed. If
-supermarkets build a profile of my shopping habits, they will
-use it to \emph{their} advantage---surely not to \emph{my}
-advantage. Also whatever might be collected about my life will
-always be an incomplete, or even misleading, picture. For
-example I am pretty sure my creditworthiness score was
-temporarily(?) destroyed by not having a regular income in
-this country (before coming to King's I worked in Munich for
-five years). To correct such incomplete or flawed credit
-history data there is, since recently, a law that allows you
-to check what information is held about you for determining
-your creditworthiness. But this concerns only a very small
-part of the data that is held about me/you. Also
-what about cases where data is wrong or outdated (but do we
-need a right-to be forgotten).
+\noindent where Alice convinced some part of the ``world''
+that she is still the owner of the Bitcoin and some other part
+of the ``world'' thinks it's Bob's. How should such a
+disagreement be resolved? This is actually the main hurdle
+where Bitcoin really innovated. The answer is that Bob needs
+to convince ``enough'' people on the network that the
+transaction from Alice to him is legit.
+
+What does, however, ``enough'' mean in a distributed system?
+If Alice sets up a network of a billion, say, puppy identities
+and whenever Bob tries to convince, or validate, that he is
+the rightful owner of the Bitcoin, then the puppy identities
+agree. Bob would then have no reason to not give Alice her
+coffee. But behind his back she has convinced everybody else
+on the network that she is still the rightful owner of the
+Bitcoin. After being outvoted, Bob would be a tad peeved.
+
+The reflex reaction to such a situation would be to make the
+process of validating a transaction as cheap as possible. The
+intention is that Bob will easily get enough peers to agree with him
+that he is the rightful owner. But such a solution has always
+the limitation of Alice setting up an even bigger network of
+puppy identities. The really cool idea of Bitcoin is to go
+into the other direction of making the process of transaction
+validation (artificially) as expensive as possible, but reward
+people for helping with the validation. This is really a novel
+and counterintuitive idea that makes the whole system of
+Bitcoins work so beautifully.
+
+\subsubsection*{Proof-of-Work Puzzles}
+
+In order to make the process of transaction validation
+difficult, Bitcoin uses a kind of puzzle. Solving the puzzles
+is called \emph{Bitcoin mining}, where whoever solves a puzzle
+will be awarded some Bitcoins. At the beginning this was 50
+Bitcoins, but the rules of Bitcoin are set up such that this
+amount halves every 210,000 transactions or so. Currently you
+will be awarded 25 Bitcoins for solving a puzzle. Because the
+amount will halve again and then later again and again, around
+the year 2140 it will go below the level of 1 Satoshi. In that
+event no new Bitcoins will ever be created again and the
+amount of Bitcoins stays fixed. There will be still an
+incentive to help with validating transactions, because there
+is the possibility in Bitcoins to offer a transaction fee to
+whoever solves a puzzle. At the moment this fee is usually set
+to 0, since the incentive for miners is the 25 Bitcoins that
+are currently awarded for solving puzzles.
+
+What do the puzzles that miners have to solve look like? The
+puzzles can be illustrated roughly as follows: Given a string,
+say \code{"Hello, world!"}, what is the salt so that the hash
+starts with a long run of zeros? Let us look at a concrete
+example. Recall that Bitcoins use the hash-function SHA-256.
+Suppose we call this hash function \code{h}, then we could try
+the salt \code{0} as follows:
+
+\begin{quote}
+\code{h("Hello, world!0") =}\\
+\mbox{}\quad\footnotesize\pcode{1312af178c253f84028d480a6adc1e25e81caa44c749ec81976192e2ec934c64}
+\end{quote}
+
+\noindent OK this does not have any zeros at all. We could
+next try the salt \code{1}:
+
+\begin{quote}
+\code{h("Hello, world!1") =}\\
+\mbox{}\quad\footnotesize\pcode{e9afc424b79e4f6ab42d99c81156d3a17228d6e1eef4139be78e948a9332a7d8}
+\end{quote}
+
+\noindent Again this hash value does not contain any leading
+zeros. We could now try out every salt until we reach
+
+\begin{quote}
+\code{h("Hello, world!4250") =}\\
+\mbox{}\quad\footnotesize\pcode{0000c3af42fc31103f1fdc0151fa747ff87349a4714df7cc52ea464e12dcd4e9}
+\end{quote}
+
+\noindent where we have four leading zeros. If four zeros are
+enough, then the puzzle would be solved with this salt. The
+point is that we can very quickly check whether a salt solves
+a puzzle, but it is hard to find one. Latest research suggest
+it is an NP-problem. If we want the output hash value to begin
+with 10 zeroes, say, then we will, on average, need to try
+$16^{10} \approx 10^{12}$ different salts before we find a
+suitable one.
+
+In Bitcoins the puzzles are not solved according to how many
+leading zeros a hash-value has, but rather whether it is below
+a \emph{target}. The hardness of the puzzle can actually be
+controlled by changing the target according to the available
+computational power available. I think the adjustment of the
+hardness of the problems is done every 2060 blocks
+(appr.~every two weeks). The aim of the adjustment is that on
+average the Bitcoin network will most likely solve a puzzle
+within 10 Minutes.
+
+\begin{center}
+\includegraphics[scale=0.37]{../pics/blockchainsolving.png}
+\end{center}
+
+\noindent It could be solved quicker, but equally it could
+take longer, but on average after 10 Minutes somebody on the
+network will have found a solution.
+
+Remember that the puzzles are a kind of proof-of-work that
+make the validation of transactions artificially expensive.
+Consider the following picture with a blockchain and some
+unconfirmed transactions.
+
+\begin{equation}
+\includegraphics[scale=0.38]{../pics/bitcoin_unconfirmed.png}
+\label{unconfirmed}
+\end{equation}
+
+\noindent The puzzle is stated as follows: There are some
+unconfirmed transactions. Choosing some of them, the miner
+(i.e.~the person/computer that tries to solve a puzzle) will
+form a putative block to be added to the blockchain. This
+putative block will contain the transactions and the reference
+to the previous block. The serial number of such a block is
+simply the hash of all the data. The puzzle can then be stated
+as the ``string'' corresponding to the block and which salt is
+needed in order to have the hashed value being below the
+target. Other miners will choose different transactions and
+therefore work on a slightly different putative block and
+puzzle.
+
+The intention of the proof-of-work puzzle is that the
+blockchain is at every given moment linearly ordered, see the
+picture shown in \eqref{unconfirmed}. If we don’t have such a
+linear ordering at any given moment then it may not be clear
+who owns which Bitcoins. Assume a miner David is lucky and
+finds a suitable salt to confirm some transactions. Should he
+celebrate? Not yet. Typically the blockchain will look as
+follows
+
+\begin{center}
+\includegraphics[scale=0.65]{../pics/block_chain1.png}
+\end{center}
+
+\noindent But every so often there will be a fork
+
+\begin{center}
+\includegraphics[scale=0.65]{../pics/block_chain_fork.png}
+\end{center}
+
+\noindent What should be done in this case? Well, the tie is
+broken if another block is solved, like so:
+
+\begin{center}
+\includegraphics[scale=0.4]{../pics/bitcoin_blockchain_branches.png}
+\end{center}
+
+\noindent The rule in Bitcoins is: If a fork occurs, people on
+the network keep track of all forks (they can see). But at any
+given time, miners only work to extend whichever fork is
+longest in their copy of the block chain. Why should miners
+work on the longest fork? Well their incentive is to mine
+Bitcoins. If somebody else already solved a puzzle, then it
+makes more sense to work on a new puzzle and obtain the
+Bitcoins for solving that puzzle, rather than waste efforts on
+a fork that is shorter and therefore less likely to be
+``accepted''. Note that whoever solved a puzzle on the
+``loosing'' fork will actually not get any Bitcoins as reward.
+Tough luck.
-To see how private matter can lead really to the wrong
-conclusions, take the example of Stephen Hawking: When he was
-diagnosed with his disease, he was given a life expectancy of
-two years. If employers would know about such problems, would
-they have employed Hawking? Now, he is enjoying his 70+
-birthday. Clearly personal medical data needs to stay private.
-
-To cut a long story short, I let you ponder about the two
-statements which are often voiced in discussions about privacy:
+\subsubsection*{Alice against the Rest of the World}
-\begin{itemize}
-\item \textit{``You have zero privacy anyway. Get over
-it.''}\\
-\mbox{}\hfill{}{\small{}(by Scott Mcnealy, former CEO of Sun)}
+Let us see how the blockchain and the proof-of-work puzzles
+avoid the problem of double spend. If Alice wants to cheat
+Bob, she would need to pull off the following ploy:
-\item \textit{``If you have nothing to hide, you have nothing
-to fear.''}
-\end{itemize}
-
-\noindent If you like to watch a movie which has this topic as
-its main focus I recommend \emph{Gattaca} from
-1997.\footnote{\url{http://www.imdb.com/title/tt0119177/}} If
-you want to read up on this topic, I can recommend the
-following article that appeared in 2011 in the Chronicle of
-Higher Education:
+\begin{center}
+\includegraphics[scale=0.4]{../pics/bitcoin_blockchain_double_spend.png}
+\end{center}
-\begin{center}
-\url{http://chronicle.com/article/Why-Privacy-Matters-Even-if/127461/}
-\end{center}
+\noindent Alice makes a transaction to Bob for paying, for
+example, for an online order. This transaction is confirmed,
+or validated, in block 2. Bob ships the goods around block 4.
+In this moment, Alice needs to get into action and try to
+validate the fraudulent transaction to herself instead. At
+this moment she is in a race against all the computing power
+of the ``rest of the world''. Because the incentive of the
+rest of the world is to work on the longest chain, that is the
+one with the transaction from Alice to Bob:
-\noindent Funnily, or maybe not so funnily, the author of this
-article carefully tries to construct an argument that does not
-only attack the nothing-to-hide statement in cases where
-governments \& co collect people's deepest secrets, or
-pictures of people's naked bodies, but an argument that
-applies also in cases where governments ``only'' collect data
-relevant to, say, preventing terrorism. The fun is of course
-that in 2011 we could just not imagine that respected
-governments would do such infantile things as intercepting
-people's nude photos. Well, since Snowden we know some people
-at the NSA did exactly that and then shared such photos among
-colleagues as ``fringe benefit''.
+\begin{center}
+\includegraphics[scale=0.4]{../pics/bitcoin_doublespend_blockchain_race.png}
+\end{center}
-
-\subsubsection*{Re-Identification Attacks}
+\noindent As shown in the picture she has to solve the puzzles
+2a to 5a one after the other, because the hash of a block is
+determined via the reference by all the data in the previous
+block. She might be very lucky to solve one puzzle for a block
+before the rest of the world, but to be lucky many times is
+very unlikely. This principle of having to race against the
+rest of the world avoids the ploy of double spend.
-Apart from philosophical musings, there are fortunately also
-some real technical problems with privacy. The problem I want
-to focus on in this handout is how to safely disclose datasets
-containing potentially very private data, say health records.
-What can go wrong with such disclosures can be illustrated
-with four well-known examples:
+In order to raise the bar for Alice even further, merchants
+accepting Bitcoins use the following rule of thumb: A
+transaction is ``confirmed'' if
\begin{itemize}
-\item In 2006, a then young company called Netflix offered a 1
- Mio \$ prize to anybody who could improve their movie
- rating algorithm. For this they disclosed a dataset
- containing 10\% of all Netflix users at the time
- (appr.~500K). They removed names, but included numerical
- ratings of movies as well as times when ratings were
- uploaded. Though some information was perturbed (i.e.,
- slightly modified).
-
- Two researchers had a closer look at this anonymised
- data and compared it with public data available from the
- International Movie Database (IMDb). They found that
- 98\% of the entries could be re-identified in the
- Netflix dataset: either by their ratings or by the dates
- the ratings were uploaded. The result was a class-action
- suit against Netflix, which was only recently resolved
- involving a lot of money.
+\item[(1)] it is part of a block in the longest fork, and
+\item[(2)] at least 5 blocks follow it in the longest fork. In
+ this case we say that the transaction has 6
+ confirmations.
+\end{itemize}
+
+\noindent A simple calculation shows that this amount of
+confirmations can take up to 1 hour and more. While this seems
+excessively long, from the merchant's point of view it is not
+that long at all. For this recall that ordinary creditcards
+can have their transactions been rolled-back for 6 months or
+so. The point however is that the odds for Alice being able to
+cheat are very low, unless she can muster more than 50\% of
+the world Bitcoin mining capacity. In this case she could
+out-race the rest of the world. The point is however that
+amassing such an amount of computing power is practically
+impossible for a single person or even a moderately large
+group.
-\item In the 1990ies, medical datasets were often made public
- for research purposes. This was done in anonymised form
- with names removed, but birth dates, gender and ZIP-code
- were retained. In one case where such data about
- hospital visits of state employees in Massachusetts was
- made public, the then governor assured the public that
- the released dataset protected patient privacy by
- deleting identifiers.
-
- A graduate student could not resist cross-referencing
- public voter data with the released data that still
- included birth dates, gender and ZIP-code. The result
- was that she could send the governor his own hospital
- record. It turns out that birth dates, gender and
- ZIP-code uniquely identify 87\% of people in the US.
- This work resulted in a number of laws prescribing which
- private data cannot be released in such datasets.
-
-\item In 2006, AOL published 20 million Web search queries
- collected from 650,000 users (names had been deleted).
- This was again done for research purposes. However,
- within days an old lady, Thelma Arnold, from Lilburn,
- Georgia, (11,596 inhabitants) was identified as user
- No.~4417749 in this dataset. It turned out that search
- engine queries are deep windows into people's private
- lives.
-
-\item Genome-Wide Association Studies (GWAS) was a public
- database of gene-frequency studies linked to diseases.
- It would essentially record that people who have a
- disease, say diabetes, have also certain genes. In order
- to maintain privacy, the dataset would only include
- aggregate information. In case of DNA data this
- aggregation was achieved by mixing the DNA of many
- individuals (having a disease) into a single solution.
- Then this mixture was sequenced and included in the
- dataset. The idea was that the aggregate information
- would still be helpful to researchers, but would protect
- the DNA data of individuals.
-
- In 2007 a forensic computer scientist showed that
- individuals can still be identified. For this he used
- the DNA data from a comparison group (people from the
- general public) and ``subtracted'' this data from the
- published data. He was left with data that included all
- ``special'' DNA-markers of the individuals present in
- the original mixture. He essentially deleted the
- ``background noise'' in the published data. The problem
- with DNA data is that it is of such a high resolution
- that even if the mixture contained maybe 100
- individuals, you can with current technology detect
- whether an individual was included in the mixture or
- not.
-
- This result changed completely how DNA data is nowadays
- published for research purposes. After the success of
- the human-genome project with a very open culture of
- exchanging data, it became much more difficult to
- anonymise data so that patient's privacy is preserved.
- The public GWAS database was taken offline in 2008.
-
+Connected with the 6-confirmation rule is an interesting
+phenomenon. On average, it would take several years for a
+typical computer to solve a proof-of-work puzzle, so an
+individual’s chance of ever solving one before the rest of the
+world, which typically takes only 10 minutes, is negligibly
+low. Therefore many people join groups called \emph{mining
+pools} that collectively work to solve blocks, and distribute
+rewards based on work contributed. These mining pools act
+somewhat like lottery pools among co-workers, except that some
+of these pools are quite large, and comprise more than 20\% of
+all the computers in the network. It is said that BTCC, a
+large mining pool, has limited its number of members in order
+to not solve more than 6 blocks in a row. Otherwise this would
+undermine the trust in Bitcoins, which is also not in the
+interest of BTCC, I guess. Some statistics on mining pools can
+be seen at
+
+\begin{center}
+\url{https://blockchain.info/pools}
+\end{center}
+
+\noindent Here is an interesting problem: You are part of a lottery
+pool, if you chip in some of the money to buy a lottery ticket. In
+this setting it is clear when you are in or outside of the pool. But
+how do you make sure people work hard in a mining pool in order to
+justify a fraction of any reward? If evil me had its way, I would just
+claim I do work and then sit back and relax. Or even if I do some work
+for a mining pool and I happen to find a correct salt, I would keep it
+secret and submit it to the bitcoin network on the ``side''. Actually,
+the idea of mining pools has opened up a full can of interesting
+problems.
+
+
+
+\subsubsection*{Bitcoins for Real}
+
+Let us now turn to the nitty gritty details. As a participant
+in the Bitcoin network you need to generate and store a
+public-private key pair. The public key you need to advertise
+in order to receive payments (transactions). The private key
+needs to be securely stored. For this there seem to be three
+possibilities
+
+\begin{itemize}
+\item an electronic wallet on your computer
+\item a cloud-based storage (offered by some Bitcoin services)
+\item paper-based
\end{itemize}
-\noindent There are many lessons that can be learned from
-these examples. One is that when making datasets public in
-anonymised form, you want to achieve \emph{forward privacy}.
-This means, no matter what other data that is also available
-or will be released later, the data in the original dataset
-does not compromise an individual's privacy. This principle
-was violated by the availability of ``outside data'' in the
-Netflix and governor of Massachusetts cases. The additional
-data permitted a re-identification of individuals in the
-dataset. In case of GWAS a new technique of re-identification
-compromised the privacy of people in the dataset. The case of
-the AOL dataset shows clearly how incomplete such data can be:
-Although the queries uniquely identified the older lady, she
-also looked up diseases that her friends had, which had
-nothing to do with her. Any rational analysis of her query
-data must therefore have concluded, the lady is on her
-death bed, while she was actually very much alive and kicking.
+\noindent The first two options of course offer convenience
+for making and receiving transactions. But given the nature of
+the private keys and how much security relies on them (recall
+if somebody gets hold of it, your Bitcoins are quickly lost
+forever) I would opt for the third option for anything except
+for trivial amounts of Bitcoins. As we have seen earlier in
+the course, securing a computer system that it can withstand a
+targeted breakin is still very much an unsolved problem.
+
+An interesting fact with Bitcoin keys is that there is no
+check for duplicate addresses. This means when generating a
+public-private key, you should really start with a carefully
+chosen random number such that there is really no chance to
+step on somebody's feet in the $2^{160}$ space of
+possibilities. Again if you share an address with somebody
+else, he or she has access to all your unspend transactions.
+The absence of such a check is easily explained: How would one
+do this in a distributed system? The answer you can't. It is
+possible to do some sanity check of addresses that are already
+used in the blockchain, but this is not a fail-proof method.
+One really has to trust on the enormity of the $2^{160}$
+space for addresses.
+
+Let us now look at the concrete data that is stored in an transaction
+message:
+
+\lstinputlisting[language=Scala]{../slides/msg}
+
+\noindent The hash in Line 1 is the hash of all the data that
+follows. It is a kind of serial number for the transaction.
+Line 2 contains a version number in case there are some
+incompatible changes to be made. Lines 3 and 4 specify how
+many incoming transactions are combined and how many outgoing
+transactions there are. In our example there are one for each.
+Line 5 specifies a lock time for when the transaction is
+supposed to become active---this is usually set to 0 to become
+active immediately. Line 6 specifies the size of the message;
+it has nothing to do with the Bitcoins that are transferred.
+Lines 7 to 11 specify where the Bitcoins in the transaction
+are coming from. The has in line 9 specifies the incoming
+transaction and the \pcode{n} in Line 10 specifies which
+output of the transaction is referred to. The signature in
+line 11 specifies the address (public key $K^{pub}$) from
+where the Bitcoins are taken and the digital signature of the
+address, that is $\{K^{pub}\}_{K^{priv}}$. Lines 12 to 15
+specify the value of the first outgoing transaction. In this
+case 0.319 Bitcoins. The hash in Line 14 specifies the address
+to where the Bitcoins are transferred.
+
+As can be seen there is no need to issue serial numbers for
+transactions, the hash of the transaction data can do this
+job. The hash will contain the sender addresses and
+hash-references to the incoming transactions, as well as the
+public key of the incoming transaction. This uniquely
+identifies a transaction and the hash is the unique
+fingerprint of it. The in-field also contains the address to
+which a earlier transaction is made. The digital signature
+ensures everybody can check that the person who makes this
+transaction is in the possession of the private key. Otherwise
+the signature would not match up with the public-key address.
+
+When mining the blockchain it only needs to be ensured that
+the transactions are consistent (all hashes and signatures
+match up). Then we need to generate the correct previous-block
+link and solve the resulting puzzle. Once the block is
+accepted, everybody can check the integrity of the whole
+blockchain.
+
+A word of warning: The point of a lottery is that some people
+win. But equally, that most people lose. Mining Bitcoins has
+pretty much the same point. According to the article below, a
+very large machine (very, very large in terms of June 2014)
+could potentially mine \$40 worth of Bitcoins a day, but would
+require magnitudes more of electricity costs to do so.
-In 2016, Yahoo released the so far largest machine learning
-dataset to the research community. It includes approximately
-13.5 TByte of data representing around 100 Billion events from
-anonymized user-news items, collected by recording
-interactions of about 20M users from February 2015 to May
-2015. Yahoo's gracious goal is to promote independent research
-in the fields of large-scale machine learning and recommender
-systems. It remains to be seen whether this data will really
-only be used for that purpose.
+\begin{center}
+\url{http://bitcoinmagazine.com/13774/government-bans-professor-mining-bitcoin-supercomputer/}
+\end{center}
+
+\noindent Bitcoin mining nowadays is only competitive, or
+profitable, if you get the energy for free, or use special
+purpose computing devices.
+
+This about ``free'' energy can actually hurt you very badly in
+unexpected ways. You probably have heard about, or even used,
+Amazon's Elastic Compute Cloud (EC2). Essentially, Amazon is
+selling computing power that you can use to run your web site,
+for example. It is \emph{elastic} in the sense that if you
+have a lot of visitors, you pay a lot, if you have only a few,
+then it is cheap. In order to bill you, you need to set
+up an account with Amazon and receive some secret keys in
+order to authenticate you. The clever (but also dangerous) bit
+is that you upload the code of your web site to GitHub and
+Amazon will pull it from there. You can probably already guess
+where this is going: in order to learn about Amazon's API, it
+gives out some limited computing power for free. Somebody used
+this offer in order to teach himself Ruby on Rails with a
+mildly practical website. Unfortunately, he uploaded also his
+secret keys to GitHub (this is really an easy mistake). Now,
+nasty people crawl GitHub for the purpose of stealing such
+secret keys. What can they do with this? Well, they quickly
+max out the limit of computing power with Amazon and mine
+Bitcoins (under somebody else's account). Fortunately for this
+guy, Amazon was aware of this scam and in a goodwill gesture
+refunded him the money the nasty guys incurred over
+night with their Bitcoin mining. If you want to read the
+complete story, google for ``My \$2375 Amazon EC2 Mistake''.
+
+\subsubsection*{Multi-Signature Transactions}
+
+To be explained.
-\subsubsection*{Differential Privacy}
+\subsubsection*{Anonymity with Bitcoins}
+
+One question one often hears is how anonymous is it actually
+to pay with Bitcoins? Paying with paper money used to be a
+quite anonymous act (unlike paying with credit cards, for
+example). But this has changed nowadays: You cannot come to a
+bank anymore with a suitcase full of money and try to open a
+bank account. Strict money laundering and taxation laws mean
+that not even Swiss banks are prepared to take such money and
+open a bank account. That is why Bitcoins are touted as
+filling this niche again of anonymous payments.
+
+While Bitcoins are intended to be anonymous, the reality is
+slightly different. I fully agree with the statement by
+Nielsen from the blog article I referenced at the beginning:
+
+\begin{quote}\it{}``Many people claim that Bitcoin can be used
+anonymously. This claim has led to the formation of
+marketplaces such as Silk Road (and various successors), which
+specialize in illegal goods. However, the claim that Bitcoin
+is anonymous is a myth. The block chain is public, meaning
+that it’s possible for anyone to see every Bitcoin transaction
+ever. Although Bitcoin addresses aren't immediately associated
+to real-world identities, computer scientists have done a
+great deal of work figuring out how to de-anonymise
+`anonymous' social networks. The block chain is a marvellous
+target for these techniques. I will be extremely surprised if
+the great majority of Bitcoin users are not identified with
+relatively high confidence and ease in the near future.''
+\end{quote}
+
+\noindent The only thing I can add to this is that with the Bitcoin
+blockchain we will in the future have even more pleasure hearing
+confessions from reputable or not-so-reputable people, like the
+infamous ``I did not inhale'' from an US
+president.\footnote{\url{www.youtube.com/watch?v=Bktd_Pi4YJw}} The
+whole point of the blockchain is that it public and will always be.
-Differential privacy is one of the few methods that tries to
-achieve forward privacy. The basic idea is to add appropriate
-noise, or errors, to any query of the dataset. The intention
-is to make the result of a query insensitive to individual
-entries in the database. That means the results are
-approximately the same no matter if a particular individual is
-in the dataset or not. The hope is that the added error does
-not eliminate the ``signal'' one is looking for in the
-dataset.
+There are some precautions one can take for boosting anonymity, for
+example to use a new public-private key pair for every new
+transaction, and to access Bitcoin only through the Tor network. But
+the transactions in Bitcoins are designed such that they allow one to
+combine incoming transactions. In such cases we know they must have
+been made by the single person who knew the corresponding private
+keys. So using different public-private keys for each transaction
+might not actually make the de-anonymisation task much harder. And the
+point about de-ano\-nymising `anonymous' social networks is that the
+information is embedded into the structure of the transition
+graph. And this cannot be erased with Bitcoins.
+
+One paper that has fun with spotting transactions made to Silk Road (2.0)
+and also to Wikileaks is
+
+\begin{center}
+\url{http://people.csail.mit.edu/spillai/data/papers/bitcoin-transaction-graph-analysis.pdf}
+\end{center}
+
+\noindent
+A paper that gathers some statistical data about the blockchain is
+
+\begin{center}
+\url{https://eprint.iacr.org/2012/584.pdf}
+\end{center}
+
+\subsubsection*{Government Meddling}
+
+Finally, what are the options for a typical Western government to
+meddle with Bitcoins? This is of course one feature the proponents of
+Bitcoins also tout: namely that there aren't any options. In my
+opinion this is far too naive and far from the truth. Let us assume
+some law enforcement agencies would not have been able to uncover the
+baddies from Silk Road 1.0 and 2.0 (they have done so by uncovering
+the Tor network, which is an incredible feat on its own). Would the
+government in question have stopped? I do not think so. The next
+target would have been Bitcoin. If I were the government, this is
+what I would consider:
-%\begin{center}
-%User\;\;\;\;
-%\begin{tabular}{c}
-%tell me $f(x)$ $\Rightarrow$\\
-%$\Leftarrow$ $f(x) + \text{noise}$
-%\end{tabular}
-%\;\;\;\;\begin{tabular}{@{}c}
-%Database\\
-%$x_1, \ldots, x_n$
-%\end{tabular}
-%\end{center}
-%
-%\begin{center}
-%\begin{tabular}{l|l}
-%Staff & Salary\\\hline
-%$PM$ & \pounds{107}\\
-%$PF$ & \pounds{102}\\
-%$LM_1$ & \pounds{101}\\
-%$LF_2$ & \pounds{97}\\
-%$LM_3$ & \pounds{100}\\
-%$LM_4$ & \pounds{99}\\
-%$LF_5$ & \pounds{98}
-%\end{tabular}
-%\end{center}
-%
-%
-%\begin{center}
-%\begin{tikzpicture}
-%\begin{axis}[symbolic y coords={salary},
-% ytick=data,
-% height=3cm]
%\addplot+[jump mark mid] coordinates
%{(0,salary) (0.1,salary)
% (0.4,salary) (0.5,salary)
% (0.8,salary) (0.9,salary)};
%\end{axis}
%\end{tikzpicture}
-%\end{center}
-%
-%\begin{tikzpicture}[outline/.style={draw=#1,fill=#1!20}]
% \node [outline=red] {red box};
% \node [outline=blue] at (0,-1) {blue box};
%\end{tikzpicture}
+\begin{itemize}
+\item The government could compel ``mayor players'' to blacklist
+ Bitcoins (for example at Bitcoin exchanges, which are usually
+ located somewhere in the vicinity of the government's reach). This
+ would impinge on what is called \emph{fungibility} of Bitcoins and
+ make them much less attractive to baddies. Suddenly their
+ ``hard-earned'' Bitcoin money cannot be spent anymore. The attraction
+ of this option is that this blacklisting can be easily done
+ ``whole-sale'' and therefore be really be an attractive target for
+ governments \& Co.
+\item The government could attempt to coerce the developer
+ community of the Bitcoin tools. While this might be a
+ bit harder, we know certain governments are ready to
+ take such actions (we have seen this with Lavabit, just
+ that the developers there refused to play ball and shut
+ down their complete operation).
+\item The government could also put pressure on mining pools
+ in order to blacklist transactions from baddies. Or be a
+ big miner itself. Given the gigantic facilities that
+ are built for institutions like the NSA (pictures from
+ the Utah dessert)
+
+ \begin{center}
+ \includegraphics[scale=0.04]{../pics/nsautah1.jpg}
+ \hspace{3mm}
+ \includegraphics[scale=0.031]{../pics/nsautah2.jpg}
+ \end{center}
+
+ this would not be such a high bar to jump over. Remember it
+ ``only'' takes to be temporarily in control of 50\%-plus of the
+ mining capacity in order to undermine the trust in the
+ system. Given sophisticated stories like Stuxnet (where we still
+ do not know the precise details) maybe even such large
+ facilities are not really needed. What happens, for example, if
+ a government starts DoS attacks on existing miners? They have
+ complete control (unfortunately) of all mayor connectivity
+ providers, i.e.~ISPs.
+
+ There are estimates that the Bitcoin mining capacity
+ outperforms the top 500 supercomputers in the world,
+ combined(!):
+
+ \begin{center}\small
+ \url{http://www.forbes.com/sites/reuvencohen/2013/11/28/global-bitcoin-computing-power-now-256-times-faster-than-top-500-supercomputers-combined/}
+ \end{center}
+
+ But my gut feeling is that these are too simplistic
+ calculations. In security (and things like Bitcoins) the
+ world is never just black and white. The point is once
+ the trust is undermined, the Bitcoin system would need
+ to be evolved to Bitcoins 2.0. But who says that Bitcoin
+ 2.0 will honour the Bitcoins from Version 1.0?
+ \end{itemize}
-\ldots
+\noindent A government would potentially not really
+need to follow up with such threads. Just the rumour that it
+would, could be enough to get the Bitcoin-house-of-cards to
+tumble. Some governments have already such an ``impressive''
+trackrecord in this area, such a thread would be entirely
+credible. Because of all this, I would not have too much hope
+that Bitcoins are free from interference by governments \& Co when
+it will stand in their way, despite what everybody else is
+saying. To sum up, the technical details behind Bitcoins are
+simply cool. But still the entire Bitcoin ecosystem is in my
+humble opinion rather fragile.
+
+
+\subsubsection*{Isn't there anything good with Bitcoins?}
+
+As you can see, so far my argument was that yes the Bitcoin system is
+based on a lot of very cool technical ideas, but otherwise it is a big
+scam. You might wonder if there is not something good (in terms of
+valuable for civilisation) in the bitcoin system? I think there is
+actually: diamonds are quite valuable and because of this can be
+used as a form of `money'---just remember the song with the line
+`diamonds are forever'.
+
+The problem with diamonds is that in some places where they are found,
+they also fund some stupid wars. You like to set up a usable system
+whereby you can check whether a diamond comes from a reputable source
+(not funding any wars) or from a dodgy source. For this you have to
+know that `clearing houses' for diamonds can engrave with lasers
+unique numbers inside the diamonds. These engravings are invisible to
+the naked eye and as far as I remember these numbers cannot be removed,
+except by destroying the diamond. Even if it can be removes, diamonds
+without the number cannot (hopefully) be sold.
+How do bitcoins come into the picture? The idea is called
+\emph{coloured coins}, where you attach some additional information to
+some `coins'. In the diamond example the bitcoin transactions are
+supposed to act as a certificate where diamonds are from (reputable
+sources or not). For this you have to know that you can attach a very
+short custom-made message with each bitcoin transaction. So you would
+record the diamond number inside the message.
+
+Now, you would set the system up so that a trusted entity (which
+exists in the diamond world) buys with their public key bitcoins (or
+smaller amounts). These trusted entities are essentially the places
+that also cut the raw diamonds. The idea is whenever you buy a
+diamond, you like to have also the corresponding bitcoin
+transaction. If you want to sell the diamond, you make a transaction
+to the new owner. The new owner will ask for this message, because
+otherwise he/she cannot sell it later on.
+
+The advantage is that for each diamond you can trace back that the
+transaction must have originated from the trusted entity. If yes, your
+diamond will be sellable. If you do not have the message, the diamond
+comes from a dodgy source and will (hopefully) not be sellable later
+on. In this way you skew the incentives such that only legitimate
+diamond are of value. The bitcoin system just helps with being able to
+check whether the message originates from the trusted entity or
+not....you do not have to consult anybody else and pay money for this
+consultation. Or in any way reveal your identity by such a consultation
+(the police might just keep a particularly close an eye on who contacts
+such a clearing house).
+
+Since we hopefully all agree that funding stupid wars is bad, any
+system that can starve funds for such wars must be good. Piggy-bagging
+on the trust established by the bitcoin system on the public block chain
+makes such a system realisable.
\subsubsection*{Further Reading}
-Two cool articles about how somebody obtained via the Freedom
-of Information Law the taxicab dataset of New York and someone
-else showed how easy it is to mine for private information:
+Finally, finally, the article
\begin{center}\small
-\begin{tabular}{p{0.78\textwidth}}
-\url{http://chriswhong.com/open-data/foil_nyc_taxi/}\smallskip\\
-\url{http://research.neustar.biz/2014/09/15/riding-with-the-stars-passenger-privacy-in-the-nyc-taxicab-dataset}
-\end{tabular}
+\url{http://www.extremetech.com/extreme/155636-the-bitcoin-network-outperforms-the-top-500-supercomputers-combined}
\end{center}
-\noindent
-A readable article about how supermarkets mine your shopping
-habits (especially how they prey on new exhausted parents
-;o) appeared in 2012 in the New York Times:
+\noindent makes an interesting point: If people are willing to
+solve meaningless puzzles for hard, cold cash and with this
+achieve rather impressive results, what could we achieve if
+the UN, say, would find the money and incentivise people to,
+for example, solve protein folding
+puzzles?\footnote{\url{http://en.wikipedia.org/wiki/Protein_folding}}
+For this there are projects like
+Folding@home.\footnote{\url{http://folding.stanford.edu}}
+This might help with curing diseases such as Alzheimer or
+diabetes. The same point is made in the article
+
+\begin{center}\small
+\url{http://gizmodo.com/the-worlds-most-powerful-computer-network-is-being-was-504503726}
+\end{center}
+
+A definitely interesting and worthy use of Bitcoins has been explored
+in the thesis
\begin{center}
-\url{http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html}
+\url{http://enetium.com/resources/Thesis.pdf}
\end{center}
-\noindent An article that analyses privacy and shopping habits
-from a more economic point of view is available from:
+\noindent where the author proposes ways of publishing information
+that is censor-resistant as part of the blockchain. The idea is that
+if a government wants to use Bitcoins, it would also have to put up
+with plain-text data that can be included in a transaction.
-\begin{center}
-\url{http://www.dtc.umn.edu/~odlyzko/doc/privacy.economics.pdf}
-\end{center}
+Ken Shirrif in his blog at
-\noindent An attempt to untangle the web of current technology
-for spying on consumers is published in:
+\begin{center}\small
+\url{http://www.righto.com/2014/02/bitcoin-mining-hard-way-algorithms.html}
+\end{center}
-\begin{center}
-\url{http://cyberlaw.stanford.edu/files/publication/files/trackingsurvey12.pdf}
-\end{center}
+\noindent writes that every day the electricity consumption of mining
+for bitcoins is roughly 15 Mega Watts---the energy consumption of a country
+like Cambodia. He writes:
-\noindent An article that sheds light on the paradox that
-people usually worry about privacy invasions of little
-significance, and overlook the privacy invasion that might
-cause significant damage:
-
-\begin{center}
-\url{http://www.heinz.cmu.edu/~acquisti/papers/Acquisti-Grossklags-Chapter-Etrics.pdf}
-\end{center}
+\begin{quote}
+ \it{}``The difficulty of mining a block is astounding. At the
+ current difficulty, the chance of a hash succeeding is a bit less
+ than one in $10^{19}$. Finding a successful hash is harder than
+ finding a particular grain of sand from all the grains of sand on
+ Earth. To find a hash every ten minutes, the Bitcoin hash rate needs
+ to be insanely large. Currently, the miners on the Bitcoin network
+ are doing about 25 million gigahashes per second. That is, every
+ second about 25,000,000,000,000,000 blocks gets hashed. I estimate
+ (very roughly) that the total hardware used for Bitcoin mining cost
+ tens of millions of dollars and uses as much power as the country of
+ Cambodia.''
+\end{quote}
\end{document}
-http://randomwalker.info/teaching/fall-2012-privacy-technologies/?
-http://chronicle.com/article/Why-Privacy-Matters-Even-if/127461/
-http://repository.cmu.edu/cgi/viewcontent.cgi?article=1077&context=hcii
-https://josephhall.org/papers/NYU-MCC-1303-S2012_privacy_syllabus.pdf
-http://www.jetlaw.org/wp-content/uploads/2014/06/Bambauer_Final.pdf
-http://www.cs.cmu.edu/~yuxiangw/docs/Differential%20Privacy.pdf
-https://www.youtube.com/watch?v=Gx13lgEudtU
-https://www.cs.purdue.edu/homes/ctask/pdfs/CERIAS_Presentation.pdf
-http://www.futureofprivacy.org/wp-content/uploads/Differential-Privacy-as-a-Response-to-the-Reidentification-Threat-Klinefelter-and-Chin.pdf
-http://www.cis.upenn.edu/~aaroth/courses/slides/Overview.pdf
-http://www.cl.cam.ac.uk/~sjm217/papers/tor14design.pdf
+bit coin
+
+A fistful of bitcoins
+http://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf
+http://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf
+
+Ross Anderson & Co (no dispute resolution; co-ercion)
+http://www.cl.cam.ac.uk/~sjm217/papers/fc14evidence.pdf
-%%% Local Variables:
-%%% mode: latex
-%%% TeX-master: t
-%%% End:
+http://www.michaelnielsen.org/ddi/how-the-bitcoin-protocol-actually-works/
+http://www.imponderablethings.com/2013/07/how-bitcoin-works-under-hood.html
+
+http://randomwalker.info/bitcoin/
+
+Jeffrey Robinson
+Bitcon: The Naked Truth about Bitcoin
+
+The Bitcoin Backbone Protocol: Analysis and Applications
+https://eprint.iacr.org/2014/765.pdf
+
+Bitcoin book
+http://chimera.labs.oreilly.com/books/1234000001802/ch04.html#public_key_derivation