diff -r 98ae49ffc262 -r 62985f147c85 handouts/ho07.tex --- a/handouts/ho07.tex Tue Sep 26 12:03:24 2017 +0100 +++ b/handouts/ho07.tex Tue Sep 26 12:10:41 2017 +0100 @@ -1,9 +1,11 @@ \documentclass{article} \usepackage{../style} \usepackage{../graphics} +\usepackage{../langs} +\usepackage{../data} -\begin{document} -\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015} +%https://crypto.stanford.edu/cs251/ +%https://programmingblockchain.gitbooks.io/programmingblockchain/content/ %% spying self defence %%https://ssd.eff.org/en/module/communicating-others @@ -18,532 +20,1001 @@ %https://fpf.org/wp-content/uploads/Differential-Privacy-as-a-Response-to-the-Reidentification-Threat-Klinefelter-and-Chin.pdf %http://research.neustar.biz/2014/09/08/differential-privacy-the-basics/ -%===== -%Tim Greene, Network World, 17 Dec 2015 (via ACM TechNews, 18 Dec 2015) -% -%Massachusetts Institute of Technology (MIT) researchers' experimental -%Vuvuzela messaging system offers more privacy than The Onion Router (Tor) by -%rendering text messages sent through it untraceable. MIT Ph.D. student -%David Lazar says Vuvuzela resists traffic analysis attacks, while Tor -%cannot. The researchers say the system functions no matter how many parties -%are using it to communicate, and it employs encryption and a set of servers -%to conceal whether or not parties are participating in text-based dialogues. -%"Vuvuzela prevents an adversary from learning which pairs of users are -%communicating, as long as just one out of [the] servers is not compromised, -%even for users who continue to use Vuvuzela for years," they note. Vuvuzela -%can support millions of users hosted on commodity servers deployed by a -%single group of users. Instead of anonymizing users, Vuvuzela prevents -%outside observers from differentiating between people sending messages, -%receiving messages, or neither, according to Lazar. The system imposes -%noise on the client-server traffic which cannot be distinguished from actual -%messages, and all communications are triple-wrapped in encryption by three -%servers. "Vuvuzela guarantees privacy as long as one of the servers is -%uncompromised, so using more servers increases security at the cost of -%increased message latency," Lazar notes. -%http://orange.hosting.lsoft.com/trk/click?ref=znwrbbrs9_5-e70bx2d991x066779& +% hard forks +% https://github.com/bitcoin/bips/blob/master/bip-0050.mediawiki + +% only 25% needed to obtain larger shares of mining +% http://www.cs.cornell.edu/~ie53/publications/btcProcFC.pdf -%%%% -%% canvas tracking -%%https://freedom-to-tinker.com/blog/englehardt/the-princeton-web-census-a-1-million-site-measurement-and-analysis-of-web-privacy/ - -%%% -%% cupit re-identification attack -%% https://nakedsecurity.sophos.com/2016/05/20/published-personal-data-on-70000-okcupid-users-taken-down-after-dmca-order/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+nakedsecurity+%28Naked+Security+-+Sophos%29 +% re-identification attacks +% https://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf -%Differential privacy -%===================== -%https://www.wired.com/2016/06/apples-differential-privacy-collecting-data/ - -%Differential privacy, translated from Apple-speak, is the -%statistical science of trying to learn as much as possible -%about a group while learning as little as possible about any -%individual in it. +% bit-coin papers +% https://crypto.stanford.edu/cs251/syllabus.html -%As Roth notes when he refers to a “mathematical proof,” -%differential privacy doesn’t merely try to obfuscate or -%“anonymize” users’ data. That anonymization approach, he -%argues, tends to fail. In 2007, for instance, Netflix released -%a large collection of its viewers’ film ratings as part of a -%competition to optimize its recommendations, removing people’s -%names and other identifying details and publishing only their -%Netflix ratings. But researchers soon cross-referenced the -%Netflix data with public review data on IMDB to match up -%similar patterns of recommendations between the sites and add -%names back into Netflix’s supposedly anonymous database. - -%As an example of that last method, Microsoft’s Dwork points to -%the technique in which a survey asks if the respondent has -%ever, say, broken a law. But first, the survey asks them to -%flip a coin. If the result is tails, they should answer -%honestly. If the result is heads, they’re instructed to flip -%the coin again and then answer “yes” for heads or “no” for -%tails. The resulting random noise can be subtracted from the -%results with a bit of algebra, and every respondent is -%protected from punishment if they admitted to lawbreaking. - -%https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf +% bit coin talk --- at 20:00 mins +%https://www.usenix.org/conference/lisa16/conference-program/presentation/perlman -% Windows 10 data send back to Microsoft (Cortana) -%Here’s a non-exhaustive list of data sent back: location data, text -%input, voice input, touch input, webpages you visit, and telemetry -%data regarding your general usage of your computer, including which -%programs you run and for how long. - -% Businesses are already using customised pricing online based on -% information they can glean about you. It is hard to know how -% widespread the practice is; companies keep their pricing strategies -% closely guarded and are wary of the bad PR price discrimination -% could pose. However, it is clear that a number of large retailers -% are experimenting with it. Staples, for example, has offered -% discounted prices based on whether rival stores are within 20 miles -% of its customers’ location. Office Depot has admitted to using its -% customers’ browsing history and location to vary its range of offers -% and products. A 2014 study from Northeastern University found -% evidence of “steering” or differential pricing at four out of 10 -% general merchandise websites and five out of five travel -% websites. (Steering is when a company doesn’t give you a customised -% price, but points you towards more expensive options if it thinks -% you will pay more.) The online travel company Orbitz raised -% headlines in 2012 when it emerged that the firm was pointing Mac -% users towards higher-priced hotel rooms than PC users. +% In fact, far from freeing people from the oppression of the state, +% blockchains perversely promise the perfect tool for a fully +% auditable, tax compliant, cashless society. Similarly, the belief it +% is an anonymous digital cash has quickly vanished and we are now +% seeing a large number of analytics companies, set-up specifically to +% work with law enforcement agencies, to police this new parallel +% financial system. +% +% But today blockchain is riddled with +% contradictions and misunderstandings. Most of its problems are very +% fixable, if you want to fix them -%%% government will overwrite your wishes if it is annoymous -%% https://www.lightbluetouchpaper.org/2016/12/05/government-u-turn-on-health-privacy/ - -%% corporate surveilance / privacy - report and CC3C talk -%% http://crackedlabs.org/en/networksofcontrol -%% https://media.ccc.de/v/33c3-8414-corporate_surveillance_digital_tracking_big_data_privacy#video&t=2933 - -\section*{Handout 6 (Privacy)} +% history of bitcoins +% https://futurism.com/images/this-week-in-tech-jan-15-22-2016/ -The first motor car was invented around 1886. For ten years, -until 1896, the law in the UK (and elsewhere) required a -person to walk in front of any moving car waving a red flag. -Cars were such a novelty that most people did not know what to -make of them. The person with the red flag was intended to -warn the public, for example horse owners, about the impending -novelty---a car. In my humble opinion, we are at the same -stage of development with privacy. Nobody really knows what it -is about or what it is good for. All seems very hazy. There -are a few laws (e.g.~cookie law, right-to-be-forgotten law) -which address problems with privacy, but even if they are well -intentioned, they either back-fire or are already obsolete -because of newer technologies. The result is that the world of -``privacy'' looks a little bit like the old Wild -West---lawless and mythical. +\begin{document} +\fnote{\copyright{} Christian Urban, 2014, 2015} -We would have hoped that after Snowden, Western governments -would be a bit more sensitive and enlightned about the topic -of privacy, but this is far from the truth. Ross Anderson -wrote the following in his blog\footnote{\url{https://www.lightbluetouchpaper.org/2016/02/11/report-on-the-ip-bill/}} about the approach taken in -the US to lessons learned from the Snowden leaks and contrasts -this with the new snooping bill that is considered in the UK -parliament: +\section*{Handout 7 (Bitcoins)} -\begin{quote}\it -``The comparison with the USA is stark. There, all three -branches of government realised they'd gone too far after -Snowden. President Obama set up the NSA review group, and -implemented most of its recommendations by executive order; -the judiciary made changes to the procedures of the FISA -Court; and Congress failed to renew the data retention -provisions in the Patriot Act (aided by the judiciary). Yet -here in Britain the response is just to take Henry VIII powers -to legalise all the illegal things that GCHQ had been up to, -and hope that the European courts won't strike the law down -yet again.'' -\end{quote} +In my opinion Bitcoins are an elaborate Ponzi +scheme\footnote{\url{http://en.wikipedia.org/wiki/Ponzi_scheme}}---still +the ideas behind them are really beautiful and not too +difficult to understand. Since many colourful claims about +Bitcoins float around in the mainstream and not-so-mainstream +media, it will be instructive to re-examine such claims from a +more technically informed vantage point. For example, it is +often claimed that Bitcoins are anonymous and free from any +potential government meddling. It turns out that the first +claim ignores a lot of research in de-anonymising social +networks, and the second underestimates the persuasive means a +government has at its disposal. -\noindent Unfortunately, also big organisations besides -governments seem to take an unenlightened approach to privacy. -For example, UCAS, a charity set up to help students with -applying to universities in the UK, has a commercial unit that -happily sells your email addresses to anybody who forks out -enough money for bombarding you with spam. Yes, you can opt -out very often from such ``schemes'', but in case of UCAS any -opt-out will limit also legit emails you might actually be -interested in.\footnote{The main objectionable point, in my -opinion, is that the \emph{charity} everybody has to use for -HE applications has actually very honourable goals -(e.g.~assist applicants in gaining access to universities), -but the small print (or better the link ``About us'') reveals -they set up their organisation so that they can also -shamelessly sell the email addresses they ``harvest''. -Everything is of course very legal\ldots{}ethical?\ldots{}well -that is in the eye of the beholder. See: +There are a lot of articles, blogposts, research papers +etc.~available about Bitcoins. Below I will follow closely the +very readable explanations from + +\begin{center} +\url{http://www.michaelnielsen.org/ddi/how-the-bitcoin-protocol-actually-works/} \;\;and\smallskip\\ +\url{http://www.imponderablethings.com/2013/07/how-bitcoin-works-under-hood.html} +\end{center} -\url{http://www.ucas.com/about-us/inside-ucas/advertising-opportunities} -or -\url{http://www.theguardian.com/uk-news/2014/mar/12/ucas-sells-marketing-access-student-data-advertisers}} +\noindent The latter also contains a link to a nice youtube +video about the technical details behind Bitcoins. I will +also use some of their pictures. -Another example: Verizon, an ISP who is supposed to provide -you just with connectivity, has found a ``nice'' side-business -too: When you have enabled all privacy guards in your browser -(the few you have at your disposal), Verizon happily adds a -kind of cookie to your -HTTP-requests.\footnote{\url{http://webpolicy.org/2014/10/24/how-verizons-advertising-header-works/}} -As shown in the picture below, this cookie will be sent to -every web-site you visit. The web-sites then can forward the -cookie to advertisers who in turn pay Verizon to tell them -everything they want to know about the person who just made -this request, that is you. - +Let us start with the question who invented Bitcoins? You +could not make up the answer, but we actually do not know who +the inventor is. All we know is that the first paper + \begin{center} -\includegraphics[scale=0.16]{../pics/verizon.png} +\url{https://bitcoin.org/bitcoin.pdf} \end{center} -\noindent How disgusting! Even worse, Verizon is not known for -being the cheapest ISP on the planet (completely the -contrary), and also not known for providing the fastest -possible speeds, but rather for being among the few ISPs in -the US with a quasi-monopolistic ``market distribution''. +\noindent is signed by Satoshi Nakamoto, which however is +likely only a pen name. There is a lot of speculation who +could be the inventor, or inventors, but we simply do not +know. This part of Bitcoins is definitely anonymous so far. +The paper above is from the end of 2008; the first Bitcoin +transaction was made in January 2009. The rules in Bitcoin are +set up so that there will only ever be 21 Million Bitcoins +with the maximum reached around the year 2140. Currently there +are already 11 Million Bitcoins in `existence'. Contrast this +with traditional fiat currencies where money can be printed +almost at will. The smallest unit of a Bitcoin is called a +Satoshi, which is the $10^{-8}$th part of a Bitcoin. Remember +a Penny is the $10^{-2}$th part of a Pound. +The two main cryptographic building blocks of Bitcoins are +cryptographic hashing functions (SHA-256) and public-private +keys using the elliptic-curve encryption scheme for digital +signatures. Hashes are used to generate `fingerprints' of data +that ensure integrity (absence of tampering). Public-private +keys are used for signatures. For example sending a message, +say $msg$, together with the encrypted version + +\[ +msg, \{msg\}_{K^{priv}} +\] + +\noindent allows everybody with access to the corresponding +public key $K^{pub}$ to verify that the message came from the +person who knew the private key. Signatures are used in +Bitcoins for verifying the addresses where the Bitcoins are +sent from. Addresses in Bitcoins are essentially the public +keys. There are $2^{160}$ possible addresses, which is such a +vast amount that there is not even a check for duplicates, or +already used addresses. If you start with a random number to +generate a public-private key pair it is very unlikely that +you step on somebody else's shoes. Compare this with the +email-addresses you wanted to register with, say +Gmail, but which are always already taken. -Well, we could go on and on\ldots{}and that has not even -started us yet with all the naughty things NSA \& Friends are -up to. Why does privacy actually matter? Nobody, I think, has -a conclusive answer to this question yet. Maybe the following -four notions help with clarifying the overall picture -somewhat: +One major difference between Bitcoins and traditional banking +is that you do not have a place, or few places, that record the +balance on your account. Traditional banking involves a +central ledger which specifies the current balance in each +account, for example + +\begin{center} +\begin{tabular}{l|r} +account owner & balance\\\hline +Alice & \pounds{10.01}\\ +Bob & \pounds{4.99}\\ +Charlie & -\pounds{1.23}\\ +Eve & \pounds{0.00} +\end{tabular} +\end{center} + +\noindent Bitcoins work differently in that there is no such +central ledger, but instead a public record of all +transactions ever made. This means spending money corresponds +to sending messages of the (oversimplified) form + +\begin{equation} +\{\text{I, Alice, am giving Bob one Bitcoin.}\}_{K^{priv}_{Alice}} +\end{equation} + +\noindent These messages, called transactions, are the only +data that is ever stored in the Bitcoin system (we will come +to the precise details later on). The transactions are +encrypted with Alice's private key so that everybody, +including Bob, can use Alice's public key $K^{pub}_{Alice}$ to +verify that this message came really from Alice, or more +precisely from the person who knows $K^{priv}_{Alice}$. + +The problem with such messages in a distributed system is that +what happens if Bob receives 10, say, of these transactions? +Did Alice intend to send him 10 Bitcoins, or did the message +get duplicated by for example an attacker re-playing a sniffed +message? What is needed is a kind of serial number for such +transactions. This means transaction messages shoul look more like + +\begin{center} +$\{\text{I, Alice, am giving Bob Bitcoin \#1234567.}\}_{K^{priv}_{Alice}}$ +\end{center} + +\noindent There are two difficulties, however, that need to be +solved with serial numbers. One is who is assigning serial +numbers to Bitcoins and also how can Bob verify that Alice +actually owns this Bitcoin to pay him? In a system with a bank +as trusted third-party, Bob could do the following: \begin{itemize} -\item \textbf{Secrecy} is the mechanism used to limit the - number of principals with access to information (e.g., - cryptography or access controls). For example I better - keep my password secret, otherwise people from the wrong - side of the law might impersonate me. - -\item \textbf{Confidentiality} is the obligation to protect - the secrets of other people or organisations (secrecy - for the benefit of an organisation). For example as a - staff member at King's I have access to data, even - private data, I am allowed to use in my work but not - allowed to disclose to anyone else. - -\item \textbf{Anonymity} is the ability to leave no evidence of - an activity (e.g., sharing a secret). This is not equal - with privacy---anonymity is required in many - circumstances, for example for whistle-blowers, - voting, exam marking and so on. - -\item \textbf{Privacy} is the ability or right to protect your - personal secrets (secrecy for the benefit of an - individual). For example, in a job interview, I might - not like to disclose that I am pregnant, if I were a - woman, or that I am a father. Lest they might not hire - me. Similarly, I might not like to disclose my location - data, because thieves might break into my house if they - know I am away at work. Privacy is essentially - everything which ``shouldn't be anybody's business''. - +\item Bob asks the bank whether the Bitcoin with that serial + number belongs to Alice and Alice hasn't already spent + this Bitcoin. +\item If yes, then Bob tells the bank he accepts this Bitcoin. + The bank updates the records to show that the Bitcoin + with that serial number is now in Bob’s possession and + no longer belongs to Alice. \end{itemize} -\noindent While this might provide us with some rough -definitions, the problem with privacy is that it is an -extremely fine line what should stay private and what should -not. For example, since I am working in academia, I am every -so often very happy to be a digital exhibitionist: I am very -happy to disclose all `trivia' related to my work on my -personal web-page. This is a kind of bragging that is normal -in academia (at least in the field of CS), even expected if -you look for a job. I am even happy that Google maintains a -profile about all my academic papers and their citations. +\noindent But for this banks would need to be trusted and +would also be an easy target for any government interference, +for example. Think of the early days of music sharing where +the company Napster was the trusted third-party but also the single point of ``failure'' which +was taken offline by law enforcement. Bitcoins is more like a +system such as BitTorrent without a single central entity that +can be taken offline.\footnote{There is some Bitcoin +infrastructure that is not so immune from being taken offline: +for example Bitcoin exchanges, HQs of Bitcoin mining pools, +Bitcoin developers and so on.} + +Bitcoins solve the problem of not being able to rely on a bank +by making everybody the ``bank''. Everybody who cares can have +the entire transaction history starting with the first +transaction made in January 2009. This history of transactions +is called the \emph{blockchain}. Bob, for example, can use his +copy of the blockchain for determining whether Alice owned the +Bitcoin he received, and if she did, he transmits the message +that he owns it now to every other participant on the Bitcoin +network. An illustration of a three-block segment of the +blockchain is (simplified) as follows + +\begin{equation} +\includegraphics[scale=0.4]{../pics/bitcoinblockchain0.png} +\label{segment} +\end{equation} + +\noindent The chain grows with time. Each block contains a +list of individual transactions, written txn in the picture +above, and also a reference to the previous block, written +prev. The data in a block (txn's and prev) is hashed so that +the reference and transactions in them cannot be tampered +with. This hash is also the unique serial number of each +block. Since this previous-block-reference is also part of the +hash, the whole chain is robust against tampering. I let you +think why this is the case?\ldots{}But does it actually +eliminate all possibilities of fraud? + +We can check the consistency of the blockchain by checking +whether all the references and hashes are correctly recorded. +I have not tried it myself, but it is said that with the +current amount of data (appr.~12GB) it takes roughly a day to +check the consistency of the blockchain on a normal computer. +Fortunately this ``extended'' consistency check usually only +needs to be done once. Afterwards the blockchain only needs to +be updated consistently. + +Recall I wrote earlier that Bitcoins do not maintain a ledger, +which lists all the current balances in each account. Instead +only transactions are recorded. While a current balance of an +account is not immediately available, it is possible to +extract from the blockchain a transaction graph that looks +like the picture shown in Figure~\ref{txngraph}. Each +rectangle represents a single transaction. Take for example +the rightmost lower transaction from Charles to Emily. This +transaction has as receiver the address of Emily and as the +sender the address of Charles. In this way no Bitcoins can +appear out of thin air (we will discuss later how Bitcoins are +actually generated). If Charles did not have a transaction of +at least the amount he wants to give Emily to his name +(i.e.~send to an address with his public-private key) then +there is no way he can make a payment to Emily. Equally, if +now Emily wants to pay for a coffee, say, with the Bitcoin she +received from Charles she can essentially only forward the +message she received. The only slight complication with this +setup in Bitcoins is that ``incoming'' Bitcoins can be +combined in a transaction and ``outgoing'' Bitcoins can be +split. For example in the leftmost upper transactions in +Figure~\ref{txngraph}, Fred makes a payment to Alice. But this +payment (or transaction) combines the Bitcoins that were send +by Jane to Fred and also by Juan to Fred. This allows you to +``consolidate'' your funds: if it were only possible to split +transactions, then the amounts would get smaller and smaller. + +In Bitcoins you have the ability to both combine incoming +transactions, but also to split outgoing transactions to +potentially more than one receiver. The latter is also needed. +Consider again the rightmost transactions in +Figure~\ref{txngraph} and suppose Alice is a coffeeshop owner +selling coffees for 1 Bitcoin. Charles received a transaction +from Zack over 5 Bitcoins, say. How does Charles pay for the +coffee? There is no explicit notion of \emph{change} in the +Bitcoin system. What Charles has to do instead is to make one +single transaction with 1 Bitcoin to Alice and with 4 Bitcoins +going back to himself, which then Charles can use to give to +Emily, for example. + +\begin{figure}[t] +\begin{center} +\includegraphics[scale=0.4]{../pics/blockchain.png} +\end{center} +\caption{Transaction graph that is implicitly recorded in the +public blockchain.\label{txngraph}} +\end{figure} + +Let us consider another example. Suppose Emily received 4 +Bitcoins from Charles and independently received another +transaction (not shown in the picture) that sends 6 Bitcoins +to her. If she now wants to buy a coffee from Alice for 1 +Bitcoin, she has two possibilities: She could just forward the +transaction from Charles over 4 Bitcoins to Alice split in +such a way that Alice receives 1 Bitcoin and Emily sends the +remaining 3 Bitcoins back to herself. In this case she would +now be in the possession of two unspend Bitcoin transactions, +one over 3 Bitcoins and the independent one over 6 Bitcoins. +Or, Emily could combine both transactions (one over 4 Bitcoins +from Charles and the independent one over 6 Bitcoins) and then +split this amount with 1 Bitcoin going to Alice and 9 Bitcoins +going back to herself. + +I think this is a good time for you to pause to let this +concept of transactions to really sink in\ldots{}You should +come to the conclusion that there is really no need for a central ledger and no +need for an account balance as familiar from traditional +banking. The closest what Bitcoin has to offer for the notion +of a balance in a bank account are the unspend transactions +that a person (more precisely a public-private key address) +received. That means transactions that can still be forwarded. + +After the pause also consider the fact that whatever +transaction is recorded in the blockchain will be in the +``historical record'' for the Bitcoin system. If a transaction +says 1 Bitcoin goes from address $A$ to address $B$, then this +is what will be---$B$ has then the possibility to spend the +corresponding Bitcoins, whether the transaction was done +fraudulently or not. There is no exception to this rule. +Interestingly this is also how Bitcoins can get lost: One +possibility is that you send Bitcoins to an address for which +nobody has generated a private key, for example because of a +typo in the address field---bad luck for fat +fingers\footnote{\url{http://en.wikipedia.org/wiki/Typographical_error}} +in the Bitcoin system. The reason is that nobody has a private +key for this erroneous address and consequently cannot forward +the transaction anymore. Another possibility is that you +forget your private key and you had messages forwarded to the +corresponding public key. Also in this case bad luck: you will +never be able to forward this message again, because you will +not be able to form a valid message that sends this to +somebody else (we will see the details of this later). But +this is also a way how you can get robbed of your Bitcoins. By +old-fashioned hacking-into-a-computer crime, for example, an +attacker might get hold of your private key and then quickly +forwards the Bitcoins that are in your name to an address the +attacker controls. You will never again have access to these +Bitcoins, because for the Bitcoin system they are assumed to +be spent. And remember with Bitcoins you cannot appeal to any +higher authority. Once the Bitcoins are gone, they are gone. +This is much different in traditional banking where at least +you can try to harass the bank to roll back the transaction. + +This brings us to back to problem of double spend. Suppose Bob +is a merchant. How can he make sure that Alice does not cheat +him? She could for example send a transaction to Bob. But also +forward the ``same'' transaction to Charlie, or even herself. +If Alice manages to get the second transaction into the +blockchain, Bob will be cheated out of his money. The problem +in such conflicting situations is how should the network +update their blockchain? You might end up with a picture like +this + +\begin{center} +\includegraphics[scale=0.4]{../pics/bitcoindisagreement.png} +\end{center} -On the other hand I would be very irritated if anybody I do -not know had a too close look on my private live---it -shouldn't be anybody's business. The reason is that knowledge -about my private life can often be used against me. As mentioned -above, public location data might mean I get robbed. If -supermarkets build a profile of my shopping habits, they will -use it to \emph{their} advantage---surely not to \emph{my} -advantage. Also whatever might be collected about my life will -always be an incomplete, or even misleading, picture. For -example I am pretty sure my creditworthiness score was -temporarily(?) destroyed by not having a regular income in -this country (before coming to King's I worked in Munich for -five years). To correct such incomplete or flawed credit -history data there is, since recently, a law that allows you -to check what information is held about you for determining -your creditworthiness. But this concerns only a very small -part of the data that is held about me/you. Also -what about cases where data is wrong or outdated (but do we -need a right-to be forgotten). +\noindent where Alice convinced some part of the ``world'' +that she is still the owner of the Bitcoin and some other part +of the ``world'' thinks it's Bob's. How should such a +disagreement be resolved? This is actually the main hurdle +where Bitcoin really innovated. The answer is that Bob needs +to convince ``enough'' people on the network that the +transaction from Alice to him is legit. + +What does, however, ``enough'' mean in a distributed system? +If Alice sets up a network of a billion, say, puppy identities +and whenever Bob tries to convince, or validate, that he is +the rightful owner of the Bitcoin, then the puppy identities +agree. Bob would then have no reason to not give Alice her +coffee. But behind his back she has convinced everybody else +on the network that she is still the rightful owner of the +Bitcoin. After being outvoted, Bob would be a tad peeved. + +The reflex reaction to such a situation would be to make the +process of validating a transaction as cheap as possible. The +intention is that Bob will easily get enough peers to agree with him +that he is the rightful owner. But such a solution has always +the limitation of Alice setting up an even bigger network of +puppy identities. The really cool idea of Bitcoin is to go +into the other direction of making the process of transaction +validation (artificially) as expensive as possible, but reward +people for helping with the validation. This is really a novel +and counterintuitive idea that makes the whole system of +Bitcoins work so beautifully. + +\subsubsection*{Proof-of-Work Puzzles} + +In order to make the process of transaction validation +difficult, Bitcoin uses a kind of puzzle. Solving the puzzles +is called \emph{Bitcoin mining}, where whoever solves a puzzle +will be awarded some Bitcoins. At the beginning this was 50 +Bitcoins, but the rules of Bitcoin are set up such that this +amount halves every 210,000 transactions or so. Currently you +will be awarded 25 Bitcoins for solving a puzzle. Because the +amount will halve again and then later again and again, around +the year 2140 it will go below the level of 1 Satoshi. In that +event no new Bitcoins will ever be created again and the +amount of Bitcoins stays fixed. There will be still an +incentive to help with validating transactions, because there +is the possibility in Bitcoins to offer a transaction fee to +whoever solves a puzzle. At the moment this fee is usually set +to 0, since the incentive for miners is the 25 Bitcoins that +are currently awarded for solving puzzles. + +What do the puzzles that miners have to solve look like? The +puzzles can be illustrated roughly as follows: Given a string, +say \code{"Hello, world!"}, what is the salt so that the hash +starts with a long run of zeros? Let us look at a concrete +example. Recall that Bitcoins use the hash-function SHA-256. +Suppose we call this hash function \code{h}, then we could try +the salt \code{0} as follows: + +\begin{quote} +\code{h("Hello, world!0") =}\\ +\mbox{}\quad\footnotesize\pcode{1312af178c253f84028d480a6adc1e25e81caa44c749ec81976192e2ec934c64} +\end{quote} + +\noindent OK this does not have any zeros at all. We could +next try the salt \code{1}: + +\begin{quote} +\code{h("Hello, world!1") =}\\ +\mbox{}\quad\footnotesize\pcode{e9afc424b79e4f6ab42d99c81156d3a17228d6e1eef4139be78e948a9332a7d8} +\end{quote} + +\noindent Again this hash value does not contain any leading +zeros. We could now try out every salt until we reach + +\begin{quote} +\code{h("Hello, world!4250") =}\\ +\mbox{}\quad\footnotesize\pcode{0000c3af42fc31103f1fdc0151fa747ff87349a4714df7cc52ea464e12dcd4e9} +\end{quote} + +\noindent where we have four leading zeros. If four zeros are +enough, then the puzzle would be solved with this salt. The +point is that we can very quickly check whether a salt solves +a puzzle, but it is hard to find one. Latest research suggest +it is an NP-problem. If we want the output hash value to begin +with 10 zeroes, say, then we will, on average, need to try +$16^{10} \approx 10^{12}$ different salts before we find a +suitable one. + +In Bitcoins the puzzles are not solved according to how many +leading zeros a hash-value has, but rather whether it is below +a \emph{target}. The hardness of the puzzle can actually be +controlled by changing the target according to the available +computational power available. I think the adjustment of the +hardness of the problems is done every 2060 blocks +(appr.~every two weeks). The aim of the adjustment is that on +average the Bitcoin network will most likely solve a puzzle +within 10 Minutes. + +\begin{center} +\includegraphics[scale=0.37]{../pics/blockchainsolving.png} +\end{center} + +\noindent It could be solved quicker, but equally it could +take longer, but on average after 10 Minutes somebody on the +network will have found a solution. + +Remember that the puzzles are a kind of proof-of-work that +make the validation of transactions artificially expensive. +Consider the following picture with a blockchain and some +unconfirmed transactions. + +\begin{equation} +\includegraphics[scale=0.38]{../pics/bitcoin_unconfirmed.png} +\label{unconfirmed} +\end{equation} + +\noindent The puzzle is stated as follows: There are some +unconfirmed transactions. Choosing some of them, the miner +(i.e.~the person/computer that tries to solve a puzzle) will +form a putative block to be added to the blockchain. This +putative block will contain the transactions and the reference +to the previous block. The serial number of such a block is +simply the hash of all the data. The puzzle can then be stated +as the ``string'' corresponding to the block and which salt is +needed in order to have the hashed value being below the +target. Other miners will choose different transactions and +therefore work on a slightly different putative block and +puzzle. + +The intention of the proof-of-work puzzle is that the +blockchain is at every given moment linearly ordered, see the +picture shown in \eqref{unconfirmed}. If we don’t have such a +linear ordering at any given moment then it may not be clear +who owns which Bitcoins. Assume a miner David is lucky and +finds a suitable salt to confirm some transactions. Should he +celebrate? Not yet. Typically the blockchain will look as +follows + +\begin{center} +\includegraphics[scale=0.65]{../pics/block_chain1.png} +\end{center} + +\noindent But every so often there will be a fork + +\begin{center} +\includegraphics[scale=0.65]{../pics/block_chain_fork.png} +\end{center} + +\noindent What should be done in this case? Well, the tie is +broken if another block is solved, like so: + +\begin{center} +\includegraphics[scale=0.4]{../pics/bitcoin_blockchain_branches.png} +\end{center} + +\noindent The rule in Bitcoins is: If a fork occurs, people on +the network keep track of all forks (they can see). But at any +given time, miners only work to extend whichever fork is +longest in their copy of the block chain. Why should miners +work on the longest fork? Well their incentive is to mine +Bitcoins. If somebody else already solved a puzzle, then it +makes more sense to work on a new puzzle and obtain the +Bitcoins for solving that puzzle, rather than waste efforts on +a fork that is shorter and therefore less likely to be +``accepted''. Note that whoever solved a puzzle on the +``loosing'' fork will actually not get any Bitcoins as reward. +Tough luck. -To see how private matter can lead really to the wrong -conclusions, take the example of Stephen Hawking: When he was -diagnosed with his disease, he was given a life expectancy of -two years. If employers would know about such problems, would -they have employed Hawking? Now, he is enjoying his 70+ -birthday. Clearly personal medical data needs to stay private. - -To cut a long story short, I let you ponder about the two -statements which are often voiced in discussions about privacy: +\subsubsection*{Alice against the Rest of the World} -\begin{itemize} -\item \textit{``You have zero privacy anyway. Get over -it.''}\\ -\mbox{}\hfill{}{\small{}(by Scott Mcnealy, former CEO of Sun)} +Let us see how the blockchain and the proof-of-work puzzles +avoid the problem of double spend. If Alice wants to cheat +Bob, she would need to pull off the following ploy: -\item \textit{``If you have nothing to hide, you have nothing -to fear.''} -\end{itemize} - -\noindent If you like to watch a movie which has this topic as -its main focus I recommend \emph{Gattaca} from -1997.\footnote{\url{http://www.imdb.com/title/tt0119177/}} If -you want to read up on this topic, I can recommend the -following article that appeared in 2011 in the Chronicle of -Higher Education: +\begin{center} +\includegraphics[scale=0.4]{../pics/bitcoin_blockchain_double_spend.png} +\end{center} -\begin{center} -\url{http://chronicle.com/article/Why-Privacy-Matters-Even-if/127461/} -\end{center} +\noindent Alice makes a transaction to Bob for paying, for +example, for an online order. This transaction is confirmed, +or validated, in block 2. Bob ships the goods around block 4. +In this moment, Alice needs to get into action and try to +validate the fraudulent transaction to herself instead. At +this moment she is in a race against all the computing power +of the ``rest of the world''. Because the incentive of the +rest of the world is to work on the longest chain, that is the +one with the transaction from Alice to Bob: -\noindent Funnily, or maybe not so funnily, the author of this -article carefully tries to construct an argument that does not -only attack the nothing-to-hide statement in cases where -governments \& co collect people's deepest secrets, or -pictures of people's naked bodies, but an argument that -applies also in cases where governments ``only'' collect data -relevant to, say, preventing terrorism. The fun is of course -that in 2011 we could just not imagine that respected -governments would do such infantile things as intercepting -people's nude photos. Well, since Snowden we know some people -at the NSA did exactly that and then shared such photos among -colleagues as ``fringe benefit''. +\begin{center} +\includegraphics[scale=0.4]{../pics/bitcoin_doublespend_blockchain_race.png} +\end{center} - -\subsubsection*{Re-Identification Attacks} +\noindent As shown in the picture she has to solve the puzzles +2a to 5a one after the other, because the hash of a block is +determined via the reference by all the data in the previous +block. She might be very lucky to solve one puzzle for a block +before the rest of the world, but to be lucky many times is +very unlikely. This principle of having to race against the +rest of the world avoids the ploy of double spend. -Apart from philosophical musings, there are fortunately also -some real technical problems with privacy. The problem I want -to focus on in this handout is how to safely disclose datasets -containing potentially very private data, say health records. -What can go wrong with such disclosures can be illustrated -with four well-known examples: +In order to raise the bar for Alice even further, merchants +accepting Bitcoins use the following rule of thumb: A +transaction is ``confirmed'' if \begin{itemize} -\item In 2006, a then young company called Netflix offered a 1 - Mio \$ prize to anybody who could improve their movie - rating algorithm. For this they disclosed a dataset - containing 10\% of all Netflix users at the time - (appr.~500K). They removed names, but included numerical - ratings of movies as well as times when ratings were - uploaded. Though some information was perturbed (i.e., - slightly modified). - - Two researchers had a closer look at this anonymised - data and compared it with public data available from the - International Movie Database (IMDb). They found that - 98\% of the entries could be re-identified in the - Netflix dataset: either by their ratings or by the dates - the ratings were uploaded. The result was a class-action - suit against Netflix, which was only recently resolved - involving a lot of money. +\item[(1)] it is part of a block in the longest fork, and +\item[(2)] at least 5 blocks follow it in the longest fork. In + this case we say that the transaction has 6 + confirmations. +\end{itemize} + +\noindent A simple calculation shows that this amount of +confirmations can take up to 1 hour and more. While this seems +excessively long, from the merchant's point of view it is not +that long at all. For this recall that ordinary creditcards +can have their transactions been rolled-back for 6 months or +so. The point however is that the odds for Alice being able to +cheat are very low, unless she can muster more than 50\% of +the world Bitcoin mining capacity. In this case she could +out-race the rest of the world. The point is however that +amassing such an amount of computing power is practically +impossible for a single person or even a moderately large +group. -\item In the 1990ies, medical datasets were often made public - for research purposes. This was done in anonymised form - with names removed, but birth dates, gender and ZIP-code - were retained. In one case where such data about - hospital visits of state employees in Massachusetts was - made public, the then governor assured the public that - the released dataset protected patient privacy by - deleting identifiers. - - A graduate student could not resist cross-referencing - public voter data with the released data that still - included birth dates, gender and ZIP-code. The result - was that she could send the governor his own hospital - record. It turns out that birth dates, gender and - ZIP-code uniquely identify 87\% of people in the US. - This work resulted in a number of laws prescribing which - private data cannot be released in such datasets. - -\item In 2006, AOL published 20 million Web search queries - collected from 650,000 users (names had been deleted). - This was again done for research purposes. However, - within days an old lady, Thelma Arnold, from Lilburn, - Georgia, (11,596 inhabitants) was identified as user - No.~4417749 in this dataset. It turned out that search - engine queries are deep windows into people's private - lives. - -\item Genome-Wide Association Studies (GWAS) was a public - database of gene-frequency studies linked to diseases. - It would essentially record that people who have a - disease, say diabetes, have also certain genes. In order - to maintain privacy, the dataset would only include - aggregate information. In case of DNA data this - aggregation was achieved by mixing the DNA of many - individuals (having a disease) into a single solution. - Then this mixture was sequenced and included in the - dataset. The idea was that the aggregate information - would still be helpful to researchers, but would protect - the DNA data of individuals. - - In 2007 a forensic computer scientist showed that - individuals can still be identified. For this he used - the DNA data from a comparison group (people from the - general public) and ``subtracted'' this data from the - published data. He was left with data that included all - ``special'' DNA-markers of the individuals present in - the original mixture. He essentially deleted the - ``background noise'' in the published data. The problem - with DNA data is that it is of such a high resolution - that even if the mixture contained maybe 100 - individuals, you can with current technology detect - whether an individual was included in the mixture or - not. - - This result changed completely how DNA data is nowadays - published for research purposes. After the success of - the human-genome project with a very open culture of - exchanging data, it became much more difficult to - anonymise data so that patient's privacy is preserved. - The public GWAS database was taken offline in 2008. - +Connected with the 6-confirmation rule is an interesting +phenomenon. On average, it would take several years for a +typical computer to solve a proof-of-work puzzle, so an +individual’s chance of ever solving one before the rest of the +world, which typically takes only 10 minutes, is negligibly +low. Therefore many people join groups called \emph{mining +pools} that collectively work to solve blocks, and distribute +rewards based on work contributed. These mining pools act +somewhat like lottery pools among co-workers, except that some +of these pools are quite large, and comprise more than 20\% of +all the computers in the network. It is said that BTCC, a +large mining pool, has limited its number of members in order +to not solve more than 6 blocks in a row. Otherwise this would +undermine the trust in Bitcoins, which is also not in the +interest of BTCC, I guess. Some statistics on mining pools can +be seen at + +\begin{center} +\url{https://blockchain.info/pools} +\end{center} + +\noindent Here is an interesting problem: You are part of a lottery +pool, if you chip in some of the money to buy a lottery ticket. In +this setting it is clear when you are in or outside of the pool. But +how do you make sure people work hard in a mining pool in order to +justify a fraction of any reward? If evil me had its way, I would just +claim I do work and then sit back and relax. Or even if I do some work +for a mining pool and I happen to find a correct salt, I would keep it +secret and submit it to the bitcoin network on the ``side''. Actually, +the idea of mining pools has opened up a full can of interesting +problems. + + + +\subsubsection*{Bitcoins for Real} + +Let us now turn to the nitty gritty details. As a participant +in the Bitcoin network you need to generate and store a +public-private key pair. The public key you need to advertise +in order to receive payments (transactions). The private key +needs to be securely stored. For this there seem to be three +possibilities + +\begin{itemize} +\item an electronic wallet on your computer +\item a cloud-based storage (offered by some Bitcoin services) +\item paper-based \end{itemize} -\noindent There are many lessons that can be learned from -these examples. One is that when making datasets public in -anonymised form, you want to achieve \emph{forward privacy}. -This means, no matter what other data that is also available -or will be released later, the data in the original dataset -does not compromise an individual's privacy. This principle -was violated by the availability of ``outside data'' in the -Netflix and governor of Massachusetts cases. The additional -data permitted a re-identification of individuals in the -dataset. In case of GWAS a new technique of re-identification -compromised the privacy of people in the dataset. The case of -the AOL dataset shows clearly how incomplete such data can be: -Although the queries uniquely identified the older lady, she -also looked up diseases that her friends had, which had -nothing to do with her. Any rational analysis of her query -data must therefore have concluded, the lady is on her -death bed, while she was actually very much alive and kicking. +\noindent The first two options of course offer convenience +for making and receiving transactions. But given the nature of +the private keys and how much security relies on them (recall +if somebody gets hold of it, your Bitcoins are quickly lost +forever) I would opt for the third option for anything except +for trivial amounts of Bitcoins. As we have seen earlier in +the course, securing a computer system that it can withstand a +targeted breakin is still very much an unsolved problem. + +An interesting fact with Bitcoin keys is that there is no +check for duplicate addresses. This means when generating a +public-private key, you should really start with a carefully +chosen random number such that there is really no chance to +step on somebody's feet in the $2^{160}$ space of +possibilities. Again if you share an address with somebody +else, he or she has access to all your unspend transactions. +The absence of such a check is easily explained: How would one +do this in a distributed system? The answer you can't. It is +possible to do some sanity check of addresses that are already +used in the blockchain, but this is not a fail-proof method. +One really has to trust on the enormity of the $2^{160}$ +space for addresses. + +Let us now look at the concrete data that is stored in an transaction +message: + +\lstinputlisting[language=Scala]{../slides/msg} + +\noindent The hash in Line 1 is the hash of all the data that +follows. It is a kind of serial number for the transaction. +Line 2 contains a version number in case there are some +incompatible changes to be made. Lines 3 and 4 specify how +many incoming transactions are combined and how many outgoing +transactions there are. In our example there are one for each. +Line 5 specifies a lock time for when the transaction is +supposed to become active---this is usually set to 0 to become +active immediately. Line 6 specifies the size of the message; +it has nothing to do with the Bitcoins that are transferred. +Lines 7 to 11 specify where the Bitcoins in the transaction +are coming from. The has in line 9 specifies the incoming +transaction and the \pcode{n} in Line 10 specifies which +output of the transaction is referred to. The signature in +line 11 specifies the address (public key $K^{pub}$) from +where the Bitcoins are taken and the digital signature of the +address, that is $\{K^{pub}\}_{K^{priv}}$. Lines 12 to 15 +specify the value of the first outgoing transaction. In this +case 0.319 Bitcoins. The hash in Line 14 specifies the address +to where the Bitcoins are transferred. + +As can be seen there is no need to issue serial numbers for +transactions, the hash of the transaction data can do this +job. The hash will contain the sender addresses and +hash-references to the incoming transactions, as well as the +public key of the incoming transaction. This uniquely +identifies a transaction and the hash is the unique +fingerprint of it. The in-field also contains the address to +which a earlier transaction is made. The digital signature +ensures everybody can check that the person who makes this +transaction is in the possession of the private key. Otherwise +the signature would not match up with the public-key address. + +When mining the blockchain it only needs to be ensured that +the transactions are consistent (all hashes and signatures +match up). Then we need to generate the correct previous-block +link and solve the resulting puzzle. Once the block is +accepted, everybody can check the integrity of the whole +blockchain. + +A word of warning: The point of a lottery is that some people +win. But equally, that most people lose. Mining Bitcoins has +pretty much the same point. According to the article below, a +very large machine (very, very large in terms of June 2014) +could potentially mine \$40 worth of Bitcoins a day, but would +require magnitudes more of electricity costs to do so. -In 2016, Yahoo released the so far largest machine learning -dataset to the research community. It includes approximately -13.5 TByte of data representing around 100 Billion events from -anonymized user-news items, collected by recording -interactions of about 20M users from February 2015 to May -2015. Yahoo's gracious goal is to promote independent research -in the fields of large-scale machine learning and recommender -systems. It remains to be seen whether this data will really -only be used for that purpose. +\begin{center} +\url{http://bitcoinmagazine.com/13774/government-bans-professor-mining-bitcoin-supercomputer/} +\end{center} + +\noindent Bitcoin mining nowadays is only competitive, or +profitable, if you get the energy for free, or use special +purpose computing devices. + +This about ``free'' energy can actually hurt you very badly in +unexpected ways. You probably have heard about, or even used, +Amazon's Elastic Compute Cloud (EC2). Essentially, Amazon is +selling computing power that you can use to run your web site, +for example. It is \emph{elastic} in the sense that if you +have a lot of visitors, you pay a lot, if you have only a few, +then it is cheap. In order to bill you, you need to set +up an account with Amazon and receive some secret keys in +order to authenticate you. The clever (but also dangerous) bit +is that you upload the code of your web site to GitHub and +Amazon will pull it from there. You can probably already guess +where this is going: in order to learn about Amazon's API, it +gives out some limited computing power for free. Somebody used +this offer in order to teach himself Ruby on Rails with a +mildly practical website. Unfortunately, he uploaded also his +secret keys to GitHub (this is really an easy mistake). Now, +nasty people crawl GitHub for the purpose of stealing such +secret keys. What can they do with this? Well, they quickly +max out the limit of computing power with Amazon and mine +Bitcoins (under somebody else's account). Fortunately for this +guy, Amazon was aware of this scam and in a goodwill gesture +refunded him the money the nasty guys incurred over +night with their Bitcoin mining. If you want to read the +complete story, google for ``My \$2375 Amazon EC2 Mistake''. + +\subsubsection*{Multi-Signature Transactions} + +To be explained. -\subsubsection*{Differential Privacy} +\subsubsection*{Anonymity with Bitcoins} + +One question one often hears is how anonymous is it actually +to pay with Bitcoins? Paying with paper money used to be a +quite anonymous act (unlike paying with credit cards, for +example). But this has changed nowadays: You cannot come to a +bank anymore with a suitcase full of money and try to open a +bank account. Strict money laundering and taxation laws mean +that not even Swiss banks are prepared to take such money and +open a bank account. That is why Bitcoins are touted as +filling this niche again of anonymous payments. + +While Bitcoins are intended to be anonymous, the reality is +slightly different. I fully agree with the statement by +Nielsen from the blog article I referenced at the beginning: + +\begin{quote}\it{}``Many people claim that Bitcoin can be used +anonymously. This claim has led to the formation of +marketplaces such as Silk Road (and various successors), which +specialize in illegal goods. However, the claim that Bitcoin +is anonymous is a myth. The block chain is public, meaning +that it’s possible for anyone to see every Bitcoin transaction +ever. Although Bitcoin addresses aren't immediately associated +to real-world identities, computer scientists have done a +great deal of work figuring out how to de-anonymise +`anonymous' social networks. The block chain is a marvellous +target for these techniques. I will be extremely surprised if +the great majority of Bitcoin users are not identified with +relatively high confidence and ease in the near future.'' +\end{quote} + +\noindent The only thing I can add to this is that with the Bitcoin +blockchain we will in the future have even more pleasure hearing +confessions from reputable or not-so-reputable people, like the +infamous ``I did not inhale'' from an US +president.\footnote{\url{www.youtube.com/watch?v=Bktd_Pi4YJw}} The +whole point of the blockchain is that it public and will always be. -Differential privacy is one of the few methods that tries to -achieve forward privacy. The basic idea is to add appropriate -noise, or errors, to any query of the dataset. The intention -is to make the result of a query insensitive to individual -entries in the database. That means the results are -approximately the same no matter if a particular individual is -in the dataset or not. The hope is that the added error does -not eliminate the ``signal'' one is looking for in the -dataset. +There are some precautions one can take for boosting anonymity, for +example to use a new public-private key pair for every new +transaction, and to access Bitcoin only through the Tor network. But +the transactions in Bitcoins are designed such that they allow one to +combine incoming transactions. In such cases we know they must have +been made by the single person who knew the corresponding private +keys. So using different public-private keys for each transaction +might not actually make the de-anonymisation task much harder. And the +point about de-ano\-nymising `anonymous' social networks is that the +information is embedded into the structure of the transition +graph. And this cannot be erased with Bitcoins. + +One paper that has fun with spotting transactions made to Silk Road (2.0) +and also to Wikileaks is + +\begin{center} +\url{http://people.csail.mit.edu/spillai/data/papers/bitcoin-transaction-graph-analysis.pdf} +\end{center} + +\noindent +A paper that gathers some statistical data about the blockchain is + +\begin{center} +\url{https://eprint.iacr.org/2012/584.pdf} +\end{center} + +\subsubsection*{Government Meddling} + +Finally, what are the options for a typical Western government to +meddle with Bitcoins? This is of course one feature the proponents of +Bitcoins also tout: namely that there aren't any options. In my +opinion this is far too naive and far from the truth. Let us assume +some law enforcement agencies would not have been able to uncover the +baddies from Silk Road 1.0 and 2.0 (they have done so by uncovering +the Tor network, which is an incredible feat on its own). Would the +government in question have stopped? I do not think so. The next +target would have been Bitcoin. If I were the government, this is +what I would consider: -%\begin{center} -%User\;\;\;\; -%\begin{tabular}{c} -%tell me $f(x)$ $\Rightarrow$\\ -%$\Leftarrow$ $f(x) + \text{noise}$ -%\end{tabular} -%\;\;\;\;\begin{tabular}{@{}c} -%Database\\ -%$x_1, \ldots, x_n$ -%\end{tabular} -%\end{center} -% -%\begin{center} -%\begin{tabular}{l|l} -%Staff & Salary\\\hline -%$PM$ & \pounds{107}\\ -%$PF$ & \pounds{102}\\ -%$LM_1$ & \pounds{101}\\ -%$LF_2$ & \pounds{97}\\ -%$LM_3$ & \pounds{100}\\ -%$LM_4$ & \pounds{99}\\ -%$LF_5$ & \pounds{98} -%\end{tabular} -%\end{center} -% -% -%\begin{center} -%\begin{tikzpicture} -%\begin{axis}[symbolic y coords={salary}, -% ytick=data, -% height=3cm] %\addplot+[jump mark mid] coordinates %{(0,salary) (0.1,salary) % (0.4,salary) (0.5,salary) % (0.8,salary) (0.9,salary)}; %\end{axis} %\end{tikzpicture} -%\end{center} -% -%\begin{tikzpicture}[outline/.style={draw=#1,fill=#1!20}] % \node [outline=red] {red box}; % \node [outline=blue] at (0,-1) {blue box}; %\end{tikzpicture} +\begin{itemize} +\item The government could compel ``mayor players'' to blacklist + Bitcoins (for example at Bitcoin exchanges, which are usually + located somewhere in the vicinity of the government's reach). This + would impinge on what is called \emph{fungibility} of Bitcoins and + make them much less attractive to baddies. Suddenly their + ``hard-earned'' Bitcoin money cannot be spent anymore. The attraction + of this option is that this blacklisting can be easily done + ``whole-sale'' and therefore be really be an attractive target for + governments \& Co. +\item The government could attempt to coerce the developer + community of the Bitcoin tools. While this might be a + bit harder, we know certain governments are ready to + take such actions (we have seen this with Lavabit, just + that the developers there refused to play ball and shut + down their complete operation). +\item The government could also put pressure on mining pools + in order to blacklist transactions from baddies. Or be a + big miner itself. Given the gigantic facilities that + are built for institutions like the NSA (pictures from + the Utah dessert) + + \begin{center} + \includegraphics[scale=0.04]{../pics/nsautah1.jpg} + \hspace{3mm} + \includegraphics[scale=0.031]{../pics/nsautah2.jpg} + \end{center} + + this would not be such a high bar to jump over. Remember it + ``only'' takes to be temporarily in control of 50\%-plus of the + mining capacity in order to undermine the trust in the + system. Given sophisticated stories like Stuxnet (where we still + do not know the precise details) maybe even such large + facilities are not really needed. What happens, for example, if + a government starts DoS attacks on existing miners? They have + complete control (unfortunately) of all mayor connectivity + providers, i.e.~ISPs. + + There are estimates that the Bitcoin mining capacity + outperforms the top 500 supercomputers in the world, + combined(!): + + \begin{center}\small + \url{http://www.forbes.com/sites/reuvencohen/2013/11/28/global-bitcoin-computing-power-now-256-times-faster-than-top-500-supercomputers-combined/} + \end{center} + + But my gut feeling is that these are too simplistic + calculations. In security (and things like Bitcoins) the + world is never just black and white. The point is once + the trust is undermined, the Bitcoin system would need + to be evolved to Bitcoins 2.0. But who says that Bitcoin + 2.0 will honour the Bitcoins from Version 1.0? + \end{itemize} -\ldots +\noindent A government would potentially not really +need to follow up with such threads. Just the rumour that it +would, could be enough to get the Bitcoin-house-of-cards to +tumble. Some governments have already such an ``impressive'' +trackrecord in this area, such a thread would be entirely +credible. Because of all this, I would not have too much hope +that Bitcoins are free from interference by governments \& Co when +it will stand in their way, despite what everybody else is +saying. To sum up, the technical details behind Bitcoins are +simply cool. But still the entire Bitcoin ecosystem is in my +humble opinion rather fragile. + + +\subsubsection*{Isn't there anything good with Bitcoins?} + +As you can see, so far my argument was that yes the Bitcoin system is +based on a lot of very cool technical ideas, but otherwise it is a big +scam. You might wonder if there is not something good (in terms of +valuable for civilisation) in the bitcoin system? I think there is +actually: diamonds are quite valuable and because of this can be +used as a form of `money'---just remember the song with the line +`diamonds are forever'. + +The problem with diamonds is that in some places where they are found, +they also fund some stupid wars. You like to set up a usable system +whereby you can check whether a diamond comes from a reputable source +(not funding any wars) or from a dodgy source. For this you have to +know that `clearing houses' for diamonds can engrave with lasers +unique numbers inside the diamonds. These engravings are invisible to +the naked eye and as far as I remember these numbers cannot be removed, +except by destroying the diamond. Even if it can be removes, diamonds +without the number cannot (hopefully) be sold. +How do bitcoins come into the picture? The idea is called +\emph{coloured coins}, where you attach some additional information to +some `coins'. In the diamond example the bitcoin transactions are +supposed to act as a certificate where diamonds are from (reputable +sources or not). For this you have to know that you can attach a very +short custom-made message with each bitcoin transaction. So you would +record the diamond number inside the message. + +Now, you would set the system up so that a trusted entity (which +exists in the diamond world) buys with their public key bitcoins (or +smaller amounts). These trusted entities are essentially the places +that also cut the raw diamonds. The idea is whenever you buy a +diamond, you like to have also the corresponding bitcoin +transaction. If you want to sell the diamond, you make a transaction +to the new owner. The new owner will ask for this message, because +otherwise he/she cannot sell it later on. + +The advantage is that for each diamond you can trace back that the +transaction must have originated from the trusted entity. If yes, your +diamond will be sellable. If you do not have the message, the diamond +comes from a dodgy source and will (hopefully) not be sellable later +on. In this way you skew the incentives such that only legitimate +diamond are of value. The bitcoin system just helps with being able to +check whether the message originates from the trusted entity or +not....you do not have to consult anybody else and pay money for this +consultation. Or in any way reveal your identity by such a consultation +(the police might just keep a particularly close an eye on who contacts +such a clearing house). + +Since we hopefully all agree that funding stupid wars is bad, any +system that can starve funds for such wars must be good. Piggy-bagging +on the trust established by the bitcoin system on the public block chain +makes such a system realisable. \subsubsection*{Further Reading} -Two cool articles about how somebody obtained via the Freedom -of Information Law the taxicab dataset of New York and someone -else showed how easy it is to mine for private information: +Finally, finally, the article \begin{center}\small -\begin{tabular}{p{0.78\textwidth}} -\url{http://chriswhong.com/open-data/foil_nyc_taxi/}\smallskip\\ -\url{http://research.neustar.biz/2014/09/15/riding-with-the-stars-passenger-privacy-in-the-nyc-taxicab-dataset} -\end{tabular} +\url{http://www.extremetech.com/extreme/155636-the-bitcoin-network-outperforms-the-top-500-supercomputers-combined} \end{center} -\noindent -A readable article about how supermarkets mine your shopping -habits (especially how they prey on new exhausted parents -;o) appeared in 2012 in the New York Times: +\noindent makes an interesting point: If people are willing to +solve meaningless puzzles for hard, cold cash and with this +achieve rather impressive results, what could we achieve if +the UN, say, would find the money and incentivise people to, +for example, solve protein folding +puzzles?\footnote{\url{http://en.wikipedia.org/wiki/Protein_folding}} +For this there are projects like +Folding@home.\footnote{\url{http://folding.stanford.edu}} +This might help with curing diseases such as Alzheimer or +diabetes. The same point is made in the article + +\begin{center}\small +\url{http://gizmodo.com/the-worlds-most-powerful-computer-network-is-being-was-504503726} +\end{center} + +A definitely interesting and worthy use of Bitcoins has been explored +in the thesis \begin{center} -\url{http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html} +\url{http://enetium.com/resources/Thesis.pdf} \end{center} -\noindent An article that analyses privacy and shopping habits -from a more economic point of view is available from: +\noindent where the author proposes ways of publishing information +that is censor-resistant as part of the blockchain. The idea is that +if a government wants to use Bitcoins, it would also have to put up +with plain-text data that can be included in a transaction. -\begin{center} -\url{http://www.dtc.umn.edu/~odlyzko/doc/privacy.economics.pdf} -\end{center} +Ken Shirrif in his blog at -\noindent An attempt to untangle the web of current technology -for spying on consumers is published in: +\begin{center}\small +\url{http://www.righto.com/2014/02/bitcoin-mining-hard-way-algorithms.html} +\end{center} -\begin{center} -\url{http://cyberlaw.stanford.edu/files/publication/files/trackingsurvey12.pdf} -\end{center} +\noindent writes that every day the electricity consumption of mining +for bitcoins is roughly 15 Mega Watts---the energy consumption of a country +like Cambodia. He writes: -\noindent An article that sheds light on the paradox that -people usually worry about privacy invasions of little -significance, and overlook the privacy invasion that might -cause significant damage: - -\begin{center} -\url{http://www.heinz.cmu.edu/~acquisti/papers/Acquisti-Grossklags-Chapter-Etrics.pdf} -\end{center} +\begin{quote} + \it{}``The difficulty of mining a block is astounding. At the + current difficulty, the chance of a hash succeeding is a bit less + than one in $10^{19}$. Finding a successful hash is harder than + finding a particular grain of sand from all the grains of sand on + Earth. To find a hash every ten minutes, the Bitcoin hash rate needs + to be insanely large. Currently, the miners on the Bitcoin network + are doing about 25 million gigahashes per second. That is, every + second about 25,000,000,000,000,000 blocks gets hashed. I estimate + (very roughly) that the total hardware used for Bitcoin mining cost + tens of millions of dollars and uses as much power as the country of + Cambodia.'' +\end{quote} \end{document} -http://randomwalker.info/teaching/fall-2012-privacy-technologies/? -http://chronicle.com/article/Why-Privacy-Matters-Even-if/127461/ -http://repository.cmu.edu/cgi/viewcontent.cgi?article=1077&context=hcii -https://josephhall.org/papers/NYU-MCC-1303-S2012_privacy_syllabus.pdf -http://www.jetlaw.org/wp-content/uploads/2014/06/Bambauer_Final.pdf -http://www.cs.cmu.edu/~yuxiangw/docs/Differential%20Privacy.pdf -https://www.youtube.com/watch?v=Gx13lgEudtU -https://www.cs.purdue.edu/homes/ctask/pdfs/CERIAS_Presentation.pdf -http://www.futureofprivacy.org/wp-content/uploads/Differential-Privacy-as-a-Response-to-the-Reidentification-Threat-Klinefelter-and-Chin.pdf -http://www.cis.upenn.edu/~aaroth/courses/slides/Overview.pdf -http://www.cl.cam.ac.uk/~sjm217/papers/tor14design.pdf +bit coin + +A fistful of bitcoins +http://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf +http://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf + +Ross Anderson & Co (no dispute resolution; co-ercion) +http://www.cl.cam.ac.uk/~sjm217/papers/fc14evidence.pdf -%%% Local Variables: -%%% mode: latex -%%% TeX-master: t -%%% End: +http://www.michaelnielsen.org/ddi/how-the-bitcoin-protocol-actually-works/ +http://www.imponderablethings.com/2013/07/how-bitcoin-works-under-hood.html + +http://randomwalker.info/bitcoin/ + +Jeffrey Robinson +Bitcon: The Naked Truth about Bitcoin + +The Bitcoin Backbone Protocol: Analysis and Applications +https://eprint.iacr.org/2014/765.pdf + +Bitcoin book +http://chimera.labs.oreilly.com/books/1234000001802/ch04.html#public_key_derivation