diff -r 38ddbc59325a -r abe178b3197e handouts/ho06.tex --- a/handouts/ho06.tex Sun Oct 25 23:34:04 2015 +0000 +++ b/handouts/ho06.tex Thu Nov 05 02:11:13 2015 +0000 @@ -29,7 +29,8 @@ \end{quote} \noindent -Take the first non-article word in this definition, +Take the first non-small word\footnote{Let's say the, a, an, +or, and \ldots fall into the category of small words.} in this definition, in this case \textit{individual}, and look up the definition of this word, say @@ -38,42 +39,42 @@ \end{quote} \noindent -In this definition take the second non-article word, that +In this definition take the second non-small word, that is \textit{human}, and again look up the definition of this word. This will yield \begin{quote} -``relating to \textit{or} characteristic of humankind'' +``relating to or \textit{characteristic} of humankind'' \end{quote} \noindent You could go on looking up the definition of the third -non-article in this definition and so on. But let us assume +non-small word in this definition and so on. But let us assume you agreed with Bob to stop after three iterations with the third non-article word in the last definition, that is \textit{or}. Now, instead of sending to Bob the solution -\textit{folio}, you send to him \textit{or}. +\textit{folio}, you send to him \textit{characteristic}. How can Bob verify that you know the solution? Well, once he solved it himself, he can use the dictionary and follow the same ``trail'' as you did. If the final word agrees with what -you had sent him, he must infer you knew the solution earlier than -him. This protocol works like a one-way hash function because -\textit{or} does not give any hint as to what was the first -word was. I leave you to think why this protocol avoids -articles? +you had sent him, he must infer you knew the solution earlier +than him. This protocol works like a one-way hash function +because \textit{characteristic} does not give any hint as to +what was the first word was. I leave you to think why this +protocol avoids small words? After Bob found his solution and verified that according to -the protocol it ``maps'' also to \textit{or}, can he be -entirely sure no cheating is going on? Not with 100\% -certainty. It could have been possible that he was -given \textit{or} as the word, but it derived from a -different word. This might seem very unlikely, but at -least theoretical it is a possibility. Protocols based on +the protocol it ``maps'' also to \textit{characteristic}, can +he be entirely sure no cheating is going on? Not with 100\% +certainty. It could have been possible that he was given +\textit{characteristic} as the word, but it derived from a +different word. This might seem very unlikely, but at least +theoretical it is a possibility. Protocols based on zero-knowledge proofs will produce a similar result---they give an answer that might be erroneous in a very small number -of cases. The point is to iterate them long enough so that -the theoretical possibility of cheating is negligibly small. +of cases. The point is to iterate them long enough so that the +theoretical possibility of cheating is negligibly small. By the way, the authors of the paper ``Dismantling Megamos Crypto: Wirelessly Lockpicking a Vehicle Immobilizer'' who @@ -88,13 +89,11 @@ made public. Bob needs to know that \textit{folio} is the solution before he can verify the claim of Alice that she had the solution first. Similarly with the car-crypto paper: we -need to wait until the authors are finally allowed to publish -their findings in order to verify the hash. This might happen -at some point, but equally it might never happen (what for -example happens if the authors lose their copy of the paper -because of a disk failure?). Zero-knowledge proofs, in -contrast, can be immediately checked, even if the secret is -not public yet and perhaps never will be. +needed to wait until September 2015 when the authors were +finally able to publish their findings in order to verify the +hash. Zero-knowledge proofs, in contrast, can be immediately +checked, even if the secret is not public yet and perhaps +never will be. \begin{figure} \begin{center} @@ -157,12 +156,15 @@ should emerge from $A$ then Alice is in trouble: Bob will find out she does not actually know the secret. So in order to fool Bob she needs to anticipate his call, and already go into the -corresponding tunnel. This of course also does not work. -Consequently in order to find out whether Alice cheats, Bob -just needs to repeat this protocol many times. Each time Alice -has a chance of $\frac{1}{2}$ to be lucky or being found out. -Iterating this $n$ times means she must be right every time -and when cheating the probability for this is $\frac{1}{2}^n$. +corresponding tunnel. This of course also does not work, since +Bob can make any call he wants. Consequently in order to find +out whether Alice cheats, Bob just needs to repeat this +protocol many times. Each time Alice has a chance of +$\frac{1}{2}$ to be lucky or being found out. Iterating this +$n$ times means she must be right every time and when +cheating: the probability for this is $\frac{1}{2}^n$, number +that for already relatively small $n$, say 10, is incredibly +small. There are some interesting observations we can make about @@ -261,18 +263,22 @@ follow the following protocol: \begin{enumerate} -\item Alice generates an isomorphic graph $H$ which she - sends to Bob. -\item After receiving $H$, Bob asks Alice either for an +\item Alice generates an isomorphic graph $H$ which she sends + to Bob (in each iteration she needs to generate a + different $H$). +\item + After receiving $H$, Bob asks Alice either for an isomorphism between $G_1$ and $H$, or $G_2$ and $H$. \item Alice and Bob repeat this procedure $n$ times. \end{enumerate} \noindent In Step 1 it is important that Alice always -generates a fresh isomorphic graph. As said before, -this is relatively easy to generate by consistently -relabelling nodes. If she started from $G_1$, Alice will -have generated +generates a fresh isomorphic graph. I let you think what +would happen if Alice sends out twice the same graph $H$. + +As said before, this is relatively easy to generate by +consistently relabelling nodes. If she started from $G_1$, +Alice will have generated \begin{equation} H = \sigma'(G_1)\label{hiso} @@ -323,7 +329,8 @@ \noindent As can be seen the protocol runs for some agreed number of iterations. The $H_i$ Alice needs to -produce, need to be all distinct. I let you think why? +produce, need to be all distinct. I hope you now know +why? It is also crucial that in each iteration, Alice first sends $H_i$ and then Bob can decide which isomorphism he wants: @@ -354,7 +361,8 @@ $H_{1..n}$ (they need to be all distinct) \item she feeds the $H_{1..n}$ into a hashing function (for example encoded as as matrix) -\item she takes the first $n$ bits of the output: +\item she takes the first $n$ bits of the output of the hashing + function: whenever the output is $0$, she shows an isomorphism with $G_1$; for $1$ she shows an isomorphism with $G_2$ @@ -362,7 +370,7 @@ \noindent The reason why this works and achieves the same goal as the interactive variant is that Alice has no -control over the hashing functions. It would be +control over the hashing function. It would be computationally just too hard to assemble a set of $H_{1..n}$ such that she can force where $0$s and $1$s in the hash values are such that it would pass an external @@ -378,7 +386,7 @@ any NP problem can be used as part of a ZKP protocol. -\subsubsection*{Using Modular Arithmetic for ZKP Protocols} +\subsubsection*{Using Modular Logarithms for ZKP Protocols} While information can be encoded into graph isomorphisms, it is not the most convenient carrier of information. Clearly it @@ -393,17 +401,207 @@ \end{center} \noindent holds. The secret Alice tries to keep secret is $x$. -\bigskip +The point of the modular logarithm is that it is very hard +from the public data to calculate $x$ (for large primes). +Now the protocol proceeds in three stages: + +\begin{itemize} +\item {\bf Commitment Stage} + \begin{enumerate} + \item Alice generates $z$ random numbers $r_1, \ldots, r_z$, + all less than $p - 1$. Alice then sends Bob for all + $i = 1,\ldots, z$: + \[ h_i = A^{r_i}\; mod\; p\] + \item Bob generates $z$ random bits, say $b_1,\ldots, b_z$. He can do this + by flipping $z$ times a coin, for example. + \item For each bit $b_i$, Alice sends Bob an $s_i$ where + + \begin{center} + \begin{tabular}{ll} + if $b_i = 0$: & $s_i = r_i$\\ + if $b_i = 1$: & $s_i = (r_i - r_j) \;mod\; (p -1)$\\ + \end{tabular} + \end{center} + + where $r_j$ is the lowest $j$ where $b_j = 1$. + + \end{enumerate} + \end{itemize} + +\noindent For understanding the last step, let $z$ be just 4. +We have four random values $r_i$ chosen by Alice and four +random bits $b_i$ chosen subsequently by Bob, for example + + \begin{center} + \begin{tabular}{lcccc} + $r_i$:\; & 4 & 9 & 1 & 3\\ + $b_i$:\; & 0 & 1 & 0 & 1\\ + & & $\uparrow$ \\ + & & $j$ + \end{tabular} + \end{center} + + \noindent The highlighted column is the lowest where $b_i = + 1$ (counted from the left). That means $r_j = 9$. The reason + for letting Alice choose the random numbers $r_1, \ldots, r_z$ + will become clear shortly. Next is the confirmation + phase where Bob essentially checks whether Alice has sent + him ``correct'' $s_i$ and $h_i$. + +\begin{itemize} +\item {\bf Confirmation Stage} + \begin{enumerate} + \item For each $b_i$ Bob checks whether $s_i$ + conform to the protocol + + \begin{center} + \begin{tabular}{ll} + if $b_i = 0$: & $A^{s_i} \stackrel{?}{\equiv} h_i\;mod\;p$\\ + if $b_i = 1$: & $A^{s_i} \stackrel{?}{\equiv} h_i * h_j^{-1} \;mod\; p$\\ + \end{tabular} + \end{center} + \end{enumerate} +\end{itemize} + +\noindent To understand the case for $b_i = 1$, you have +to do the following calculation: + +\begin{center} +\begin{tabular}{r@{\hspace{1mm}}c@{\hspace{1mm}}l} +$A^{s_i}$ & $=$ & $A^{r_i - r_j}$\\ + & $=$ & $A^{r_i} * A^{-r_j}$\\ + & $=$ & $h_{r_i} * h_{r_j}^{-1}\;mod\;p$ +\end{tabular} +\end{center} + +\noindent What is interesting that so far nothing has been +sent about $x$, which is the secret Alice has. Also notice +that Bob does not know $r_j$. He received + +\begin{center} +$r_j - r_j$, $r_m - r_j$, \ldots, $r_p - r_j \;mod \;p - 1$ +\end{center} + +\noindent whenever his corresponding bits were $1$. So Bob +does not know $r_j$ and also does not know any $r_i$ where the +bit was $1$. Information about the $x$ is sent in the next +stage (obviously not revealing $x$). + +\begin{itemize} +\item {\bf Proving Stage} + +\begin{enumerate} +\item Alice proves she knows $x$, the discrete log of $B$, +by sending + +\begin{center} +$s_{z+1} = x - r_j\;mod\;p-1$ +\end{center} + +\item Bob confirms + +\begin{center} +$A^{s_{z+1}} \stackrel{?}{\equiv} B * h_j^{-1} \;mod \; p$ +\end{center} +\end{enumerate} +\end{itemize} + +\noindent To understand the last step, you have to do the trick +again that + +\[A^{s_{z+1}} = A^{x-r_j} = \ldots +\] \noindent -\ldots still to be completed (for example can be attacked by -MITM attacks) +which I leave to you. + +Now the question is how can Alice cheat? In order to cheat she +has to coordinate what she sends as $h_i$ in step 1 and $s_i$ +in step 3 of the commitment stage, and also what to send as +$s_{z+1}$ in the proving stage. For the latter of course +Alice does not know $x$, so she just chooses some random +number for $s_{z+1}$ and calculates + +\[A^{s_{z+1}}\] + +\noindent +and then solves the equation + +\[A^{s_{z+1}} \equiv B * y \;mod\;p\] + +\noindent for $y$. This is easy since no logarithm needs to be +computed. If Alice can guess the $j$ where the first 1 will +appear in Bob's bit vector, then she sends the inverse of $y$ +as $h_j$ and 0 as $s_j$. However, notice that when she +calculates a solution for $y$ she does not know $r_j$. For this she +would need to calculate the modular logarithm + + +\[ +y \equiv A^{r_j}\;mod\;p +\] + +\noindent which is hard (see step 1 in the commitment stage). + +Having settled on what $h_j$ should be, now what should Alice +send as the other $h_i$ and other $s_i$? If the $b_i$ happens +to be a 1, then the $h_i$ and other $s_i$ need to satisfy the +test + +\[A^{s_i} \stackrel{?}{\equiv} h_i * h_j^{-1} \;mod\; p\] + +\noindent where she has already settled on the value of +$h_j^{-1}$. Lets say she choses $s_i$ at random, then she just +needs to solve + +\[A^{s_i} \equiv z * h_j^{-1} \;mod\; p\] + +\noindent for $z$. Again that is easy, but it does not allow +us to know $r_i$, because then we would again need to solve +a modular logarithm problem. Let us call an $h_i$ which was +solved the easy way as \emph{bogus}. Alice has to produce +bogus $h_i$ for all bits that are going to be $1$ in advance! +This means she has to guess all the bits correctly. (Yes?) + +Let us see what happens if she guesses wrongly: Suppose the +bit $b_i = 1$ where she thought she will get a 0. Then she has +already sent an $h_i$ and $h_j$ and now must find an $s_i$ +such that + +\[A^{s_i} \equiv h_i * h_j^{-1} \;mod\; p\] + +\noindent holds. For this remember in calculating $h_i$, she +just chose a random $s_i$. Now she has to send a genuine one. +But this is of course too hard. If she knew the genuine $r_i$ +and $r_j$ for $h_i$ and $h_j$, it would be easy (in this case +$s_i = r_i - r_j$). But she does not. So she will be found +out. If $b_i = 0$, but she thought she will get a 1, then +she has to send a $s_i$ which satisfies + +\[A^{s_i} \equiv h_i\;mod\;p\] + +\noindent Again she does not know $r_i$. So it is a too hard +task and she will be found out again. + +To sum up, in order for Alice to successfully cheat Bob, she +needs to guess \emph{all} bits correctly. She has only a +$\frac{1}{2^z}$ chance of doing this. \subsubsection*{Further Reading} Make sure you understand what NP problems -are.\footnote{\url{http://en.wikipedia.org/wiki/NP_(complexity)}} They -are the building blocks for zero-knowledge proofs. +are.\footnote{\url{http://en.wikipedia.org/wiki/NP_(complexity)}} +They are the building blocks for zero-knowledge proofs. +Zero-Knowldege proofs are not yet widely used in production +systems, but it is slowly gaining ground. One application +where they pop up are crypto currencies. + +If you want to brush up on the modular logarithm problem, +the Khan Academy has a nice video: + +\begin{center} +\url{https://www.khanacademy.org/video/discrete-logarithm-problem} +\end{center} \end{document}