--- a/handouts/ho06.tex Sun Oct 25 23:34:04 2015 +0000
+++ b/handouts/ho06.tex Thu Nov 05 02:11:13 2015 +0000
@@ -29,7 +29,8 @@
\end{quote}
\noindent
-Take the first non-article word in this definition,
+Take the first non-small word\footnote{Let's say the, a, an,
+or, and \ldots fall into the category of small words.} in this definition,
in this case \textit{individual}, and look up the definition
of this word, say
@@ -38,42 +39,42 @@
\end{quote}
\noindent
-In this definition take the second non-article word, that
+In this definition take the second non-small word, that
is \textit{human}, and again look up the definition of this
word. This will yield
\begin{quote}
-``relating to \textit{or} characteristic of humankind''
+``relating to or \textit{characteristic} of humankind''
\end{quote}
\noindent
You could go on looking up the definition of the third
-non-article in this definition and so on. But let us assume
+non-small word in this definition and so on. But let us assume
you agreed with Bob to stop after three iterations with the
third non-article word in the last definition, that is
\textit{or}. Now, instead of sending to Bob the solution
-\textit{folio}, you send to him \textit{or}.
+\textit{folio}, you send to him \textit{characteristic}.
How can Bob verify that you know the solution? Well, once he
solved it himself, he can use the dictionary and follow the
same ``trail'' as you did. If the final word agrees with what
-you had sent him, he must infer you knew the solution earlier than
-him. This protocol works like a one-way hash function because
-\textit{or} does not give any hint as to what was the first
-word was. I leave you to think why this protocol avoids
-articles?
+you had sent him, he must infer you knew the solution earlier
+than him. This protocol works like a one-way hash function
+because \textit{characteristic} does not give any hint as to
+what was the first word was. I leave you to think why this
+protocol avoids small words?
After Bob found his solution and verified that according to
-the protocol it ``maps'' also to \textit{or}, can he be
-entirely sure no cheating is going on? Not with 100\%
-certainty. It could have been possible that he was
-given \textit{or} as the word, but it derived from a
-different word. This might seem very unlikely, but at
-least theoretical it is a possibility. Protocols based on
+the protocol it ``maps'' also to \textit{characteristic}, can
+he be entirely sure no cheating is going on? Not with 100\%
+certainty. It could have been possible that he was given
+\textit{characteristic} as the word, but it derived from a
+different word. This might seem very unlikely, but at least
+theoretical it is a possibility. Protocols based on
zero-knowledge proofs will produce a similar result---they
give an answer that might be erroneous in a very small number
-of cases. The point is to iterate them long enough so that
-the theoretical possibility of cheating is negligibly small.
+of cases. The point is to iterate them long enough so that the
+theoretical possibility of cheating is negligibly small.
By the way, the authors of the paper ``Dismantling Megamos
Crypto: Wirelessly Lockpicking a Vehicle Immobilizer'' who
@@ -88,13 +89,11 @@
made public. Bob needs to know that \textit{folio} is the
solution before he can verify the claim of Alice that she had
the solution first. Similarly with the car-crypto paper: we
-need to wait until the authors are finally allowed to publish
-their findings in order to verify the hash. This might happen
-at some point, but equally it might never happen (what for
-example happens if the authors lose their copy of the paper
-because of a disk failure?). Zero-knowledge proofs, in
-contrast, can be immediately checked, even if the secret is
-not public yet and perhaps never will be.
+needed to wait until September 2015 when the authors were
+finally able to publish their findings in order to verify the
+hash. Zero-knowledge proofs, in contrast, can be immediately
+checked, even if the secret is not public yet and perhaps
+never will be.
\begin{figure}
\begin{center}
@@ -157,12 +156,15 @@
should emerge from $A$ then Alice is in trouble: Bob will find
out she does not actually know the secret. So in order to fool
Bob she needs to anticipate his call, and already go into the
-corresponding tunnel. This of course also does not work.
-Consequently in order to find out whether Alice cheats, Bob
-just needs to repeat this protocol many times. Each time Alice
-has a chance of $\frac{1}{2}$ to be lucky or being found out.
-Iterating this $n$ times means she must be right every time
-and when cheating the probability for this is $\frac{1}{2}^n$.
+corresponding tunnel. This of course also does not work, since
+Bob can make any call he wants. Consequently in order to find
+out whether Alice cheats, Bob just needs to repeat this
+protocol many times. Each time Alice has a chance of
+$\frac{1}{2}$ to be lucky or being found out. Iterating this
+$n$ times means she must be right every time and when
+cheating: the probability for this is $\frac{1}{2}^n$, number
+that for already relatively small $n$, say 10, is incredibly
+small.
There are some interesting observations we can make about
@@ -261,18 +263,22 @@
follow the following protocol:
\begin{enumerate}
-\item Alice generates an isomorphic graph $H$ which she
- sends to Bob.
-\item After receiving $H$, Bob asks Alice either for an
+\item Alice generates an isomorphic graph $H$ which she sends
+ to Bob (in each iteration she needs to generate a
+ different $H$).
+\item
+ After receiving $H$, Bob asks Alice either for an
isomorphism between $G_1$ and $H$, or $G_2$ and $H$.
\item Alice and Bob repeat this procedure $n$ times.
\end{enumerate}
\noindent In Step 1 it is important that Alice always
-generates a fresh isomorphic graph. As said before,
-this is relatively easy to generate by consistently
-relabelling nodes. If she started from $G_1$, Alice will
-have generated
+generates a fresh isomorphic graph. I let you think what
+would happen if Alice sends out twice the same graph $H$.
+
+As said before, this is relatively easy to generate by
+consistently relabelling nodes. If she started from $G_1$,
+Alice will have generated
\begin{equation}
H = \sigma'(G_1)\label{hiso}
@@ -323,7 +329,8 @@
\noindent As can be seen the protocol runs for some
agreed number of iterations. The $H_i$ Alice needs to
-produce, need to be all distinct. I let you think why?
+produce, need to be all distinct. I hope you now know
+why?
It is also crucial that in each iteration, Alice first sends
$H_i$ and then Bob can decide which isomorphism he wants:
@@ -354,7 +361,8 @@
$H_{1..n}$ (they need to be all distinct)
\item she feeds the $H_{1..n}$ into a hashing function
(for example encoded as as matrix)
-\item she takes the first $n$ bits of the output:
+\item she takes the first $n$ bits of the output of the hashing
+ function:
whenever the output is $0$, she shows an isomorphism
with $G_1$; for $1$ she shows an isomorphism
with $G_2$
@@ -362,7 +370,7 @@
\noindent The reason why this works and achieves the same
goal as the interactive variant is that Alice has no
-control over the hashing functions. It would be
+control over the hashing function. It would be
computationally just too hard to assemble a set of
$H_{1..n}$ such that she can force where $0$s and $1$s
in the hash values are such that it would pass an external
@@ -378,7 +386,7 @@
any NP problem can be used as part of a ZKP protocol.
-\subsubsection*{Using Modular Arithmetic for ZKP Protocols}
+\subsubsection*{Using Modular Logarithms for ZKP Protocols}
While information can be encoded into graph isomorphisms, it
is not the most convenient carrier of information. Clearly it
@@ -393,17 +401,207 @@
\end{center}
\noindent holds. The secret Alice tries to keep secret is $x$.
-\bigskip
+The point of the modular logarithm is that it is very hard
+from the public data to calculate $x$ (for large primes).
+Now the protocol proceeds in three stages:
+
+\begin{itemize}
+\item {\bf Commitment Stage}
+ \begin{enumerate}
+ \item Alice generates $z$ random numbers $r_1, \ldots, r_z$,
+ all less than $p - 1$. Alice then sends Bob for all
+ $i = 1,\ldots, z$:
+ \[ h_i = A^{r_i}\; mod\; p\]
+ \item Bob generates $z$ random bits, say $b_1,\ldots, b_z$. He can do this
+ by flipping $z$ times a coin, for example.
+ \item For each bit $b_i$, Alice sends Bob an $s_i$ where
+
+ \begin{center}
+ \begin{tabular}{ll}
+ if $b_i = 0$: & $s_i = r_i$\\
+ if $b_i = 1$: & $s_i = (r_i - r_j) \;mod\; (p -1)$\\
+ \end{tabular}
+ \end{center}
+
+ where $r_j$ is the lowest $j$ where $b_j = 1$.
+
+ \end{enumerate}
+ \end{itemize}
+
+\noindent For understanding the last step, let $z$ be just 4.
+We have four random values $r_i$ chosen by Alice and four
+random bits $b_i$ chosen subsequently by Bob, for example
+
+ \begin{center}
+ \begin{tabular}{lcccc}
+ $r_i$:\; & 4 & 9 & 1 & 3\\
+ $b_i$:\; & 0 & 1 & 0 & 1\\
+ & & $\uparrow$ \\
+ & & $j$
+ \end{tabular}
+ \end{center}
+
+ \noindent The highlighted column is the lowest where $b_i =
+ 1$ (counted from the left). That means $r_j = 9$. The reason
+ for letting Alice choose the random numbers $r_1, \ldots, r_z$
+ will become clear shortly. Next is the confirmation
+ phase where Bob essentially checks whether Alice has sent
+ him ``correct'' $s_i$ and $h_i$.
+
+\begin{itemize}
+\item {\bf Confirmation Stage}
+ \begin{enumerate}
+ \item For each $b_i$ Bob checks whether $s_i$
+ conform to the protocol
+
+ \begin{center}
+ \begin{tabular}{ll}
+ if $b_i = 0$: & $A^{s_i} \stackrel{?}{\equiv} h_i\;mod\;p$\\
+ if $b_i = 1$: & $A^{s_i} \stackrel{?}{\equiv} h_i * h_j^{-1} \;mod\; p$\\
+ \end{tabular}
+ \end{center}
+ \end{enumerate}
+\end{itemize}
+
+\noindent To understand the case for $b_i = 1$, you have
+to do the following calculation:
+
+\begin{center}
+\begin{tabular}{r@{\hspace{1mm}}c@{\hspace{1mm}}l}
+$A^{s_i}$ & $=$ & $A^{r_i - r_j}$\\
+ & $=$ & $A^{r_i} * A^{-r_j}$\\
+ & $=$ & $h_{r_i} * h_{r_j}^{-1}\;mod\;p$
+\end{tabular}
+\end{center}
+
+\noindent What is interesting that so far nothing has been
+sent about $x$, which is the secret Alice has. Also notice
+that Bob does not know $r_j$. He received
+
+\begin{center}
+$r_j - r_j$, $r_m - r_j$, \ldots, $r_p - r_j \;mod \;p - 1$
+\end{center}
+
+\noindent whenever his corresponding bits were $1$. So Bob
+does not know $r_j$ and also does not know any $r_i$ where the
+bit was $1$. Information about the $x$ is sent in the next
+stage (obviously not revealing $x$).
+
+\begin{itemize}
+\item {\bf Proving Stage}
+
+\begin{enumerate}
+\item Alice proves she knows $x$, the discrete log of $B$,
+by sending
+
+\begin{center}
+$s_{z+1} = x - r_j\;mod\;p-1$
+\end{center}
+
+\item Bob confirms
+
+\begin{center}
+$A^{s_{z+1}} \stackrel{?}{\equiv} B * h_j^{-1} \;mod \; p$
+\end{center}
+\end{enumerate}
+\end{itemize}
+
+\noindent To understand the last step, you have to do the trick
+again that
+
+\[A^{s_{z+1}} = A^{x-r_j} = \ldots
+\]
\noindent
-\ldots still to be completed (for example can be attacked by
-MITM attacks)
+which I leave to you.
+
+Now the question is how can Alice cheat? In order to cheat she
+has to coordinate what she sends as $h_i$ in step 1 and $s_i$
+in step 3 of the commitment stage, and also what to send as
+$s_{z+1}$ in the proving stage. For the latter of course
+Alice does not know $x$, so she just chooses some random
+number for $s_{z+1}$ and calculates
+
+\[A^{s_{z+1}}\]
+
+\noindent
+and then solves the equation
+
+\[A^{s_{z+1}} \equiv B * y \;mod\;p\]
+
+\noindent for $y$. This is easy since no logarithm needs to be
+computed. If Alice can guess the $j$ where the first 1 will
+appear in Bob's bit vector, then she sends the inverse of $y$
+as $h_j$ and 0 as $s_j$. However, notice that when she
+calculates a solution for $y$ she does not know $r_j$. For this she
+would need to calculate the modular logarithm
+
+
+\[
+y \equiv A^{r_j}\;mod\;p
+\]
+
+\noindent which is hard (see step 1 in the commitment stage).
+
+Having settled on what $h_j$ should be, now what should Alice
+send as the other $h_i$ and other $s_i$? If the $b_i$ happens
+to be a 1, then the $h_i$ and other $s_i$ need to satisfy the
+test
+
+\[A^{s_i} \stackrel{?}{\equiv} h_i * h_j^{-1} \;mod\; p\]
+
+\noindent where she has already settled on the value of
+$h_j^{-1}$. Lets say she choses $s_i$ at random, then she just
+needs to solve
+
+\[A^{s_i} \equiv z * h_j^{-1} \;mod\; p\]
+
+\noindent for $z$. Again that is easy, but it does not allow
+us to know $r_i$, because then we would again need to solve
+a modular logarithm problem. Let us call an $h_i$ which was
+solved the easy way as \emph{bogus}. Alice has to produce
+bogus $h_i$ for all bits that are going to be $1$ in advance!
+This means she has to guess all the bits correctly. (Yes?)
+
+Let us see what happens if she guesses wrongly: Suppose the
+bit $b_i = 1$ where she thought she will get a 0. Then she has
+already sent an $h_i$ and $h_j$ and now must find an $s_i$
+such that
+
+\[A^{s_i} \equiv h_i * h_j^{-1} \;mod\; p\]
+
+\noindent holds. For this remember in calculating $h_i$, she
+just chose a random $s_i$. Now she has to send a genuine one.
+But this is of course too hard. If she knew the genuine $r_i$
+and $r_j$ for $h_i$ and $h_j$, it would be easy (in this case
+$s_i = r_i - r_j$). But she does not. So she will be found
+out. If $b_i = 0$, but she thought she will get a 1, then
+she has to send a $s_i$ which satisfies
+
+\[A^{s_i} \equiv h_i\;mod\;p\]
+
+\noindent Again she does not know $r_i$. So it is a too hard
+task and she will be found out again.
+
+To sum up, in order for Alice to successfully cheat Bob, she
+needs to guess \emph{all} bits correctly. She has only a
+$\frac{1}{2^z}$ chance of doing this.
\subsubsection*{Further Reading}
Make sure you understand what NP problems
-are.\footnote{\url{http://en.wikipedia.org/wiki/NP_(complexity)}} They
-are the building blocks for zero-knowledge proofs.
+are.\footnote{\url{http://en.wikipedia.org/wiki/NP_(complexity)}}
+They are the building blocks for zero-knowledge proofs.
+Zero-Knowldege proofs are not yet widely used in production
+systems, but it is slowly gaining ground. One application
+where they pop up are crypto currencies.
+
+If you want to brush up on the modular logarithm problem,
+the Khan Academy has a nice video:
+
+\begin{center}
+\url{https://www.khanacademy.org/video/discrete-logarithm-problem}
+\end{center}
\end{document}