The Adversarial Computation

Build me an unbreakable cryptosystem.

Mort Yao

2017-01-01

Theory of computation (computability and complexity) forms the basis for modern cryptography:

Generally, pseudorandom generators used in probabilistic algorithms yield random bits according to the uniform distribution, so it is worth mentioning:

Cryptographic schemes are defined as tuples of deterministic or probabilistic algorithms:

A brief, formalized overview of some classical ciphers, and their security:

Lessons learned from these classical ciphers: While perfect secrecy is easy to achieve (one-time pads), designing practical cryptographic schemes (with shorter keys, and computationally hard to break) can be difficult.

Where do random bits come from?

The construction of private-key encryption schemes involves probabilistic algorithms. We simply assume that an unlimited supply of independent, unbiased random bits is available for these cryptographic algorithms. But in practice, this is a non-trivial issue, as the source of randomness must provide high-entropy data so as to accommodate cryptographically secure random bits.

In the perfectly secret scheme of one-time pads, the key generation algorithm \(\mathsf{Gen}\) requires the access to a source of randomness in order to choose the uniformly random key \(k \in \mathcal{K}\). Practically, high-entropy data may be collected via physical input or even fully written by hand with human labor.

Theoretically, without external intervention, we have:

Conjecture 3.1. Pseudorandom generators exist.

Theorem 3.2. (Pseudorandom generator theorem) Pseudorandom generators exist if and only if one-way functions exist.

Pseudorandomness is also a basic construction in CPA-secure encryption algorithms (\(\mathsf{Enc}\)), e.g., in stream ciphers and block ciphers.

So what is an acceptable level of pseudorandomness, if we are not sure whether such generators theoretically exist? Intuitively, if one cannot distinguish between a “pseudorandom” string (generated by a PRG) and a truly random string (chosen according to the uniform distribution), we have confidence that the PRG is a good one. Various statistical tests have been designed for testing the randomness of PRGs.

Pseudorandomness and IND-CPA

It holds true that:

Corollary 3.3. By redefining the key space, we can assume that any encryption scheme \(\Pi=(\mathsf{Gen},\mathsf{Enc},\mathsf{Dec})\) satisfies

  1. \(\mathsf{Gen}\) chooses a uniform key.
  2. \(\mathsf{Enc}\) is deterministic.

If so, why do we still need probabilistic \(\mathsf{Enc}\) in CPA-secure encryptions? Can’t we just make \(\mathsf{Enc}\) deterministic while still being CPA-secure?

The first thing to realize is that chosen-plaintext attacks are geared towards multiple encryptions (with the same secret key \(k\)), so when the adversary obtains a pair \((m_0, c_0)\) such that \(\Pr[C=c_0\,|\,M=m_0] = 1\), the key is already leaked. (Recall that the adversary knows the deterministic algorithm \(\mathsf{Enc}_k\), thus reversing \(k\) from known \(m_0\) and \(c_0\) can be quite feasible; e.g., in a one-time pad, \(k = m_0 \oplus c_0\).) The only way to get around this is make \(\mathsf{Enc}_k\) probabilistic (constructed from a pseudorandom function), such that an adversary cannot reverse the key efficiently within polynomial time.

Note that perfect secrecy is not possible under CPA, since there is a small possibility that the adversary will reverse the key (by, for example, traversing an exponentially large lookup table of all random bits) and succeed in the further indistinguishability experiment with a slightly higher (but negligible) probability.

Historical exploits of many-time pad

One-time pad is one of the most (provably) secure encryption schemes, and its secrecy does not rely on any computational hardness assumptions. However, it requires that \(|\mathcal{K}| \geq |\mathcal{M}|\) (which in fact is a necessary condition for any perfectly secret scheme), thus its real-world use is limited.

The one-time key \(k\) (uniformly chosen from the key space \(\mathcal{K}\)) may not be simply reused in multiple encryptions. Assume that \(|\mathcal{K}| = |\mathcal{M}|\), for encryptions of \(n\) messages, the message space is expanded to size \(|\mathcal{M}|^n\), while the key space remains \(\mathcal{K}\), thus we have \(|\mathcal{K}| < |\mathcal{M}|^n\). Such a degraded scheme (many-time pad) is theoretically insecure and vulnerable to several practical cryptanalyses.

A historical exploit of the vulnerability of many-time pad occurred in the VENONA project, where the U.S. Army’s Signal Intelligence Service (later the NSA) aimed at decrypting messages sent by the USSR intelligence agencies (KGB) over a span of 4 decades. As the KGB mistakenly reused some portions of their one-time key codebook, the SIS was able to break a good amount of the messages.1

References and further reading

Books:

J. Katz and Y. Lindell, Introduction to Modern Cryptography, 2nd ed.

Papers:

[1] C. E. Shannon, “Communication theory of secrecy systems,” Bell system technical journal, vol. 28, no. 4, pp. 656–715, 1949.


  1. R. L. Benson, “The Venona story.” https://www.nsa.gov/about/cryptologic-heritage/historical-figures-publications/publications/coldwar/assets/files/venona_story.pdf