Friday, December 20, 2019

Network Graph Visualization Software

Just a short list of network/graph visualization software we have used for various things. I keep losing track of them, so I figured it was worth writing them down.

  • GraphViz is maybe the most venerable.
  • NetworkX is a pretty cool toolkit for Python. Very easy to use, a great starting point.
  • D3's Network Graph can do some nice things, well integrated into web pages.
  • Graphs and Networks in Mathematica. I have used a couple of different approaches in Mathematica, including just treating the graph as a variable. I should dig out examples; I wasn't fully happy with what I managed with simple code. But if you're willing to go through the syntactic pain, you can do amazing things in Mathematica.
  • OmNeT++ and ns-3 are full-on (classical) network simulators we have used for some sims, and they have built-in or add-on visualization tools.


Do you have a favorite or a dislike among this list? What have I missed?

Saturday, November 30, 2019

2.2 Birthday Paradox, or, When Should I Change my Encryption Keys?

You're probably familiar with the basic idea of the birthday paradox: If you have $n$ people in a room, what's the probability of two of them sharing a birthday?  More concretely, how many people have to be in the room before the probability of two sharing a birthday exceeds 50%?  It's a surprisingly small number, but it's not a true paradox in the logical sense.  (This is also called the pigeonhole principle or hash collision probability.)

Modern ciphers are designed so that the ciphertext (output of the encryption) looks random; we can say that the ciphertext has very high entropy.  If the block size is $n$ bits, each of the $2^n$ possible bit combinations should be an equally likely result of encrypting a data block.  (The encryption is deterministic, but the output looks random.)

If we monitor a long stream of encrypted data, there is some probability that we will find two blocks that have the same ciphertext.  We call that a collision.  If the two blocks were encrypted with the same key, we gain information about the plaintext.

(Side question: How can you have birthday paradox collisions of the ciphertext?  Wouldn't that mean that the encryption process is lossy, and that it would be impossible to reliably decrypt that ciphertext back to both original blocks?

Answer: there is more information involved in both the encryption and decryption than just that specific block and the key.  This is the cipher block chaining mentioned at the end of Sec. 1.)

In CBC mode, what you gain is the XOR of two plaintext blocks.  If you get to choose which two, this could be incredibly valuable information; given that it's just two random blocks, it's less so, but not unimportant.  As it happens, if block numbers $i$ and $j$ have the same ciphertext $c[i] = c[j]$, then you get the XOR of the plaintext of blocks $i-1$ and $j-1$.

I spent quite a bit of time working through the math of the birthday paradox, only to come back to one of the first sources I found: the 2016 SWEET32 paper by Bhargavan and Leurent.  Sec. 2.2 of that paper has a compact description of the commonly-quoted square root limit, as well as why and how much to be conservative relative to that value.

If the block size is $n$ bits, there are $N = 2^n$ possible ciphertexts, and if the number of blocks encrypted is the square root of that number, $2^{n/2}$, the probability of at least one collision is above 39%, which arises in the limit as the expression $1-1/e^{1/2}$.  (See the appendix of these notes for some example calculations of this and the following to play around with.)

Assuming you consider a 39% chance of disclosing some information to be an unacceptably large probability, when should you rekey?  That's the first-level answer to our ultimate question of when to rekey using QKD, which is of course back where we started this conversation.  Say, for example, we want the probability of an attacker recovering a plaintext block using the birthday attack to be less than (for example) one in a billion.

If we have some power of two $D = 2^d$ ciphertext blocks, then the expected number of collisions is approximately $2^{2d-n-1}$.  For our one-in-a-billion probability, using $log_2(10^9) \approx 30$, we need to set up our session lifetime such that $2d-n-1 < -30$, or $d < (n-29)/2$.

Since DES has only a 64-bit block, we should set $d \le 17$.  That's a startlingly small number: $D = 2^{17}$ blocks and each block is eight bytes, so we should limit the use of a single key to one megabyte! An impractically small size.

If, on the other hand, you are okay with a disclosure probability of one in a million, we can raise that by a factor of a thousand and change keys once every gigabyte instead.  But if you are trying to protect a 40Gbps link -- a common backbone bandwidth today, and coming to the home in the foreseeable future -- that still means changing keys once every 200msec or so!

Ah, but AES uses $n = 128$, a block size twice as large.  Now we only need $d \le 49$ for that one-in-a-billion probability.  $D = 2^{49}$ times our block size of 16 bytes equals 8 terabytes, about 1,600 seconds on our 40Gbps link.  So change the keys at least once every half hour, and you're good.

Keep in mind a few points:
  1. what's disclosed here is not the key but is plaintext-related, but not even pure plaintext; however, the insanely smart and devious cryptanalysts can do amazing things with small amounts of data;
  2. this depends only on the block size of a block cipher, not on the key length;
  3. part of this arises due to the CBC (cipher block chain) mode, but ECB (electronic code book) mode is worse; there are other modes that I haven't looked at closely; and
  4. given that there are still other attacks, minimizing the amount of data under any given encryption key is still good practice.
Next, let's look at a couple of the cryptanalysis techniques.

(If you missed it, you can go back to the beginning of What Every Quantum Researcher and Engineer Should Know about Classical Cryptography.)

Quantum Information Theory, Computing, Algorithms, Networking, Engineering and Education Journals


(Edited on 20/2/29 and 20/5/19, as noted in line.)

The field of quantum computing, broadly constituted, is changing rapidly. Hundreds, nay thousands, of new people are entering the field each year right now. Eventually, things will settle out, but in the short run what is publishable research and how we publish it are going to undergo a lot of stress.  And eventually publishing will become less critical for a large number of quantum computing graduates, as they will move into the nascent quantum industry, where different metrics of success will be applied and ideas will affect society in different ways. But for now, the primary mode of work is research-oriented people creating and testing new ideas, and the standard approach to disseminating those ideas is peer-reviewed publications.

One of the problems is that architecture and engineering are neither theory nor experiment. Each is its own beast. Where do we send papers describing advances in software tools, architectural principles and evaluation, network protocols, etc.? Many of the existing journals are not good venues for this engineering-oriented work, but neither are most of the ACM and IEEE top journals and conferences.

Another item high on my list is how to value creation of new educational tools and techniques, e.g., our MOOC.

But before fussing too much, it's worth taking a look at what we currently have for publication venues...

Very roughly in terms of age, but grouped by organization (in short, the order in which my brain chose to produce/organize them of its own accord):
  1. Nature and Science: You know the drill. Top experimental papers only. 20 papers/year each.
  2. Physical Review Letters. Top papers, including theory and algorithms; HHL, for example. Highly compressed; I refer to a PRL as "the cryptographic hash of a longer paper still to come".
  3. Physical Review A. Has been an excellent venue for almost anything quantum (my first quantum arithmetic paper, for example, and recent engineering-oriented work on quantum networks), and has explicitly added to its charter since the founding of APS's Division of Quantum Information, but is overwhelmed; acceptance rate is declining and topics are narrowing.
  4. Physical Review X. Cody's architecture paper was explicitly described by the editors as, "Hey, this isn't really physics in the traditional sense, but we think it's cool." (Well, that's a liberal paraphrase.) (At almost 250 citations, according to Scholar, good call on their part if they value citations to their journal.)
  5. IOP New Journal of Physics. An excellent venue, longer papers than PRL but good visibility. But seems to draw from same reviewer and reader pool as PRA.
  6. IOP Quantum Science and Technology. Looks to be an excellent new venue, and quantum-focused, which is a win, and we're happy with it, but seems to draw from same reviewer and reader pool as PRA, at least so far. Hopefully engineering-oriented papers will continue to find a home here.
  7. Quantum Information and Computation (QIC). Has been the journal of record for much quantum algorithms work, less so for experimental work and engineering. Subscriptions are expensive, but papers can be put on the arXiv.
  8. Quantum Information Processing (QIP). Along with QIC, a top journal, with more experimental work. Also expensive, also arXiv-friendly.
  9. International Journal of Quantum Information (IJQI). Gets less attention than QIC and QIP. Also arXiv-friendly.
  10. ACM Journal of Emerging Technology in Computing Systems (JETC). One of my favorite journals. I spent two years as an associate editor, and wouldn't mind doing another stint here. arXiv-friendly. Not quantum only, but definitely has a good quantum component. Will be interesting to see if that changes with the introduction of...
  11. ACM Transactions on Quantum Computing (TQC). If the quality of a review board is an indicator of the quality of a journal, destined to be the top thing in our field. True rock stars, and great breadth. Will probably be limited to a few dozen high-impact papers a year?
  12. Nature Quantum Information (npj QI). Aims to be a top journal. Open access, which is great, but the author fees are outrageous. I won't pay them.
  13. Nature Communications and Scientific Reports. In my limited experience, erratic in review quality and editorial procedures, and no help on formatting and copy editing, but still have the outrageous Nature open access fees. A worse deal all the way around than npj QI.
  14. Science Advances. I haven't interacted with this yet at all. Also has high open access fees.
  15. Quantum. The new star in our firmament. An overlay journal dependent on the arXiv. Outstanding, young editorial board, this is the cool coffee shop where all the hipsters hang out. Go here to be seen and to hear the newest tunes!
  16. IEEE Transactions on Quantum Engineering. I have heard only rumors of this and either my Google-fu is poor, or there's nothing about it on the web. Could be good, but unless it's an exception to the truly egregious average IEEE journal latency, not useful for students. (Edit on 20/2/29: Link added. Erik DeBenedictis is EiC, the journal will be good! First papers published in early 2020.)
  17. IET Quantum Communication is brand new, and has a lot of "wireless" in its call, since that's the community the founders come from.
  18. PRX Quantum will start publishing in mid-2020 (Edit: this one added 20/2/29.)
  19. Wiley Quantum Engineering (Edit: this one added 20/2/29.)
  20. Springer Quantum Machine Intelligence (Edit: this one added 20/5/19.)
Of course, there are conferences and workshops, too.

And I'm working on a short slide deck on how contributions to open source and non-traditional publishing (e.g. my #QuantumComputerArchitecture tweetstorm) should count.

But for the moment, the above nineteen or so journals are where the primary action is.

Friday, November 29, 2019

2.1.2 Diffusion and Confusion

Claude Shannon, in his seminal article on the mathematics of cryptography, defined two concepts he called diffusion and confusion.  In diffusion, information from the original plaintext message is spread across more than one symbol in the output ciphertext; the farther the information spreads, the better.  Shannon defined confusion as making "the relation between...[the ciphertext] $E$ and the...[key] $K$ a very complex and involved one."

Feistel, who designed DES, called diffusion the avalanche effect. Webster and Tavares defined the strict avalanche criterion (SAC), requiring that changing a single input bit flips each of the output bits with 50% probability.

The Handbook of Applied Cryptography says the following (p. 20 in my edition):
A substitution in a round is said to add confusion to the encryption process whereas a transposition [permutation] is said to add diffusion. Confusion is intended to make the relationship between the key and the ciphertext as complex as possible.  Diffusion refers to rearranging or spreading out the bits in the message so that any redundancy in the plaintext is spread out over the ciphertext.
I haven't seen it written quite this directly (but then, I'm not that well read in crypto), but I think it's fair to say that confusion is achieved by the nonlinearity of the S-boxes.

These two concepts don't seem to figure prominently in the cryptography books and papers I have been reading, but it seems to me that they ultimately underly much of the process of cryptanalysis: spreading data broadly reduces the relationship exhibited by two ciphertexts even when the plaintexts are closely related, expanding the search space; and the nonlinearity means that even a large set of straightforward equations is not enough to simply and mathematicallyrelate the input and output.

Thursday, November 07, 2019

2.1 Playing Defense / 2.1.1 Entropy

2.1.1 Entropy

The single most important concept in cryptography -- indeed, in all of information theory, and inarguably one of the most important in all of computer science -- is Claude Shannon's entropy.  The entropy is, roughly, the amount of disorder in a sequence of data, or the amount of information that must be transmitted to reconstruct the data.  When the entropy of a cleartext message is high, it is hard to predict the next symbol (bit or byte or letter or floating point number); purely
random data has very high entropy.  When it is low, the cleartext has a skewed probability distribution such that, e.g., the letter 'e' or the number 0 is more common than other values.

Of course, because encryption tries to hide information, encrypted data looks as random as possible: it has very high entropy.  Thus, an easy way to automate a test to see if we have successfully decrypted a message is to calculate the entropy of our provisionally decrypted plaintext; low entropy indicates a high probability that we have (at least partially) succeeded.

The entropy is defined as
$H_{2}(X)=-\sum_{i=1}^{N} \frac{\operatorname{count}_{i}}{N}\log_{2}\left(\frac{\operatorname{count}_{i}}{N}\right)$
where $\operatorname{count}_i$ is the number of times that the $i$th value appeared in our data, and $N$ is the number of different values that show up in our data sequence.

From the fantastic website RosettaCode.org, we have a simple example in Python of calculating the entropy in bits per byte:


import math from collections import Counter def entropy(s): p, lns = Counter(s), float(len(s)) return -sum( count/lns * math.log(count/lns, 2) for count in p.values()) entropy("1223334444")
(This function can actually calculate entropy of strings of other types besides characters, as well.)

Note that this definition of entropy does not take into account any long-range structure; it is purely based on the overall probability of the appearance of a given set of symbols.  Thus, to this function, the strings 0101...01 and 000...01...111 have a bit-level entropy just as high as a string composed of exactly 50/50 randomly chosen 0s and 1s, although by applying a little intelligence we can easily reconstruct the former but not the latter (at least, not exactly).  Many lossless data compression programs operate on stateful _sequences_ of data to predict the next symbol.  For example, the common Unix utilities gzip and compress use the Lempel-Ziv algorithm, which builds a dictionary of sequences as it goes, so that repetitive data and long runs are compressed effectively; this is especially good at sequences such as 0101...01 or 000...01...111.

(Taking still further advantage of brainpower, the ultimate in compression is measured using the _Kolmogorov complexity_: the length of the shortest program that will reproduce the sequence.  This has very low complexity for anything highly regular, as well as seemingly irregular sequences such as the output of a pseudo-random number generator or the digits of $\pi$.  I am not aware of any automated form for calculating the Kolmogorov complexity of an arbitrary sequence of data.)

This, obviously, is one of the motivations for the substitution phase in a substitution-permutation network; with permutation only the entropy of the original cleartext is preserved in the ciphertext, and in some cases reveals a great deal about the original data.



Wednesday, October 30, 2019

2. Cryptanalysis

The general idea of trying to decode messages that are encrypted is known as cryptanalysis.

Some of the planet's smartest people have worked on this for all of their lifetimes, building on work in progress since well before we were born, so my summary here is inevitably going to be superficial and perhaps wrong.  Comments welcome; there is a lot of _bad_ information about cryptography on the web, and I'd prefer to be corrected and make mine _good_ information for the web.

(Somewhere, right now, maybe even as I type, employees of one or more of the world's major intelligence agencies are snickering at how naive my presentation here is.  We don't know how much they really know, but we do know that e.g. the RSA public key cryptosystem and the technique of differential analysis were known to U.S. agencies for years, perhaps decades, before they became public.)

Wikipedia's AES article says, "For cryptographers, a cryptographic 'break' is anything faster than a brute-force attack – i.e., performing one trial decryption for each possible key in sequence (see Cryptanalysis). A break can thus include results that are infeasible with current technology." The page then goes on to list quite a number of attacks on AES, none considered practical yet.

Given how long people have been working on cryptography, naturally there are many techniques for attacking the ciphers.  Here I'll mention just a couple of the modern techniques are known for what we might call "honest" attacks on block cipher cryptography, granting the assumption that the algorithm has no flaws and ignoring side channel attacks (such as measuring the power consumed during encryption) and attacks on infrastructure, people, etc. (some of which are known as "black bag", or burglary, attacks, and some of which are "rubber hose" attacks, extracting information from humans by coercion or torture). David Kahn referred to these latter attacks as "practical cryptanalysis" in his classic book, The Codebreakers.

The older one of the two is differential cryptanalysis, apparently discovered independently at least three times, publicly by Biham and Shamir in the late 1980s, in the mid-70s by IBM (where the goal of defending against the technique drove DES design decisions), and earlier by NSA. The second is linear cryptanalysis, discovered by Mitsuru Matsui in 1990.

(Other techniques that I know nothing about include integral, algebraic, and biclique.)

Before talking about those two in any detail, there's a more straightforward technique that leaks information about the plaintext, if not the key itself: the birthday paradox.  But before we get into the cryptanalysis itself, let's look a little bit at how the designers built defenses into the block ciphers.

1.4 Notes & References

Coming soon to a blog near you!

1.3.3 Limitations of Block Ciphers

One obvious problem with the most straightforward application of a block cipher is that it's deterministic.  If you just apply the transform to each block independently, it's easy to implement; this is known as electronic code book (EBC) mode.  BUT: If you see the same ciphertext $c$ in two different places in the message stream, you know that the two input blocks were the same!  This leaks a huge amount of information to a sophisticated attacker, and is considered unacceptable.

One answer to this is to make the encryption slightly less deterministic by XORing in the ciphertext of the previous block into the current one before performing the encryption.  This is cipher block chaining (CBC) mode, the standard mode of operation.  (There are at least two more modes that I know nothing about.)  CBC has the undesirable side effect of requiring encryption to be done serially, but attacks can be parallelized.  Nevertheless, it's the most commonly used mode.

1.3.2 AES (the Advanced Encryption Standard)

AES, the Advanced Encryption Standard, has replaced DES.  It is a type of block cipher known as a substitution-permutation cipher.  AES uses a block size of 128 bits, and a key size of 128, 192 or 256 bits. The key size affects the number of rounds of substitution-permutation, requiring 10, 12 or 14, respectively.
AES began life in 1997 when NIST announced a competition for a successor to DES, due to the problems with DES described earlier and later. Two Belgian cryptographers, Vincent Rijmen and Joan Daemen, submitted a proposal they dubbed Rijndael, which ultimately beat out fourteen other serious competitors to become, with some modifications, the Advanced Encryption Standard, published as the standard in 2001.
A substitution-permutation network, as you might guess, alternates substituting new bits for old ones with permuting (rearranging) the bits, conducted over a series of rounds. DES has some characteristics of a substitution-permutation network, but is not purely so. A well-designed s-p net will maximize Shannon's confusion (or the avalanche effect) and diffusion, which we will see shortly.
AES begins by laying out the sixteen bytes in a $4\times 4$ array in column major form:the first byte in the upper left, then the next byte below it, with the fifth byte returning to the tp of the second column. Most operations work on rows, however, increasing the mixing of the bytes.
First, in the KeyExpansion phase, the round keys are created from the original key.
Next, the appropriate number of rounds is performed. Each round consists of the following steps:
  1. SubBytes
  2. ShiftRows
  3. MixColumns
  4. AddRoundKey
In the first round, AddRoundKey is performed before the other operations as well as after. In the last round, the MixColumns step is omitted.
The S box in SubBytes is a single, byte-level, fixed operation, $1:1$, unlike the DES S boxes, which are more complex and vary depending on the position within the block. The AES S box is nonlinear ($f(x+y) \ne f(x)+f(y)$) and was designed specifically to defeat both linear and differential cryptanalysis (which we will see later), but should you not trust the specific values they can be modified. The S box can be implemented as a straightforward 256-byte lookup table, with the input byte as the index, or as a simple bitwise matrix multiplication.
The ShiftRows operation rotates all rows but the first. The MixColumns operation is more complex, executed as a matrix multiplication using modulo-two integer arithmetic. These two operations maximize the diffusion of the original data. The AddRoundKey is a simple XOR of the 128-bit round key into the block.
AES is easily implemented in either software or hardware. Some modern CPUs include special instructions to accelerate AES encryption, and even for those that don't, heavily optimized assembly language is achievable.  In the absence of special hardware, encrypting each 128-bit block of data requires several hundred CPU instructions. For example, my Mac laptop, with a 2.2GHz Intel Core i7 processor, encrypts AES-CBC at the following rates:
  • 128-bit key: 1.25 Gbps for 16-byte chunks, 1.44 Gbps for 8192-byte chunks
  • 192-bit key: 1.06 Gbps for 16-byte chunks, 1.18 Gbps for 8192-byte chunks
  • 256-bit key: 912 Mbps for 16-byte chunks, 1.01 Gbps for 8192-byte chunks
Despite two decades of effort, there are no known attacks on AES that allow recovery of the key in substantially less time than brute force; the best attacks roughly halve that cost, at the cost of requiring tremendous amounts of storage.

Thursday, October 24, 2019

Google's Quantum Supremacy Paper

[Important update, 10/25 0630. Dave Bacon has corrected me.  The chip is fully programmable. I had thought that the choice of coupler patterns (Fig. S25 in the SOM) would limit what could and could not be done. I thought those were the only patterns the hardware supported, but it's just the set that they chose; in fact, any couplers can be on/off in any cycle, as long as they aren't using the same qubits. I stand corrected! Thanks, Dave!]

[Edit 10/24. 1730 JP time: softened the language a bit in the first couple of paragraphs; it came out harsher than I intended in the original version.]

We interrupt your irregularly scheduled programming on cryptography (What Every Quantum Researcher and Engineer Should Know about Classical Cryptography) for some comments on Google's new supremacy paper...

By now, everyone on the entire planet and some people in space has heard that Google has claimed the achievement of "quantum supremacy". As the resident quantum pundit on Dave Farber's IP list, I suppose I'm obliged to say a few things, but I really don't have much to add beyond what some of the most prominent people have said, so this is largely a collection of links I mostly endorse...

Fundamentally, Google claims that their 53-qubit superconducting processor can solve a problem that would take a classical supercomputer 10,000 years.  But the problem is rather constrained; arguably, they defined the problem as whatever the computer is capable of doing.  That sounds harsh, and I don't mean it to be; it's still a tour de force that will stand as an important milestone, and it is programmable, but it's not yet fully general. Actually, let me soften that a bit; the chip itself was designed in conjunction with an understanding of what they are physically capable of doing, but very much with an eye toward what would most clearly demonstrate supremacy within their physical constraints.  So it's more fair to say it's hardware/software co-design. (And as I always tell my students, you can't design any computer without a plan for its workload!)

If you squint, it's something like a satisfiability problem (SAT, in CS theory terms).  In a SAT problem, you're given a Boolean circuit and challenged to find a set of input values that makes the (single) output TRUE.  In this case, measuring out all 53 qubits, it's not a single value that's the output, but rather a probability distribution. Using quantum entanglement and interference, in fact, it's not a flat "any number out of $2^{53}$" distribution, but instead has complex statistical characteristics.  It's those characteristics that they are looking for.  They started with a small problem size (using a subset of the qubits), calculated the distribution, saw that the machine conforms (but see the next paragraph!), then increased the size up past the level where their code can simulate it, and declared victory.

There are three two big catches to this: one, as already stated it's not fully programmable; two, it's not clear that this is beyond the capability of classical systems (see below); three, I believe them when they say it works, but noise remains the limiting factor, and it takes tremendous statistical skill to pull the signal out of the noise -- indeed, the fidelity (roughly, probability that it did the right thing) for their largest problem is only 0.2%!  That is, only one time in 500 does the circuit run properly, with no errors.  And given that the output is supposed to be a probability distribution, and that the space of answers is about $10^{16}$, that means figuring out whether it differs from pure noise is tough.

The paper itself:
https://www.nature.com/articles/s41586-019-1666-5
53 qubits and 76 authors by my count, which makes 1.43 authors per qubit.  The team is headed by John Martinis (pronounced close to "Martinez"), one of the long-time key people in superconducting qubits.  John was at UC Santa Barbara before Google lured him and his entire team away a few years ago, and put truly outstanding resources at their disposal.

There is also substantial (67 pages) Supplementary Online Material (SOM) at the bottom of the web page, don't miss it if you want details of everything from the fabrication (two chips bonded face-to-face, one with the classical parts and one with the quantum parts!) to the cross-entropy benchmarking (XEB) used to statistically prove their proposition that the device works to the calibration procedure (tremendously difficult) to the algorithm itself.

News & views from William Oliver:
https://www.nature.com/articles/d41586-019-03173-4

The five-minute Google feel-good video:
https://www.youtube.com/watch?v=-ZNEzzDcllU

Blog post by Sundar Pichai, Google CEO:
https://blog.google/perspectives/sundar-pichai/what-our-quantum-computing-milestone-means

They built on techniques developed by Sergio Boixo & co.:
https://arxiv.org/abs/1608.00263

Naturally, IBM, one of the other biggest names in quantum computing, is pushing back:
https://www.ibm.com/blogs/research/2019/10/on-quantum-supremacy/
(full disclosure: I'm Vice Center Chair of Keio's Quantum Computing Center, and we are working with IBM; my students have recently been using IBM's own 53-qubit processor.)

They're not questioning the experiment itself, but whether it's truly post-classical.  Simulators have been advancing rapidly, and _ideas_ for simulators and what counts as quantum supremacy have been moving even more rapidly.

(n.b.: there is pushback on the term "quantum supremacy"; in particular, the Europeans don't like it, they think it echoes Nazi terminology.)

Anyway...the most straightforward way of simulating a quantum computer takes memory exponential in the number of qubits:

10 qubits: 16KB
20 qubits: 16MB
30 qubits: 16GB
40 qubits: 16TB
50 qubits: 16PB
53 qubits: 128PB (144,115,188,075,855,872 bytes)

and, of course, a separate computation for each entry.  But numerous techniques compress that dramatically; one technique called tensor networks, which (very, very, very roughly) allows a query for the value of a state to reuse previously-calculated results, with the whole thing compressed into a tree of pointers to values, plus modifiers.  IBM's Edwin Pednault and team have simulated up to 56 qubits in special cases:
https://arxiv.org/abs/1710.05867
and this team's new work forms the core of the IBM pushback against Google:
https://arxiv.org/abs/1910.09534

Indeed, Alibaba simulated some special cases related to supremacy last year, up to *144* qubits:
https://arxiv.org/abs/1805.01450

There has been some recent discussion of what it is that truly determines the difficulty of a quantum computation; it's not just the number of qubits or the number of entangling two-qubit gates, it's a more complex function. I should dig up some references on that.

Scott Aaronson's quantum supremacy FAQ:
https://www.scottaaronson.com/blog/?p=4317
and a new post on the topic:
https://www.scottaaronson.com/blog/?p=4372

Simon Devitt's podcast, Meet the meQuanics, just did an episode with some of the inventors of some of the theoretical techniques Google used:
https://soundcloud.com/mequanics/meet-the-mequanics-e53-quantum-supremacy-a-montenaro-m-bremner-r-mann

More useful discussion:
https://spectrum.ieee.org/tech-talk/computing/hardware/how-googles-quantum-supremacy-plays-into-quantum-computings-long-game

And from John Preskill himself, inventor of the term:
https://www.quantamagazine.org/john-preskill-explains-quantum-supremacy-20191002/

Finally, let me point out that Chris Monroe is quoted in Financial Review, talking about his ion trap experiment from a couple of years ago -- featuring, you guessed it, 53 qubits (they were trying for 52 and wound up with 53):
https://www.afr.com/technology/why-google-wasn-t-really-the-first-to-achieve-quantum-supremacy-20190927-p52vg6
https://www.nature.com/articles/nature24654
https://arxiv.org/abs/1708.01044
52 was chosen to get ahead of Misha Lukin's group, which had 51:
https://www.nature.com/articles/nature24622
https://arxiv.org/abs/1707.04344

More than enough for now.  Hope this helps!

Thursday, October 17, 2019

1.3.1 DES (the Data Encryption Standard)

DES, the Data Encryption Standard, was designed by IBM in the 1970s, consulting with the NSA.  DES is a type of cipher known as an iterative block cipher, which breaks data into to blocks and repeats a set of operations on them.  The operations in DES are called a Feistel network, after the inventor.

DES uses a 56-bit key, though products exported from the U.S. were limited to using 40 meaningful key bits [wikipedia/40-bit-encryption]. It was later upgraded to triple-DES, using three 56-bit key pieces and repeating DES three times, giving up to 168 bits of protection.  But it's not just the key size that matters in a block cipher.  The block size itself matters a great deal, as we'll see below.  For DES, that block size is only 64 bits.

DES operates in sixteen rounds, each of which uses a 48-bit subkey generated from the original 56-bit key using a key scheduling algorithm.  In each round, half of the block is tweaked and half initially left alone, then XORed with the tweaked half.  The two halves are swapped before the next round.

The "tweaking" of the right half of the block is done by first expanding the 32 bits into 48 by replicating half of the bits, XORing with the 48-bit subkey, then dividing it into 6-bit chunks and pushing each chunk through one of eight substitution boxes, or S boxes.  Each S box turns 6 bits into 4, using a lookup table defined as part of the algorithm (that is, this operation is not key-dependent).  The S boxes are nonlinear (but not affine), which is the source of the true security of DES; if the S boxes were linear, breaking DES would be easy (or so I am told).

Decrypting DES is exactly the same operation as encrypting, except that the subkeys are used in reverse order.

Slightly more formally, the sequence of operations in a DES encryption is:


  1. Apply initial permutation (IP) (a fixed operation)
  2. For i = 1 to 16 do
    1. divide the block into two 32-bit halves
    2. expand the left half to 48 bits (a fixed operation)
    3. calculate subkey i
      1. split key into two 28-bit halves
      2.  rotate each half 1 or 2 bits (a fixed operation according to the key schedule)
      3. select a subset of 48 bits (a fixed operation according to  the schedule)
    4. XOR subkey i with the left half of the block
    5. split into 8 six-bit pieces
    6. push each 6-bit piece through a $6\rightarrow 4$ S-box
    7. permute and recombine the pieces (a fixed operation)
    8. XOR the left half with the right half of the block
    9.  swap halves of the block
  3. Apply the final permutation (FP) (a fixed operation)

The $6 \rightarrow 4$ S boxes are obviously inherently non-reversible, but the earlier
expansion guarantees that ultimately no information is lost as the block passes through the entire network.

The success of cryptanalysis is often measured in terms of the number of rounds it can break in a multi-round cipher like DES.  e.g., if DES were six rounds instead of sixteen, we could do yea much.  Various attacks have been able to penetrate DES to different depths.  Of course, the more straightforward approach is sheer brute force; as early as 1977, Diffie and Hellman published a critique in which they argued that brute force stood at the edge of feasible, and indeed in 1999, a special-purpose machine named Deep Crack, built by the Electronic Freedom Foundation, cracked a DES key.  Further advances have made it possible even for dedicated hobbyists to crack.

Triple-DES (also called 3DES) has several modes of operation, but is usually used with three independent 56-bit keys, $K1$, $K2$, and $K3$, with encryption performed as $C = E_{K3}(D_{K2}(E_{K1}(P)))$ where $P$ is the plaintext, $E$ and $D$ are the encryption and decryption operations, and $C$ is the ciphertext.

DES was withdrawn as a standard in 2005, after having been replaced by AES in 2001, although the U.S. government still allows 3DES until 2030 for sensitive information.

Monday, October 14, 2019

1.3 Bulk Data Encryption

D-H and RSA are pretty easy to understand, and known to be vulnerable to quantum computers, hence attract a lot of attention. The workhorse symmetric block ciphers for bulk encryption are actually much more complex mathematically, and hence harder to understand, but ultimately can be executed efficiently on modern microprocessors and dedicated chips.

A block cipher takes the data to be encrypted, and breaks it into fixed-size chunks called blocks.  If the last block isn't full, it is filled out with meaningless data.  We also take a key, and using the key perform mathematical operations on the data block.  In a symmetric system, by definition, the decryption key is the same as the encryption key.  Generally, the output block is the same size as the input block.  There is no requirement that the block and key are the same size.

Ideally, the output block (the ciphertext) will look completely random: about 50% zeroes and 50% ones, regardless of input, and with no detectable pattern.  That is, its entropy will be very high.  Of course, it cannot be completely random, as it must be possible for the data to be decrypted by someone holding the appropriate key.  A good cipher, however, will provide few clues to an attacker.  A single bit's difference in either the key or the original plaintext should result in about half of the ciphertext bits being flipped, so that being "close" offers no guidance on a next step.

Thus, a symmetric key system's security is often linked to its key size; with $k$ key bits, it should require, on average, $2^{k-1}$ trial decryptions to find the key and to be able to decrypt the message.  We will discuss this further when we get to cryptanalysis.

Many encryption algorithms have been designed over the years, but two in particular are too important to ignore, so we will examine them: DES, the Data Encryption Standard, which is still in use in modified form but is primarily of historical interest now, and AES, the Advanced Encryption Standard, which is used for most communications today.

Friday, October 11, 2019

1.2 Key Generation

See the previous section here.

The other algorithm we talk about a lot as being both important and vulnerable to Shor's factoring algorithm is Diffie-Hellman key exchange, which is used for creating the session key we will use for bulk data encryption.  It's important that every session have its own separate encryption key; we don't want multiple conversations sharing the same key, for a lot of obvious reasons.

D-H works as follows:

  1. Alice and Bob publicly agree on a modulus $p$ and a base $g$ (in cleartext is okay)
  2. Alice and Bob each pick a secret random number $a$ and $b$
  3. Alice calculates $A = g^a \bmod p$, Bob calculates $B = g^b \bmod p$
  4. Alice sends Bob $A$, Bob sends Alice $B$ (in cleartext is okay)
  5. Alice calculates $s = B^a \bmod p$, Bob calculates $s = A^b \bmod p$

Both Alice and Bob now have the same secret $s$, which they can use as an encryption key.

Note that, as-is, D-H is vulnerable to a man-in-the-middle attack, and so must be coupled with some form of authentication so that Alice and Bob each know that the other is who they say they are.

Thursday, October 10, 2019

1.1 Authentication

Previous post in this series is here.

Okay, time to dive into this in more technical depth. Let's go phase by phase:

1.1 Authentication

Authentication, or proving you are who you say you are, is often described as something you have (a physical key), something.you know (a secret password), or something you are (biometrics, or physically unclonable functions, or PUFs).

The authentication phase can be accomplished using a pre-shared secret key and symmetric encryption of some message and response, or it can be done via asymmetric, public-key cryptography.

The best-known and most-used form of public key crypto is RSA, developed by Rivest, Shamir and Adelman (but also previously discovered by Cocks, of a British intelligence agency):

1. Pick two large primes, $p$ and $q$, let $n = pq$.
2. Calculate $\phi(p,q) = (p-1)(q-1)$.
3. Pick $e$, prime relative to $\phi(p,q)$.
4. Calculate $d$, s.t. $de = 1 \bmod \phi(p,q)$, which we can call the inverse of $e$ modulo $\phi(p,q)$.

The tuple $\langle n,e\rangle$ is now your public key, and $d$ is your private key. You can publish $\langle n,e\rangle$ any way you like, including in the New York Times. To send you a message $m$, I calculate the ciphertext $c$,

$c = m^e \bmod n$

and send $c$ to you.  You can recover the message $m$ by

    $m = c^d \bmod n$

That's all there is to it!  Except that it's expensive, so we don't use it for every message.  If I send you "Hi, Bob, it's Tuesday, let's use 42 as our key," using your public key and you reply, "Tuesday is beautiful, Alice, and 42 is fine," using my public key, we confirm that we are definitely Bob and Alice, known as authentication -- assuming, of course, that you trust that the key that you are holding really and truly belongs to me, and I haven't leaked it out somehow!

As you might guess from the publication of $n$, RSA is the one that's vulnerable to factoring larger integers, and hence of interest to spooks once Shor's algorithm came about.

Current EU recommendations are for 3072-bit RSA keys, recently (2018) upgraded from 2048, for "near-term protection" (up to ten years). They recommend an extraordinary 15360 bits for "long-term protection" (thirty to fifty years). [Cloudflare, Keylength.com, Ecrypt, Lenstra]

(Note: there are many more references to come; they will be delivered in a batch at the end.  It's not practical to list them all here in every article as we go, but I'll try to give you some of the key(ha!) ones.)

1.0 Encrypted Communications

See the previous posting if you haven't.

An encrypted conversation involves, at the macro level, three phases:

  1. Authentication: proving that you are who you say you are
  2. Key generation: creating the keys used for bulk data encryption
  3. Encryption/sending/decryption of the user's valuable bulk data

That's a bit of an oversimplification, as we'll see below when we talk about IPsec, but good enough for now.

You probably all know that there are two major kinds of cryptography -- symmetric key and asymmetric key, also known as public key.  Due to the high costs of computation for public key, bulk data encryption is done using symmetric key.  Symmetric encryption comes in two kinds, block ciphers and stream ciphers.  These days pretty much everything seems to be block, but I'm not sure why.

Some additional terminology:

  • cleartext: data that isn't encrypted and isn't really intended to be (sometimes confused with the below, even by me)
  • plaintext: the original, unencrypted message
  • ciphertext: the encrypted message
  • integrity: the data hasn't been tampered with
  • confidentiality or privacy: the data hasn't been disclosed to anyone unauthorized
  • session keys: the keys used for one communication session
  • forward secrecy: keeping your data secret in the future, esp. by building a crypytosystem that doesn't reuse keys
  • rekeying: changing the keys used in the middle of a session
  • subkeys: keys used for a portion of an encryption process, derived from a subset of the bits of the session key
  • cryptoperiod: the time that a specific key is authorized for use

Attacks on encrypted communications generally fall into one of three
categories:

  • unknown plaintext: the hardest problem; how do you recognize when you've found the message?  With many but not all systems, failure will leave only unintelligible, random data, while success will produce words from the dictionary or other recognizable text.
  • known plaintext: when an attacker knows what the plaintext corresponding to a particular ciphertext is, and attempts to find the key; not uncommon given the regularity of communications such as web browsing or email.
  • chosen plaintext: when the attacker can control the text to be encrypted, but obviously not the key; rarer than known plaintext, but can happen with small devices that a person may "own" but not always completely control, or if the attacker partially controls some subset of resources, such as a related web server, or has compromised one or more hosts behind an encryption gateway.

We also need this definition:

  • brute force/exhaustive search: checking every possible key, which of course is $2^n$ for an $n$-bit key, resulting in an expected hit time of half that number of trials; if you have a method that will find a key in substantially less than $2^{n-1}$ trials, you have "broken" the cipher, even if your attack isn't necessarily practical in the short run.

And three mathematical definitions I know for algebra on the real numbers, but I'm a little fuzzy on in the bitwise, cryptographic context:

  • linear function: I think in this context, $f(x,y)$ is a linear function of some bits if it involves only a linear addition of the inputs, and $f(0,0) = 0$ (origin is preserved).  Multiplication (AND) is disallowed?  Importantly, $f(x1 + x2) = f(x1) + f(x2)$.
  • affine function: same as a linear function, but an offset is allowed, such that $f(0,0) = 1$ (origin isn't necessarily preserved) (equivalent to a translation in a real space $R^n$).
  • nonlinear function: A nonlinear function is one in which $f(x+y) \ne f(x) + f(y)$ for some values $x$ and $y$. Some affine functions are nonlinear. I'm definitely fuzzy here...multiplication and arbitrary mappings are allowed?
Next posting in this series is here.

Wednesday, October 09, 2019

0.0 Introduction

I just posted this introduction to the project here.  I hope you'll follow along as I fill in the postings.

There are two fundamental reasons for periodic rekeying:
  1. Having a long string of data encrypted with the same key increases the ability of an attacker to find the key.
  2. As a practical matter, reducing the amount of data that is exposed in the event that a key is broken is good stewardship of your data.
So changing the key "often" makes it both harder and less valuable for an attacker to find a key.  That's the tl;dr.  Turning it into concrete, numeric recommendations is hard, but it looks like rekeying every half hour is pretty good, if your key bandwidth isn't high enough to do full one-time pad.

The above I knew off the top of my head, but I'm not a cryptographer and had more or less taken the need for changing the keys (known as rekeying) on faith.  So, let's dig a little deeper...there won't be any new knowledge in this, but I hope it helps bring together some information and organizes it to both save you the googling and calculating I've done and to provide a clear picture.

I used to work on a box known as an IPsec gateway, but I was the OS guy, not the cryptographer; Darrell Piper, Dan Harkins and Dave Kashtan did most of the crypto work.  I am also a board member of the WIDE Project, which through its KAME Project created the Racoon software that was an important early implementation of the key exchange protocol for IPsec, with much of that work done by Shoichi Sakane.  Much of what I know about IPsec and other security was just incidental radiation from them, or gained from the security-oriented papers we had to read when I did my master's at USC back before the Enlightenment.

I'm afraid this has gotten longer than I originally intended, so I hope it's all useful.  It's so long, a table of contents is in order:
  1. Brief, mostly non-mathematical intro to cryptography and its terminology, in the context of encrypted communications.
  2. General ideas behind cryptanalysis and the importance of limiting data volume, since that's the topic that started this conversation.
  3. How these ideas play out in IPsec, one of the Internet's key security protocols.
  4. How these ideas play out in TLS/SSL.
  5. Quantum and crypto.
  6. Post-quantum crypto.
  7. ACT NOW and order the above package, and as a FREE bonus, Ronco (well, Rodco) will send you a few thoughts on Bitcoin and its vulnerability to the development of quantum computers!
  8. References
Next posting is here.

What Every Quantum Researcher and Engineer Should Know about Classical Cryptography

At a conference in Berlin in June, I was asked, paraphrasing, about why we change the cryptographic keys used during long communication sessions, and how often they ought to be changed.  I waved my hands a little bit, but I wasn't completely satisfied with my own answer. It turned out to be a good opportunity for me to dig into a little background I'd been meaning to go through.  It also proved to be a remarkably hard question to answer from googling up pages, including cryptography research papers and the extensive mailing list archives from development of key Internet protocols.  I fell down a pretty serious rabbit hole here, and haven't found my way out yet.

What I have wound up with is about 25 pages of notes on classical cryptography, and I have decided to turn them into something that will benefit the quantum community at large.  I'm tentatively titling it, "What Every Quantum Researcher and Engineer Should Know about Classical Cryptography", in homage to Goldberg's classic paper on floating point arithmetic.

The target audience for this is primarily quantum computing researchers who are familiar with Shor's algorithm, Grover's algorithm and QKD, since those are among the first things you learn, but who have only a very rough idea of what it means to actually encrypt data and to use encryption in a real-world setting.

I'll be posting the notes gradually over the next few weeks here on this blog, and I hope you will comment and help me improve them.

Table of Contents:
  1. Introduction
  2. Encrypted Communications
    1. Authentication
    2. Key Generation
    3. Bulk Data Encryption
      1. DES
      2. AES
      3. Limitations of Block Ciphers
    4. Notes & References
  3. Cryptanalysis
    1. Defense
      1. Entropy
      2. Diffusion and Confusion
    2. The Birthday Paradox, or, When Should I Change my Encryption Keys?
    3. Differential Cryptanalysis
    4. Linear Cryptanalysis
    5. Known and Chosen Plaintexts in Real Systems
    6. Notes & References
  4. IPsec and the IETF
    1. Internet Standards
    2. (Classical) IPsec
    3. Digging into the Cryptanalysis of IPsec
    4. IPsec with QKD
    5. Notes & References
  5. TLS/SSL and cryptography
    1. TLS Records and Basic Limits
    2. Keying and Rekeying
    3. Other Attacks on TLS
    4. TLS and QKD
    5. Notes & References
  6. Quantum Attacks on Classical Crypto
    1. Shor 'Nuff
    2. Grover's Amplifier
  7. Post-Quantum Cryptography
  8. Bitcoin/Blockchain and Quantum Computers
  9. Conclusion
  10. Additional References
See the first installment here.

Tuesday, October 08, 2019

testing, testing, is this thing on?

My interest in running a blog was tailing off when I lost access to this account, so I let it go for a long time. Now, I have some things to say, and I may have recovered access here.