Monday, October 14, 2019

1.3 Bulk Data Encryption

D-H and RSA are pretty easy to understand, and known to be vulnerable to quantum computers, hence attract a lot of attention. The workhorse symmetric block ciphers for bulk encryption are actually much more complex mathematically, and hence harder to understand, but ultimately can be executed efficiently on modern microprocessors and dedicated chips.

A block cipher takes the data to be encrypted, and breaks it into fixed-size chunks called blocks.  If the last block isn't full, it is filled out with meaningless data.  We also take a key, and using the key perform mathematical operations on the data block.  In a symmetric system, by definition, the decryption key is the same as the encryption key.  Generally, the output block is the same size as the input block.  There is no requirement that the block and key are the same size.

Ideally, the output block (the ciphertext) will look completely random: about 50% zeroes and 50% ones, regardless of input, and with no detectable pattern.  That is, its entropy will be very high.  Of course, it cannot be completely random, as it must be possible for the data to be decrypted by someone holding the appropriate key.  A good cipher, however, will provide few clues to an attacker.  A single bit's difference in either the key or the original plaintext should result in about half of the ciphertext bits being flipped, so that being "close" offers no guidance on a next step.

Thus, a symmetric key system's security is often linked to its key size; with $k$ key bits, it should require, on average, $2^{k-1}$ trial decryptions to find the key and to be able to decrypt the message.  We will discuss this further when we get to cryptanalysis.

Many encryption algorithms have been designed over the years, but two in particular are too important to ignore, so we will examine them: DES, the Data Encryption Standard, which is still in use in modified form but is primarily of historical interest now, and AES, the Advanced Encryption Standard, which is used for most communications today.

No comments: