March 28, 2024

An Introduction to Cryptography: Symmetric Encryption

The world of computer science can be intimidating, especially if you try to figure it out on your own. That's whyI decided to write an overview of the basics of cryptography forthose who would like to delve into it, but do not know where to start. The review is based on a course on cryptography from Stanford University professor Dan Bonech, available on Coursera.

</p>

I decided to take this course since Iblockchain developer with no traditional education in computer science. I studied economics at university, but then I went into computer programming. When I started programming, I aimed to "get closer to the computer" - to figure out what is hidden under all the abstraction that I dealt with while doing web development. The transition from web programming to cryptocurrency and distributed systems has been insanely fun, thanks in part to a deeper understanding of cryptography. However, I needed a stronger foundation. Since this is a fairly large area, I donated $ 70 to get information on the forum, which is moderated by Stanford University faculty members. But the course can be taken for free without checking the assignments by teachers.

So let's get started.

What is cryptography?

Essentially, cryptography is the practice of communicating securely in the presence of potential third-party ill-wishers. Safe communication has two main components:

  1. Interception protection:data confidentiality.
  2. Data manipulation protection:integrity of information, which means that no one can change the data you send, causing the recipient to accept the distorted data as valid.

Data privacy is achieved through encryption, which can take two forms: symmetric and asymmetric.

  • Symmetric encryptionuses a single key that must be exchanged between all communication participants.
  • Asymmetric encryptionuses private keys. Each participant has a pair of public and private keys to encrypt and decrypt messages.

In this article, we will focus on symmetric encryption.

Two types of ciphers

Encryption ensures the confidentiality of data and includes two important components:

  1. The secret key:in the context of symmetric encryption, we can assume that our participants Alice and Bob have a shared secret key.
  2. Cipher:encryption and decryption algorithms.

It is important to note that the encryption and decryption algorithms are known to everyone. Only the key is kept secret.

There are two types of symmetric ciphers: streaming and block. To properly understand these ciphers, it will be helpful to be familiar with bit operations (operations performed on bits), in particular with the exclusive "or" (XOR). In short, this is the addition of two bits, in which, if they are different (0 and 1), the result is 1, and if they are the same (two zeros or two ones), the result is 0. In what follows, it is assumed that the reader knows what is XOR and that this operation is usually denoted by the sign ⊕.

Stream cipher

Stream cipher- this is a symmetric cipher, where over the open(original) text (in byte form) when encrypting bit by bit, an XOR operation is performed using the key (also in byte form). The same process is used for decryption. Given the nature of the XOR operation, if we perform it on the ciphertext (ciphertext) using the key, we will get the original plaintext.

How a stream cipher works. : Dan Boneh

Attentive readers have probably guessed thatthe key and the plaintext must have something in common, and something very important. Right! Key and plaintext must be the same length. This is, of course, not very practical.

To make the stream cipher more practical, we usegenerators pseudo-randomnumbers.Pseudo-random number generator- Thisdeterministic procedure, which based on the input produces a longerpseudo-random result. “Deterministic” means that this procedure, given the same input, always produces the same output (i.e. “abc123” always produces “8474f24e0d72e1b949ffd2...”). Word"Pseudo-random"means that although the exit is not reallyrandom (because it is deterministic based on a specific input), it is indistinguishable from a true random string. In other words, if you have a set of inputs and outputs, then it is impossible to understand which output corresponds to which input. If you use a shared secret key as input, you can obtain a longer pseudorandom key that can be used to XOR a plaintext of the same length.

The described implementation of the stream cipher is called"Disposable pad". Its very important property is that the key can only be used ONCE. If it is reused, the security of the messages is compromised.

Below is a slide from the course.PRG(K) denotes the pseudo-random sequence generated from our public key K. The symbol ⊕ denotes the XOR operation; C – ciphertext; m – message (plain text).

: Dan Boneh

In essence, the slide shows that ifuse the key twice, then you can perform an XOR operation on two ciphertexts, which will be equal to an XOR operation on two plain texts. Since there is a lot of redundancy in human-readable language, a smart attacker can use this information to completely recover messages.

To avoid repeating the one-time pad key but keeping one shared secret key, the conceptnonsa.Nonsis an arbitrary number thatcryptographic communication can only be used once. Along with the ciphertext, you can also send a nonce, which will be added to the secret key to produce a different pseudo-random key each time you encrypt.

(You may have noticed that on the slide above it says"Attack 1". If you have a question about"Attack 2", then it refers to the fact that a stream cipher providesconfidentiality, but notdata integrity, mentioned at the beginning of the article).

Block cipher

The second type of cipher is block ciphers. A block cipher uses a fixed-length input and encrypts the original text over and over again, using a different key ("round key") each time, until it produces a ciphertext of the same length. 3DES and AES are two examples of block ciphers that use 48 and 128 bit input, respectively.

Block cipher scheme. : Dan Boneh

The slide above shows the basic architectureblock cipher. As you can see, a key expansion algorithm is used to obtain a new key in each round. The original text (m) is encrypted over and over until a ciphertext (C) of the same length is received.

For the sake of brevity, I will be looking at AES in this article. Despite the historical significance of DES / 3DES, AES is more widely used today.

: Dan Boneh

AES is built likepermutation network (SP-network)... AES uses blocks of 128 bits, which is 16 bytes. 16 bytes are written as a 4-by-4 matrix. Such a matrix is ​​convenient for data manipulation. In each round, the process is as follows:

  1. We perform an XOR operation on the original message and the first round key(k0).
  2. We perform substitution, replacing data blocks with others based on a specific table(ByteSub procedure).
  3. We carry out the permutation by shifting and mixing the bits(procedures ShiftRow and MixColumn).
  4. The process is repeated for 10 rounds.

In the last round the procedure is omittedMixColumn, the XOR operation is performed with the last round key and we get the final ciphertext. The reverse process is used for decryption. For those who want to go deeper, I can advise you to familiarize yourself with the Feistel network, which is used in 3DES to compare different block ciphers.

After the launch of the Westmere architecture, Intel began to build special instructions for AES optimization into its processors, and AMD soon followed.

Block cipher modes

Unlike stream ciphers, block ciphers usefixed length input. Obviously we want to process more than 16 bytes of data at the same time. Therefore, it is further important to understand the modes in which block ciphers can be applied to large amounts of data. The first of these is the Electronic Code Book (ECB) mode. ECB simply breaks the data into blocks of 16 bytes and performs AES encryption consistently. This can be done in parallel and very quickly. However, this is not very secure.

: Dan Boneh

The reason is that if 16 byte messagesare repeated, the ciphertext will also be repeated. This is how a potential attacker can obtain information about our data. You can illustrate this vulnerability with an example of an image encrypted with ECB. In the picture below, you can see that dark hair and a T-shirt allow you to distinguish outlines by which you can understand that a snapshot of someone's face is encrypted.

: Dan Boneh, B. Preneel

It is important that encryption schemes besemantically stable.Semantic stabilitymeans that if there is a ciphertext,corresponding to one of two plaintexts, the adversary cannot know with probability greater than 1/2 which plaintext it corresponds to. Obviously, ECB mode is semantically unstable. The encrypted image reveals a lot of information that allows you to guess about the source.

ECB is an example of one-time key mode (that is, like a one-time pad, the key cannot be reused). Another, more secure mode with a one-time key -deterministic counter mode... You can search for information about it yourself, and I will move on to safe modes with reusable keys.

In modeciphertext block concatenation (CBC)Each 16-byte block of plaintext is XORed with the ciphertext of the previous block, followed by block cipher encryption (AES).

: Dan Boneh

Let's start with randominitialization vectors (IV), that is, the initial value in the cyclicprocess. In the case of CBC, the IV value must be random (and therefore unpredictable), and therefore unique in every transaction. The first block of ciphertext is just an unencrypted random IV. To get the rest of the ciphertext, we first XOR the random IV and the first plaintext block (m [0]). The result is encrypted with the round key k, and we get the first encrypted ciphertext block (c [0]). Next, an XOR operation is performed on this ciphertext and the next block of plaintext (m [1]), the result is encrypted with a round key k, and we get the next block of ciphertext (c [1]). The process continues until all blocks are encrypted.

: Dan Boneh

The process is reversed for decryption.

An important component of CBC encryption is the unpredictable random IV. If the IV is predictable, then our encryption scheme will become vulnerable to an attack based onmatched plaintext.Coded plaintext attack (CPA)implies that an attacker can obtainciphertexts of arbitrary plaintexts and using them to reveal information about encrypted messages. Therefore, an unpredictable IV providesCPA protection.

I'll try to explain what this might look likeattack. It is possible to perform CPA in the presence of a predictable IV due to the properties of XOR. The result of an XOR operation on two identical values ​​(for example, 0101 ⊕ 0101) will always be zero. Therefore, if you suspect that the ciphertext c[0] corresponds to a certain plaintext m[0], you can test the hypothesis using a predictable IV. If the plaintext was encrypted using IV1 (c[0] = E(k, m[0] ⊕ IV1)), then we can encrypt the new plaintext and see if the result matches c[0]. Since IV can be predicted to be IV2, you do m[0] ⊕ IV1 ⊕ IV2. CBC will XOR this input and the next IV (IV2): c[1] = E(k, m[0] ⊕ IV1 ⊕ IV2 ⊕ IV2). Therefore, the two IV2s cancel out and we again encrypt E(k, IV1 ⊕ m[0]), which again gives c[0], in which case we can guess what was previously encrypted with IV1.

Finally, I would like to consider another block cipher mode: the randomized counter mode (CTR). This is the last, safest mode and is also more efficient than CBC.

: Dan Boneh

The randomized counter mode also usesa random IV, but it serves a different purpose here. Our key is added (for example, via AES) with the iterated version of our IV: for example, in each iteration, you can increase the IV by one so that the result does not repeat. We do this until we get a key the same length as our plaintext. Now, as with the one-time-key stream cipher, we XOR our plaintext and pseudo-random key to get the ciphertext. If your computer has multiple AES processors, this is very efficient as it can run in parallel. In CBC, each ciphertext depends on the previous block, so it is impossible to compute it in parallel.

To get a pseudo-random key from additionIV and our private key, a block cipher is optional. Block ciphers must be reversible. If you carefully consider the mechanism of the randomized counter mode, you will notice that you do not need to perform F (k, IV) in reverse order for decryption. Given the properties of XOR, it is enough to take the same pseudo-random key and XOR it with our ciphertext. That is, to decrypt, you need to repeat the operation, and not carry it out in the reverse order.

More abstractly (so far I have avoidedabstract concepts), this means that the procedure we use to add our private key and IV (F (k, IV)) must be a pseudo-random function (PRF), not a pseudo-random permutation (PRP). We've actually come across these concepts in this article. Both PRF and PRP are deterministic procedures that give a pseudo-random output (AES, XOR) for a given input. However, PRP has a strict reversibility requirement. As a matter of fact, the concepts PRP and block cipher (eg AES) are often used interchangeably. But PRF should NOT be reversible.

So this is where our review ends.symmetric encryption. We've covered stream and block ciphers. Then, since block ciphers are only capable of processing 16 bytes at a time, we talked about the modes of applying them to large plaintext. Finally, we also explained the difference between PRP and PRF.

</p>