Maceió — Winter 2019
July 2020
In this course, you will learn about
The recommended reading will introduce you to the most important authors and articles in cryptology.
On completion of this chapter, you will have learned …
Cryptography serves to protect information by encryption (or enciphering), the shuffling of data (that is, the transformation of intelligible into indecipherable data) that only additional secret information, the key, can feasibly undo it (decryption or deciphering).
encrypt/encipher: to shuffle data so that only additional secret information can feasibly undo it.
key: the additional secret information that is practically indispensable to decrypt.
decrypt/decipher: to invert encryption.
That is, the shuffled (enciphered) data can practically only be recovered (deciphered) by knowledge of the key. Since the original data is in principle still recoverable, it can be thought of as concealment.
Because historically only written messages were encrypted, the source data, though a string of 1
s and 0
s (the viewpoint adopted in symmetric cryptography) or a number (that adopted in asymmetric cryptography), is called plaintext and the encrypted data the ciphertext.
plaintext respectively ciphertext: the data to be encrypted respectively the encrypted data.
Historically, the key to reverse this transformation (of intelligible data into indecipherable data) was both necessary to decipher and to encipher, symmetric encryption. That is, in the past, the key used to encrypt and decrypt was always the same: Symmetric cryptography had been used by the Egyptians almost 2000 years before Christ, and was used, for example,
Engima
machine, andAES
algorithm).symmetric cryptography: cryptography is symmetric when the same key is used to encrypt and decrypt.
In the 70s asymmetric cryptography was invented, in which the key to encipher (the public key) and the key to decipher (the secret or private key) are different.
asymmetric cryptography: cryptography is asymmetric when different keys are used to encrypt and decrypt. The key to encipher is public and the key to decipher is private (or secret).
In fact, only the key to decipher is private, kept secret, while the key to encrypt is public, known to everyone. In comparison with symmetric cryptography, asymmetric encryption avoids the risk of compromising the key to decipher that is involved
On top, It is useful, that the keys exchange their roles, the private key enciphers, and the public one deciphers, a digital signature: While the encrypted message will no longer be secret, every owner of the public key can check whether the original message was encrypted by the private key.
Nowadays such asymmetric cryptography algorithms are ubiquitous on the Internet: Examples are
RSA
which is based on the difficulty of factoring in prime numbers, orECC
which is based on the difficulty of computing points in finite curves,which protect (financial) transactions on secure sites (those indicated by a padlock in the browser’s address bar).
Up to the digital age, cryptography mainly studied the transformation of intelligle text into indecipherable text. Since then, cryptography studies the transformation of processible (digital) data into indecipherable (digital) data. This data is, for example, a digital file (text, image, sound, video, …). It is considered a bit sequence (denoted by a string of 0s and 1s) or byte sequence (denoted by a string of hexadecimal pairs 00, 01, …, FE, FF) or a number (denoted as usual by their decimal expansion 0, 1, 2, 3 …). Let us recall that every $1011...$ bit sequence is a number $n$ via its binary expansion $n = 1 + 0 \cdot 2 + 1 \cdot 2^2 + 1 \cdot 2^3 + \cdots$ (and vice versa).
The point of view of a sequence of bits (or, more exactly, of hexadecimal digits whose sixteen symbols 0
– 9
and A
– F
correspond to a group of four bits) is preferred in symmetric cryptography whose algorithms transform them, for instance, by permutation and substitution of their digits. The point of view of a number is preferred in asymmetric cryptography whose algorithms operate on it by mathematical functions such as raising to a power (raising to a power) and exponentiation.
The key, the additional secret information, can take various form; which form is mainly a question of convenience, most common are:
For example, in the ancient Scytale algorithm (see Section 2) that uses a role of parchment wrapped around a stick, the key consists of the circumference (in letters) of the stick, a small number. Nowadays, PIN
codes (= Personal Identification Number) or passwords are ubiquitous in day-to-day life; to facilitate memorization the memorization of complete secret sentences (= pass phrases) is encouraged.
Asymmetric encryption depends on larger keys and therefore stores them in files (of 64-letter texts, called ASCII
-armor) of a couple of kilobytes. For example (where ...
indicates tens of skipped lines):
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: SKS 1.1.6
Comment: Hostname: pgp.mit.edu
mQENBFcFAs8BCACrW3TP/ZiMRQJqWP0SEzXqm2cBZ+fyBUrvcu1fGU890pd4
3JdiWIreHx/sbJdW1wjABeW8xS1bM67nLW9VVHUPLi9QP3VGfmqmXqbWIB7O
...
-----END PGP PUBLIC KEY BLOCK-----
The prefix Crypto-, comes from Greek kryptós, “hidden.”
Cryptography (from the Greek gráphein, “to write”) is the art of hidden writing: shuffling information so that it is indecipherable to all but the intended recipient.
Cryptography: the art of transforming information so that it is indecipherable to all but the intended recipient.
That is, cryptography is the art of transforming information such that it is incomprehensible to all but the intended recipient. Useful, since Antiquity, for example to conceal military messages from the enemy. Since then, (electronic binary) data has replaced text, and what used to be concealing written messages exchanged by messengers or kept secret has become symmetric cryptography: securing data flowing between computers or stored on a computer.
Since the 70s, asymmetric cryptography makes it possible (by digital signatures) to verify the identities of participants and undeniably (non-repudiation) register their transactions in electronic commerce.
Cryptographic methods (or Ciphers) are generically classified
AES
, or different keys to encrypt and decrypt (asymmetric, or two-key, or public-key) cipher, such as RSA
or ECC
.Among the symmetric ciphers, these are generically classified
AES
or RSA
) or single bits (stream ciphers such as RC4
): While stream ciphers typically are simpler, faster and predestined for real time transmissions, they tend to be less secure and are therefore less commonly used (for example, a Wi-Fi network is commonly secured by a block cipher such as AES
).Cryptanalysis (from the Greek analýein, “to unravel”) is the art of untying the hidden writing: the breaking of ciphers, that is, recovering or forging enciphered information without knowledge of the key.
Cryptanalysis: the art of deciphering ciphertext without knowledge of the key.
Cryptanalysis (colloquially “code breaking”) is the art of deciphering the enciphered information without knowledge of the secret information, the key, that is normally required to do so; usually by finding a secret key.
Cryptanalysis of public-key algorithms relies on the efficient computation of mathematical functions on the integers. For instance, cryptanalysis of the most famous public-key algorithm, RSA
, requires the factorization of a number with $> 500$ decimal digits into its prime factors, which is computationally infeasible (without knowledge of the key).
Cryptanalysis of symmetric ciphers depends on the propagation of patterns in the plaintext to the ciphertext. For example, in a monoalphabetic substitution cipher (in which each letter is replaced by another letter, say A
by Z
), the numbers of occurrences with which letters occur in the plaintext alphabet and in the ciphertext alphabet are identical (if A
occurred ten times, then so does Z
). If the most frequent letters of the plaintext can be guessed, so those of the ciphertext.
A powerful technique is Differential cryptanalysis that studies how differences (between two plaintexts) in the input affect those at the output (of the corresponding ciphertexts). In the case of a block cipher, it refers to tracing the (probabilities of) differences through the network of transformations. Differential cryptanalysis attacks are usually Chosen-plaintext attacks, that is, the attacker can obtain the corresponding ciphertexts for some set of plaintexts of her choosing.
Cryptology (from the Greek lógos, “word,” “reason,” “teaching” or “meaning”) is the science of hiding, the science of trusted communication which embraces cryptography and cryptanalysis; according to Webster (1913) it is “the scientific study of cryptography and cryptanalysis.” Though cryptology is often considered a synonym for cryptography and occasionally for cryptanalysis, cryptology is the most general term.
Cryptology: the science of trusted communication, including cryptography and in particular cryptanalysis.
Secrecy, though still important, is no longer the sole purpose of cryptology since the advent of public-key cryptography in the 80s. To replace by electronic devices what had historically been done by paperwork, digital signatures and authentication were introduced.
Adjectives often used synonymously are secret, private, and confidential. They all describe information which is not known about by other people or not meant to be known about. Something is
Frequently confused, and misused, terms in cryptology are code and cipher, often employed as though they were synonymous: A code is a rule for replacing one information symbol by another. A cipher likewise, but the rules governing the replacement (the key) are a secret known only to the transmitter and the legitimate recipient.
A codification or an encoding is a rule for replacing one bit of information (for example, a letter) with another one, usually to prepare it for processing by a computer.
encoding: a rule for replacing one bit of information (for example, a letter) with another one, usually to process it by a computer.
For example,
ASCII
code) from 1963 that represents on computers 128
characters (and operations such as backspace and carriage return) by seven-bit numbers, that is, by sequences of seven 1s and 0s. For example, in ASCII a lowercase a
is always 1100001
, a lowercase b
is always 1100010
, and so on (whereas an uppercase A
is always 1000001
, an uppercase B
is always 1000010
).UTF-8
(8-bit Unicode Transformation Format) is a variable-length character encoding by Ken Thompson and Rob Pike to represent any universal character in the Unicode standard (which possibly has up to $2^2 \approx4$ billion characters, and includes the alphabets of many languages, such as English, Chinese, …, as well as meaningful symbols such as emoticons) by a sequence of between 1
up to 4
bytes, and which is backwards compatible with ASCII
, and is becoming the de facto standard.A cipher, like an encoding, also replaces information (which may be anything from a single bit to an entire sequence of symbols) with another one. However, the replacement is made according to a rule defined by a key so that anyone without its knowledge cannot invert the replacement.
cipher: a rule for replacing information (for example, a text) so that its inverse is only feasible by knowledge of the key.
Information is frequently both encoded and enciphered: For example, a text is encoded, for example, by ASCII, and then encrypted, for example, by the Advanced Encryption Standard (AES
).
Please distinguish between cryptology, cryptography and cryptanalysis:
Please distinguish between encoding and encryption:
While both encoding and encryption transform information into a computer readable format, only for encryption this transformation is not invertible without knowledge of the key.
A snappy acronym to resume the fundamental aims of information security is the CIA
, which stands for: C
onfidentiality, I
ntegrity and A
vailability. That is, confidentiality of information, integrity of information and availability of information.
CIA: stands for
C
onfidentiality,I
ntegrity andA
vailability.
More comprehensive are “the five pillars of Information Assurance,” that add authentication and non-repudiation: Confidentiality, integrity, availability, authentication and non-repudiation of information.
The five pillars of Information Assurance: are formed by Confidentiality, integrity, availability, authentication and non-repudiation of information.
Cryptography helps to achieve all of these to good effect: Good encryption, as achieved by thoroughly tested standard algorithms such as AES
or RSA
, is practically impossible to break computationally; instead, keys are stolen or plaintext is stolen before encryption or after decryption. While cryptography provides high technical security, human failure, for example, arising out of convenience or undue trust, is the weakest point in information security.
Information that is confidential is meant to be kept secret, that is, should not be disclosed to other people, for example information that is known only by someone’s doctor or bank. In law, confidential is the relation existing between, for example, a client and her counsel or agent, regarding the trust placed in one by the other. In the information security standard ISO/IEC 27002
the International Organization for Standardization (ISO) defines confidentiality as “ensuring that information is accessible only to those authorized to have access.” In IT, it means ensuring that sensitive information stored on computers is not disclosed to unauthorized persons, programs or devices. For example, avoiding that anyone with access to a network can use common tools to eavesdrop on traffic and intercept valuable private information.
Integrity is the state of being whole, the condition of being unified or sound in construction.
(Data) Integrity is about the reliable, complete and error-free, transmission and reception or storage of data: that the original data had not been altered or corrupted; in particular, is valid in accordance with expectations.
When the data has been altered, either through electronic damage by software or physical damage to the disk, the data is unreadable. For example when we download a file we verify its integrity by calculating its hash and comparing with the hash published by source. Without such a check, someone could, for example, package a Trojan horse into an installer on Microsoft Windows (that, as a last resort, hopefully would already be known and detected by an antivirus program such as Microsoft Defender).
Though unrelated to cryptology, in IT security availability of information against threats such as DoS (Denial of Service) attacks (to deny users of the Website access to a Website by flooding it with requests) or accidents, such as power outages, or natural disasters such as earthquakes. To achieve it, it is best to have a safety margin and include redundancy, in particular, to have
Authentic (from Greek authentes, real or genuine) means according to Webster (1913)
Authentication thus is the verification of something (or someone) as “authentic.” This might involve confirming the identity of a person or the origins of an object.
In IT, authentication means
To verify her identity, a person proves that she is who she claims to be by showing some evidence. Devices that are used for authentication include passwords, personal identification numbers, smart cards, and biometric identification systems. For example, to login, she enters her user identification and password.
A common attack is that of the “man in the middle,” where the attacker assumes to either correspondent the identity of the other correspondent. To solve this, certificates, digital signatures by third parties, are used. Either,
OpenPGP
, by signatures among persons known to each other over ends, orVeriSign
.Repudiation is a legal term for disavowal of a legal bind (such as an agreement or obligation); someone who repudiates:
For example, a forged or forced signature is repudiable.
Non-repudiation is the assurance:
In computing, this means that authentication can hardly be refuted afterwards. This is achieved by a digital signature.
For example,
In today’s global economy, where face-to-face agreements are often impossible, non-repudiation is essential for safe commerce.
In practice, is sensitive information obtained by
failure?
What does CIA in IT security stand for: Confidentiality, Integrity and Availability.
Please list the five pillars of information security: Confidentiality, integrity, availability, authentication and non-repudiation of information.
The history of cryptography dates back at least 4000
years. We distinguish three periods:
Till the 20th century, its methods were classic, mainly pen and paper.
In the early 20th century, they were replaced by more efficient and sophisticated methods by complex electromechanical machines, mainly rotor machines for polyalphabetic substitution, such as the Enigma rotor machine used by the axis powers during World War II.
Since then, digitalization, the replacement of analog devices by digital computers, allowed methods of ever greater complexity. Namely, the most tested algorithms are
DES
(or its threefold iteration 3DES
) and its successor AES
for symmetric cryptography,RSA
and its successor ECC
(Elliptic Curve Cryptography) for asymmetric cryptography.From antiquity till World War I, cryptography was carried out by hand and thus limited in complexity and extent to at most a few pages. The principles of cryptanalysis were known, but the security that could be practically achieved was limited without automatization. Therefore, given sufficient ciphertext and effort, cryptanalysis was practically always successful.
The principles of cryptanalysis were first understood by the Arabs. They used both substitution and transposition ciphers, and knew both letter frequency distributions and probable plaintext in cryptanalysis. Around 1412, al-Kalka-shandī gave in his encyclopedia Subīal-aīshī a manual on how to cryptanalyze ciphertext using letter frequency counts with lengthy examples.
A scytale (from Latin scytala) consists of a rod with a band of parchment wound around it on which is written a secret message. It was rolled spirally upon a rod, and then written upon. The secret writing on the strip wound around the rod is only readable if the parchment was to be wound on a rod of the same thickness; It is a transposition cipher, that is, shuffles, or transposes, the letters of the plaintext.
Caesar’s Cipher is one of the simplest and most widely-known chiphers, named after Julius Caesar (100 – 44 BC), who used it to communicate with his generals. It is a substitution cipher that replaces, substitutes, each alphabetic letter of the plaintext by a fixed alphabetic letter. In Caesar’s Cipher, each letter in the plaintext is shifted through the alphabet the same number of positions; that is, each letter in the plaintext is replaced by a letter some fixed number of positions further down the alphabet.
Francis Bacon’s cipher from 1605 is an arrangement of the letters a
and b
in five-letter combinations (of which there are $2^5 = 32$ ) that each represent a letter of the alphabet (of which there are 26
). Nowadays we would call a code, but at the time it illustrated the important principle that only two different signs can be used to transmit any information.
In 1470, Leon Battista Alberti described in Trattati in Cifra (“Treatise on Ciphers”) the first cipher disk to shift the letters of the alphabet cyclically. He recommended changing the offset after every three or four words, thus conceiving a polyalphabetic cipher in which the same alphabetic letters are replaced by different ones. The same device was used more than four centuries later by the U.S. Army in World War I.
The best known cipher of World War I is the German ADFGVX
cipher:
A
, D
, F
, G
, V
, and X
.Invented by Fritz Nebel, it was introduced in March 1918 for use by mobile units. The French Bureau du Chiffre, in particular, Georges Painvin, broke the cipher a month later — still too late as the German attacks had already ceded.
The mechanization of cryptography began after World War I by the development of so-called rotor cipher machines:
These rotors are stacked. The rotation of one rotor causes the next one to rotate $1 / 26$ of a full revolution. (Just like in an odometer where after a wheel has completed a full revolution, the next one advances $1 / 10$ of a full revolution.) In operation, there is an electrical path through all rotors. Closing the key contact of the plaintext letter on a typewriter-like keyboard
In the US, Edward H. Hebern made in 1917 the first patent claim to accomplish polyalphabetic substitution by cascading a collection of monoalphabetic substitution rotors, wiring the output of the first rotor to the input of the following rotor, and so on. In Europe, Already in 1915 such a rotor machine had been built by two Dutch naval officers, Lieut. R.P.C. Spengler and Lieut. Theo van Hengel, and independently by a Dutch mechanical engineer and wireless operator, Lieut. W.K. Maurits. Around the same time as Hebern, Arthur Scherbius from Germany (who filed his patent in February 1918) and Hugo A. Koch from the Netherlands (a year later), also built rotor machines, which were commercialized and evolved into the German Enigma used in World War II.
In Japan, the Japanese Foreign Office put into service its first rotor machine in 1930, which was cryptanalyzed in 1936, using solely the ciphertexts, by the U.S. Army’s Signal Intelligence Service (SIS). (In 1939 a new cipher machine was introduced, in which rotors were replaced by telephone stepping switches, but readily broken by the SIS again solely relying on ciphertext; even more so, their keys could be foreseen.)
Arthur Scherbius was born in Frankfurt am Main on 20 October 1878 as son of a businessman. After studying at the Technical College in Munich, he completed his doctoral dissertation at the Technical College in Hanover in 1903, then worked for several major electrical companies in Germany and Switzerland. In 1918, he submitted a patent for a cipher machine based on rotating wired wheels and founded his own firm, Scherbius and Ritter. Since both the imperial navy and the Foreign Office declined interest, he entered the commercial market in 1923 and advertised the Enigma machine, as it was now called, in trade publications or at the congress of the International Postal Union. This sparked again the interest of the German navy in the need for a secure cipher, and a slightly changed version was in production by 1925. Still, the corporation continued to struggle for profitability because commercial as public demand was confined to a few hundred machines. While Scherbius fell victim to a fatal accident involving his horse-drawn carriage, and died in 1929, his corporation survived and by 1935 amply supplied the German forces under Hitler’s rearmament program.
Polish and British cryptanalysis solved the German Enigma cipher (as well as two telegraph ciphers, Lorenz-Schlüsselmaschine and Siemens & Halske T52). To this end
Hans-Thilo Schmidt, decorated with an Iron Cross in World War I, worked as a clerk at a cipher office (previously lead by his brother). In June 1931, he contacted the intelligence officer at the French embassy and agreed with Rodolphe Lemoine to reveal information about the Enigma machine, copies of the instruction manual, operating procedures and lists of the key settings. However, French cryptanalysts made little headway, and the material was passed on to Great Britain and Poland, whose specialists had more success:
The commercial version of the Enigma had a rotor at the entry and his wiring was unknown. However, the Polish cryptanalyst Marian Rejweski, guided by the German inclination for order, found out that it did not exist in the military version; what is more, he inferred the internal wirings of the cylinders by distance, that is, by mere cryptanalysis of the enciphered messages.
The Britisch cryptanalysts in Bletchley Park (among them the mathematician Alan Turing, a founding father of theoretical computer science) could reduce by likely candidates the number of possible keys from 150
trillions to around a million, a number that allowed a work force of around 4200
(among them $80\%$ women) an exhaustive key-search with the help of the Turing Bomb, an ingenious electromechanical code-breaking machine that imitated a simultaneous run of many Enigma machines and efficiently checked the likelihood of their results.
Schmidt continued to inform the Allies throughout war till the arrest (and confession) of Lemoine in Paris which lead to that of Schmidt by the Gestapo in Berlin on 1 April 1943 and his death in prison.
After World War II, cryptographic machines stayed conceptually the same till the early 80s: faster rotor machines in which rotors had been replaced by electronic substitutions, but still merely concatenating shifted monoalphabetic substitutions to obtain a polyalphabetic substitution.
However, such letter per letter substitutions are still linear over the letters, so the ciphertext obtained from a plaintext will reveal how to decrypt all letters of a plaintext of (at most) the same length. That is, a letter per letter substitution diffuses little, that is, hardly spreads out changes; optimal diffusion is attained whenever the change of a single letter of the plaintext causes the change of half of the letters of the ciphertext. If the attacker has access to the ciphertexts of many plaintexts, possibly of his own choosing, then he can obtain the key by the ciphertexts of two plaintexts that differ in a single position.
Instead, computers made it possible to combine such substitutions (such as Caesar’s Cipher) with transpositions (such as the Scytale), achieving far better diffusion, which lead to the creation of one of the most widely used ciphers in history, the Data Encryption Standard (DES
), in 1976.
In January 1997 the U.S. National Institute of Standards and Technology (NIST; former National Bureau of Standards, NBS) announced a public contest for a replacement of the aging DES
, the Advanced Encryption Standard (AES
). Among 15 viable candidates from 12 countries, in October 2000 Rijndael, created by two Belgian cryptographers, Joan Daemen and Vincent Rijmen, was chosen and became the AES
.
Since improvements in computing power allowed to find the fixed 56
-bit DES key by exhaustive key-search (brute force), the NIST specifications for the AES demanded an increasing key length, if need be. Rijndael not only was shown immune to the most sophisticated known attacks such as differential cryptanalysis (in Daemen & Rijmen (1999) and Daemen & Rijmen (2002)) and of an elegant and simple design, but is also both small enough to be implemented on smart cards (at less than 10 000
bytes of code) and flexible enough to allow longer key lengths.
Since the ’80s, the advent of public-key cryptography in the information age made digital signatures and authentication possible; giving way to electronic information slowly replacing graspable documents.
Asymmetric encryption was first suggested publicly at Diffie & Hellman (1976).
Conceptually it relies on a trap function (more specifically, in op.cit. the modular exponential), an invertible function that is easily computable but whose inverse is hardly computable in the absence of additional information, the secret key.
To encrypt, the function is applied; to decrypt its inverse with the secret key. For example, in the approach according to Diffie and Hellman, this function is the exponential, however, over a different domain than the real numbers we are used to.
In fact, Diffie & Hellman (1976) introduced only a scheme for exchanging a secret key through an insecure channel. It was first put it into practice
RSA
cryptographic algorithm was introduced, orElGamal
algorithm, more recent, but the closest example of the original scheme.Not only do these algorithms enable ciphering by a public key (thus removing the problem of its secret communication), but, by using the private key instead to encipher, made possible digital signatures, which might have been its commercial breakthrough. These algorithms still stand strong, but others, such as elliptic curve cryptography, are nowadays deemed more efficient at the same security.
26
including the trivial one.Kerckhoff principle postulates the independence of a cryptographic algorithm’s security from its secrecy:
Kerckhoffs’ principle: The ciphertext should be secure even if everything about it, except the key, is public knowledge.
While knowledge of the key compromises a single encryption, knowledge of the algorithm will compromise all encryptions ever carried out. A public algorithm guarantees the difficulty of decryption depending only on the knowledge of the key, but not on the algorithm. The more it is used, the more likely it becomes that the algorithm will be eventually known. For the algorithm to be useful, it thus needs to be safe even though it is public.
Claude Shannon (1916 – 2001) paraphrased it as: “the enemy knows the system,” Shannon’s maxim. The opposite would be to rely on a potentially weak, but unknown algorithm, “security through obscurity”; ample historic evidence shows the futility of such a proposition (for example, the above ADFGVX cipher of Section 1.3 comes to mind).
Shannon’s principles of
give criteria for an uninferable relation between the ciphertext and
Ideally, when one letter in the key respectively in the plaintext changes, then half of the ciphertext changes, that is, each letter in the ciphertext changes with a probability of 50%. While the output of the cipher, the ciphertext, depends deterministically on the input, the plaintext, and the key, the algorithm aims to obfuscate this relationship to make it as complicated, intertwined, scrambled as possible: each letter of the output, of the ciphertext, depends on each letter of the input, of the plaintext, and of the key.
DES
and AES
.Cryptography protects information by shuffling data (that is, transforming it from intelligible into indecipherable data) so that only additional secret information, the key, can feasibly reverse it. Up to the end of the ’70s, the key used to encrypt and decrypt was always the same: symmetric or (single-key) cryptography. In the 70s asymmetric cryptography was invented, in which the key to encipher (the public key) and the key to decipher (the secret or private key) are different. In fact, only the key to decipher is private, kept secret, while the key to encrypt is public, known to everyone. When the keys exchange their roles, the private key enciphers, and the public one deciphers, then the encryption is a digital signature. while the encrypted message will no longer be secret, every owner of the public key can check whether the original message was encrypted by the private key. Because historically only written messages were encrypted, the source data, though a stream of 1s and 0s (the viewpoint adopted in symmetric cryptography) or a number (that adopted in asymmetric cryptography), is called plaintext and the encrypted data the ciphertext.
The security
RSA
, requires the factorization of a number with $> 500$ decimal digits into its prime factors, which is computationally infeasible (without knowledge of the key);Good encryption, as achieved by standard algorithms such as AES
or RSA
, is practically impossible to break computationally; instead, keys are stolen or plaintext is stolen before encryption or after decryption.
A hash function is an algorithm that generates an output of fixed (byte) size (usually around 16
to 64
bytes) from an input of variable (byte) size, for example, a text or image file, a compressed archive. The output string of fixed length that a cryptographic hash function produces from a string of any length (an important message, say) is a kind of inimitable “signature.” A person who knows the “hash value” cannot know the original message, but only the person who knows the original message can prove that the “hash value” is produced from that message.
The article Simmons & others (2016) gives a good summary of cryptology, in particular, historically; read its introduction and the section on history. As does the first chapter of Menezes et al. (1997), which focuses more on the techniques. The most recent work is Aumasson (2017), and a concise but demanding overflight of modern cryptography. Get started by reading its first chapter as well.
Some classics are Frederick) Friedman (1976) which is a manual for cryptanalysis for the U.S. military, originally not intended for publication.
The books Kahn (1996) and Singh (2000) trace out the history of cryptanalysis in an entertaining way.
The book Schneier (2007) is a classic for anyone interested in understanding and implementing modern cryptographic algorithms.
On completion of this chapter, you will have learned …
… that the two fundamental symmetric cryptographic algorithms are
… that the only cryptographically perfectly secure cipher is the one-time pad in which the key is as long as the plaintext
… that modern algorithms like DES
and AES
are Substitution and Permutation Networks that break the plaintext up into short blocks of the same size as the key and, on each block, iterate
Up to the end of the ’70s, before the publication of Diffie & Hellman (1976) and Rivest et al. (1978), all (known) cryptographic algorithms were symmetric (or single-key), that is, used the same key to encipher and decipher. Thus every historic algorithm, as sophisticated as it may be, be it Caesar’s Cipher, the Scytale or the Enigma, was symmetric.
While asymmetric algorithms depend on a computationally difficult problem, such as the factorization of a composed number into its prime factors, and regard the input as a natural number, symmetric ones operate on the input as a string (of bits or letters) by (iterated) substitutions and transpositions.
The only perfectly secure cipher is the one-time pad in which the key is as long as the plaintext and the ciphertext is obtained by adding, letter by letter, each letter of the key to the corresponding (that is, at the same position) letter of the plaintext.
However, such a large key is impractical for more complex messages, such as text, image or video files: In modern times, it means that to encrypt a hard drive, another hard drive that carries the key is needed.
To compensate the shorter key length, modern algorithms, ideally, create so much intertwining that they achieve almost perfect diffusion, that is, the change of a single bit of the input or key causes the change of around half of the output bits. Modern algorithms, such as DES
or AES
, are substitution and permutation network block ciphers, meaning that they encrypt one chunk of data at a time by iterated transpositions and substitutions.
The two basic operations to encrypt are transposition and substitution:
The historical prototypical algorithms for these two operations are:
A
as D
, B
as E
, and so forth, andWe will see that even with many possible keys an algorithm, such as that given by any permutation of the alphabet which has almost $2^{80}$ keys, can be easily broken if it preserves regularities, like the frequency of the letters.
As a criterion for security, there is that of diffusion by Shannon: Ideally, if a letter in the plaintext changes, then half of the letters in the ciphertext changes. Section 2.2 will show how modern algorithms, called substitution and permutation networks, join and iterate these two complementary prototypical algorithms to reach this goal.
ideal diffusion (according to Shannon): if a bit in the plaintext or key changes, then half of the bits in the ciphertext changes.
In a substitution cipher the key determines substitutions of the plaintext alphabet (considered as a set of units of symbols such as single letters or pairs of letters) by the ciphertext alphabet. For example, if the units of the plaintext and ciphertext are both the letters of the Latin alphabet, then a substitution permutes the letters of the Latin alphabet. If the substitution cipher is monoalphabetic (such as Caesar’s Cipher), then the same substitution is applied to every letter of the plaintext independent of its position. If the substitution cipher is polyalphabetic (such as the Enigma), then the substitution varies with the position of the letter in the plaintext. To encrypt, each alphabetical unit of the plaintext is replaced by the substituted alphabetical unit, and inversely to decrypt.
Substitution Cipher: a cipher that replaces each alphabetical unit of the plaintext by a corresponding alphabetical unit.
Every monoalphabetic substitution cipher, that is, every plaintext symbol is always encrypted into the same ciphertext symbol, is insecure: the frequency distributions of symbols in the plaintext and in the ciphertext are identical, only the symbols having been relabeled. Therefore, for example in English, around 25
letters of ciphertext suffice for cryptanalysis.
The main approach to reduce the preservation of the single-letter frequencies in the ciphertext is to use several cipher alphabets, that is, polyalphabetic substitution.
The simplest substitution cipher is a cyclical shift of the plaintext alphabet; Caesar’s cipher.
Caesar’s Cipher A substitution cipher that shifts the alphabetical position of every plaintext letter by the same distance.
This method was used by Roman emperors Caesar (100 – 44 B.C.) and Augustus (63 – 14 B.C.): fix a distance $d$ between letters in alphabetical order, that is, a number between 0 and 25, and shift (forward) each letter of the (latin) alphabet by this distance $d$ . We imagine that the alphabet is circular, that is, that the letters are arranged in a ring, so that the shift of a letter at the end of the alphabet results in a letter at the beginning of the alphabet.
For example, if $d = 3$ , then $A \mapsto D, B \mapsto E, ..., W \mapsto Z, X \mapsto A, ..., Z \mapsto C.$
There are 26
keys (including the trivial key $d = 0$ ).
To decipher, each letter is shifted by the negative distance $-d$ , that is, $d$ positions backwards. If the letters of the alphabet form a wheel, then the letters are transferred
By the cyclicity of the letter arrangement, we observe that a transfer of $d$ positions in counterclockwise direction equals one of $26-d$ positions in clockwise direction.
Instead of replacing each letter by one shifted by the same distance $d$ , let us replace each letter with some letter, for example:
A | B | … | Y | Z |
↓ | ↓ | … | ↓ | ↓ |
E | Z | … | G | A |
To revert the encipherment, never two letters be sent to the same letter! That is, we shuffle the letters among themselves. This way we obtain $26 \cdot 25 \cdot s1 = 26! > 10^6$ keys (which is around the number of passwords with 80
bits).
A transposition (or permutation) cipher encrypts the plaintext by permuting its units (and decrypts by the inverse permutation). Each alphabetical unit stays the same; the encryption depends only on the positions of the units.
Transposition Cipher: Transpose the alphabetical units of the plaintext.
The Scytale or Licurgo’s Baton (= a Spartan legislator around 800 B.C.) is a cipher used by the Spartans, as follows:
The letters thus transposed on the strip could only be deciphered by a stick with the same circumference (and being long enough) in the same way as the text was encrypted:
Here, the key is given by the stick’s circumference, that is, the number of letters that fit around the stick.
For example, if the stick has a circumference of 2
letters (and a length of 3
letters), the two rows
B | I | G |
S | U | M |
become the three rows
B | S |
I | U |
G | M |
which, once concatenated (to reveal neither the circumference nor the length), become
B | S | I | U | G | M |
Let us apply the established security criteria to the substitution ciphers:
This simple substitution cipher violates all desirable qualities: For example, Kerckhoff’s principle that the algorithm be public: Once the method is known, considering the small amount of 25 keys, the ciphertext gives way in short time to a brute-force attack:
brute-force attack: an exhaustive key-search that checks each possible key.
A substitution by any permutation of the letters of the alphabet, such as,
A | B | … | Y | Z |
↓ | ↓ | … | ↓ | ↓ |
E | Z | … | G | A |
has $26 \cdot 25 \cdots 1 = 26! > 10^{26}$ keys, so a brute-force attack is computationally infeasible.
But it violates the goals of diffusion and confusion. If the key (= permutation of the alphabet) exchanges the letter $\alpha$ for the letter $\beta$ , then there’s
In fact, the algorithm allows statistical attacks on the frequency of letters, bigrams (= pairs of letters) and trigrams (= triples of letters). See Section 12.1.
Also the scytale is weak in any sense given by the security principles. It violates
In fact, the maximum value of the circumference of the stick in letters is $< n/2$ where $n$ = the number of letters in the ciphertext. So a brute-force attack is feasible.
It has
In fact, the algorithm is prone to statistical attacks on the frequency of bigrams (= pairs of letters), trigrams (= triples of letters), and higher tuples. For example, a promising try would be the choice of circumference as number $c$ that maximizes the frequency of the ‘th
’ bigram between the letter strings at positions $1, 1 + c, 1 + 2c, ...$ , $2, 2 + c, 2 + 2c, ...$ . For example, if we look $\text{TEHMHTUB}$ we notice that T
and H
are one letter apart, which leads us to the guess that the circumference is three letters, $c = 3$ , yielding the decipherment $\text{THE THUMB}.$
A product cipher composes ciphers, that is, if the product is two-fold, then the output of one cipher is the input of the other.
product cipher: a composition of ciphers where the output of one cipher serves as the input of the next.
The ciphertext of the product cipher is the ciphertext of the final cipher. Combining transpositions only with transpositions or substitutions only with substitutions, the obtained cipher is again a transposition or substitution, and hardly more secure. However, mixing them, a transposition with substitutions, indeed can make the cipher more secure.
A fractionation cipher is a product cipher that:
The most famous fractionation cipher was the ADFGVX
cipher used by the German forces during World War I:
A | D | F | G | V | X | |
---|---|---|---|---|---|---|
A | a | b | c | d | e | f |
D | g | h | i | j | k | l |
F | m | n | o | p | q | r |
G | s | t | u | v | w | x |
V | y | z | 0 | 1 | 2 | 3 |
X | 4 | 5 | 6 | 7 | 8 | 9 |
A
, D
, F
, G
, V
, and X
that indicate the row and column of the letter or digit.However, it was cryptanalyzed within a month by the French cryptanalyst Georges J. Painvin in 1918 when the German army entered in Paris. We will see in Section 2.2 how modern ciphers refine this idea of a product cipher to obtain good diffusion.
Classic ciphers usually replaced single letters, sometimes pairs of letters. Systems that operated on trigrams or larger groups of letters were regarded as too tedious and never widely used.
Instead, it is safer to substitute a whole block (of letters instead of a single letter, say) according to the key. However, the alphabet of this replacement would be gigantic, so this ideal is practically unattainable, especially on hardware as limited as a smart card with an 8
bit processor. For a block of, for example, $4$ bytes, this substitution table would already have a $4$ gigabytes (= $2^{32} \cdot 4$ bytes). However, in modern single-key cryptography a block of information commonly has $16$ bytes, about 27 alphabetic characters (whereas two-key cryptography based on the RSA algorithm commonly uses blocks of $256$ bits, about $620$ alphabetic characters).
Instead, for example, AES
only replaces each byte, each entry in a block, a replacement table of $2^8 = 256$ entries of $1$ byte (and afterwards transposes the entries.) We will see that these operations complement each other so well that they are practically as safe as a substitution of the whole block.
A block cipher partitions the plaintext into blocks of the same size and enciphers each block by a common key: While a block could consist of a single symbol, normally it is larger. For example, in the Data Encryption Standard the block size is 64
bits and in the Advanced Encryption Standard 128
bits.
stream cipher versus block cipher: a stream cipher operates on single characters (for example, single bytes) while a block cipher operates on groups of characters (for example, each
16
bytes large)
A stream cipher partitions the plaintext into units, normally of a single character, and then encrypts the $i$ -th unit of the plaintext with the $i$ -th unit of a key stream. Examples are the one-time pad, rotor machines (such as the Enigma) and DES
used in Triple DES
(in which the output from one encryption is the input of the next encryption).
In a stream cipher, the same section of the key stream that was used to encipher must be used to decipher. Thus, the sender’s and recipient’s key stream must be synchronized initially and constantly thereafter.
A Feistel Cipher (after Horst Feistel, the inventor of DES
) or a substitution and permutation network (SPN
) groups the text (= byte sequence) into $n$ -byte blocks (for example, $n = 16$ for AES
and enciphers each block by iteration (for example, 10
times in AES
, and 5
times in our prototypical model) of the following three steps, in given order:
XOR
) the key,Substitution and Permutation Network: a cipher that iteratively substitutes and permutes each block after adding a key.
That is, after
One-time pad
,are applied
AES
algorithm (each byte, pair of hexadecimal letters, by another), andAES
, that groups the text into a $4\times 4$ square (whose entries are pairs of hexadecimal letters), the entries in each row (and the columns) are permuted.These two simple operations,
complement each other well, that is, they generate high confusion and diffusion after a few iterations. In the first and last round, the steps before respectively after the addition of the key are omitted because they do not increase the cryptographic security: Since the algorithm is public (according to Kerckhoff’s principle), any attacker is capable of undoing all those steps that do not require knowledge of the key.
Though seemingly a Feistel Cipher differs from classical ciphers, it is after all a product cipher, made up of transpositions and substitutions.
The Data Encryption Standard (DES), was made a public standard in 1977 after it won public competition announced by the U.S. National Bureau of Standards (NBS; now the National Institute of Standards and Technology, NIST). IBM (International Business Machines Corporation) submitted the patented Lucifer algorithm invented by one of the company’s researchers, Horst Feistel, a few years earlier (after whom the substitution and permutation network was labelled Feistel Cipher). Its internal functions were slightly altered by the NSA (and National Security Agency) and the (effective) key size shortened from 112 bits to 56 bits, before it became officially the new Data Encryption Standard.
DES: Block cipher with an effective key length of
56
bits conceived by Horst Feistel from IBM that won a U.S. National competition to become a cryptographic standard in 1977.
DES
is a product block cipher of 16 iterations, or rounds, of substitution and transposition (permutation). Its block size and key size is 64 bits. However, only 56 of the key bits can be chosen; the remaining eight are redundant parity check bits.
As the name of its inventor Horst Feistel suggests, it is a Feistel Cipher, or substitution and permutation network, similar to the prototype discussed above. It groups the text (= byte sequence) into 32
-bit blocks with sub-blocks of 4
bits and enciphers each block in 16
iterations of the following three steps, called the Feistel function, for short F-function, in given order:
add (XOR
) the key,
substitution of each 4
-bit sub-blocks of the block by the S
-box (in hexadecimal notation), and
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
E | 4 | D | 1 | 2 | F | B | 8 | 3 | A | 6 | C | 5 | 9 | 0 | 7 |
permutation of all the sub-blocks.
At each round $i$ , the output from the preceding round is split into the 32 left-most bits, $L(i)$ , and the 32 right-most bits, $R(i)$ . $R(i)$ will become $L(i+1)$ , whereas $R(i + 1)$ is the output of a complex function, $L(i) + f(R(i), K(i + 1))$ whose input is
This process is repeated 16 times.
Essential for the security of DES is the non-linear S
-box of the F
-function $f$ specified by the Bureau of Standards; it is not only non-linear, that is, $f(A) + f(B) \neq f(A + B)$ but maximizes confusion and diffusion as identified by Claude Shannon for a secure cipher in Section 2.1.
The security of the DES
like any algorithm is no greater than the effort to search $2^{56}$ keys. When introduced in 1977, this was considered an infeasible computational task, but already in 1999 a special-purpose computer achieved this in three days. A workaround, called “Triple DES
” or 3DES, was devised that effectively gave the DES
a 112-bit key (using two normal DES
keys).
3DES: Triple application of
DES
to double the key size of the DES algorithm.
(Which is after all the key size for the algorithm originally proposed by IBM for the Data Encryption Standard.) The encryption would be $E(1) \circ D(2) \circ E(1)$ while decryption would be $D(1) \circ E(2) \circ D(1)$ , that is, the encryption steps are:
while the decryption steps are:
If the two keys coincide, then this cipher becomes an ordinary single-key DES; thus, triple DES is backward compatible with equipment implemented for (single) DES.
DES
is the first cryptographic algorithm to fulfill Kerckhoff’s principle of being public: every detail of its implementation is published. (Before, for example, the implementation records of the Japanese and German cipher machines in World War II were released only half a century after their cryptanalysis.)
Shortly after its introduction as a cryptographic standard, the use of the DES algorithm was made mandatory for all (electronic) financial transactions of the U.S. government and banks of the Federal Reserve. Standards organizations worldwide adopted the DES, turning it into an international standard for business data security. It only waned slowly after its successor AES
was adopted around 2000 (after its shortcomings became more and more apparent and could only be worked around, by provisional means such as 3DES
).
What key size does DES use?
Name a cryptographic weakness of DES: Short key length.
What does 3DES stand for? Tripe DES, that is, tripe application of DES.
What key size does 3DES use?
In January 1997 the U.S. National Institute of Standards and Technology (NIST; former National Bureau of Standards, NBS) announced a public contest (National Institute for Standards and Technology (2000)) for an Advanced Encryption Standard (AES) to replace the former symmetric encryption standard, the Data Encryption Standard (DES
). Since improvements in computing power allowed to find the fixed 56-bit DES key by exhaustive key-search (brute force) in a matter of days, the NIST specifications for the AES
demanded an ever increasable key length, if ever need be. The winner of this competition, the algorithm that became the AES
, was Rijndael
(named after its creators Vincent Rijmen and Joan Daemen):
Rijndael
: 86 positive votes, 10 negative votes.Serpent
: 59 votes in favour, 7 against.Twofish
: 31 positive, 21 negative votesRC6
: 23 positive, 37 negative votesMARS
: 13 votes in favour, 84 against.AES: Substitution and Permutation network with a (variable) key length of usually
128
bits conceived by Vincent Rijmen and Joan Daemen that won a U.S. National competition to become a cryptographic standard in 2000 and succeedDES
.
The creators of AES
could demonstrate in Daemen & Rijmen (1999) that these two operations complement each other so well that, after several iterations, they almost compensate for the absence of a replacement of the entire block by another. For a more detailed source, see Daemen & Rijmen (2002).
As was the case with DES
, the AES
, decades after its introduction, still stands strong against any attacks of cryptanalysis, but foreseeably will not yield to developments in computing, as happened to the DES
, also thanks to its adjustable key size.
Among the competitors of public contest by the NIST, none of them stood out for its greater security, but Rijndael
for its simplicity, or clarity, and in particular computational economy in implementation. Since this algorithm is to be run everywhere, for example on 8-bit smart card processors (smartcards
), the decision was made in favour of Rijndael
. Rijndael
not only was secure, but thanks to its elegant and simple design, also both small enough to be implemented on smart cards (at less than 10,000 bytes of code).
To this day, this algorithm remains unbroken and is considered the safest; there is no need for another standard symmetric cryptographic algorithm. And indeed, it runs everywhere: For example, to encrypt a wireless network, a single key is used, so the encryption algorithm is symmetrical. The safest option, and therefore most recommended, is AES
.
The AES
algorithm is a block cipher, that is, it groups the plaintext (and the keys) into byte blocks of $4 \times B$-byte rectangles where $B
:=
\text{number of columns in the rectangle}
=
4, 6 or 8.$ Commonly, and for us from now on, $B = 4$ , that is, the rectangle is a $4 \times 4$-square (containing $16$ bytes or, equivalently, $128$ bits). Each entry of the block is a byte (= sequence of eight binary digits = eight-bit binary number).
On a hexadecimal basis (= whose numbers are 0
– 9
, A = 10
, B = 11
, C = 12
, D = 13
, E = 14
and F = 15
), such a square is for example
A1 |
13 |
B1 |
4A |
A3 |
AF |
04 |
1E |
3D |
13 |
C1 |
55 |
B1 |
92 |
83 |
72 |
The AES
algorithm enciphers each byte block $B$ iteratively, in a number of rounds $R$ which depends on the number of columns of $B$: there are $R = 10$ rounds for $B = 4$ columns, $R = 12$ rounds for $B = 6$ columns and $R = 14$ rounds for $B = 8$ columns. For us, as we assume $B = 4$ columns, $R = 10$.
The Substitution and Permutation cipher AES
operates repeatedly as follows on each block:
S-box
) with $2^8 = 256$ entries of $1$ byte each.XOR
in each entry).The first step is a substitution. The second and third step count as a horizontal permutation (of the entries in each row) respectively vertical permutation (of the entries in each column).
CrypTool 1
offers in Menu Individual Procedures -> Visualization of Algorithms -> AES
Animation
entry to see the animation in Figure 2.1 of the rounds, andInspector
entry in Figure 2.2 to experiment with the values of plaintext and key.In these rounds, keys are generated, the plaintext replaced and transposed by the following operations:
Round $r = 0$ :
AddRoundKey
to add (by XOR
) the key to the plaintext (square) blockRounds $r = 1, ..., R-1$ : to encrypt, apply the following functions:
SubBytes
to replace each entry (= byte = sequence of eight bits) with a better distributed sequence of bits,ShiftRows
to permute the entries of each row of the block,MixColumn
to exchange the entries (= bytes = eight-digit binary number) of each column of the block by sums of multiples of them,AddRoundKey
to generate a key from the previous round’s key and add it (by XOR
) to the block.Round $r = R$ : to encrypt, apply the following functions:
SubBytes
ShiftRows
AddRoundKey
That is, compared to previous rounds, the MixColumn
function is omitted: It turns out that MixColumn
and AddRoundKey
, after a slight change of AddRoundKey
, can change the order without changing the end result of both operations. In this equivalent order, the operation MixColumn
does not increase cryptographic security, as the last operation invertible without knowledge of the key. So it can be omitted.
The function MixColumn
(and at its origin SubBytes
) uses the multiplication of the so-called Rijndael field
$\mathbb{F}_{2^8}$ to compute the multiple of (the eight-digit binary number given by) a byte; it will be presented at the end of this chapter. Briefly, the field defines, on all eight-digit binary numbers, an addition given by XOR
and a multiplication given by a polynomial division with remainder: The eight-digit binary numbers $a = a_7 ... a_0$ and $b = b_7 ... b_0$ to be multiplied are identified with polynomials $a(x) = a_7 x^7 + \cdots + a_0 \quad \text{ and } \quad b(x) = b_7 + \cdots + b_0$ which are then multiplied as usual to give a polynomial $C(x) = a(x) b(x)$. To yield a polynomial with degree $\leq 7$, the remainder $c(x) = c_7 x^7 + \cdots + c_0$ of $C(x)$ by polynomial division with $m(x) = x^8 + x^4 + x^3 + x + 1.$ is computed. The product $c = a \cdot b$ is then given by the coefficients $c_7 ... c_0$.
Let us describe all round functions in more detail:
SubBytes
substitutes each byte of the block by another byte given by the S-box
substitution table.
To calculate the value of the entry by which the S-box
substitutes each byte:
Calculate its multiplicative inverse $B$ in $\mathbb{F}_{2^8}$ ,
Calculate $a_i
=
b_i + b_{i + 4} + b_{i + 5} + b_{i + 6} + b_{i + 7} + c_i$ where $i$ = 0
, 1
, …, 7
is the index of each bit of a byte, and
In matrix form, $\begin{pmatrix} a_0 \\ a_1 \\ a_2 \\ a_3 \\ a_4 \\ a_5 \\ a_6 \\ a_7 \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 & 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 0 & 0 & 0 & 1 & 1 \\ 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 1 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 1 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 \end{pmatrix} \begin{pmatrix} b_0 \\ b_1 \\ b_2 \\ b_3 \\ b_4 \\ b_5 \\ b_6 \\ b_7 \end{pmatrix} + \begin{pmatrix} 1 \\ 1 \\ 0 \\ 0 \\ 0 \\ 1 \\ 1 \\ 0 \end{pmatrix}$ in hexadecimal notation (where the row number corresponds to the first hexadecimal digit and the column number to the second hexadecimal digit of the byte to be replaced):
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 63 | 7c | 77 | 7b | f2 | 6b | 6f | c5 | 30 | 01 | 67 | 2b | fe | d7 | ab | 76 |
1 | ca | 82 | c9 | 7d | fa | 59 | 47 | f0 | ad | d4 | a2 | af | 9c | a4 | 72 | c0 |
2 | b7 | fd | 93 | 26 | 36 | 3f | f7 | cc | 34 | a5 | e5 | f1 | 71 | d8 | 31 | 15 |
3 | 04 | c7 | 23 | c3 | 18 | 96 | 05 | 9a | 07 | 12 | 80 | e2 | eb | 27 | b2 | 75 |
4 | 09 | 83 | 2c | 1a | 1b | 6e | 5a | a0 | 52 | 3b | d6 | b3 | 29 | e3 | 2f | 84 |
5 | 53 | d1 | 00 | ed | 20 | fc | b1 | 5b | 6a | cb | be | 39 | 4a | 4c | 58 | cf |
6 | d0 | ef | aa | fb | 43 | 4d | 33 | 85 | 45 | f9 | 02 | 7f | 50 | 3c | 9f | a8 |
7 | 51 | a3 | 40 | 8f | 92 | 9d | 38 | f5 | bc | b6 | da | 21 | 10 | ff | f3 | d2 |
8 | cd | 0c | 13 | ec | 5f | 97 | 44 | 17 | c4 | a7 | 7e | 3d | 64 | 5d | 19 | 73 |
9 | 60 | 81 | 4f | dc | 22 | 2a | 90 | 88 | 46 | ee | b8 | 14 | de | 5e | 0b | db |
A | e0 | 32 | 3a | 0a | 49 | 06 | 24 | 5c | c2 | d3 | ac | 62 | 91 | 95 | e4 | 79 |
B | e7 | c8 | 37 | 6d | 8d | d5 | 4e | a9 | 6c | 56 | f4 | ea | 65 | 7a | ae | 08 |
C | ba | 78 | 25 | 2e | 1c | a6 | b4 | c6 | e8 | dd | 74 | 1f | 4b | bd | 8b | 8a |
D | 70 | 3e | b5 | 66 | 48 | 03 | f6 | 0e | 61 | 35 | 57 | b9 | 86 | c1 | 1d | 9e |
E | e1 | f8 | 98 | 11 | 69 | d9 | 8e | 94 | 9b | 1e | 87 | e9 | ce | 55 | 28 | df |
F | 8c | a1 | 89 | 0d | bf | e6 | 42 | 68 | 41 | 99 | 2d | 0f | b0 | 54 | bb | 16 |
ShiftRows
shifts the $l$-th row (counted starting from zero, that is, $l$ runs through $0$ , $1$ , $2$ and $3$; in particular, the first row is not shifted) $l$ positions to the left (where shift is cyclic). That is, the (square) block with entries
B_{00} | B_{01} | B_{02} | B_{03} |
B_{10} | B_{11} | B_{12} | B_{13} |
B_{20} | B_{21} | B_{22} | B_{23} |
B_{30} | B_{31} | B_{32} | B_{33} |
is transformed into one with entries
B_{00} | B_{01} | B_{02} | B_{03} |
B_{11} | B_{12} | B_{13} | B_{10} |
B_{22} | B_{23} | B_{20} | B_{21} |
B_{33} | B_{30} | B_{31} | B_{32} |
MixColumn
exchanges all entries of each column of the block by a sum of multiples of them. This is done by multiplying each column by a fixed matrix. More exactly,
then $\begin{pmatrix} a_{0,j} \\ a_{1,j} \\ a_{2,j} \\ a_{3,j} \end{pmatrix} = \begin{pmatrix} 02 & 03 & 01 & 01 \\ 01 & 02 & 03 & 01 \\ 01 & 01 & 02 & 03 \\ 03 & 01 & 01 & 02 \end{pmatrix} \begin{pmatrix} b_{0,j} \\ b_{1,j} \\ b_{2,j} \\ b_{3,j} \end{pmatrix}$ For example, the byte $a_{0,j}$ is computed by $a_{0,j} = 2 \cdot b_{0,j} + 3 \cdot b_{1,j} + b_{2,j} + b_{3,j}$
AddRoundKey
adds, by the XOR
operation, the key $W(r)$ of the current round $r$ to the current block $B$ of the ciphertext, the status, that is, $B$ is transformed into $B \oplus W(r).$ The key is generated column by column. We denote them by $W(r)_0$, $W(r)_1$, $W(r)_2$ and $W(r)_3$; that is, $W(r)
=
W(r)_0 \mid W(r)_1 \mid W(r)_2 \mid W(r)_3.$ Since the key has $16$ bytes, each column has $4$ bytes.
ScheduleCore
applied to the first column of the previous round key (which we denote by $T$ ); here ScheduleChore
is the composition of transformations:
SubWord
: Substitutes each of the 4
bytes of $T$ according to the S-box
of SubBytes
.RotWord
: Shift $T$ one byte to the left (in a circular manner, that is, the last byte becomes the first).Rcon(r)
: Adds (by XOR
) to $T$ the constant value, in hexadecimal notation, $(02)^{r-1} 00 00 00$ (where the power, that is, the iterated product, is calculated in the Rijndael field
$\mathbb{F}_{2^8}$). That is, the only byte that changes is the first one, by adding either the value $2^{r-1}$ (for $r \leq 8$ ) or the value $2^{r-1}$ in $F_{2^8}$ for $r = 9, 10$.We note that the only transformation that is not affine (that is, the composition of a linear application and a constant shift) is the multiplicative inversion in the Rijndael field $\mathbb{F}_{2^8}$ in the SubBytes
operation. In fact
SubBytes
are applied, in this order,
ShiftRows
is a permutation, in particular, linear.MixColumn
is an addition, in particular, linear.AddRoundKey
is the translation by the round key.Regarding the goals of ideal diffusion and confusion, we can point out that in each step about half of the bits (in SubBytes
) or bytes (in MixColumn
and ShiftRows
)is replaced and transposed. To convince oneself of the complementarity of the simple operations for high security, that is, that they generate in conjunction high confusion and diffusion after few iterations:
it is worth to experiment in Individual Procedures -> Visualization of Algorithms -> AES -> Inspector
with some pathological values, for example:
00
, and00
and plaintext entries equals 00
except one entry equals 01
, that is, change a single bit.We see how this small initial difference spreads out, already generating totally different results after, say, four rounds! This makes plausible the immunity of AES
against differential cryptanalysis.
In case all key and plaintext entries are equal to 00
, we also understand the impact of adding the Rcon(r)
constant to the key in each round: that’s where all the confusion comes from!
The function MixColumn
(as well as the computation of the power of Rcon
in AddRoundKey
) uses the multiplication given by the so-called Rijndael field
, denoted $\mathbb{F}_{2^8}$, to compute the multiple of (the number given by a) byte; let us quickly introduce it:
A group is a set $G$ with
Generally,
Example. The set of nonzero rational numbers $\mathbb{Q}^*$ with the multiplication operation $\cdot$ is a group.
If the group $G$ is commutative, that is, if the operation satisfies the commutativity law, then commonly
Example. The set of rational numbers $\mathbb{Q}$ with the addition operation $+$ is a commutative group.
A field is a set $F$ with an addition and multiplication operation $+$ and $\cdot$ such that
Example. The set of rational numbers $\mathbb{Q}$ with addition $+$ and multiplication $\cdot$ is a field.
A byte, a sequence $b_7$ … $b_1$, $b_0$ of eight bits in $\{ 0, 1 \}$ is considered a polynomial with binary coefficients by $b_7 ... b_1 b_0
\mapsto
b_7 X^7 + \cdots + b_1 X + b_0$ For example, the hexadecimal number $0x57$ , or binary number $0101 0111$ , corresponds to the polynomial $x^6 + x^4 + x^2 + x + 1.$ All additions and multiplications in AES
take place in the binary field $\mathbb{F}_{2^8}$ with $2^8 = 256$ elements, which is a set of numbers with addition and multiplication that satisfies the associativity, commutativity and distributivity law (like, for example, $\mathbb{Q}$ ) defined as follows: Let $\mathbb{F}_2
=
\{ 0, 1 \}$ be the field of two elements with
XOR
), andLet $\mathbb{F}_2[X] = \mathbb{Z} / 2 \mathbb{Z} = \text{the polynomials on } \mathbb{F}_2,$ that is, the finite sums $a_0 + a_1 X + a_2 X^2 + \cdot s + a_n X^n$ to $a_0$ , $a_1$ , …, $a_n$ to $\mathbb{F}_2$ and be $\mathbb{F}_{2^8} := \mathbb{F}_2[X] / (X^8 + X^4 + X^3 + X+1).$ That is, the result of both operations $+$ and $\cdot$ in $\mathbb{F}_2 X$ is the remainder of the division by $X^{8 + X^4 + X} 3 + X + 1$ .
The $+$ addition of two polynomials is the addition in $\mathbb{F}_2$ coefficient to coefficient. That is, as bytes, the $+$ addition is given by the XOR
addition.
The multiplication $\cdot$ is given by the natural multiplication followed by the division with rest by the polynomial $m(x) = x^8 + x^4 + x^3 + x + 1.$ For example, in hexadecimal notation, $57 \cdot 83 = C1$ , because $(x^6 + x^4 + x^2 + x + 1)(x^7 + x + 1) = x^{13} + x^{11} + x^9 + x^8 + x^6 + x^5 + x^4 + x^3 + 1$ and $x^{13} + x^{11} + x^9 + x^8 + x^6 + x^5 + x^4 + x^3 + 1 = (x^5 + x^3 + 1) (x^8 + x^4 + x^3 + x + 1) + x^7 + x^6 + 1$ The multiplication by the polynomial $1$ does not change anything, it is the neutral element. For every polynomial $b(x)$ , Euclid’s extended algorithm, calculates polynomials $a(x)$ and $c(x)$ such that $b(x)a(x) + m(x)c(x) = 1.$ That is, in the division with the remaining $a(x) b(x)$ for $m(x)$ left over $1$ . This means that $a$ is the inverse multiplicative in $\mathbb{F}_{2^8}$ , $b^{-1}(x) = a(x) \text{ in } \mathbb{F}_{2^8}.$ When we invert a byte $b$ into $\mathbb{F}_{2^8}$ , we mean byte $a = b^{-1}$.
How many rounds has AES for a 128
bit key?
Which are the steps of each round? SubBytes
, ShiftRows
, MixColumn
and AddRoundKey
Which one of the steps is non-linear?
SubBytes
ShiftRows
MixColumn
AddRoundKey
There are two basic operations in ciphering: transpositions and substitution.
Either of these ciphers is insecure because they preserve statistical data of the plaintext: For example, a mere (monoalphabetic) substitution cipher falls victim to the frequency distributions of symbols in the plaintext being carried over to the ciphertext. Instead, a modern cipher combines substitution and permutation ciphers, so called substitution and permutation network or Feistel cipher
While the only cipher proved to be perfectly secure, that is no method of cryptanalysis is faster than exhaustive key-search, modern ciphers such as DES
from 1976 or AES
from 2000 in practice achieve, up to (the k)now, the same, that is, no cryptanalytic method faster than exhaustive key-search is known. The key criterion for this feat is high diffusion as defined by Shannon, that is, if a bit in the plaintext or key changes, then half of the bits in the ciphertext changes. Compare it to that of ancient algorithms!
Read the section on symmetric cryptography in the article Simmons & others (2016).
Read (at least the beginnings of) Chapter 9
on hash functions in Menezes et al. (1997), and (at least the beginnings of) that in the most recent work Aumasson (2017), Chapter 6
.
Cryptanalyze a substitution cipher in Esslinger & others (2008).
Follow the encryption process
AES
by the AES
inspector in Esslinger & others (2008).See the book Sweigart (2013a) for implementing some simpler (symmetric) algorithms in Python
, a readable beginner-friendly programming language.
Read the parts of the book Schneier (2007) on understanding and implementing modern symmetric cryptographic algorithms.
On completion of this chapter, you will have learned about the manifold uses of (cryptographic) hash functions whose outputs serve as IDs of their input (for example, a large file).
A hash function is an algorithm that generates an output of fixed (byte) size (usually around 16
to 64
bytes) from an input of variable (byte) size, for example, a text or image file, a compressed archive.
hash function: algorithm that generates a fixed-size output from variable-size input
As a slogan it transforms a large amount of data into a small amount of information.
A hash function takes an input (or “message”), a variable-size string (of bytes), and returns a fixed-size string, the hash value (or, depending on its use, also (message) digest, digital fingerprint or checksum).
For example, the hash md5
(of 16 bytes) of the word “key” in hexadecimal coding (that is, whose digits run through 0
, …,9
, a
,b
, c
,d
,e
and f
) is 146c07ef2479cedcd54c7c2af5cf3a80
.
One distinguishes between
A (simple) hash function, or checksum function, should satisfy:
That is, with respect to the second property, a hash function should behave as much as possible like a random function, while still being a fast and deterministic algorithm.
checksum function: algorithm that quickly generates a fixed-size output from variable-size input without collisions.
For example, the most naive checksum would be the sum of the bits of the input, truncated to the fixed output size. It is almost a hash function: it is fast, and it is indeed unlikely that two different messages give the same hash. However, one easily obtains two almost identical messages with the same hash. Tiny alterations could therefore go undetected.
A cryptographic (or one-way) hash function should, moreover, satisfy:
Thus the output string of fixed length that a cryptographic hash function produces from a string of any length (an important message, say) is a kind of inimitable signature. A person who knows the hash value cannot know the original message; only the person who knows the original message can prove that the “hash value” is produced from that message.
cryptographic (or one-way) hash function: a hash function such that, given an output, it is unfeasible to calculate a corresponding input.
More exactly:
weak collision resistance: computationally unfeasible to find an unknown message for a given hash.
strong collision resistance: computationally unfeasible to find two messages with the same hash.
Otherwise, an attacker could substitute an authorized message with an unauthorized one.
Hash functions are used for
querying database entries,
error detection and correction, for example,
in cryptography to identify data but conceal its content, for example,
for data authenticity checks,
for authentication, for example,
A checksum is a method of error detection in data transmission (where it also bears the name message digest) and storage. That is, a checksum detects whether the received or stored data was not accidentally or intentionally changed, that is, is free from errors or tampering. It is a hash of (a segment of) computer data that is calculated before and after transmission or storage.
That is, it is a function which, when applied to any data, generates a relatively short number, usually between 128
and 512
bits. This number is then sent with the text to a recipient who reapplies the function to the data and compares the result with the original number. If they coincide, then most probably the message has not been altered during transmission; if not, then it is practically certain that the message was altered.
Most naively, all the bits are added up, and the sum is transmitted or stored as part of the data to be compared with the sum of the bits after transmission or storage. Another possibility is a parity bit that counts whether the number of nonzero bits, for example, in a byte, is even or odd. (The sum over all bits for the exclusive-or operation instead of the usual addition operation.) Some errors — such as reordering the bytes in the message, adding or removing zero valued bytes, and multiple errors which increase and decrease the checksum so that they cancel each other out — cannot be detected using this checksum.
The simplest such hash function that avoids these shortfalls against accidental alterations is CRC
, which will be discussed below. It is faster than cryptographic checksums, but does not protect against intentional modifications.
A hash table stores and sorts data by a table in which every entry is indexed by its hash (for a hash function that is fixed once and for all for the table). That is, its key is the hash of its value. This key has to be unique, and therefore the hash function ideally collision free. If not, then, given a key, first the address where several entries with this key are stored has to be looked up, and then the sought-for entry among them, causing a slow down.
Therefore, the hash size has to be chosen wisely before creating the table, just large enough to avoid hash collisions in the far future. If this can be achieved, then the hash table will always find information at the same speed, no matter how much data is put in. That is, hash tables often find information faster than other data structures, such as search trees, and frequently used; for example, for associative arrays, databases, and caches.
In practice, even for checksums, most hash functions are cryptographic. Though slower, they are still fast enough on most hardware. In fact, sometimes, for example to store passwords, they have to deliberately slow so that the passwords cannot be found quickly by their hash values through an exhaustive search among probable candidates (see rainbow tables in Section 12.6.2).
The most common cryptographic hash functions used to be MD5
, usually with an output length of 128
bit, invented by Ron Rivest of the Massachusetts Institute of Technology in 1991. By 1996 methods were developed to create collisions for the MD5
algorithm, that is, two messages with the same MD5
hash. MD5CRK
was a concerted effort in 2004 by Jean-Luc Cooke and his company, CertainKey Cryptosystems
, to prove the MD5
algorithm insecure by finding a collision. The project started in March and ended in August 2004 after a collision for MD5 was found. In 2005, further security defects were detected. In 2007 the NIST (National Institute of Standards and Technology) opened a competition to design a new hash function and gave it the name SHA
hash functions, that became a Federal Information Processing standard.
One exception is the Cyclic Redundance Check (CRC
) which is a fast simple hash function to detect noise, expected accidental errors, for example, while reading a disc, such as a DVD
, or in network traffic. The CRC uses a binary generating polynomial (a formal sum in an unknown whose only coefficients are 0
and 1
such as $X^2 + X$ ). The CRC is computed by:
1001
corresponds to $X^3 + 1$ ) by the generating polynomial, andThe choice of the generator polynomial is the most important one to be made when implementing the CRC algorithm; it should maximize the error-detection and minimize the chances of collision. The most important attribute of the polynomial is its length (or degree) as it determines the length of the output. Most commonly used polynomial lengths are 9 bits (CRC-8), 17 bits (CRC-16), 33 bits (CRC-32) and 65 bits (CRC-64).
In fact, the type of a CRC identifies the generating polynomial in hexadecimal format (whose 16
digests run through 0
, …, 9
and A
, …, F
). A frequent CRC type is that used by Ethernet, PKZIP, WinZip, and PNG; the polynomial $0x04$ .
Again, the CRC can only be relied on to confirm the integrity against accidental modification; through intentional modification, an attacker can cause changes in the data that remain undetected by a CRC. To prevent against this, cryptographic hash functions could be used to verify data integrity.
SHA
stands for Secure Hash Algorithm. The SHA
hash functions are cryptographic hash functions made by the National Security Agency (NSA) and the National Institute of Standards and Technology. SHA-1
is the successor to MD5
with a hash size of 160
bits, an earlier, widely-used hash function, that fell victim to more and more suspicious security shortcomings (even though it is not downright broken; for example, there is no known computationally feasible way to produce an input for a given hash ). SHA-1 was notably used in the Digital Signature Algorithm (DSA) as prescribed by the Digital Signature Standard (DSS) by the Internet Engineering Task Force.
In the meanwhile, there are three SHA algorithms SHA-1
, SHA-2
and SHA-3
, released in 2015 of ever-increasing security, mitigating the shortcomings of each predecessor. “SHA-2” permits hashes of different bit sizes; to indicate the number of bits, it is appended to the prefix “SHA,” for example, “SHA-224,” “SHA-256,” “SHA-384,” and “SHA-512.”
A hash function should be
So if, for example, the output has 256
bits, then ideally each value should have the same probability $2^{-256}$. That is, the output identifies the input practically uniquely (with a collision chance of ideally $2^{-256}$); So one might think of a data hash, for example, from a file, as its ID card (or more accurately, identity number); a hash identifies much data by little information.
Since the length of the hash sequence is limited (rarely $\geq 512$ bits), while the length of the input sequence is unlimited, there are collisions, that is equal hashes from different files. However, the algorithm minimizes the probability of collisions by distributing their values as evenly as possible: Intuitively, make them as random as possible; more accurately, every possible fixed-length sequence is a value and the probability of each of the values is the same.
It is cryptographic (or one-way)
Cryptographic or One-Way hash function: a hash function such that it is computationally infeasible to find an input for a given output and similar inputs have dissimilar output.
More exactly, the algorithm should resist
According to Kerckhoff’s principle, the algorithm should also be public. In practice,
MD4
, MD5
and SHA-1
that do not resist against collisions attacks, but are still in use.For example, the CRC
algorithm is a hash function (not cryptographic); Common cryptographic hash functions are, for example, MD4
, MD5
, SHA-1
, SHA-256
and SHA-3
.
For example, the output of the hash function SHA-256
of ongel
is bcaec91f56ef60299f60fbce80be31c49bdb36bc500525b8690cc68a6fb4b7f6
. The output of a hash function, called hash, but also message digest or digital fingerprint, depending on the input, is used, for example, for message integrity and authentication.
The most commonly used hash (cryptographic) algorithms even today are 16
bytes (=128
bits) MD4
and MD5
or SHA-1
, which uses 20
bytes (= 160
bits). Although all of these, MD4
, MD5
and SHA-1
do not withstand collision attacks, they remain popular for use. Their implementation details are described in RFC
s (Request for Comments): an RFC
publicly specifies in a text file the details of a proposed Internet standard or of a new version of an existing standard and are commonly drafted by university and corporate researchers to get feedback from others. An RFC is discussed on the Internet and in formal meetings of the working group tasked by the Internet Engineering Task Force (IETF). For example, networking standards such as IP and Ethernet have been documented in RFC
s.
MD4
(Message-Digest algorithm):
RSA
algorithm (and the RSA Data Security
company);RFC 1320
.MD5
:
Developed by RSA Data Security
.
Described in RFC 132
.
Vulnerable to collisions, but not to creating a second reverse image. Often used for
P2P
, or Peer-to-Peer, in English), andSHA-1
(Secure Hash Algorithm):
instead of MD4
, MD5
or the ancient SHA-1
, recommended are more recent versions like SHA-256
and SHA-3
of the Secure Hash Algorithm
.
Secure Hash Algorithm: Hash algorithm recommended by the National Institute for Standards and Technology.
Similar to modern symmetric ciphers that follow the design laid out by Feistel’s Lucifer algorithm in the 70s, cryptographic hash functions follow the Merkle-Damgård construction:
The Merkle meta-method, or the Merkle-Damgård construction, builds from a lossless compression function (that is, whose inputs have the same length as the output) a lossy compression function, that is, a hash function. This method allows us to reduce the immunity of the entire hash function against
to the corresponding immunity of the compression function.
This construction needs
IV
= initialization value),C
, andWith $h_0 = \mathrm{IV}$, one computes for $i = 1, 2, ...$ $h_i = C(m_i, h_{i-1}).$ That is, for the computation of the hash of the current block, the value of the compression function of the last block enters jointly with the current block value.
The C
compression function consists of a Feistel Cipher (or substitution and permutation network) $c$ where
XOR
) to $H_{i-1}$ . This is $h_i = h_{i-1} \oplus c(h_{i-1}, m_i)$XOR
) to $H_{i-1}$ . This is $h_i = h_{i-1} \oplus c(m_i, g(h_{i-1}))$The addition of $h_{i-1}$ ensures that the C
compression function is no longer invertible (unlike the $c$ cipher for a fixed key); that is, for a given output, it is no longer possible to know the (unique) input. Otherwise, to create a collision, given an output $h_i$ , one could decipher by different keys, $m_i '$ and $m_i ''$ to get different entries $h_{i-1}'$ and $h_{i-1}''$ .
To reduce the immunity from hash to compression function, the padding $m \overline m$ needs to meet sufficient conditions:
The simplest pad that meets these conditions is the one that attaches the length $|m|$ to $m$ and fills the segment between the end of $m$ and $|m|$ by the number of 0
s prescribed by the block size, that is, the concatenation $\overline m = m \, \| \, 0^k \, \| \, |m|.$
Observation: To avoid collisions, it is not enough for the padding to fill with zeros the rest of the message: This way, two messages that only differ in the number of final zeros at the last end would have the same padding!
Instead, the simplest way would be to attach a digit 1
, and then the rest with 0
s. However, we will see that this would allow collisions with the Merkle meta-method if the initial value IV
was chosen in the following way:
Denote $B_1 , ..., B_k$ the message blocks and IV
the starting value. The hash is calculated by iterating the compression function $C : (X,B) \to C(X,B)$ $H(M) = C(C(..C(C(\mathrm{IV},B_1),B_2)...B_{k-1}),B_k)$ The clou of Merkle’s meta-method is the reduction of collisions from the hash function to the compression function: a hash collision would imply a collision of the compression function, that is, different pairs of $X',B'$ and $X'',B''$ blocks with $C(X',B')=C(X'',B'')$ .
To see this, we note that
Without the length in the padding, the collision of two messages with different lengths can ultimately only be reduced to a pre-image of the initial value IV
under the compression function, that is, a value B
such that $\mathrm{IV}
=
C(\mathrm{IV}, B).$
If its choice were arbitrary, the authors of MD5
and SHA-256
could however have inserted the following back door: Both algorithms use Mayer-Davis’s scheme, that is, a compression function $C(X,K) = X \oplus E(X, K)$ for a Feistel cipher E
with a key $K$ ; in particular, with $K$ fixed the decryption function $E_K = E( \cdot,K)$ is invertible! Now, if the authors wanted to, they could have chosen a key $K$ and set $\mathrm{IV} := E_K^{-1}(0)$ exactly so that IV = C(IV,K)
. Then C(IV, B) = C(C(IV, K), B)
, that is, a collision between the hashes of $B$ and $K \oplus B$ !
Since, for example, MD5
and SHA-256
choose as IV
a value whose pre-image is supposedly not known (for example, the hexadecimal digits in ascending order in MD5
or the decimal digits of the first eight primes in SHA-256
) this problem is more theoretical than practical.
We won’t study the inner workings of SHA-256
in detail, but a schematic look at its design shows that it follows the Merkle-Damgård construction:
It is then iterated in roughly 64
rounds:
On Lynn-Miller (2007), you can trace the bytes of an input of your choosing at each step of the SHA-1
algorithm.
Uses of cryptographic hash functions abound:
If the roles of the public and private key are flipped, then the encryption is a digital signature: while the encrypted message will no longer be secret, every owner of the public key can check whether the original message was encrypted by the private key. However, in theory, signing a file (using the RSA
algorithm) that was encrypted using the RSA
algorithm would decrypt it. Therefore, in practice, since for a signature it suffices to unequivocally identify the signed file (but its content often secret, for example, when signing a secret key), usually a cryptographic hash is encrypted by the private key.
A message authentication code algorithm uses a one-way hash function (such as MD5 or SHA-1) and a block cipher that accepts a secret key and a message as input to produce a MAC. The MAC value provides the intended receivers (who know the secret key) to detect any changes to the message content by:
Hash functions (not necessarily cryptographic, like CRC
) are used:
for fast data query (that is, at fixed time, regardless of the number of entries, for example,
to ensure the integrity of a file in case of accidental modifications, that is, to detect differences between the file and a reference version (typically the one before the file is transported).
Cryptographic Hash functions are used:
key stretching
), intuitively make them less predictable; that is, as KDF
(= Key Derivation Function
);
PBKDF
(= Password Based Key Derivation Function
).Observation: Note the difference between authenticity and authentication: The former guarantees the equality of data received and sent from a person (for example, in the digital signature), the latter the identity of that person (for example, in a secure site access).
The cryptographic hash algorithms listed above, MD4/5
, SHA
… distribute the values evenly, but are fast; so they are unsuitable for password creation because they are vulnerable to brute-force attacks. To prevent these, the PBKDF
algorithm, for example, PBKDF1
, PBKDF2
, bcrypt
, scrypt
, Argon2
(a new and more promising candidate), are
bcrypt
,scrypt
algorithm;salt
, an additional, unique, usually random argument. Without salt
, the algorithm is prone to attacks by so-called Rainbow Tables
, tables of the hashes of the most common passwords.A h table (or scatter table) uses a hash function to address the entries (= rows) of a array, that is, a data table.
Each entry has a name, that is, a unique identification (= key). For example, the key is the name of a person. The key data, for example, your telephone number, is stored in a table row. These rows are numbered from $0$.
The row number, your address, of the key is determined by your hash. As an advantage, at a fixed time, the data can
While,
Collisions, that is, two entrances with the same hash are more frequent than you might think; see the Birthday Paradox. To avoid collisions, it is necessary
When collisions occur, a strategy is
instead of the hash number address a single entry, use Chaining:
address a bucket, a bucket of multiple entries, a list, or
use "Open Hashing. put the entry in another free position, for example,
Double Hashing
).Up to $80$ of the table being filled, on average $3$ collisions occur. That is, finding an entry takes $3$ operations. To compare, with $1000$ entries, it would take on average
However, after this factor collisions occur so often that the use of another data structure, for example a binary tree, is recommended.
A spreading or Merkle tree (invented in 1979 by Ralph Merkle
) groups data into blocks by a tree (binary), whose vertices (= nodes) are hashes and whose sheets (= end nodes) contain the data blocks, in order to be able to quickly check (in linear time) each data block by computing the root hash.
In a $n$ (= $2^n$ sheets )deep scattering tree the data block of each sheet is verifiable
The main use of Merkle trees is to ensure that blocks of data received from other pairs in a point-to-point network (P2P) are received intact and unchanged.
A Merkle tree is a (usually binary) hashes
tree whose leaves are data blocks from a file (or set of files). Each node above the leaves on the tree is the hash of its two children. For example, in the image,
Hash(34)
is the hash of the concatenation of the hashes Hash(3)
and Hash(4)
, that is, $\mathrm{Hash(34)} = \mathrm{Hash(3)} || \mathrm{Hash}(4))$,Hash(1234)
is the hash of the concatenation of the hashes Hash(12)
and Hash(34)
, that is, $\mathrm{Hash(1234)} = \mathrm{Hash(12)} || \mathrm{Hash}(34))$, andHash(12345678)
is the hash of the concatenation of the hashes Hash(1234)
and Hash(5678)
, that is, $\mathrm{Hash(12345678)}
=
||mathrm{Hash}(5678)).$ .Normally a cryptographic shuffling function' is used, for example,
SHA-1. However, if the Merkle tree only needs to protect against unintentional damage, "checkums" are not necessarily cryptographic, such as
CRC`s are used.
At the top of the Merkle tree resides the root-dispersion or master-dispersion. For example, on a P2P
network, root dispersion is received from a trusted source, e.g. from a recognized website. The Merkle tree itself is received from any point on the P2P
network (not particularly reliable). This (not particularly reliable) Merkle tree is compared by calculating the leaf hashes with the reliable root dispersion to check the integrity of the Merkle tree.
The main advantage of a tree (deep $n$ with $2^n$ leaves, that is, blocks of data), rather than a dispersion
list, is that the integrity of each leaf can be verified by calculating $n$ (rather than $2^n$) hashes (and its comparison with the root hash).
For example, in Figure 3.1, the $3$ integrity check requires only
Hash(4)
Hash(12)
, and Hash(5678)
, andHash(3)
Hash(34)
, Hash(1234)
, and Hash(12345678)
Hash(12345678)
and the hash obtained from the trusted source.A hash function transforms any data (such as a file), in other words, a variable-length string, into a fixed-length string (which is usually 16
or 32
bytes long); as a slogan, it transforms a large amount of data into a small amount of information. Their output, a hash, can be thought of as an ID-card of their input (such as a file); to this end, the hash function should
So if, for example, the output has 256
bits, then ideally each value should have the same probability $2^{-256}$. That is, the output identifies the input practically uniquely (with a collision chance of ideally $2^{-256}$);
It is cryptographic (or one-way)
Recommended are, among others, the more recent versions SHA-256
and SHA-3
of the Secure Hash Algorithm
.
How do a checksum and cryptographic hash function differ? Only for a cryptographic hash function it is unfeasible to create collisions.
What are typical uses for checksums? (Accidental) error detection and correction, for example, noise on reading a Compact Disc or in network traffic, and many more.
What are typical uses of cryptographic hash functions? (Intentional) alteration correction in storage or network traffic, and many more.
Name common cryptographic hash functions. The MD and the SHA family.
Which hash function is not cryptographic ? SHA-1, MD4, CRC, WHIRLPOOL
Which cipher is a stream cipher? RSA, Enigma, AES, DES
Read the section on symmetric cryptography in the article Simmons & others (2016).
Read (at least the beginnings of) Chapter 9
on hash functions in Menezes et al. (1997), and (at least the beginnings of) that in the most recent work Aumasson (2017), Chapter 6
.
Observe the diffusion created by a cryptographic hash function in Lynn-Miller (2007).
On completion of this chapter, you will have learned …
The big practical problem of single-key cryptography is key distribution, more exactly:
In 1976, Whitfield Diffie and Martin Hellman conceived that the key distribution problem could be solved by an algorithm that satisfied:
(computationally) easy creation of a matched pair of keys for encryption and decryption,
(computationally) easy encryption and decryption,
(computationally) infeasible recovery of one of the keys despite knowledge of:
(computationally) infeasible recovery of the plaintext for almost all keys $k$ and messages $x$ .
Observation: This was the first public appearance of two-key cryptography. However, the British Government Communications Headquarters (GCHQ) knew it around a decade earlier.
If a user, say Alice, of such an algorithm keeps her decryption key secret but makes her encryption key public, for example, in a public directory, then:
The security of two-key cryptographic algorithms relies on the computational difficulty of a mathematical problem, for example, factoring a number that is the product of two large primes; ideally, computing the secret key is equivalent to solving the hard problem, so that the algorithm is at least as secure as the underlying mathematical problem is difficult. This has not been proved for any of the standard algorithms, although it is believed to hold for each of them.
In comparison with symmetric cryptography, asymmetric encryption avoids the risk of compromising the key to decipher that is involved in exchanging the key with the cipherer. This secure communication with anyone via an insecure channel is a great advantage compared to symmetric cryptography. Let us recall the classic methods to exchange a symmetric key, before looking at its asymmetric counterpart: While asymmetric cryptography made it possible to exchange a secret key overtly, this convenience comes at the risk of the unknown identity of the key holder, prone to a man-in-the-middle attack which Public Key Infrastructure work around by the use of certificates, digital signatures by third parties of public keys.
A symmetric key must be passed secretly. Possible methods are:
Derivation from a base key using a Key Derivation Function (KDF), a cryptographic hash function which derives a secret key from secret — and possibly other public — information, for instance, a unique number,
Creating a key from key parts held by different persons, for example, as an analogue to the one-time pad: If $s$ is the secret (binary number, then $s = s_1 \oplus s_2 \oplus ... \oplus s_n$ for the partial secrets $s_1$, $s_2$, … Reconstruction of $s$ is only possible if all $s_1$, $s_2$, … are combined
transmission via a different channel, for example:
personally,
a sealed letter,
by telephone, or
by quantum entanglement, where a change of state of one quantum particle (say from spin-up to spin-down) instantly implies that of the other and vice-versa.
This information of state cannot be intercepted and lends itself to bit transmission; However, entanglement is fragile and can not be fully controlled. Quantum entanglement for key distribution was put into practice in Sasaki et al. (2011).
Diffie and Hellman’s achievement was to separate secrecy from authentication: Ciphertexts created with the secret key can be deciphered by everyone who uses the corresponding public key; but the secret-key holder has no information about the owner of the public key!
Thus the public keys in the directory must be authenticated. Otherwise A
could be tricked into communicating with C
when he thinks he is communicating with B
by substituting C
key for B
in A
’s copy of the directory; the man-in-the-middle attack (MIM):
Man-in-the-middle attack: an attacker intercepts the messages between the correspondents and assumes towards each of them either correspondent’s identity.
In an MITM
, the attacker places himself between the correspondents, assuming towards each one of them the identity of the other to intercept their messages.
So both Alice and Bob are convinced to use each other’s public key, but they are actually Eve’s!
In practical terms, this problem occurs for example in the 1982 ‘ARP’ Address Resolution Protocol (in RFC 826) which standardizes the address resolution (conversion) of the Internet layer into link layer addresses. That is, ARP
maps a network address (for example, an IPv4
address) to a physical address as an Ethernet
address (or MAC
address). (On Internet Protocol Version 6 (IPv6) networks, ARP has been replaced by NDP
, the Neighbor Discovery Protocol).
The ARP
poisoning attack proceeds as follows:
Maria Bonita wants to intercept Alice’s messages to Bob, all three being part of the same physical network.
Maria Bonita sends a arp who-has
packet to Alice which contains as the source IP address that of Bob whose identity we want to usurp (ARP
spoofing) and the physical MAC
address of Maria Bonita’s network card.
MAC
address of Maria Bonita with Bob’s IP
address.IP
level, it will be Maria Bonita who will receive Alice’s packages!Let us recall that there are two keys, a public key and a private key. Usually:
Thus, a text can be transferred from the encipherer (Alice) to one person only, the decipherer (Bob).
The roles of the public and private keys can be reversed:
Thus, the encipherer can prove to all decipherers (those who have the public key) their ownership of the private key; the digital signature.
The (mathematical) algorithm behind the encryption either by the public key (for hiding the content of digital messages) or by the private key (for adding digital signatures) is in theory almost the same: only the roles of the arguments of the one-way function are exchanged (for example, in the RSA
algorithm). In practice, however, usually:
That is, while
A sub-key of a master key is a key (whose cryptographic checksum is) signed by the master key. The owner often creates subkeys in order to use the main key (public and private) only
to sign
to revoke a key (that is, sign a revocation),
to change the expiration date of a personal key.
The subkeys are used for all other daily purposes (signing and deciphering).
This way, if a sub key is eventually compromised (which is more likely that that of a main key due to its everyday use), then the main key will not be compromised. In this case,
We saw how subkeys work in practice in the common command-line program GPG
discussed in Section 13.4. A good reference is https://wiki.debian.org/Subkeys.
For more security, you create (for example, in GPG
)
first a (public and private) main key and
then several sub keys with expiry date, for everyday use:
Before their expiry, the keys are either extended, or revoked to create others.
As for using different keys to sign and encrypt,
it’s necessary for some algorithms, for example,
it is safer (but more inconvenient, which can lead to the user’s sloppiness and then in practice it is less safe!) to have different keys to decipher and to sign,
because
because, for example in the RSA
algorithm, the signature and the decipherment (by the private key) are equal (in theory, though in practice implemented differently) algorithms!
Therefore, signing (by the private key) a document encrypted by the corresponding public key is equivalent to deciphering it! However, this possibility is theoretical, but it does not practice: All implementations of the RSA
protect the user by the fact that always and exclusively
Only the main key is immutably linked to the identity of the owner, and all others replaceable: While the
main key is kept in a safe at home and only sees the light when keys need to be signed (be it from the owner or from someone else [for example, to establish the web of trust]).
In practice,
is stored on a flash drive or memory card,
and to be more durable, it is even printed, for example:
By the paperkey
program that extracts the secret part (of the file) of the private key (that is, it omits all public information like
and encodes this part in hexadecimal notation (stored in a text file). (So if the key has, for example, $4096$ bits, the file generated by paperkey
has $\geq 1024$ bits). This command
gpg --export-secret-key 0xAE46173C6C25A1A1! > ~/private.sec
paperkey --secret-key ~/private.sec > ~/private.paperkey
!
) the main secret key (0xAE46173C6C25A1A1
) of the private keys, andpaperkey
into a text file.By the qrencoder
program (for keys whose file has $< 2953$ characters) which encodes it into a QR
code.
the sub keys are stored in a smartcard that is accessed by a USB reader with its own keyboard. Compared to using a digital file, it has the advantage that
reading the keys on a smart card is much more difficult than a file (stored on a USB stick or hard drive)
leaves fewer traces:
(Perfect) Forward Secrecy
means that, after the correspondents exchanged their (permanent) public keys and established mutual trust,
MITM
attack (see Section 4.2),This way, even if the correspondence was eavesdropped and recorded, it cannot be deciphered later; in particular, it cannot be deciphered by obtaining a correspondent’s private key.
For example, the TLS
protocol, which encrypts communication of much of the Internet, has since version 1.2 support for Perfect Forward Secrecy: In the handshake between client and server in Section 13.3 : after the client has received (and trusted) the server certificate, the server and the client exchange a ephemeral public key which serves to encrypt the communication of only this correspondence. This ephemeral key is signed by the public (permanent) key of the server. (The creation of this asymmetric key in Perfect Forward Secrecy makes the creation of a symmetric preliminary key by the client in the penultimate step in the handshake in the TLS
protocol superfluous.)
Every granted signature shall first be thought through solemnly. The same goes for digital signatures:
A signature proves that the owner of the private key, say Alice, acknowledged the content. In e-mail communication, it avoids the risk that an attacker
However, if the communication contains something Alice doesn’t want to be seen by others (for example, to be read out loud by a prosecutor in court), better not prove her acknowledgement! Since this message may eavesdropped, her correspondent may change his mind about the privacy of their conversation, or his account is hacked, …
To give an analogy to our analog reality, an automatic digital signature by Alice compares to the recording of every private conversation of Alice.
In fact, usually Alice wants to prove only to Bob that she’s the sender, but not to third parties! For this, in a group signature, an ephemeral key to sign is created and shared (that is, the public and private key) between Alice and Bob. This way, Alice and Bob are sure that the message was sent by the other correspondent, but a third party only that it was sent among the group members.
A Public Key Infrastructure (PKI) of a network establishes trust among its spatially separated users by first authenticating them, then authorizing their public keys by signing them (referred to as digital certificates) and finally distributing them. In institutions and corporations, a PKI is often implemented as “a trust hierarchy” of Certification Authorities, whereas in looser communities it can be decentralized and trust mutually established by the users themselves.
A Public Key Infrastructure (PKI): establishes trust among its spatially separated users of a network by first authenticating them, then authorizing their public keys by signing them (referred to as digital certificates) and finally distributing them.
A PKI includes:
(Digital) Certificates: public keys which are signed to authenticate their users. Other then the name and key, they contain additional personal data, such as an e-mail address, and usually an expiry date.
Certificate Revocation List (CRL): A list of certificates that have been revoked before their validity expires, for example, because
the key has been compromised, or
the key owner is no longer trustable, for example, because of her departure.
Directory Service: a searchable database of the emitted certificates; for example, in a trust hierarchy an LDAP server (Lightweight Directory Access Protocol; a standard used by large companies to administer access of users to files, printers, servers, and application data), and in the web of trust a server that hosts a database searchable by a web form.
The identity of the key owner is confirmed by third parties, that is, other identities with private keys that confirm by their digital signatures that it is Alice who owns the private key.
However, the problem of the public key identity arises again: How can we ensure the identities of the private key owners? There are two solutions:
For short: while in the Web-of-Trust the connections built by trust form a graph, in the approach by hierarchical authorities they form a tree.
In the approach via hierarchical authorities, private key owners are distinguished by hierarchical levels. At the highest level lie the root authorities on which one trusts unconditionally.
hierarchical authorities: Key owners that confirm others’ identities by digital signatures and are organized in a hierarchy, where trust passes from a higher to a lower level; total trust is placed in those at the highest level, the root authorities.
For example,
VeriSign, GeoTrust, Commode, … are major US certifying companies;
as a recent addition, the (US intermediary) non-profit authority Let us encrypt
;
a look at the /etc/ssl/certs
folder in the Linux distribution openSUSE
reveals that there is, for example,
a German authority (TeleSec
of Deutsche Telekom AG
, the former national telecommunications operator),
three Spanish (Firmaprofesional
, ACCVRAIZ1
— Agencia de Tecnología y Certificación Electrónica, and ACC RAIZ FNMT
— Fábrica Nacional de Moneda y Timbre), and
many U.S. authorities.
In the web of trust, private key owners cannot be distinguished from each other.
web of trust: Private key owners confirm other’s identities by successively passing trust to each other among equals.
The absence of root authorities, unconditionally trusted entities, is compensated for by the
trust initially established by
having obtained the public key personally (for example, at key-sign parties, meetings where participants exchange and sign their public keys mutually), or
by
SMS
, instant messenger, …);then the trust is successively (transitively) passed from one to the other: If Alice trusts Bob, and Bob trusts Charles, then Alice trusts Charles.
On the Internet,
X.509
, principally used to encrypt the communication between a user and a (commercial) Website (but also between users in corporate environments, such as, S/MIME
e-mail encryption), andOpenPGP
scheme (as implemented by the GnuPG
program), with its main use of encrypting e-mails. This scheme radically rejects any hierarchy: the user can publish a public key with an e-mail address on a public key server such as that of the MIT without even confirming (by an activation e-mail) that he has access to the account of this e-mail address.The IETF (Internet Engineering Task Force) proposed (in RFC 63941: DANE use cases and RFC 66982: DANE protocol) the DANE protocol that aims to cryptographically harden the TLS, DTLS, SMTP, and S/MIME protocols using DNSSEC. By DNSsec, a DNS resolver can authenticate a DNS resolution, that is, whether it is identical to that on the authoritative DNS server, by checking its signature (of the authoritative DNS server). Instead of relying, like these protocols, entirely on certificate authorities (CAs), domain holders
Using CAs, there is no restriction on which CAs can issue certificates for which domains. If an attacker can gain control of a single CA among the many CAs that the client trusts, then she can emit fake certificates for every domain. DANE allows clients to ask the DNS servers, which certificate are trustworthy so that the domain holder can restrict the scope of a CA. When the user is passed a domain name certificate (as part of the initial TLS handshake), the client can check the certificate against a TLSA resource-record (TLSA-RR) published in the DNS for the service name which is authenticated by the authoritative DNS server.
The most common standard for a PKI is the hierarchy of X.509
certificate authorities. X.509 was first published in 1998 and is defined by the International Telecommunications Union’s Standardization sector (ITU-T), X.509 establishes in particular a standard format of electronic certificate and an algorithm for the validation of certification path. The IETF developed the most important profile, PKIX Certificate and CRL Profile, or “PKIX” for short, as part of RFC 3280, currently RFC 5280. It is supported by all common web browsers, such as Chrome and Firefox, which come with a list of trustworthy X.509
certificate authorities.
In more detail, the TLSA-RR contains the entry Certificate Usage, whose value (from 0
to 3
, the lower, the more restrictive) restricts the authority allowed to validate the certificate for the user:
PKIX-TA (CA constraint).
The client’s trust resides in a PKIX authority.
PKIX-EE (Service Certificate Constraint).
The client’s trust resides in a PKIX certificate.
DANE-TA (Trust Anchor assertion).
The client’s trust resides in an authority which, in contrast to PKIX-TA, does not have to be a PKIX certificate authority.
DANE-EE (Domain-issued certificate). The client’s trust resides in an certificate which, in contrast to PKIX-EE, does not have to be a PKIX certificate.
The DANE check therefore serves to confirm certificates issued by public certification authorities. With DANE values (2 and 3), the domain holder has the option of creating his own, even self-signed certificates for his TLS-secured services, without having to involve a certification authority known to the client. By choosing between “Trust Anchor” (TA) and “End Entity” (EE), the domain owner can decide for himself whether to anchor DANE security to a CA or server certificate.
hybrid encryption: a two-key algorithm is used to authenticate the correspondents by digitally signing the messages or to exchange a key for single-key cryptography for efficient communication thereafter,
The most common key exchange method is to create a shared secret between the two parties by the Diffie-Hellman protocol and then hash it to create the encryption key. To avoid a man-in-the-middle attack, the exchange is authenticated by a certificate, that is, by signing the messages with a long term private key to which the other party holds a public key to verify. Since single-key cryptographic algorithms are more efficient than two-key cryptographic algorithms by a considerable factor, the main use of two-key encryption is thus so-called hybrid encryption where the two-key algorithm is used
For example, in the TLS
(Transport Layer Security; former SSL) protocol, which encrypts secure sites on the World Wide Web, a cryptographic package such as TLS_RSA_WITH_3DES_EDE_CBC_SHA
(identification code 0x00 0x0a
) uses
RSA
to authenticate and exchange the keys,3DES
in CBC
mode to encrypt the connection, andSHA
as a cryptographic hash.X.509
(as used by S/MIME
) and OpenPGP
Single-key (or symmetric) cryptography suffers from the key distribution problem: to pass the same secret key to all, often distant, correspondents. Two-key (or asymmetric) cryptography solves this problem seemingly at once, by enabling the use of different keys to encrypt and decrypt. However, the identity of the key owner must be confirmed; if not personally, then by third parties, that is, identities with private keys that confirm by their digital signatures that it is Alice who owns the private key. However, the problem of the public key identity arises again: How can we ensure the identities of these private key owners? There are two solutions: In the approach via hierarchical authorities, private key owners are distinguished by hierarchical levels. At the highest level lie the root authorities on which one trusts unconditionally. In the web of trust, trust is transferred from one to the other, that is, trust is transitive: If Alice trusts Bob, and Bob trusts Charles, then Alice trusts Charles.
Read the section on asymmetric cryptography in the article Simmons & others (2016). Read in Menezes et al. (1997) Sections 3.1 and 3.3.
Read the parts of the book Schneier (2007) on understanding and implementing modern asymmetric cryptographic algorithms.
On completion of this chapter, you will have learned what a trapdoor function is, and:
The security of two-key cryptographic algorithms relies on the computational difficulty of a mathematical problem, for example, factoring a number that is the product of two large primes; ideally, computing the secret key is equivalent to solving the hard problem, so that the algorithm is at least as secure as the underlying mathematical problem is difficult. This has not been proved for any of the standard algorithms, although it is believed to hold for each of them.
To see the usefulness of (modular) arithmetic in cryptography, recall that asymmetric cryptography is based on a trapdoor function, which
The ease of calculating the function corresponds to the ease of encryption, while the difficulty of calculating the inverse corresponds to the difficulty of decryption, that is, inverting the encryption. For example, RSA
uses
While both, the function itself and even its inverse, are easily computed using the usual multiplication of numbers, instead cryptographic algorithms (such as RSA
) use modular arithmetic to entangle the computation of the inverse function without knowledge of the key (which in RSA
is root extraction).
We already know modular or circular arithmetic from the arithmetic of the clock, where $m = 12$ is considered equal to $0$ : Because the indicator restarts counting from $0$ after each turn, for example, $3$ hours after $11$ hours is $2$ o’clock: $11 + 3 = 14 = 12 + 2 = 2.$ Over these finite circular domains, called finite rings and defined in Section 5.3, (the graphs of) these functions become irregular and practically incomputable, at least without knowledge of a shortcut, the key.
In what follows, we
The difficulty of calculating the inverse corresponds to the difficulty of decryption, that is, inverting the encryption. In an asymmetric cryptographic algorithm,
are based on an invertible function such that
For example, the inverses of the trap-door functions
RSA
algorithm), andDiffie-Hellman
algorithm)are given by
They are
Observation. Both functions are algebraic, that is, they are expressed by a formula of sums, products, and powers. Analytical functions, that is, infinite convergent sums, for example, sine, cosine, …, are inconvenient by the rounding errors.
The domain of these functions is not the set of integers $\mathbb{Z}$ (or that of the real numbers $\mathbb{R}$ that includes them), because both functions, exponentiation and raising to a power, are continuous over $\mathbb{R}$:
If the domain of these functions were $\mathbb{R}$ , then their inverses could be approximated over $\mathbb{R}$ , for example, by iterated bisection where the inverse point is besieged in intervals that are halved at every iteration: Given $y_0$, find an $x_0$ such that $f(x) = y_0$ is equivalent to finding a zero $x_0$ of the function $x \mapsto f(x) - y_0.$
(Start) Choose an interval $[a, b]$ such that $f(a) < 0 \text{ and } f(b) > 0.$
(Recalibration) Calculate the midpoint $m := (a + b)/2$ of the interval $[a , b]$:
Otherwise:
and recalibrate the newly obtained interval $[m, b]$ respectively $[a, m]$.
By the Intermediate Value Theorem, the zero is guaranteed to be in the interval, which at each step decreases and converges to the intersection.
Finding the zero of the polynomial $F(x) = x^3 + 3x^2 + 12x + 8$ by bisection with starting points $x_1 = -5$ and $x_2 = 0$ , yields in steps $i = 2, ..., 15$ the successive approximations
$i$ | $x_i$ | $F(x)$ | $x_i - x_{i-1}$ |
---|---|---|---|
$2$ | $0$ | $8$ | $5$ |
$3$ | $-2.5$ | $-18.875$ | $2.5$ |
$4$ | $-1.25$ | $-4.26563$ | $1.25$ |
$5$ | $-0.625$ | $1.42773$ | $0.625$ |
$6$ | $-0.9375$ | $-1.43726$ | $0.3125$ |
$7$ | $-0.7813$ | $-0.02078$ | $0.15625$ |
$8$ | $-0.7031$ | $0.69804$ | $0.07813$ |
$9$ | $-0.7422$ | $0.33745$ | $0.03906$ |
$10$ | $-0.7617$ | $0.15806$ | $0.01953$ |
$11$ | $-0.7715$ | $0.06857$ | $0.00977$ |
$12$ | $-0.7764$ | $0.02388$ | $0.00488$ |
$13$ | $-0.7783$ | $0.00154$ | $0.00244$ |
$14$ | $-0.7800$ | $-0.00962$ | $0.00122$ |
To avoid the iterative approximation of the zero of a function and thus complicate the computation of the inverse function (besides facilitating the computation of the proper function), the domain of a trapdoor function is a finite ring denoted by $\mathbb{Z} / m \mathbb{Z} = \{ 0, 1, ..., m-1 \}$ for a natural number $m$.
finite ring: A finite set that contains $0$ and $1$ and over which a sum $+$ is explained that obeys the laws of associativity and commutativity.
In such a finite ring $m = 1 + \cdots + 1 = 0;$ necessarily, every addition (and thus every multiplication and every raising to a power) has result $< m$. So the addition of $\mathbb{Z} / m \mathbb{Z}$ is different from that on $\mathbb{Z}$ (or $\mathbb{R}$). For example, for $m = 7$ , $2^2 = 2 \cdot 2 = 4 \text{ and } 3^2 = 3 \cdot 3 = 7 + 2 = 0 + 2 = 2.$ We will introduce these finite rings first by the examples $\mathbb{Z} / 12 \mathbb{Z}$ and $\mathbb{Z} / 7 \mathbb{Z}$, the rings given by the clock hours respectively weekdays), then for every $m$ .
When we look at the graphs of the functions, which are so regular on $\mathbb{R}$ , we note that over the finite ring $\mathbb{Z} /101 \mathbb{Z}$, either graph, that of
is initially as regular over $\mathbb{Z} / 101 \mathbb{Z}$ as over $\mathbb{Z}$ , but starting
begins to behave erratically. (Except for the symmetry of the parabola on the central axis $x = 50.5$ due to $(-x)^2 = x^2$).
Task. Experiment with the function plotter Grau (2018d) to view the erratic behavior of other function graphs over finite domains.
We apply modular arithmetic in everyday life when we add times in the daily, weekly and yearly cycle of clock hours, weekdays and months. It is this circularity that explains the naming “ring.”
The prototypical example of modular arithmetic is the arithmetic of the clock in which the pointer comes back to start after $12$ hours; formally, $12 \equiv 0,$ which implies, for example, $9 + 4 \equiv 1 \quad \text{ and } \quad 1-2 \equiv 11. \qquad(5.1)$ That is, $4$ hours after $9$ hours is $1$ o’clock, and $2$ hours before $1$ o’clock is $11$ o’clock. We can go further: $9 + 24 \equiv 9$; that is, if it is $9$ o’clock now, then in $24$ hours (one day later) as well.
Another example of modular arithmetic in everyday life are the days of the week: after $7$ days, the days of the week start over: If we enumerate ‘Saturday,’ ‘Sunday,’ ‘Monday,’ ‘Tuesday,’ ‘Wednesday,’ ‘Thursday’ and ‘Friday’ by $0$ , $1$ , $2$ , $3$ , $4$ , $5$ , $6$ , then $7 \equiv 0,$ which implies, for example, $4 + 4 \equiv 1 \quad \text{ and } \quad 1-2 \equiv 5.$ Indeed, $4$ days after Wednesday is Sunday, and $2$ days before Sunday is Friday. We can go further: $5 + 14 \equiv 5$; that is if now it is Thursday, then in $14$ days (two weeks later) as well.
Another example of modular arithmetic in everyday life are the months of the year: after $12$ months, the months of the year start over. If we number ‘January,’ ‘February,’ … for $1$, $2$, … then, as in the clock, $12 \equiv 0,$ which implies, for example, $10 + 3 \equiv 1 \quad \text{ and } \quad 1-2 \equiv 11;$ That is, a quarter after October the year starts over, and $2$ months before January is November. We can go further: $5 + 24 \equiv 5$; that is, if it’s ‘May’ now, then in $2$ years as well.
Formally, we derive the equation Equation 5.1 from the equalities $9 + 4 = 13 = 12 + 1 \equiv 0 + 1 = 1 \quad \text{ and } \quad 1-2 = -1 = -1 + 0 \equiv -1 + 12 = 11.$ and $9 + 24 = 9 + 2 \cdot 12 \equiv 9 + 2 \cdot 0 = 9.$ In general, for every $a$ and $x$ in $\mathbb{Z}$ , $a + 12 \cdot x \equiv a$ or, equivalently, for all $a$ and $b$ in $\mathbb{Z}$ , $a \equiv b \quad \text{ if } 12 \text{ divides } a - b.$
Definition. Let $a$ and $b$ be positive integers. That $a$ divided by $b$ has remainder $r$ means that there is such an integer $q$ that $a = b \cdot q + r \quad \text{ with } 0 \leq r < b.$
Example. For $a = 230$ and $b = 17$ , we compute $230 = 17 \cdot 13 + 9$ . That is, the remainder of $230$ divided by $17$ is $9$ .
In other words, for every $a$ and $b$ in $\mathbb{Z}$ , $a \equiv b$ if and only if $a$ and $b$ leave the same remainder divided by $12$.
There is nothing special about the number $m = 12$ (of clock hours). For example, analogous equalities would hold if the clock indicated $m = 16$ hours (as many as a day on the planet Neptune has):
Definition. Let $m$ be a natural number. The integers $a$ and $b$ are congruent modulo $m$, formally, $a \equiv b \mod \, m$ if $m | a - b$ , that is, if their difference $a-b$ is divisible by $m$. In other words, if $a$ and $b$ leave the same remainder divided by $m$.
The number $m$ is called modulus.
Or, phrased differently, $a \equiv b \mod m$ if $a$ and $b$ leave the same remainder divided by $m$ .
Congruence: Two integer $a$ and $b$ are congruent modulo $m$ if they leave the same remainder after division by $m$ .
Let us finally define these finite domains $\mathbb{Z} /m \mathbb{Z}$ (the ring of integers modulo $m$) for a natural, usually prime, number $m$, on which the trap-door functions in asymmetric cryptography live; those that make a cryptanalyst’s life so difficult when trying to compute their inverse (in contrast to the domains $\mathbb{R}$ or $\mathbb{Z}$).
Given an integer $m \geq 1$, we want to define the ring $\mathbb{Z} /m \mathbb{Z}$ (loosely, a set with an addition $+$ and multiplication $\cdot$ governed by certain laws) such that $m = 1 + \cdots + 1 = 0.$ More exactly, such a ring
If this equality for $+$ is to hold over $\mathbb{Z} /m \mathbb{Z}$, then the addition $+$ over $\mathbb{Z} /m \mathbb{Z}$ has to be defined differently from that over $\mathbb{Z}$. We put
That is, to add and multiply in $\mathbb{Z} /m \mathbb{Z}$,
For example, for $m = 4$ we get the addition and multiplication tables
+ | 0 | 1 | 2 | 3 |
---|---|---|---|---|
0 | 0 | 1 | 2 | 3 |
1 | 1 | 2 | 3 | 0 |
2 | 2 | 3 | 0 | 1 |
3 | 3 | 0 | 1 | 2 |
and
* | 0 | 1 | 2 | 3 |
---|---|---|---|---|
0 | 0 | 0 | 0 | 0 |
1 | 0 | 1 | 2 | 3 |
2 | 0 | 2 | 0 | 2 |
3 | 0 | 3 | 2 | 1 |
Exercise. Show that an integer is divisible by $3$ (respectively $9$) if, and only if, the sum of its decimal digits is divisible by $3$ (respectively $9$).
In Python
, the modular operator is denoted by the percentage symbol %
. For example, in the interactive shell, we get:
>>> 15 % 12
3
>>> 210 % 12
6
The base $g$ of $x \mapsto g^x$ and the exponent $e$ of $x \mapsto x^e$ determine whether the function is invertible or not.
The exponential function invertible if and only if it is onto, that is, every number, except $0$ is a value of the function. For example, for $m = 101$, the values are contained in $\mathbb{Z} / 101 \mathbb{Z}^* := \mathbb{Z} / 101 \mathbb{Z} - \{ 0 \}$:
Theorem on the existence of a primitive root. A generator of $\mathbb{Z} / m \mathbb{Z}^*$ exists if, and only if,
A function (between a finite domain and counter-domain) is invertible if and only if it is injective, that is, sends different arguments to different values.
Task. Experiment with the function plotter Grau (2018d) to find examples of functions that are injective or not, that is, whose graph has two points at the same level.
If the exponent $E$ is even, $E = 2 e$ for an integer $e$ , then raising to the power $x \mapsto x^E$ satisfies $(-x)^E = (-x)^e = ((-x)^2 )^e = (x^2 )^e = x^e = x^E$, that is, sends the arguments $-x$ and $x$ to the same value. Thus, it is not injective. For example, for $m = 101$ and $e = 1$ , we observe this symmetry in Figure 5.3 along the central axis $x = 50, 5$ (but note that its restriction onto $0, ..., 50$ is injective).
Theorem. The raising to a power $x^E$ is injective over $\mathbb{Z} / m \mathbb{Z}$
RSA
) if and only if $E$ has no common divisor with neither $p-1$ nor $q-1$ .Example. For example, for $m = 21 = 3 \cdot 7$ , the exponent $E = 5$ gives the invertible function $x \mapsto x^E$ on $\mathbb{Z} / m \mathbb{Z}$.
What is the remainder of $8^8$ divided by $7$ ?
Because $8 \equiv 1 \mod 7$ , it is $8^8 \equiv 1^8 = 1$ .
Why is a number divisible by $3$ if and only if the sum of its decimal digits is divisible by $3$ ? Because $10 \equiv 1 \mod 3$ , we have $10^2 , 10^3 , ... \equiv 1 \mod 3$ . Therefore, say $a = a_2 \cdot 10^2 + a_1 \cdot 10 + a_0 \equiv a_2 + a_1 + a_0 \mod 3$ . We have $3 | a$ if and only if $a \equiv 0 \mod 3$ .
While the finite domains $\mathbb{Z} / m \mathbb{Z}$ for a natural number $m$ complicate the computation of the inverse function of the trapdoor function, they actually facilitate the computation of the function itself given by the raising to a power:
RSA
algorithm, $x \mapsto x^E$, andGiven a base $b$ and an exponent $e$ in $\mathbb{Z}$ to calculate $b^e \text{ in } \mathbb{Z} / M \mathbb{Z},$
expand the exponent in binary base, that is, $e = e_0 + e_1 2 + e_2 2^2 + \cdots + e_s 2^s \quad \text{ with } e_0, e_1, ..., e_s \text{ in } \{ 0, 1 \},$
compute $b^1, b^2, b^{2^2}, ..., b^{2^s} \mod M.$
Because $b^{2^{n + 1}} = b^{2^n \cdot 2} = (b^{2^n})^2$, that is, each power is the square of the previous one (bounded by $M$), each power is, in turn, is easily computable.
raise to the power: $b^e = b^{e_0 + e_1 2 + e_2 2^2 + \cdots+ e_s 2^s} = b^{e_0} (b^2)^{e_1} (b^{2^2})^{e_2} \cdots (b^{2^s})^{e_s}$ Only powers $e_0, e_1, ..., e_s$ equal to $1$ matter, the others can be omitted.
This algorithm takes $2 \log_2(e)$ module multiplications.
To calculate $3^5$ module $7$, expand $5 = 1 + 0 \cdot 2^1 + 1 \cdot 2^2$ and calculate $3^1 = 3, 3^2 = 9 \equiv 2, 3^{2^2} = (3^2)^2 \equiv 2^2 = 4 \mod 7,$ yielding: $3^5 = 3^{1 + 2^2} = 3^1 \cdot 3^{2^2} = 3 \cdot 4 \equiv 5 \mod 7.$
To calculate $3^{11}$ module $5$, expand $11 = 1 + 1 \cdot 2^1 + 0 \cdot 2^2 + 1 \cdot 2^3$ and calculate $3^1 = 3, 3^2 = 9 \equiv 4, 3^{2^2} = (3^2)^2 \equiv 4^2 = 1 \text{ and } 3^{2^3} = 3^{2^2 \cdot 2} = (3^{2^2})^2 \equiv 1^2 = 1 \mod 5,$ yielding $3^{11} = 3^{1 + 2^1 + 2^3} = 3^1 \cdot 3^{2^1} \cdot 3^{2^3} = 3 \cdot 4 \cdot 1 = 12 \equiv 2 \mod 5.$
Asymmetric cryptography relies on a trapdoor function, which
RSA
), butRSA
) must be practically incomputable without knowledge of a shortcut, the key!This difficulty of calculating the inverse corresponds to the difficulty of decryption, that is, inverting the encryption. To complicate the computation of the inverse function (besides facilitating the computation of the proper function) is done using modular (or circular) arithmetic*, that we already know from the arithmetic of the clock, where $m = 12$ is considered equal to $0$.
Read the section on asymmetric cryptography in the article Simmons & others (2016). Read in Menezes et al. (1997), Sections 2.4, 2.5 and 2.6 on the basic notions of number theory behind public key cryptography. Use Grau (2018d) to get an intuition for the graphs over finite domains.
See the book Sweigart (2013a) for implementing some simpler (asymmetric) algorithms in Python
, a readable beginner-friendly programming language.
Read the parts of the book Schneier (2007) on understanding and implementing modern asymmetric cryptographic algorithms.
On completion of this chapter, you will have learned the Diffie-Hellman key exchange using the (Discrete) Logarithm.
In 1976, Whitfield Diffie and Martin Hellman conceived that the key distribution problem could be solved by an algorithm that satisfied:
(computationally) easy creation of a matched pair of keys for encryption and decryption,
(computationally) easy encryption and decryption,
(computationally) infeasible recovery of one of the keys despite knowledge of:
(computationally) infeasible recovery of the plaintext for almost all keys $k$ and messages $x$ .
Observation: This was the first public appearance of two-key cryptography. However, the British Government Communications Headquarters (GCHQ) knew it around a decade earlier.
RSA
about three years before it was independently developed by Rivest, Shamir, and Adleman, andThe first published protocol to overtly agree on a mutual secret key is the Diffie-Hellman key exchange protocol published in Diffie & Hellman (1976).
This is not yet two-key cryptography, because the single secret key is known to both correspondents (in the following called Alice and Bob). The asymmetric cryptographic algorithms that build on this protocol (for example, ElGamal
and ECC
), generate a unique key for every message.
Diffie-Hellman Key exchange: overt agreement on a common secret key whose security relies on the infeasibility of computing the logarithm modulo a large number.
Notation. Let us denote, in every asymmetric encryption algorithm,
For both correspondent, say Alice and Bob, to overtly agree on a secret key, they first combine
Then
CrypTool 1
offers in the menu entry Individual Procedures -> Protocols
a dialogue to experiment with key values in the Diffie-Hellman
.
Observation. This protocol shows how to overtly build a shared secret key. This key can then be used to encrypt all further communication, for example, by an asymmetric algorithm such as AES
. However, the protocol shows
ElGamal (1985) showed first how to build an encryption and signature algorithm on top of the Diffie-Hellman protocol. While the encryption algorithm is rarely employed (albeit, for example, the standard cryptography command-line tool GnuPG
offers it as first alternative to RSA
), the signature algorithm lies behind that of the Digital Signature Algorithm (DSA), which is used in the US government’s Digital Signature Standard (DSS), issued in 1994 by the National Institute of Standards and Technology (NIST).
Elliptic Curve DSA (ECDSA) is a variant of the Digital Signature Algorithm (DSA) which uses points on finite (elliptic) curves instead of integers. The general number field sieve computes keys on DSA in subexponential time (whereas the ideal would be exponential time) . Elliptic curve groups are (yet) not vulnerable to a general number field sieve attack, so they can be (supposedly) securely implemented with smaller key sizes.
The security of the Diffie-Hellman key exchange is based on the difficulty of computing the logarithm modulo $p$ . An eavesdropper would obtain the secret key $A^b = B^a$ from $A$ and $B$, if he could compute the logarithm $\log_g$ as inverse of the exponentiation $x \mapsto g^x = y$, that is $a = \log_g A \quad \text{ or } b = \log_g B \, \mod \, p;$ While a power is easily computable (even more so using the fast power algorithm in Section 5.6), even more so in modular arithmetic, its inverse, the logarithm, the exponent for a given power, is practically incomputable for $p$ and $g$ appropriately chosen, that is:
the prime number $p$
the powers $g$ , $g^2$, … of the base $g$ generate a large set (that is, its cardinality is a multiple of $q$).
Let us look for such appropriate numbers:
Euclid’s Theorem. There are infinitely many prime numbers.
Demonstration: Otherwise, there are only finitely many prime numbers, say $p_1$, …, $p_n$ are all of them. Consider $q = p_1 ... p_n + 1$ . Since $q$ is greater than $p_1$, …, $p_n$, it cannot be prime. Let $p$ be a prime number that divides $q$ . Because $p_1$, …, $p_n$ are prime, $p$ divides at least one of $p_1$, …, $p_n$. However, by its definition, $q$ has remainder $1$ divided by every prime $p_1$, …, $p_n$. The last two statements are in contradiction! Therefore, there must be an infinite number of primes.
Euclid’s Theorem. There are infinitely many prime numbers.
Euclid’s Theorem guarantees that there are arbitrarily many big prime numbers (in particular, $>2048$ bits).
Thank Heavens, for almost every prime number $p$ there is a prime number $q$ large ($> 768$ bits) that splits $p-1$ .
The Theorem on the existence of a Primitive Root ensures that (since the modulus is prime) there is always a number $g$ in $\mathbb{F}_p^*$ such that $\{ g, g^2, g^3, ..., g^{p-1} \} = \mathbf{F}_p^*$ That is, every number $1$ , $2$ , $3$ , …, $p-1$ is a suitable power of $g$ . In particular, the cardinality of $1$ , $g$ , $g^2$, …, $g^{p-1}$ is a multiple of any prime $q$ that divides $p-1$ . In practice, the numbers $p$ and $g$ are taken from a reliable source, such as a standards committee.
Since initially (for $x < \log_g p$ ) the values $g^x$ over $\mathbb{Z} / p \mathbb{Z}$ equal to the values $g^x$ over $\mathbb{Z}$ , the secret numbers $a$ and $b$ should be large enough, that is, $> \log_g p$ . To ensure this, in practice these numbers are artificially increased, that is, the message is padded.
At present, the fastest algorithm to calculate the logarithm $x$ from $g^x$ , is an adaption of the general number field sieve, see Gordon (1993), that achieves subexponential runtime. That is, roughly, the number of operations to calculate the logarithm of an integer of $n$ bits is exponential in $n^{1/3}.$
The Diffie-Hellman Key exchange protocol shows how to build overtly a mutual secret key based on the difficulty of computing the logarithm modulo $p$ . This key can then be used to encrypt all further communication, for example, by an asymmetric algorithm such as AES
.
However, it shows
ElGamal (1985) showed first how to build an encryption and signature algorithm on top of the Diffie-Hellman protocol; in particular, it gave rise to the Digital Signature Algorithm, DSA.
Read in Menezes et al. (1997) Sections 3.1, 3.3 and try to understand as much as possible in 3.2 and 3.6 on the number theoretic problems behind public key cryptography.
Use CrypTool 1
to experiment with the Diffie-Hellman
protocol.
See the book Sweigart (2013a) for implementing some simpler (asymmetric) algorithms in Python
, a readable beginner-friendly programming language.
Read the parts of the book Schneier (2007) on understanding and implementing modern asymmetric cryptographic algorithms.
On completion of this chapter, you will have learned …
what a trapdoor function is, and:
We study multiplication in the finite rings $\mathbb{Z} /n \mathbb{Z}$. We take a particular interest in what numbers we can divide into them. It will be the Euclid Algorithm that computes the answer for us.
The private key
RSA
algorithm, orElGamal
algorithm (based on Diffie-Hellman key exchange),defines a function that is the inverse of the function defined by the public key. This inverse is computed via the greatest common divisor between the two numbers. This, in turn, is computed by Euclid’s Algorithm, an iterated division with rest.
We then introduce
Definition. A common divisor of two whole numbers $a$ and $b$ is a natural number that divides both $a$ and $b$. The greatest common divisor of two whole numbers $a$ and $b$ is the greatest natural number that divides both $a$ and $b$. Denote by $\mathrm{gcd} (a,b)$ the greatest common divisor of $a$ and $b$ , $\mathrm{gcd}(a,b) = \mathrm{the greatest natural number that divides} a \mathrm{and} b.$
Example. The greatest common divisor of $12$ and $18$ is $6$ .
Iterated Division with rest yields an efficient algorithm to calculate the largest common divisor, Euclid’s Algorithm.
Definition. The integers $a$ and $b$ are relatively prime if $\mathrm{gcd}(a,b) = 1$, that is, if no integer $> 1$ divides $a$ and $b$.
For all integer numbers $a$ and $b$, integer numbers $a/g$ and $b/g$ for $g = \mathrm{gcd}(a,b)$ are relatively prime.
Division with rest helps us build an efficient algorithm to calculate the largest common divisor. Let us look back on Division with Rest:
Definition. Be $a$ and $b$ positive integer numbers. That $a$ divided by $b$ has quotient $q$ and rest $r$ means $a = b \cdot q + r \quad \text{ with } 0 \leq r < b. \qquad(7.1)$
Example. For $a = 19$ and $b = 5$, we get $19 = 5 \cdot 3 + 4$. That is, the remainder of $19$ divided by $5$ is $4$.
A linear combination (or sum of multiples) of two whole numbers $a$ and $b$ is a sum $s = \lambda a + \mu b$ for whole numbers $\lambda$ and $\mu$.
Example. For $a = 15$ and $b = 9$, a sum of multiples of them is $s = 2 \cdot a + (-3) \cdot b = 2 \cdot 15 - 3 \cdot 9 = 3.$
In particular, looking at the division with remainder in Equation 7.1, for an entire number $d$, we observe:
That is, $d$ splits $a$ and $b$ if, and only if, $d$ splits $b$ and $r$. That is, the common dividers of $a$ and $b$ are the same as those of $b$ and $r$. In particular, $\mathrm{gcd}(a,b) = \mathrm{gcd}(b,r).$ By dividing the numbers $b$ and $r$ (which is $< b$), we obtain $b = r \cdot q' + r' \quad \text{ with } 0 \leq r' < r$ e $\mathrm{gcd}(b,r) = \mathrm{gcd}(r,r').$ Iterating, and thus diminishing the remainder, we arrive at $s = r^{'...'}$ and $r^{'...''}$ with $r^{'...''} = 0$, that is $\mathrm{gcd}(a,b) = ... = \mathrm{gcd}(s,0) = s.$
That is, the highest common divisor is the penultimate remainder (or the last one other than $0$).
Example. To calculate $\mathrm{gcd}(748, 528)$, we get
$748 = 528 \cdot 1 + 220$
$528 = 220 \cdot 2 + 88$
$220 = 88 \cdot 2 + 44$
$88 = 44 \cdot 2 + 0$
thus $\mathrm{gcd}(528, 220) = 44$.
CrypTool 1
, in the entry Indiv. Procedures -> Number Theory Interactive -> Learning Tool for Number Theory
, Section 1.3
, page 15, shows an animation of this algorithm:
Theorem. (Euclidean algorithm) Let $a$ and $b$ be positive whole numbers with $a \geq b$ . The following algorithm calculates $\mathrm{gcd} (a,b)$ in a finite number of steps:
(start) Put $r_0 = a$ and $r_1 = b$ , and $i = 1$ .
(division) Divide $r_{i-1}$ by $r_i$ with rest to get $r_{i-1} = r_i q_i + r_{i + 1} \quad \text{ with } 0 \leq r_{i + 1} < r_i.$ Then
Demonstration: We need to demonstrate that the algorithm ends with the highest common divisor of $a$ and $b$:
Like $r_0 > r_1 > ...$, finally $r_I = 0$ for $I$ large enough, and the algorithm ends.
For Equation 7.1, we have $\mathrm{gcd}(r_{i-1},r_i) = \mathrm{gcd}(r_i,r_{i + 1}) \quad \text{ for all } i = 1, 2, ...$ As ultimately $r_{I + 1} = 0$ for $I$ big enough, we have $\mathrm{gcd}(a,b) = \mathrm{gcd}(r_0,r_1) = ... = \mathrm{gcd}(r_i,r_{i + 1}). = \mathrm{gcd}(r_I,0) = r_I$ That is, $r_I = \mathrm{gcd}(a,b)$.
Observation: All it takes is $2 \cdot \log_2 b + 1$ divisions with rest for the algorithm to finish.
Demonstration: We demonstrate $r_{i + 2} < 1/2 \cdot r_i. \quad (\dagger)$ We have
In the latter case, it follows $r_i = r_{i + 1} \cdot 1 + r_{i + 2}$ and then $r_{i + 2} = r_i - r_{i + 1} < r_i - 1/2 \cdot r_i = 1/2 \cdot r_i.$ For ($\dagger$) we get iteratively that just $2 \cdot \log_2 b + 1$ divisions with rest for the algorithm finish.
In fact, it turns out that a factor of $1.45$ is enough $\log_2(b) + 1.68$ with rest, and on average $0.85$ is enough $\log_2(b) + 0.14$.
For the computation of the exponent of the decryption function, we need more information than the largest common divisor (calculated by the Euclid Algorithm). In fact, one observes (Euclid’s Extended Algorithm) that in each step of the Euclid’s Algorithm the largest common divisor $\mathrm{gcd} (x, m)$ of $x$ and $m$ is a linear combination (or sum of multiples) of $x$ and $m$ , that is, $\mathrm{gcd}(x, m) = \lambda x + \mu m \quad \text{ for integers } x \text{ and } m.$ The inverse of $x$ modulo $m$ is one of these multiples:
Example. For $a = 15$ and $b = 9$, a sum of multiples of them is $s = 2 \cdot a + (-3) \cdot b = 2 \cdot 15 - 3 \cdot 9 = 3.$
Theorem. (Euclid’s Extended Algorithm) For any positive integers $a$ and $b$ , their highest common divisor $\mathrm{gcd} (a,b)$ is a linear combination of $a$ and $b$ ; that is, there are integers $u$ and $v$ such that $\mathrm{gcd}(a,b) = au + bv.$
CrypTool 1, in the Indiv. Procedures -> Number Theory Interactive -> Learning Tool for Number Theory
, Section 1.3
, page 17, shows an animation of this algorithm:
Example. We have $\mathrm{gcd} (528, 220) = 44$ and, indeed, $44 = 5 \cdot 748 - 7 \cdot 528.$
Demonstration: As $r_0 = a$, $r_1 = b$, and $r_2 = r_0 - q_1 r_1$, it follows that $r_2$ is a linear combination of $a$ and $b$. In general, since $r_{i-1}$ and $r_i$ are linear combinations of $a$ and $b$, first $q_i r_i$ is a linear combination of $a$ and $b$, and so $r_{i + 1} = r_{i-1} - q_i r_i$ is a linear combination of $a$ and $b$. In particular, if $r_{I + 1} = 0$ then $r_I = \mathrm{gcd}(r_I, r_{I + 1}) = \mathrm{gcd}(a,b)$ is a linear combination of $a$ and $b$.
CrypTool 1, in the menu entry Indiv. Procedures -> Number Theory Interactive -> Learning Tool for Number Theory
, Section 1.3
, page 17, shows an animation of this algorithm:
Example. Let’s review the calculation of the largest common divisor of $a = 748$ and $b = 528$. Euclid’s algorithm did:
$748 = 528 \cdot 1 + 220$
$528 = 220 \cdot 2 + 88$
$220 = 88 \cdot 2 + \mathbf{44}$
$88 = 44 \cdot 2 + 0$,
which provides the linear combinations
$220 = 748 - 528 \cdot 1 = a - b$
$88 = 528 - 220 \cdot 2 = b - (a-b) \cdot 2 = 3b - 2a$
$\mathbf{44} = 220 - 88 \cdot 2 = (a-b) - (3b - 2a) \cdot 2 = 5a - 7b$.
Indeed, $44 = 5 \cdot 748 - 7 \cdot 528.$
Python
To implement the Euclid algorithm, we will use multiple assignment in Python
:
>>> spam, eggs = 42, 'Hello'
>>> spam
42
>>> eggs
Hello
The names of the variables and their values are listed to the left of $=$ respectively.
Here is a function that implements the Euclid algorithm in Python
; it returns the largest common divisor $\mathrm{gcd}(a,b)$ of two whole $a$ and $b$.
def gcd(a, b):
while a != 0:
= b % a, a
a, b return b
For example, in the interactive shell:
>>> gcd(24, 30)
6
>>> gcd(409119243, 87780243)
6837
The //
operator will figure in the implementation of the extended Euclid algorithm; it divides two numbers and rounds down. That is, it returns the greater integer equal to or less than the result of the division. For example, in the interactive shell:
>>> 41 // 7
5
>>> 41 / 7
5.857142857142857
>>> 10 // 5
2
>>> 10 / 5
2.0
We chose the following implementation of the extended Euclid algorithm:
def egcd(a, b):
= 0,1, 1,0
x,y, u,v while a != 0:
= b//a, b%a
q, r = x-u*q, y-v*q
m, n = a,r, u,v, m,n
b,a, x,y, u,v = b
gcd return gcd, x, y
The private key is calculated by the multiplicative reverse in the modular arithmetic from the public key, both in the RSA
algorithm and the ElGamal
algorithm.
We have just learned how to calculate the largest common divisor by the Euclid Extended Algorithm; we now learn how it is used to calculate this multiplicative reverse.
While in $\mathbb{Q}$ we can divide by any number (except $0$), in $\mathbb{Z}$ only by $\pm 1$! The numbers you can divide by are called invertible or units. The amount of invertible numbers in $\mathbb{Z} / m \mathbb{Z}$ depends on the $m$ module. Roughly speaking, the fewer prime factors in $m$, the more units in $\mathbb{Z} / m \mathbb{Z}$.
Definition. The element $x$ in $\mathbb{Z} / m \mathbb{Z}$ is a unit (or invertible) if there is $y$ in $\mathbb{Z} / m\mathbb{Z}$ such that $y x = 1$. The element $y$ is the inverse of $x$ and denoted by $x^{-1}$. The set of units (where we can multiply and divide) is denoted by $(\mathbb{Z} / m \mathbb{Z})^* := \text {the units in } \mathbb{Z} / m \mathbb{Z}.$ Euler’s totiente function # $\Phi$ is $m \mapsto \# \mathbb{Z} / m \mathbb{Z}^*;$ that is, given $m$, it counts how many units $\mathbb{Z} / m \mathbb{Z}$ has.
Examples:
On the clock, that is, for $\mathbb{Z} / 12 \mathbb{Z}$ the multiplication $v \cdot h$ of one hour $h$ by $v$ corresponds to iterate $v$ times the path taken by the indicator in $h$ (from $0 = 12$.) We note that for $h = 1, 5, 7$ and $11$ there is a iteration of the path that leads the indicator to $1$ (more exactly made $1$, $5$, $7$ and $11$ times), while for all other numbers this iteration leads the indicator to $0$. These possibilities are mutually exclusive, and we conclude $(\mathbb{Z} / 12 \mathbb{Z})^* = \{ 1, 5, 7, 11 \}.$ That’s $\Phi(12) = 11$.
On days of the week, that is, for $\mathbb{Z} /7 \mathbb{Z}$, we get $(\mathbb{Z} / 7 \mathbb{Z})^* = \{ 1, 2, 3, 4, 5, 6 \}.$ That is, the number of units is as large as possible, that is, $Phi(7) = 6$, all numbers except $0$.
For $\mathbb{Z} / 4 \mathbb{Z}$, the multiplication table
* | 0 | 1 | 2 | 3 |
---|---|---|---|---|
0 | 0 | 0 | 0 | 0 |
1 | 0 | 1 | 2 | 3 |
2 | 0 | 2 | 0 | 2 |
3 | 0 | 3 | 2 | 1 |
shows $(\mathbb{Z} / 4 \mathbb{Z})^* = \{ 1, 3 \}$ because $1 \cdot 1 = 1$ and $3 \cdot 3 = 1$ in $\mathbb{Z} / 4 \mathbb{Z}$. In contrast, $2 \cdot 2 = 0$ in $\mathbb{Z} / 4 \mathbb{Z}$, in particular $2$ is not a unit. (But a zero divisor; in fact, each element in $\mathbb{Z} / m \mathbb{Z}$ is either a unit, or a zero divider). Thus $\Phi(4) = 2$.
For $\mathbb{Z} / 5 \mathbb{Z}$, the multiplication table
* | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 1 | 2 | 3 | 4 |
2 | 0 | 2 | 4 | 1 | 3 |
3 | 0 | 3 | 1 | 4 | 2 |
4 | 0 | 4 | 3 | 2 | 1 |
reveals the units $(\mathbb{Z} / 5 \mathbb{Z})^* = \{ 1, 2, 3, 4 \}$.
Proposition. Let $\mathbb{N}$ and $x$ in $\mathbb{Z} / m \mathbb{Z}$, that is, $x$ in $\{ 0, 1, ..., m-1 \}$. The number $x$ is a unit in $\mathbb{Z} / m \mathbb{Z}$ if, and only if, $\mathrm{gcd}(x,m) = 1$.
Demonstration: We observe that each common divisor of $x$ and $m$ divides each sum of multiple $s = ux + vm$ of $x$ and $m$; in particular, if $s = 1$, then the largest common divisor of $\bar x$ and $m$ is $1$.
By the Extended Euclid Algorithm, there is $u$ and $v$ in $\mathbb{Z}$ such that $u x + v m = \mathrm{gcd}(x,m).$ So, from the above observation, $\mathrm{gcd}(x,m) = 1$ if, and only if, there is $u$ in $\mathbb{Z}$ such that $u x \equiv 1 \mod m.$ That is, $b x$ is a unit in $\mathbb{Z} / m \mathbb{Z}$ whose inverse is $b x^{-1} = b u$.
Observation. We concluded that for $x$ in $\{ 0, ..., m-1 \}$ with $\mathrm{gcd}(x,m) = 1$, we obtained by the Extended Euclid Algorithm $u$ and $v$ in $\mathbb{Z}$ such that $u x + v m = 1$ The reverse $x^{-1}$ of $x$ in $\mathbb{Z} /m \mathbb{Z}$ is given by the remainder of $u$ divided by $m$.
In Python
, we can then calculate the inverse of $a$ in $\mathbb{Z} / m \mathbb{Z}$ by
def ModInverse(a, m):
if gcd(a, m) != 1:
return None # no mod. inverse exists if a and m not rel. prime
else
= egcd(a,m)
gcd, x, y return x % m
What mathematical function is used to encrypt in RSA
? Raising to a power modulo a composed integer.*
What mathematical function is used to decrypt in RSA
? Taking a root modulo a composed integer.
What is a principal use of RSA
on the Internet today? The verification of certificates.
Asymmetric cryptography relies on a trapdoor function, which
RSA
), butRSA
) must be practically incomputable without knowledge of a shortcut, the key!This difficulty of calculating the inverse corresponds to the difficulty of decryption, that is, inverting the encryption. To complicate the computation of the inverse function (besides facilitating the computation of the proper function) is done using modular (or circular) arithmetic*, that we already from the arithmetic of the clock, where $m = 12$ is considered equal to $0$ .
Read the section on asymmetric cryptography in the article Simmons & others (2016). Read in Menezes et al. (1997), Sections 3.1, 3.3 and try to understand as much as possible in 3.2 and 3.6 on the number theoretic problems behind public key cryptography
Use CrypTool 1
to experiment with Euclid’s algorithm.
See the book Sweigart (2013a) for implementing some simpler (asymmetric) algorithms in Python
, a readable beginner-friendly programming language.
Read the parts of the book Schneier (2007) on understanding and implementing modern asymmetric cryptographic algorithms.
On completion of this chapter, you will have learned …
what a trapdoor function is, and:
the most common asymmetric cryptographic algorithms and their underlying trapdoor functions:
RSA
Algorithm using the (discrete) Power Function.The best-known public-key algorithm is the Rivest-Shamir-Adleman (RSA) cryptoalgorithm from Rivest et al. (1978). A user secretly chooses a pair of prime numbers $p$ and $q$ so large that factoring the product $N = pq$ is beyond estimated computing powers for the lifetime of the cipher. The number $N$ will be the modulus, that is, our trapdoor function will live on $\mathbb{Z} / N \mathbb{Z} = \{ 0, 1, ..., N-1 \}$.
RSA: algorithm that encrypts by raising to a power and whose security relies on the computational infeasibility of factoring a product of prime numbers.
$N$ is public, but $p$ and $q$ are not. If the factors $p$ and $q$ of $N$ were known, then the secret key can be easily computed. For RSA
to be secure, the factoring must be computationally infeasible; nowadays 2048
bits. The difficulty of factoring roughly doubles for each additional three digits in $N$ .
Having chosen $p$ and $q$, the user selects any integer e less than n and relatively prime to $p - 1$ and $q - 1$, that is, so that $1$ is the only factor in common between e and the product $(p - 1)(q - 1)$. This assures that there is another number d for which the product ed will leave a remainder of $1$ when divided by the least common multiple of $p - 1$ and $q - 1$. With knowledge of $p$ and $q$, the number d can easily be calculated using the Euclidean algorithm. If one does not know $p$ and $q$, it is equally difficult to find either $e$ or $d$ given the other as to factor $n$, which is the basis for the cryptosecurity of the RSA algorithm.
The RSA
algorithm creates
Compared to the Diffie-Hellman
protocol, it has the advantage that it is completely asymmetric: there is no need to share a mutual secret key (and therefore the secret key is kept in a single place only, but not two). Instead a single correspondent has access to the secret key. However, in this case the communication is encrypted only towards the owner of the secret key. To encrypt in both directions,
RSA
key,The keys for encrypting, $E$ and decryption, $d$ , will be constructed via Euler’s Formula, which in turn is based on Fermat’s Little Theorem.
Fermat’s Little Theorem. * If $p$ is a prime number, then for any integer $a$ ,
In particular, for every integer $a$ , $a^p = a^{(p-1)+1} = a^{p-1} a = a.$
For example, if $m = Ed$ , that is, $E$ and $d$ are such that $Ed \equiv 1 \mod p-1$ (so to speak, $d \equiv 1/E \mod p-1$ ) then $a^{Ed} = (a^E)^d \equiv a \, \mod \, p,$ that is, $a^{d} = a^{1/E} = \sqrt[E]{a}!$ That is, the computation of the $E$ -th root $\sqrt[E] \cdot$ is equal to that of the $d$ -th power $\cdot^d$ , a great computational shortcut! The existence of such a shortcut $d$ given $E$ is assured by Euler’s formula:
Theorem. (Euler’s formula) Let $p$ and $q$ be different prime numbers. If $a$ is divisible by neither $p$ nor $q$ , then $a^{(p-1)(q-1)} \equiv 1 \, \mod \, pq.$
Proof: By Fermat’s Little Theorem, $a^{(p-1)(q-1)} = (a^{p-1})^{q-1} \equiv 1^{q-1} = 1 \, \mod \, p$ and $a^{(q-1)(p-1)} = (a^{q-1})^{p-1} \equiv 1^{p-1} = 1 \, \mod \, q$ that is, $p$ and $q$ divide $a^{(p-1)(q-1)} - 1$ . Since $p$ and $q$ are different prime numbers, $pq$ divides $a^{(p-1)(q-1) - 1}$ , that is, $a^{(p-1)(q-1)} \equiv 1 \mod pq$ .
Corollary. (Taking roots modulo N) Let $p$ and $q$ be different prime numbers, $N = pq$ and $\phi(N) = (p-1)(q-1)$ . For every exponent $n$ such that $n \equiv 1 \, \mod \, \phi(N)$ we have $a^n \equiv a \, \mod \, N \quad \text{ for every integer } a.$
Demonstration: If $p$ or $q$ splits $a$ , then $a pq$ equiv 0. Because $n \equiv 1 \mod (p-1)(q-1)$ , that is, there is $\nu$ such that $n-1 = \nu\mod (p-1)(q-1)$ , by Euler’s Formula, $a^n = a^{\nu (p-1)(q-1) + 1} = (a^{(p-1)(q-1)})^\nu \cdot a \equiv 1^\nu \cdot a = a \, \mod \, N.$
Observation (crucial for the RSA
algorithm). If $m \equiv 1 \mod \phi(N)$ , then by Euler’s Formula $a^m \equiv a \mod N$ , that is, taking to the power is the identity function, $\cdot^m
\equiv
\mathrm{id}
\, \mod \, N.$ In particular, if $m = Ed$ is the product of two whole numbers $E$ and $d$ , that is, $Ed \equiv 1 \, \mod \, \phi(N),$ then $a = a^m = a^{Ed} = (a^E)^d.$ That is, $\cdot^d = \cdot^{1/E} \mod N$ . Calculating a power is much easier than a root!
Example. If $p = 3$ and $q = 11$ then $N = pq = 33 \quad \text{ and } \quad \phi(N) = (p-1)(q-1) = 20.$ If $E = 7$ and $d = 3$ , then $n = Ed = 21 \equiv 1 \mod 20$ . For example, for base $2$ , we check $2^E = 2^7 = 128 = 29 + 3 \cdot 33 \equiv 29 \, \mod \, N$ and $29^d = 29^3 = (-4)^3 \equiv -64 = 2 - 2 \cdot 33 \equiv 2 \, \mod \, N.$ That is, $\sqrt[E]{29} = 2 = 29^d \, \mod \, N.$
(Recall that an upper case letter denotes a public number (and vice-versa), whereas a lower case letter denotes a secret number.) For Alice to secretly send the message $m$ to Bob through an insecure channel:
Bob, to generate the key, chooses
Bob, to transmit the key, sends to Alice
Alice, to cipher,
Bob, to decipher,
Computing $d$ by knowing both prime factors of $N$ is Bob’s shortcut. CrypTool 1
offers in the menu Individual Procedures -> RSA Cryptosystem
the entry RSA Demonstration
to experiment with different values of the key and message in RSA
.
In sum, raising to the power $y = x^E \mod N$ encrypts where the exponent $E$ is the public key. Correspondingly, its inverse, taking the $E$-th root $x = y^{1/E} \mod N$ , decrypts. It is practically incomputable. However, modulo $N$ , by Euler’s formula, there is $d$ such that $y^{1 / E} = y^d \, \mod \, N$ for a number $d$ that Euclid’s Algorithm calculates from $E$ as well as $p$ and $q$ . Therefore, the secret key is $d$ , or, sufficiently, the knowledge of the prime factors $p$ and $q$ of $N$ .
Since
are all public, the computational security of RSA
is solely based on the difficulty of finding a root modulo a large number $m
\equiv
\sqrt[E]{M}
\; (=
M^{1/E})
\, \mod \, N.$
An eavesdropper would obtain the secret message $m$ from $N$ , $E$ and $M$ only if he could compute $m \nequiv \sqrt[E]{M} \; (= M^{1/E}) \, \mod \, N.$
The shortcut is the knowledge of the two prime factors $p$ and $q$ of $N = pq$ that makes it possible to calculate
so that, by the Euler Formula, $M^d = m^d \equiv m \mod N$ .
Euclid’s Theorem guarantees that there are arbitrarily large prime numbers (in particular, $> 15360$ bits). While taking powers is computationally easy, taking roots is computationally hard for suitable choices of $N = pq$ and $E$ , that is, for large enough prime numbers $p$ and $q$ (while the choice of the exponent $E$ is free; for example, $E = 2$ ):
The fastest algorithm to calculate the prime factors $p$ and $q$ from $N$ is the general number sieve, see A. K. Lenstra et al. (1993). The number of operations to factor an integer number of $n$ bits is roughly $\exp(\log n^{1/3}).$ Therefore, according to Barker (2016), the National Institute for Standards and Technology (NIST
)
In practice, the plaintext number $m$ needs to be padded, that is, the number $m$ randomly increased. Otherwise, when the plaintext number $m$ and the exponent $E$ are both small (for example, $E = 3$ ), then possibly the enciphered message $M = m^E$ satisfies $M < N$ . In this case, the equality $m = \sqrt[E]{M} \quad \text{ holds in } \mathbb{Z} !$ So $m$ is easily computable, for example, by the Bisection method already presented, (or, numerically more efficient, by Newton’s method).
A simple hypothetical attack is that by Wiener when
If the conditions of the theorem are met, then the secret $d$ can be computed by a linear time algorithm as the denominator of a continuous fraction.
Observation. If $E$ (but not $d$ ) is too small, then an attack is much more difficult; see Boneh & others (1999).
RSA
is still a standard of many asymmetric cryptographic methods. One principal use of RSA
nowadays lies in the verification of (older) certificates emitted by certificate authorities. See Section 13.
Other uses are that by GPG
for asymmetric cryptography, such as the OpenPGP
protocol to encrypt e-mail messages, which creates RSA
keys by default:
The command line program GPG
creates keys and makes it possible to (de)encrypt and sign/authenticate with them. Other (graphical) applications, for example, the Enigmail
extension for the free e-mail client Thunderbird
, use it for all these cryptographic operations.
RSA
? Raising to a power modulo a composed integer.RSA
? Taking a root modulo a composed integer.RSA
on the Internet today? The verification of certificates.Let us recall that in public-key cryptography there are two keys, a public key and a private key. Usually:
Thus, a text can be transferred from the encipherer (Alice) to one person only, the decipherer (Bob). The roles of the public and private keys can be reversed:
Thus, the encipherer can prove to all decipherers (those who have the public key) his ownership of the private key; the digital signature.
digital signature: encryption of a message by the private key followed by decryption by the public key to check whether the original message was encrypted by the private key.
The theory (meaning the mathematics) behind the encryption by the public key (digital messages) or private key (digital signature) is almost the same; only the roles of the trap function arguments are reversed. (For example, in the RSA
algorithm, this exchange of variables is indeed all that happens). In practice, however, usually encrypted by the private key are:
That is, while
In the RSA Signature Algorithm, to sign (instead of encrypt), the only difference is that the exponents $E$ and $d$ exchange their roles. That is, the signed message is $M = m^d$ (instead of $m^E$ ). For Samantha to sign the $m$ message and Victor to verify it,
Samantha, to generate a signet, chooses
Samantha, to transmit the signet, sends to Victor
Samantha, to sign,
Victor, to verify, calculates $M^E = m^{Ed} \equiv m \mod N$ (which holds by Euler’s Formula).
Observation. Signing and the deciphering are both given by $\cdot^d$ for the private key $d$ . So, signing an encrypted document (for the public key $E$ that corresponds to $d$ ) is equivalent to deciphering it! Therefore, in practice,
different key pairs are used
a cryptographic hash $h(d)$ of the document $d$ is signed, a small number that identifies the document.
CrypTool 1
offers in the menu Individual Procedures -> RSA Cryptosystem
the entry Signature Generation
to experiment with different values of the signature and the message.
We note that instead of the original message, it signs:
a cryptographic hash (for example, by the algorithm MD5
) of the original message, and
with additional information, such as
ElGamal (1985) showed first how to build an encryption and signature algorithm on top of the Diffie-Hellman protocol. While the encryption algorithm is rarely employed (albeit, for example, the standard cryptography command-line tool GnuPG
offers it as a first alternative to RSA
), the signature algorithm is behind that of the Digital Signature Algorithm (DSA), which is used in the US government’s Digital Signature Standard (DSS), issued in 1994 by the National Institute of Standards and Technology (NIST). Elliptic Curve DSA (ECDSA) is a variant of the Digital Signature Algorithm (DSA) which uses points on finite (elliptic) curves instead of integers.
Single-key (or symmetric) cryptography suffers from the key distribution problem: to pass the same secret key to all, often distant, correspondents. Two-key (or asymmetric) cryptography solves this problem seemingly at once, by enabling the use of different keys to encrypt and decrypt. However, the identity of the key owner must be confirmed; if not personally, then by third parties, that is, identities with private keys that confirm by their digital signatures that it is Alice who owns the private key. However, the problem of the public key identity arises again: How can we ensure the identities of these private key owners? There are two solutions: In the approach via hierarchical authorities, private key owners are distinguished by hierarchical levels. At the highest level lie the root authorities on which one trusts unconditionally. In the web of trust, trust is transferred from one to the other, that is, trust is transitive: If Alice trusts Bob, and Bob trusts Charles, then Alice trusts Charles.
Asymmetric cryptography relies on a trapdoor function, which
RSA
), butRSA
) must be practically incomputable without knowledge of a shortcut, the key!This difficulty of calculating the inverse corresponds to the difficulty of decryption, that is, inverting the encryption. To complicate the computation of the inverse function (besides facilitating the computation of the proper function) is done using modular (or circular) arithmetic*, that we already from the arithmetic of the clock, where $m = 12$ is considered equal to $0$ .
The Diffie-Hellman Key exchange protocol shows how to build overtly a mutual secret key based on the difficulty of computing the logarithm modulo $p$ . This key can then be used to encrypt all further communication, for example, by an asymmetric algorithm such as AES
.
However, it shows
ElGamal (1985) showed first how to build an encryption and signature algorithm on top of the Diffie-Hellman protocol; in particular, it gave rise to the Digital Signature Algorithm, DSA.
The best-known public-key algorithm is the Rivest-Shamir-Adleman (RSA
) cryptoalgorithm. A user secretly chooses a pair of prime numbers $p$ and $q$ so large that factoring the product $N = pq$ is beyond estimated computing powers during the lifetime of the cipher; The number $N$ will be the modulus, that is, our trapdoor function will be defined on $\mathbb{Z} / N \mathbb{Z} = \{ 0, 1, ..., N-1 \}$ .
$N$ is public, but $p$ and $q$ are not. If the factors $p$ and $q$ of $N$ were known, then the secret key can be easily computed. For $RSA$ to be secure, the factoring must be computationally infeasible; nowadays 2048
bits. The difficulty of factoring roughly doubles for each additional three digits in $N$ .
To sign (instead of encrypt), the only difference is that the exponents $E$ and $d$ exchange their roles. That is, the signed message is $M = m^d$ (instead of $m^E$ ):
The trapdoor function of of RSA
is raising to a power, $m \mapsto M = M^E$ for a message $m$ , and its computational security relies upon the difficulty of finding a root modulo a large number $m
\equiv
\sqrt[E]{M}
\; (=
M^{1/E})
\, \mod \, N.$ The shortcut is the knowledge of the two prime factors $p$ and $q$ of $N = pq$ that makes it possible to calculate the inverse multiplicative $d = 1/E$ of $E$ , that is, $d$ such that $Ed \equiv 1 \, \mod \, (p-1)(q-1).$ Then $\sqrt[E]{M} \equiv M^d \, \mod \, N;$ computing a power is a lot faster than a root.
What is the inverse of the trapdoor function used in RSA?
What is the fastest known algorithm to attack RSA
?
What is the minimum key size of RSA to be currently considered secure, for example, by the NIST?
Read the section on asymmetric cryptography in the article Simmons & others (2016). Read in Menezes et al. (1997),
Use Grau (2018d) to get an intuition for the graphs over finite domains.
Read Chapter 10 on RSA
.
Use CrypTool 1
to experiment with RSA
.
See the book Sweigart (2013a) for implementing some simpler (asymmetric) algorithms in Python
, a readable beginner-friendly programming language.
Read the parts of the book Schneier (2007) on understanding and implementing modern asymmetric cryptographic algorithms.
Let us recall Euclid’s Theorem which asserts that there are prime numbers arbitrarily large (for example, with $\geq 2048$ binary digits for the RSA
):
Euclides’ theorem. There is an infinitude of prime numbers.
Demonstration: Let’s suppose otherwise, that there’s only a finite number $p_1, ..., p_n$ of prime numbers. Consider $q = p_1 ... p_n + 1.$ Since $q$ is greater than $p_1$, …, $p_n$, not prime. So be $p$ a prime number that shares $q$. On the off chance, $p_1$, …, $p_n$. But by definition $q$ has $1$ left over from any $p_1$, …, $p_n$.
Contradiction! So there’s no greater prime number. $\quad$ q.e.d.
Marin Mersenne (Oizé, 1588 — Paris, 1648) was a French minimum priest who tried to find, without success, a formula for all prime numbers. Motivated by a letter from Fermat in which he suggested that all the numbers $2^{2^p}+1$, Fermat’s Numbers be primes, Mersenne studied the numbers $2^p-1 \quad \text{ for $p$ prime }.$ In 1644 he published the work Cogita physico-mathematica which states that these are primes to $p = 2,3,5,7,13,17,19,31 \text{ and } 127.$ (and mistakenly included $p = 63$ and $p = 257$). Only one computer could show at $1932$ that $2^{257} - 1$ is composed.
Mersenne’s prime numbers, in the form of $2^p-1$ to $p$ prime, are known to be
$2$, $3$, $5$, $7$, $13$, $17$, $19$, $31$, $61$, $89$, $107$, $127$, $521$, $607$, $1279$, $2203$, $2281$, $3217$, $4253$, $4423$, $9689$, $9941$, $11213$, $19937$, $21701$, $23209$, $44497$, $86243$, $110503$, $132049$, $216091$, $756839$, $859433$, $1257787$, $1398269$, $2976221$, $3021377$, $6972593$, $13466917$, $20996011$, $24036583$, $25964951$, $30402457$, $32582657$, $37156667$, $42643801$, $43112609$, $57885161$, $74207281$, $77232917$ e $82589933$
The prime number $2^{82\,589\,933} - 1$ has $24,862,048$ digits. It was found on December 8,2018 and is to this day the most known prime number.
The “CrypTool 1,” in the menu entry "Indiv. Procedures -> Number Theory Interactive -> Compute Mersenne Numbers’ allows you to calculate some prime numbers of Mersenne.
A quick test if the natural number $n$ is compound is the Small Fermat Theorem (formulated as its contraposition): If there is a natural number $a$ such that $a^n \nequiv a \mod n$ then $n$ is compound.
But the reverse implication doesn’t hold: There are $n$ numbers (which are called Carmichael numbers) that are compound but for every natural number $a$, $a^n \equiv a \mod n.$ The lowest such number $n$ is $561$ (which is divisible by $3$).
The simplest algorithm to verify whether a number is prime or not is the Eratosthenes sieve (285 – 194 B.C.).
To illustrate this, let’s determine the prime numbers between $1$ and $30$.
Initially, we’ll determine the largest number by what we’ll divide to see if the number is composed; is the square root of the upper coordinate rounded down. In this case, the $30$ root, rounded down, is $5$.
The next number on the list is prime. Repeat the procedure:
As initially determined, $5$ is the last number we divide by. The final list $2$, $3$, $5$, $7$, $11$, $13$, $17$, $19$, $23$, $29$ contains only prime numbers.
Here’s an implementation in Python
:
def primeSieve(sieveSize):
# Returns a list of prime numbers calculated using
# the Sieve of Eratosthenes algorithm.
= [True] * sieveSize
sieve 0] = False # zero and one are not prime numbers
sieve[1] = False
sieve[# create the sieve
for i in range(2, int(math.sqrt(sieveSize)) + 1):
= i * 2
pointer while pointer < sieveSize:
= False
sieve[pointer] += i
pointer # compile the list of primes
= []
primes for i in range(sieveSize):
if sieve[i] == True:
primes.append(i)return primes
The test of AKS
determines in polynomial time whether $n$ is compound or prime (more exactly, in time $\textit{O}(d)^6$ where $d$ = the number of digits $d$ [binary] of $n$). In practice, the Miller-Rabin test is usually enough to guarantee much more witnesses (= $a$ numbers that prove whether $n$ is composed or not) than Fermat’s Little Theorem.
In fact, when we compare the duration between the two algorithms to check if a number is prime on a computer with a 2GHz Intel Pentium-4
processor, we get
prime number | Miller-Rabin | AKS |
---|---|---|
$7309$ | $0.01$ | $12.34$ |
$9004097$ | $0.01$ | $23:22.55$ |
$2^31-1$ | $0.01$ | $6:03:14.36$ |
The CrypTool 1
offers in the Individual Procedures -> RSA
Menu an entry to experiment with different algorithms to detect prime numbers.
Miller-Rabin
The simplistic tests, to know if a $n$ number is prime or not, are inefficient because they calculate the $n$ factors. Instead of them, to know only if it is prime or not, there is the Miller-Rabin Test. After your demonstration, we’ll give you your opposition; it’s in this formulation that it’s applied.
The Miller-Rabin Test. Be $p > 2$ a prime number, be $n-1 = 2^k q$ for numbers $k$ and $q$ (with $q$ odd). So, for any whole number $a$ indivisible for $p$ is worth
Demonstration: By the Little Theorem of Fermat $a^{p-1} = (a^d)^{2^k} \equiv 1 \mod p$ By iteratively extracting the square root, we obtain
If for an odd number, a possibly prime number, we write $n-1 = 2^k q$, then by Fermat’s Test $n$ is not prime if there is a whole $a$ such that $a^{2^k q} \equiv 1 \mod n$. The Miller-Rabin Test explains the condition $a^{q 2^k} = (a^q)^{2^k} = ((a^q)^2) \cdots)^2 \nequiv 1$:
The Miller-Rabin Test. (Contraposition) It’s $n$ odd and $n-1 = 2^k q$ for numbers $k$ and $q$ (with $q$ odd). An integer $a$ relatively prime to $n$ is a Miller-Rabin Witness (for divisibility) of $n$, if
Question: What are the chances that we declare by the Miller-Rabin Test accidentally a prime number, that is, a number that is actually compound?
Theorem. (About the frequency of witnesses) Be $n$ odd and compound. So at least $75$ of the numbers in $1$, …, $n-1$ are Miller-Rabin Witnesses for $n$.
So, already after $5$ attempts $a_1$, $a_2$, …, $a_5$ without witness we know with a chance $1/4^5 = 1/1024 < 0.1 \%$, that the number is prime!
Let’s implement
# Primality Testing with the Rabin-Miller Algorithm
# http://inventwithpython.com/hacking (BSD Licensed)
import
random
def rabinMiller(num):
# Returns True if num is a prime number.
= num - 1
s = 0
t while s % 2 == 0:
# keep halving s while it is even (and use t
# to count how many times we halve s)
= s // 2
s += 1
t
for trials in range(5): # try to falsify num's primality 5 teams
= random.randrange(2, num - 1)
a = pow(a, s, num)
v if v != 1: # this test does not apply if v is 1.
= 0
i while v != (num - 1):
if i == t - 1:
return False
else:
= i + 1
i = (v ** 2) % in a
v return True
def isPrime(num):
# Return True if num is a prime number. This function does a
# quicker prime number check before calling rabinMiller().
if (num < 2):
return False # 0, 1, and negative numbers are not prime
# About 1/3 of the time we can quickly determine if num is not
# prime by dividing by the first few dozen prime numbers.
# This is quicker # than rabinMiller(), but unlike rabinMiller()
# is not guaranteed to prove that a number is prime.
= [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,
lowPrimes 47, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107,
109, 113, 127, 131, 137, 149, 151, 157, 163, 167, 173, 179, 181,
191, 193, 197, 199, 211, 223, 227, 233, 239, 241, 251, 257, 263,
269, 271, 277, 281, 283, 293, 307, 311, 313, 331, 337, 347, 349,
353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 431, 433,
439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509,
523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607,
613, 617, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691,
701, 709, 719, 727, 739, 743, 751, 757, 761, 769, 773, 787, 797,
809, 811, 821, 823, 827, 829, 853, 857, 859, 863, 877, 881, 883,
887, 907, 911, 919, 929, 937, 941, 947, 967, 971, 977, 983, 991,
997]
if in a lowPrimes:
return True
# See if any of the low prime numbers can divide into one
for prime in lowPrimes:
if (num % prime == 0):
return False
# If all else fails, call rabinMiller() to check if num is a prime.
return rabinMiller(num)
def generateLargePrime(keysize = 1024):
# Return a random prime number of keysize bits in size.
while True:
= random.randrange(2**(keysize-1), 2**(keysize))
num if isPrime(num):
return num
Let’s observe for modules that are not primes, that is, a product of primes factors, that the difficulty increases linearly in the number of factors, unlike it increases exponentially in the number of bits of each factor:
If the $m = pq$ module is the product of two factors $p$ and $q$ without common factor, then the modular logarithm $\log_g \mod m$ can be computed, by the Chinese Theorem of the Remains, by the logarithms $\log_g \mod p \quad \text{ and } \quad \log_g \mod q$ More exactly, there are integers $a$ and $b$, computed (in linear time in the number of $p$ and $q$ bits) by the Euclid Algorithm (extended), such that $ap + bq = 1$ and $\log_g \mod m = a (\log_g \mod p) + b (\log_g \mod q).$
If the modulus $m = p^e$ is a power of a prime $p$, then Bach (1984) shows how the modular logarithm module $m$ for a $g$ base $\log_g \colon \mathbb{Z} / m \mathbb{Z}^* \to \mathbb{Z} / \phi(m) \mathbb{Z}$ can be computed in polynomial time from the $p$ module logarithm. Let’s expose the steps to a prime number $p > 2$:
Let us recall the Section 7.2 about the existence of the primitive root for every module for which $\mathbb{Z} / p^e \mathbb{Z}^*$ is cyclical in order $(p-1)p^{e-1}$. Therefore, there is a multiplicative application $\mathbb{Z} / p^e \mathbb{Z}^* \to \mu_{p-1} \times U_1$ given by $x \mapsto x^{p^{e-1}}, x / x^{p^{e-1}} \quad (*)$ where $\mu_{p-1} = \{ \zeta \in \mathbb{Z} / p^e \mathbb{Z}^* : \zeta^{p-1} = 1 \}$ denote the group of $(p-1)$-highest roots of the unit and $U_1 = 1 + p \mathbb{Z} / p^e \mathbb{Z}$ The unitary units.
We have the isomorphism $\mu_{p-1} \to \mathbb{F}_p^*$ given for $x \mapsto x \mod p$ and its inverse for $X \mapsto X^{p^{e-1}}$ for every $X$ in $\mathbb{Z} / p^e \mathbb{Z}$ such that $X \equiv x \mod p$. (Note that the restriction of homomorphism $\mathbb{Z} / p^e \mathbb{Z}^* \to U_1$ given by $x^{1 - p^{e-1}}$ to $U_1$ is the identity because the order of $U_1$ is $p^{e-1}$).
We have the logarithm to the $g$ base $\log_g \colon \mathbb{F}_p^* \to \mathbb{Z} / (p-1) \mathbb{Z}$ and we have the natural logarithm $\log \colon U_1 \to p (\mathbb{Z} / p^e \mathbb{Z})$ which is calculated in polynomial time by the formula $u \mapsto [x^{p^e} - 1]/p^e; \qquad(9.1)$ and which provides the logarithm $\log_g \colon U_1 \to p (\mathbb{Z} / p^e \mathbb{Z})$ for the base $g$ by the scaling $\log_g = \log \cdot / \log g.$
By the Chinese Remainder Theorem, we have the isomorphism $\mathbb{Z} / (p-1) \mathbb{Z} \times \mathbb{Z} / p^{e-1} \mathbb{Z} \to \mathbb{Z} / (p-1) p^{e-1} \mathbb{Z}$ given by the product and its inverse given by $y \mapsto (a y \mod p, b y \mod p^{e-1})$ where $a$ and $b$ satisfy $a (p-1) + b (p^{e-1}) = 1$ and were obtained by the Euclid Algorithm (extended).
We conclude that, given
the value of $\log_g(y)$ of $\log_g \colon (\mathbb{Z} / p^e \mathbb{Z})^* \to \mathbb{Z} / (p-1)p^{e-1} \mathbb{Z}$ is computed in polynomial time.
Observation. To facilitate computing, instead of projection $\mathbb{Z} / p^e \mathbb{Z}^* \to U_1$ given in $(*)$ for $x \mapsto x^{1 - p^{e-1}}$, it’s faster to use that given in $(*)$ by $\pi \colon x \mapsto x^{p-1}$. However, its $U_1$ restriction is not identity. So you need to use it instead of $\log g \colon U_1 \to p \mathbb{Z} / p^e \mathbb{Z}$ the scaled logarithm $(p-1)^{-1} \log_g$ in order to obtain $\log_g = (p-1)^{-1} \log_g \circ \pi = (\log(g) p-1)^{-1} \log \circ \pi \colon U_1 \to p \mathbb{Z} / p^e \mathbb{Z}.$
Let’s explain Equation 9.1 which defines the logarithm $\log \colon U_1 \to p (\mathbb{Z} / p^e \mathbb{Z})$: let us remember the definition of the exponential on $\mathbb{R}$ for compound interest $\exp(x) = \lim \left(1+\frac{1}{n}\right)^n,$ which leads to the definition of the inverse function $\log(x) = \lim n (x^{1 / n} - 1) = x^\epsilon - 1$ where $\epsilon = 1 / n \to 0$.
Now, at $\mathbb{Z} / p^e \mathbb{Z}$, we have $1$, $p$, $p$, $2$, …, $p = 0$, that is, $p^n$, which may motivate the idea of considering $p$ as small. So, the good analog about $U_1$ is $\log(x) = \lim \frac{1}{p^{e-1}}(x^{p^{e-1}} - 1).$ In fact, over $U_1$, $\log(1+x) = x - \frac{x^2}{2} + \frac{x^3}{3} - ...$ is a well defined value in $p \mathbb{Z} / p^e \mathbb{Z}$, because if $p$ divides $x$, then no denominator of the fraction cut is divisible by $p$ and all indivisible numbers by $p$ are invertible by $\mathbb{Z} / p^e \mathbb{Z}$. Likewise, over $p \mathbb{Z} / p^e \mathbb{Z}$, $\exp(x) = \sum_{n \geq 0} \frac{x^n}{n!}$ is a well defined value at $1 + p \mathbb{Z} / p^e \mathbb{Z}$, because if $p$ divides $x$, then no cut denominator is divisible by $p$ and all indivisible numbers by $p$ are invertible by $\mathbb{Z} / p^e \mathbb{Z}$.
Of particular interest is the base $e^p$ of the natural logarithm at $1 + p \mathbb{Z} / p^e \mathbb{Z}$, that is, the argument $y$ such that $\log y = 1$. For example, for $p = 7$ and $e = 4$, we calculate $\exp(p) = \sum_{n \geq 0} \frac{p^n}{n!} = 1 + p + \frac{p^2}{2} + \frac{p^3}{3!} = 1 + 7 \cdot 127 = 1 + 1 \cdot 7 + 4 \cdot 7^2 + 2 \cdot 7^3.$
Asymmetric cryptography relies on a trapdoor function, which
RSA
), butRSA
) must be practically incomputable without knowledge of a shortcut, the key!This difficulty of calculating the inverse corresponds to the difficulty of decryption, that is, inverting the encryption. To complicate the computation of the inverse function (besides facilitating the computation of the proper function) is done using modular (or circular) arithmetic*, that we already from the arithmetic of the clock, where $m = 12$ is considered equal to $0$ .
Use CrypTool 1
to experiment with various prime detection algorithms.
On completion of this chapter, you will have learned …
Denote $\mathbb{F}_p := \mathbb{Z} / p \mathbb{Z}$ . Among all curves, the clou of the elliptic curves (given by an equation $y^2 = x^3 + ax + b$ ) is that one can add points on them: $p + q + r = 0$ if a line passes for $p, q$ and $r$ . By restricting the solutions to points $(x, y)$ in $(\mathbb{F}_p \times\mathbb{F}_p$ for a large prime number $p$ and fixing a point $P$ on the curve,
Diffie-Hellman over Elliptic Curves: (an analog of) the Diffie-Hellman protocol, in which iterated multiplication of a number modulo $p$ is replaced by iterated addition of a point on a finite elliptic curve.
The Diffie-Hellman
protocol (over $\mathbb{F}_p$) has an analog over Elliptic Curves:
The advantage of using
is that depending on the number of bits $n$ of $p$ (regarding the fastest presently known algorithms):
For example, the security obtained by a 2048
bits key for the multiplicative logarithm equals approximately that of a 224
bits key for the logarithm over an elliptic curve. To a length of 512
bits of a key for an elliptic curve, corresponds a length of 15360
bits of an RSA
key.
In the next sections, we:
introduce these general finite fields (because common finite elliptic curves are defined over more general finite fields than those of the form $\mathbb{F}_p = \mathbb{Z} / p \mathbb{Z}$ for a prime number $p$ that we know so far).
introduce elliptic curves.
study the addition of points of an elliptic curve.
present the Diffie-Hellman Key Exchange over elliptic curves, and
look at the cryptographic problem behind the curves and the algorithms that solve it.
We realized that we already use the modular arithmetic in everyday life, for example for the modulus $m = 12$ , the arithmetic of the clock, and for $m = 7$ , the days of the week. More generally, we defined, for any integer $m$ the finite ring $\mathbb{Z} /m \mathbb{Z}$ (= a finite set where we can add and multiply), roughly,
If $m = p$ is prime, then it can be shown that $\mathbb{Z} / p \mathbb{Z}$ is a field denoted by $\mathbb{F}_p$: for every $a$ in $\mathbb{F}_p$, except $0$ , there is always $a^{-1}$ in $\mathbb{F}_p$, the inverse multiplicative of $a$ defined by satisfying $a a^{-1} = 1$ . In other words, in a field we can divide by every number except $0$ . (The most common examples are the infinite fields $\mathbb{Q}$ and $\mathbb{R}$ .)
$\mathbb{F}_q$ for $q = p^n$: A ring of polynomials with coefficients in $\mathbb{F}_p$ of degree $n$
In cryptography, elliptic curves are defined over fields $\mathbb{F}_q$ whose cardinality is a power $q = p^n$ of a prime number $p$ (and not just $\mathbb{F}_p$ till now); for example, $q = 2^n$ for a large number $n$ . The case $p = 2$ is particularly suitable for computing (cryptographic). The fields of the form $\mathbb{F}_{2^n}$ are called binary.
In Section 2.4, we already made acquaintance with the Rijndael field, which was defined by polynomials of degree $7$ with binary coefficient. More generally, the field $\mathbb{F}_q$ to $q = p^n$ is defined by polynomials of degree $n$ over $\mathbb{F}_p$, $\mathbb{F}_q[X] = \{ a_n X^n + a_{n-1} X^{n-1} + \cdots + a_0 \text{ with } a_n, a_{n-1}, ..., a_0 \text{ in } \mathbb{F}_q \}.$
The $+$ addition of two polynomials is the addition of polynomials, that is, the coefficient to coefficient addition in $\mathbb{F}_p$, and
to multiply two polynomials,
For example, for $q = p^n$ with $p = 2$ and $n = 8$ , we get the field $\mathbb{F}_{2^8}$ used in AES
. As a set $\mathbb{F}_q
=
\{ a_7 X^7 + \cdots + a_0 \text{ with } a_7, ..., a_0 \text{ in } \mathbb{F}_p \}.$ that is, the finite sums $a_0 + a_1 X + a_2 X^2 + \cdots + a_7 X^7$ for $a_0$, $a_1$, …, $a_7$ in $\{ 0, 1 \}$ .
An elliptic curve $E$ over a finite field (in which $0 \neq 2, 3$ ) is an equation $y^2 = x^3 + ax + b$ for coefficients $a$ and $b$ such that the curve is not singular, that is, its discriminant is nonzero, $4 a^3 + 27 b^2 \neq 0$ .
Note.
The equation $y^2 = x^3 + ax + b$ is the form of Weierstraß, but there are several others that have proved to be computationally more efficient, such as that of Montgomery $B y^2 = x^3 + A x^2 + x \quad \text{ where } B(A^2 - 4) \neq 0.$
If the characteristic is $2$ , that is, $\mathbb{F}_q$ with $q = 2^n$ , then the equation is $y^2 + cxy + dy = x^3 + ax + b$ .
After choosing a domain (for example, $\mathbb{Z}$ , $\mathbb{Q}$ , $\mathbb{R}$ , $\mathbb{C}$ or $\mathbb{F}_p$ for a prime number $p$ ) the points $(x, y)$ that solve this equation, $E(x,y) = 0$ , form a curve in the plane. This plane,
In addition to the points in the plane, there is also the point at infinity (or ideal point) that is denoted $0$ . Thus, the points of the elliptic curve are given by $E := \{ \text{ all points } (x,y) \text{ such that } E(x,y) = 0\} \cup \{ 0 \}$ where the notion of point depends on the domain: On a finite field $\mathbb{F}_q$, the number of points $\# E$ is limited by $q + 1 - t$ where $t \leq 2 \sqrt{q}$ , that is, asymptotically equal to $\# \mathbb{F}_q^* = q -1$ . It can be computed by Schoof’s algorithm Schoof (1995) in about $n^5$ operations for $n = \log_2 q$ the number of binary digits of $q$ .
Just as the coefficients $a$ and $b$ , this plane can be a rational, real, complex or finite plan, that is, a finite field $\mathbb{F}_q$ for a power $q = p^n$ of a prime number $p$ .
For the domain $\mathbb{R}$ , the curves take the following forms in the real plane for different $a$ and $b$ parameters:
While on finite fields, we obtain a discrete set of points (symmetrical around the middle horizontal axis).
For the cryptographic algorithm on this curve to be safe, that is, the computation of the logarithm on it takes time, there are restrictions on the choices of $q = p^n$ and the elliptic curve (that is, on its defining coefficients $a$ and $b$ ). For example,
that the coefficients $a$ and $b$ are such that
The probability that a randomly generated curve (that is, one whose coefficients $a$ and $b$ in the equation $b y^2 = x^3 + a x^2 + x$ are randomly generated) is vulnerable to one of these attacks is negligible.
Ideally, these choices are publicly justified.
A safe choice is for example: The Curve25519
given by $y^2 = x^3 + 486 662 x^2 + x$ over $\mathbb{F}_q$ with $q = p^2$ where $p = 2^{255} - 19$ (which explains its name); its number of points is $\#E = 2^{252} + 277423177773585193790883648493$ . (This curve became popular as an unbiased alternative to the recommended, and soon distrusted, curves by the National Institute for Standards and Technology, NIST
).
nistk163
, nistk283
, nistk571
and nistb163
, nistb283
, nistb571
defined over the binary field with 163
, 283
and 571
bits.Brainpool P256r
as specified in the RFC7027
used to encrypt the data on the German microchip ID card.To ensure that the coefficients are not chosen to intentionally compromise cryptographic security, they are often obtained at random, that is,
SHA-256
.How is the sum of two points on an elliptic curve defined? Geometrically the sum of three points $p$ , $q$ and $r$ on an elliptic curve in the Euclidean plane (that is, in $\mathbb{R} \times\mathbb{R}$ ) is defined by the equality $p + q + r = 0$ if a line passes $p, q$ and $r$ . However, over finite fields, this geometric intuition no longer applies, and we need an algebraic definition (which is also the form that the computer expects).
If we look at the real points of the curve $E$ , that is, at all the points $(x, y)$ in $\mathbb{R} \times\mathbb{R}$ such that $E(x,y) = 0$ , the addition has a geometric meaning: We have $P + Q + R = 0$ if $P$ , $Q$ and $R$ are on the same line. More exactly:
The reflection of $-R$ along the $x$ -axis is the point $R = P + Q$ .
CrypTool 1
demonstrates this geometric addition in an animation accessible from the Point Addition on Elliptic Curves
entry in the menu Procedures -> Number Theory - Interactive
.
This geometric description of the addition leads us to the following algebraic description: Expressed by Cartesian coordinates, the addition of two points of an elliptic curve is given by an algebraic formula, that is, it involves only the basic operations of addition, multiplication (and raising to a power). (Thus, we can replace the unknowns by values in any domain, be it $\mathbb{Q}$ , $\mathbb{R}$ , or $\mathbb{F}_q$.)
Proposition: Denote $P + Q = R$ by $(x_P, y_P) + (x_Q, y_Q) = (x_R, y_R).$ If the curve $E$ is given by $Y^2 = X^3 + aX + b$ and the points $P$ , $Q$ and $R$ are non-zero, then $x_R = s^2 - x_P - x_Q \quad \text{ and } \quad y_R = s(x_P - x_R) - y_P \quad (*)$ where $s$ is the degree of inclination of the line that passes through $P$ and $Q$ given by $s = \frac{y_Q - y_P}{x_Q - x_P} \quad \text{ if } x_Q \neq x_P, \quad \text{ and } \quad s = \frac{3x_P^2 + a}{2y_P} \quad \text{ if } x_Q = x_P.$
Demonstration: For a cubic curve not in the normal Weierstrass shape, we can still define a group structure by designating one of its nine inflection points as the $O$ identity. On the projective plane, each line will cross a cubic at three points when considering multiplicity. For a $P$ point, $-P$ is defined as the third exclusive point in the line passing $O$ and $P$ . So for all $P$ and $Q$ , $P$ + $Q$ is defined as $-R$ where $R$ is the third exclusive point in the line containing $P$ and $Q$ .
Be $K$ a field on which the curve is defined (that is, the coefficients of the equation or defining equations of the curve are in $K$ ) and denotes the curve as $E$ . So, the rational $K$ points are the $E$ points whose coordinates are in $K$ , including the point at infinity. The $-$ -rational points set is indicated by $E$ ($K$ ). It also forms a group, because the properties of the polynomial equations show that if $P$ is in $E$ ($K$ ), then $-P$ is also in $E$ ($K$ ), and if two of $P$ , $Q$ and $R$ are in $E$ ($K$ ), then it is the third. Also, if $K$ is a subfield of $L$ , then $E$ ($K$ ) is a subgroup of $E$ ($L$ ). Given the curve $Y^2 = X^3 + aX + b$ on the field $K$ (such that $0 \neq 2, 3$ ), and the points $P$ = ($x_P$, $y_P$) and $Q$ = ($x_Q$, $y_Q$) on the curve.
If $x_P \neq x_Q$, then $y = sx + d$ the Cartesian equation of the line intersecting $P$ and $Q$ with the slope $s = \frac{y_P - y_Q}{x_P - x_Q}.$
For the values $x_P$, $x_Q$ and $x_R$ the equation of the line is equal to the curve $(s x + d)^2 = x^3 + ax + b,$ or, equivalently, $0 = x^3 - s^2 x^2 - 2xd + ax + b - d^2.$ The roots of this equation are exactly $x_P$, $x_Q$ and $x_R$; thence $(x - x_P) (x - x_Q) (x - x_R) = x^3 + x^2 (- x_P - x_Q - x_R) + x (x_P x_Q + x_R + x_Q x_R) - x_P x_Q x_R.$
Therefore, the coefficient of $x^2$ gives the value $x_R$; the value of $y_R$ follows by replacing $x_R$ in the Cartesian equation of the line. We conclude that the coordinates ($x_R$, $y_R$) = $R$ = -($P$ + $Q$ ) are $x_R = s^2 - x_P - x_Q \quad \text{ and } \quad y_R = y_P + s (x_R - x_P).$
If $x_P$ = $x_Q$, then
Observation. For certain specific curves, these formulas can be simplified: For example, in an Edwards’ Curve of the shape $x^2 + xy = 1 + d x^2 y^2$ for $d \neq 0, 1$ (with neutral element $0$ the point $(0, 1)$ ), the sum of the points $p = (x_P , y_P )$ and $q = (x_Q , y_Q )$ is given by the formula $(x_P,y_P) + (x_Q,y_Q) = \left(\frac{x_P y_Q + x_Q y_P}{1 + dx_P x_Q y_P y_Q}, \frac{y_P y_Q - x_P x_Q}{1 - d x_P x_Q y_P y_Q}\right)$ (and the inverse of a point $(x, y)$ is $(-x, y)$ ). If $d$ is not a square in $\mathbb{F}_q$, then there are no exceptional points: The denominators $1 + dx_P x_Q y_P y_Q$ and $1 - dx_P x_Q y_P y_Q$ are always different from zero.
If, instead of $\mathbb{R}$ , we look at the points with entries in a finite field $\mathbb{F}_q$ from the curve $E$ , that is, all points $(x, y)$ in $\mathbb{F}_q$ such that $E(x,y) = 0$ , the addition is uniquely defined by the formula $(*)$ .
CrypTool 1
demonstrates this addition in the entry Point Addition on Elliptic Curves
of the menu Indiv. Procedures -> Number Theory - Interactive
.
The addition leads to scaling by iterated addition. That is,
As the group of points on a finite field $\mathbb{F}_q$ is finite (of cardinality approximately $q$ ), necessarily for any point $P$ the set $\langle P \rangle := \{ P, 2P, ... \}$ is finite. That is, there is $n$ and $n + m$ in $\{ 0, 1, ..., q-1 \}$ such that $n P = (n + m) P$ , that is, there is a whole $m < q$ such that $m P = 0$ .
Grau (2018a) shows for an elliptic curve over a finite field $\mathbb{F}_p$ the addition table between the points, and for each point $P$ the finite cyclic group $\langle P \rangle = \{ P, 2 P, ... \}$ generated by it. The cardinality $m = \# \langle P \rangle$ is the smallest $m$ such that $m P = 0$ and is called the order of the point $P$ .
Elliptic Curve Cryptography uses Diffie-Hellman Key Exchange to
Encryption by the ECC
is standardized by the ECIES
(Elliptic Curve Integrated Encryption Scheme), a hybrid procedure (asymmetric cryptography with symmetric cryptography).
Once the mutually secret $c$ key (= a point on the finite elliptic curve) is agreed on, Alice and Bob derive from it a key to a symmetric cipher like AES
or 3DES
. The derivation function that transforms a secret information into the appropriate format is called a Key Derivation Function, KDF
. Such a standardized function is ANSI-X9.63-KDF
with the SHA-1
option. For example, the TLS
protocol
Let us transfer the Diffie-Hellman protocol from multiplication in a finite field to addition on a finite elliptic curve: Denote $G$ a point on the curve, and $x G = G + \cdots + G$ the $x$ -fold iterated addition over the finite elliptic curve (instead of $g$ and $g^x = g \cdot s g$ for a finite field).
one chooses first
To resist
Then one chooses
The critical cryptographic number is the order $n$ of the base point $G$ that should be big enough.
To resist
To find a base point $G$ whose order $n$ is big enough, proceed as follows:
That the point thus found has order $n$ can be shown by Langrange’s Theorem which asserts that $\# H | \# G$ where
For example, $G = \mathbb{Z} / 10 \mathbb{Z}$ is a group and $2 \cdot H = \{ 0, 2, 4, 6, 8 \}$ a subgroup.
Demonstration: For every point $P$ we have $\# E P = n h P = 0$ by Lagrange’s Theorem. That is, $n G = 0$ for $G = h P$ , and for the Lagrange Theorem $\# \langle G \rangle$ divides $n$ . Since $n$ is prime, either $\langle G \rangle = 1$ , or $\langle G \rangle = p$ . Since $\langle G \rangle \neq 1$ , or $\# \langle G \rangle > 1$ , so $\# \langle G \rangle = n$ .
Example (of a base point).
The elliptic curve Curve25519
with $y^2 = x^3 + 486 662 x^2 + x$ over $\mathbb{F}_q$ with $q = p^2$ where $p = 2^{255} - 19$ , uses as base point $G = (x_G , y_G )$ uniquely determined by
In the ECDH
(Elliptic Curve Diffie-Hellman) protocol, for Alice and Bob to overtly build a secret key, they first combine
and then
We note that for both to compute the same key $c$ , the addition must satisfy the associative and commutative law; that is, it is indispensable that $E$ be a group.
The ECDHE
protocol, where the additional final E
stands for ‘Ephemeral,’ is, regarding the key exchange, the same as the ECDH
protocol, but discards the keys (which are necessarily signed by permanent keys to testify the identity) after the session. Corbellini (2015b) is an implementation in Python
of ECDH
.
How much older is RSA
compared to ECC
? Around 20 years.
How does the ECC
Diffie-Hellman key exchange compare to the original Diffie-Hellman key exchange? The ECC
Diffie-Hellman key exchange uses points on an elliptic curve given by pairs of numbers whereas the original Diffie-Hellman key exchange uses simple numbers.
How do the key sizes between RSA
and ECC
compare? Usual are 2048
bits for RSA
and 224
bits for ECC
.
How much faster is ECC
compared to RSA
?
Asymmetric cryptography relies on a trapdoor function, which
RSA
), butRSA
) must be practically incomputable without knowledge of a shortcut, the key!This difficulty of calculating the inverse corresponds to the difficulty of decryption, that is, inverting the encryption. To complicate the computation of the inverse function (besides facilitating the computation of the proper function) is done using modular (or circular) arithmetic*, that we already from the arithmetic of the clock, where $m = 12$ is considered equal to $0$ .
Let us denote $\mathbb{F}_p = \mathbb{Z} / p \mathbb{Z}$ .
The most current cryptography widely in use uses elliptic curves (given by an equation $y^2 = x^3 + ax + b$ ) where one can add points on them: $p + q + r = 0$ if a line passes for $p, q$ and $r$ . By restricting the solutions to points $(x, y)$ in $(\mathbb{F}_p \times\mathbb{F}_p$ for a large prime number $p$ and fixing a point $P$ on the curve,
while it is easy to compute the exponential, that is, for $n$ , compute $Q = nP = P + ... + P,$
in contrast, given a point $Q = P + ... + P$ , it is difficult to compute the logarithm: that is, how many times $P$ has been added, the number $n$ such that $Q = nP$ . By virtue of this point addition, the Diffie-Hellman
protocol (over $\mathbb{F}_p$) has an analog over Elliptic Curves.
Instead of multiplying repeatedly ($n$ times) the base $g$ in $\mathbb{F}_p^*$ , that is, computing $g^n = g \cdots g,$
add repeatedly ($n$ times) a point $G$ , that is, compute $n \cdot G = G + \cdots + G.$
The advantage of using elliptic curves are shorter key sizes: Because, depending on the number of bits $n$ of the used modulus $p$ , regarding the fastest presently known algorithms:
For example, the security of a 2048
bit key for the multiplicative logarithm equals approximately that of a 224
bit key for the logarithm over an elliptic curve. To a length of 512
bits of a key for an elliptic curve, corresponds a length of 15360
bits of an RSA
key.
ECC
, elliptic curve cryptography, is becoming the new standard because its cryptographic problem (the logarithm over a finite elliptic curve) is currently computationally more difficult than that of RSA
(the factoring in prime numbers) or DH
(the logarithm over the multiplicative group of a finite field). small keys for the ECC
achieve the same level of security as large keys for the RSA
or DH
. As an example, the security of a 224
bits key from the ECC
corresponds to that of a 2048
bits key from the RSA
or DH
. This factor in reducing key sizes corresponds to a similar factor in reducing computational costs. Regarding usability,
ECC
public key can be shared by spelling it out (it has 56
letters in hexadecimal notation,RSA
or DH
has to be shared in a file (which is for convenience referred to by a fingerprint).What key size in ECC
is as secure as a $256$ bit key in AES
?
What key size in ECC
is as secure as a 256 bit key in AES
?
What certificate size in ECC
is as secure as a 256 bit key in AES
?
Use Grau (2018d) to get an intuition for the graphs over finite domains.
Read Chapter 12 on elliptic curves of Aumasson (2017).
Use CrypTool 1
to observe the addition of points on an elliptic curve.
See the book Sweigart (2013a) for implementing some simpler (asymmetric) algorithms in Python
, a readable beginner-friendly programming language.
Read the parts of the book Schneier (2007) on understanding and implementing modern asymmetric cryptographic algorithms.
On completion of this chapter, you will have learned …
… How a user can be authenticated by:
a password or personal identification number (PIN),
smart card (that has a microprocessor that stores a private key and runs cryptographic algorithms), or
biometric identifiers (recognition of the users’ signature, facial features, fingerprint, …).
… How authentication is most securely achieved over distance.
… How the secret is never revealed by challenge and response protocols.
… How no information other than the knowledge of the secret is leaked by zero-knowledge proofs.
… How Kerberos mediates between users and servers without either one revealing her password to the other.
Authentication is the identification of a person or of data; the confirmation
This chapter treats exclusively the former type, the identification of a person, the user; particularly important on the Internet where the person is far away. In this sense identification is telling a computer or a network who the user is, usually by her user (or account) name. This is followed by authentication, the verification of the identity of a user, that is, convincing a computer that a person is who he or she claims to be.
To authenticate, the user can use as a proof information that
only she knows, such as a password or personal identification number (PIN),
only she has, such as
software certificate (containing public or private keys),
hardware certificate, such as a token or smart card (that has a microprocessor that stores a key and runs cryptographic algorithms),
a device, such as a phone or an e-mail account, to receive a code,
only she is described by (what she is) such as biometric identifiers (recognition of facial features, fingerprint, …).
Authentication should not be confused with authorization, the final confirmation of authentication that determines the user’s eligibility to access certain contents, that is, what a user is allowed to do or see.
The authentication protocol can be simple or two-factor, that is:
and one-way or mutual, that is,
A
, such as the user, authenticates herself to party B
, andB
, such as the server, authenticates himself to party A
.Most operating systems (such as Linux) and applications store a hash of the authentication data rather than the authentication data itself. During authentication, it is verified whether the hash of the authentication data entered matches the hash stored. Even if the hashes are known to an intruder, it is practically impossible to determine an authentication data that matches a given hash.
A password is a secret sequence of letters attached to a user identity that grants access to a system (such as a computer) after deriving it from the data of authorized users stored on the system.
password: a secret sequence of letters attached to a user identity that grants access to a system (such as a computer).
It is the most common approach for authentication: gratis, convenient and private. It should be easy to memorize but difficult to guess.
Comparing authentication by what one knows (such as a password) to
what one is (biometric data such as a fingerprint), the advantages are:
does not require specific (rather sophisticated) hardware,
it is securely stored, and
cannot be forged.
what one has (such as a smart-card), the advantages are:
The user’s limitation in memorizing sequences of symbols. The more meaningful, the more easily guessed (for example, a word of the user’s tongue), but the more patternless, the harder to remember. A compromise is a passphrase, that is, a complete sentence instead of a single word; though longer, its content is more meaningful and thus more easily remembered than a patternless sequence of symbols. To shorten it, the first letter of each word is taken. For example, “Better to light one candle than to curse the darkness.” can become “B2l1ct2ctd.”
Another workaround is a password manager, a program that stores all passwords in a file encrypted by a master password.
However,
Can be acquired over distance; as a workaround, two-factor authentication requires a second ingredient to authenticate, which usually has to be spatially close to the user, such as a hardware token.
Attacks during the entry of another person’s password are:
spying during the entry of the password. Workarounds are inconvenient, for example,
keylogging
login spoofing where one user’s account a login entry form is faked so that the next user’s entered login data, instead of granting access, are stored, display an error message and log the first user out.
asking a user for the password, either through e-mail or on the phone as a purported system administrator.
asking a system administrator for the password by posing as a purported user who has forgotten her password;
asking a user to change her password on a purported entry form.
Attacks on another person’s stored password exploit principally the following deficiency: Since passwords have to be memorized, they tend to be memorizable, that is, follow patterns. For example, are built from (birth)dates, counts and words, in particular names. Common are (reversing, capitalizing, … ):
These more likely candidates can then be guessed first. Or, instead of building likely passwords, the attacker uses those already leaked to begin with.
A challenge-response protocol poses a task that can be solved only by a user with additional authentication data; usually:
challenge-response protocol: poses a task that can be solved only by a user who has the authentication data.
A Zero-knowledge Protocol goes, in theory, further, as it shows how a claimant can prove knowledge of a secret to a verifier such that no other information (than this proof of knowledge) is disclosed; however:
Zero-Knowledge Protocol: a protocol (first presented in Goldwasser et al. (1989)) to prove knowledge of a secret but disclose no other information.
A challenge-response protocol poses a task that can be solved only by a user with additional authentication data. For example,
thus proving that the user could decrypt the original value. For example, smart-cards commonly use such a protocol. Such a (randomly) generated value is a nonce (number used once) and avoids a replay attack where the exchanged data is recorded and sent again later on.
nonce: stands for (a randomly generated) number used once used in a cryptographic protocol (such as one for authentication).
For example, in CRAM-MD5
or DIGEST-MD5
,
Server sends a unique challenge value challenge
to the client.
Client computes response = hash(challenge + secret)
and sends it to the server.
Server calculates the expected value of response
and verifies that it coincides with the client’s response.
Such encrypted or hashed information does not reveal the password itself, but may supply enough information to deduce it by a dictionary (or rainbow table) attack, that is, by probing many probable values. Therefore information that is (randomly) generated anew enters on each exchange, a salt such as the current time.
salt: a (randomly) generated number that enters additionally into the input of a hash function to make its output unique; principally if the additional input is a secret information such as a password.
Observation. Nonce, Salt and IV (Initialization Vector) are all numbers that are (usually) randomly generated, disclosed, used once in a cryptographic process to improve its security by making it unique. The name is given according to its use:
Challenge-Response protocols such as those presented below are used, for example, in object-relational databases such as PostgreSQL
, or e-mail clients such as Mozilla Thunderbird
.
Digest-MD5
was a common challenge-response protocol that uses the MD5
hash function and specified in RFC2831
. It is based on the HTTP Digest Authentication (as specified in RFC2617
) and was obsoleted by Melnikov (2011)
CRAM-MD5 is a challenge-response protocol based on HMAC-MD5, Hashing for Message Authentication as specified in Krawczyk et al. (1997), that uses the MD5
hash function. The RFC draft Zeilenga (2008) recommends obsoleting it by SCRAM
.
nonce
.HMAC(secret, nonce)
.HMAC(secret, nonce)
and checks whether it coincides with the client’s response to be convinced that the client knew secret
.No mutual authentication, that is, the server’s identity is not verified.
The used hash function MD5
is quickly computed, and thus facilitates dictionary attacks. Instead, key stretching, that is, using a hash function that is deliberately computationally expensive is preferable.
Weak password storage:
Dovecot
) store an intermediate hash value of the password: While this prevents storage of the plain password, for authenticating with CRAM-MD5, knowledge of the hash value is equivalent to that of the password itself.Salted Challenge-Response Authentication Mechanism (SCRAM
) is a challenge-response protocol for mutual authentication specified in Menon-Sen et al. (2010) (that should supersede CRAM-MD5
as proposed in Zeilenga (2008)).
While in CRAM
the client password is stored as hash on the server, now knowledge of the hash (instead of the password) suffices to impersonate the client on further authentications: The burden has just been shifted from protecting the password to its hash. SCRAM
prevents this by demanding additional information (on top of the authentication information StoredKey
stored on the server) that was initially derived from the client’s password (ClientKey
). Advantages of SCRAM in comparison to older challenge-response protocols, according to loc.cit., are:
The authentication information stored on the server is insufficient to impersonate the client. In particular,
supports mutual authentication (by the client and server).
When the client creates a password Password
, the server stores derived keys StoredKey
and ServerKey
together with the parameters used for its derivation, as follows:
The client:
SaltedPassword
by applying the password hashing function PBKDF2
(which is variable; by default it is PBKDF2
, but nowadays, for example, bcrypt
is recommended) IterationCount
many times on the input given by the password Password
and the salt Salt
, that is, SaltedPassword := PBKDF2(Password, Salt, IterationCount)
ClientKey
respectively ServerKey
by applying the HMAC function on SaltedPassword
with the public constant strings “Client Key” respectively “Server Key,” that is, $\mathrm{ServerKey} := \mathrm{HMAC}(\mathrm{SaltedPassword}, '\mathrm{Server Key}')$ and $\mathrm{ClientKey} := \mathrm{HMAC}(\mathrm{SaltedPassword}, '\mathrm{Client Key}')$StoredKey
by hashing ClientKey
, that is, StoredKey := H(ClientKey)
and sends ServerKey
and StoredKey
to the server (but not ClientKey
).The server stores StoredKey
, ServerKey
, Salt
and IterationCount
to later check proofs from clients and issue proofs to clients: ClientKey
is used first to authenticate the client against the server and ServerKey
is used later by the server to authenticate against the client.
The server only stores the public part of the root (Salt and IterationCount
) and the leafs (StoredKey
and ServerKey
) of this tree. That is, the password is never sent to the server, but only:
the salt,
the iteration count,
ServerKey
and StoredKey
, that is,
Thus, after a database breach, that is, after an attacker has stolen a ServerKey
, a client’s password does not need to be replaced, but only the salt and iteration count changed and ClientKey
and ServerKey
replaced.
iteration count: Given an initial input, apply a hash function that many times ($-1$ ) to the output.
For the server to authenticate the client:
client-name
and a client-nonce
).salt
, iteration count ic
and a server-nonce
.Therefore, both, the client and server know AuthMessage := client-name, client-nonce, salt, ic, server-nonce
.
The client
creates proof of her knowledge of StoredKey
by computing
ClientSignature := HMAC(StoredKey, AuthMessage)
ClientProof := ClientKey XOR ClientSignature
and sends ClientProof
to the server.
The server
recovers ClientKey
by
computing ClientSignature
(by knowing StoredKey
from storage and AuthMessage
from this exchange), and
deciphering ClientKey'
from the one-time pad ClientProof
by computing
ClientKey' = ClientProof XOR ClientSignature
computes StoredKey'
by H(ClientKey)
, and
checks whether the computed StoredKey'
coincides with the stored StoredKey
; if so the client is successfully authenticated.
If just ClientSignature
were sent, then an attacker who knows StoredKey
could impersonate the client. Instead, ClientProof
additionally requires the client to know ClientKey
. Therefore, the value of ClientKey'
that is calculated on the server should be immediately and irreversibly (say, by zeroing) deleted after verification.
The client initiates the protocol by sending the Client’s First Message to the server that contains:
ClientNonce
randomly generated by the client.In response to the reception of a valid message from the client, the server sends the Server’s First Message to the client that contains:
ServerNonce
randomly generated by the server.salt
randomly generated by the server that enters with the password as input of a hash function.IterationCount
generated by the server that indicates how many times the hash function is applied to the salt and the password to obtain its output.The concatenation of the client’s and server’s message is AuthMessage
AuthMessage = username, ClientNonce, ServerNonce, salt, IterationCount
The client creates the proof for the server by computing:
ClientSignature = HMAC(StoredKey, AuthMessage)
, andwhere StoredKey
can be recomputed by StoredKey = H(ClientKey)
and $\mathrm{ClientKey} = \mathrm{HMAC}(\mathrm{SaltedPassword}, '\mathrm{Client Key}')$ using the salt
and IterationCount
from AuthMessage
and the user’s password.
ClientNonce
and ClientProof
to the serverClientProof
from the client by
recalculating
ClientSignature = HMAC(StoredKey, AuthMessage)
,ClientKey
by $\mathrm{ClientKey}' = \mathrm{ClientProof} \oplus \mathrm{ClientSignature}$,StoredKey' = H(ClientKey')
, andchecking whether $\mathrm{StoredKey'} = \mathrm{StoredKey}$.
If just ClientSignature
were sent, then an attacker that knows StoredKey
could impersonate the client. Instead, ClientProof
additionally requires the client to know ClientKey
. Therefore, the value of ClientKey'
that is calculated on the server should be immediately and irreversibly (say, by zeroing) deleted after verification.
We conclude that instead of the client’s password, sent were
ClientNonce
and server nonce ServerNonce
,salt
,IterationCount
,The server computes ServerSignature = HMAC(ServerKey, AuthMessage)
The server sends ServerSignature
The client checks ServerSignature
from the server by
recalculating
checking whether $\mathrm{ServerSignature'} = \mathrm{ServerSignature}$.
We conclude that instead of the server’s key, sent was
ServerSignature = HMAC(ServerKey, AuthMessage)
.If an attacker knows
StoredKey
from the server, andAuthMessage
and ClientProof
from an authentication exchange,then he can calculate ClientSignature
, thus ClientKey
and can impersonate the client to the server.
A Zero-Knowledge Protocol shows how a claimant can prove knowledge of a secret to a verifier such that no other information is disclosed; that is, not a single bit of information, other than the validity of the claimant’s claim, is disclosed (to anyone, including the verifier).
Proof is meant probabilistically, that is, the claim is true beyond any reasonable doubt. More so, because the proofs are independent of each other, the probability can be increased as much as desired by increasing their number. The impossibility to gather information on the secret from that exchanged relies on the computational difficulty to solve a mathematical problem.
That no information whatsoever is leaked cites the following benefits:
While claiming to know the secret alone is unconvincing, the leakage of information during classical protocols in which a claimant C
proves knowledge of a secret to verifier V
, still compromises the secret as follows:
C
transmits his password to V
, then V
and everyone who eavesdropped this transmission obtains all data to impersonate C
from this moment on.V
and an eavesdropper can during every occurrence accumulate new information on the secret so that it eventually yields to it. For example, if the challenge is the encryption of a plaintext, and the attacker can choose this plaintext, then this is a chosen-plaintext attack.To weigh up the advantages and inconveniences between a zero-knowledge proof and a digital signature by a private key (verified by its corresponding public key):
The security of both, most zero-knowledge protocols and public-key protocols, depends on the unproved assumption that cryptanalysis is computationally as difficult as a mathematical problem (such as the computation of quadratic residuosity, the decomposition of an integer into its prime factors, discrete logarithm, …).
The concept of
An interactive protocol is turned into a non-interactive one by taking the hash of the initial statement and then send the transcript to the verifier (who also checks that the hash was computed correctly).
Practical protocols were introduced in Fiat & Shamir (1987) and Feige et al. (1988).
The latter introduced so-called sigma protocols in three steps:
IEEE 1363.2 defines zero-knowledge password proof as “an interactive zero-knowledge proof of knowledge of password-derived data shared between a prover and the corresponding verifier.”
Ali Baba’s cave illustrates the principles behind a zero-knowledge proof: In a circular cave there is a door invisible from the entrance that bars any passage if not opened by a password (such as “Sesame”). For C
to prove to V
that he knows the password without disclosing it:
C
enters the cave unobserved from V
till standing in front of the door’s left or right.V
demands C
to return to the entrance coming from the left or right.Because the probability that C
entered the cave on the same side as V
asked for to leave on is one half and all proofs are independent, for example, after $10$ proofs the chance C
does not know the password is $2^{-10}$, less than a thousandth.
While V
is convinced that C
knows the password, she cannot convince anybody else. Even if V
recorded the sequence, then it could have been mutually fixed in advance.
Schnorr (1991) presented a zero-knowledge protocol simple enough to run on smart cards. Knowledge of discrete logarithms is proved, that is,
P
knows an integer $x$ , andV
knows $g^x \mod p$where $p$ is a prime number. For P
to prove knowledge of $x$ without revealing it:
P
chooses some integer $a$ and sends $g^a$ to V
.
V
tosses a coin and sends the result $c$ in ${0, 1}$ to P
.
P
sends to V
This is a zero-knowledge protocol, because
If $c = 0$ , then no knowledge of $x$ is needed. If $c = 1$ however, then V
can verify whether P
knows $a+x$ by $g^{a+x} = g^a g^x$ where both values on the right-hand side are known. Since the probability that $c = 1$ is $1/2$ and all proofs are independent, after, say, $10$ proofs the chance V
does not know $x$ is $2^{-10} < 1/1000$ .
The security of the Fiege-Fiat-Shamir protocol rests on the assumed computational difficulty of extracting square roots modulo large composite integers whose prime factors are unknown (similar to RSA
):
1024
bits.P
, as follows:
P
, the prover,
V
.V
, the verifier,
P
.P
, the prover,
V
, the verifier,
P
knows $x^{1/2} = r$ .P
knows $(x / v)^{1/2}$.If $c = 0$ , then no knowledge of the private key $s$ is needed. If $c = 1$ however, then V
can verify whether P
can compute $(x / v)^{1/2}$, thus presumably knows $s$ . Because the probability that $c = 1$ is one half and all proofs are independent, for example, after $10$ proofs the chance V
does not know $x$ is $2^{-10}$, less than a thousandth.
This is nearly a zero-knowledge protocol, however, care must be taken:
V
could gather information by manipulating the “random” bits.How does a challenge-response protocol authenticate a client? By posing a task to the user that she can only solve if having authentication data.
What is a salt? A (randomly) generated number that enters additionally into the input of a hash function to make its output unique.
How does storage of the identification data differ between SCRAM and CRAM? In CRAM
a hash of the client password is stored as is on the server, while in SCRAM
additional data (salt) enters.
Name an advantage and inconvenience of zero-knowledge proof in comparions to a digital signature. The proof of knowledge does not leak any information on the user’s private key, while a digital signature requires less computation.
On the computational difficulty of which mathematical function does the zero-knowledge property of Schnorr’s protocol rely? On the discrete logarithm.
Biometric authentication identifies the user of a computer by
either physical human characteristic, such as:
or behavioral human characteristic, such as:
typing characteristics such as the speed of keystrokes or the occurrence of typos (particularly useful to supplement a log-in dialogue);
handwriting characteristics; either static where an image is used, or dynamic where the traces on a tablet are evaluated by the functions (of time):
$x$ and $y$ -coordinates,
pressure, and
inclination
voice properties (speaker recognition; particularly useful to verify the identity of telephone customers). Each spoken word is decomposed into its formants, the dominant frequencies, and then physiological and behavioral characteristics identified:
the physiological characteristics describe the shape of the person’s vocal tract, that is, of her nose, mouth, jaw, tongue, …
the behavioral characteristics describe the movement of the nose, mouth, jaw, tongue, …
that change the accent, tone, pitch, pace, …
Comparing authentication by what one is (for example, a fingerprint) to
what one knows (for example, a password), the advantages are:
what one has (for example, a token), the advantages are:
For example, the German Chaos Computer Club (CCC) and its collaborators have demonstrated time and again the ease of forging fingerprints, for example, using wood glue and sprayable graphene. The many successful forgery attacks against biometric identification, such as fingerprints and photo ID, leads to the conclusion to treat them, instead of a replacement for passwords or smart-cards, rather
List advantages of authentication by what one knows over what one is:
In a (distributed) system, only if
then the secret itself for authentication could be responsibly sent in the clear. Otherwise, to ensure that the secret for authentication is seen by none other than possibly the intended recipient, the secret itself must never be sent in the clear. Instead, both the user and the system convince each other they know the shared secret (usually a password). That is, the identification data itself is never sent, but only proof she has access to it beyond any doubt. In practice, for a system that is password-based, the most popular approach are challenge-response systems. A challenge-response protocol poses a task that can be solved only by a user with additional authentication data: Usually,
challenge-response protocol: poses a task that can be solved only by a user who has the authentication data.
As best practice, even though the secret itself is never sent, it is still advisable to encrypt all communication to establish authentication, for example, by public-key encryption. Other approaches are:
to reveal the secret only partially, for example in a single-use system uses the identification data only once. For example, TAN
s (TransAction Numbers) in banking. However, if the identification data can be eavesdropped and its use for authentication, and thus invalidation, be prevented (for example, by logging into a forged copy of the bank’s Website), then it can be used later.
to send additional secret information via a second-channel. For example, the sending of an SMS in the mobile TAN (mTAN) system.
single-use system: the identification data is only used once.
second-channel system: the identification data is sent via a second channel.
The FIDO (Fast IDentity Online) Alliance was officially founded in February 2013 to develop open and license-free industry standards for authentication on the Internet in arrangement with many companies such as Google
or Microsoft
. On December 9, 2014, the first standard FIDO v1.0 was released that specifies:
These standards aim to facilitate authentication on the Internet by using a user’s
instead of
That is, no longer needs a user to memorize numerous secure passwords (though at the cost of the drawbacks discussed in Section 11.1). In comparison with previous methods of two-factor authentication such as SMS verification codes, FIDO2 requires the key, such as the smart phone, to be physically near your computer. FIDO2 (“Moving the World Beyond Passwords”) consists of the
W3C Web Authentication Standard (WebAuthn) that allows users to log into Internet accounts using biometrics, mobile devices and/or FIDO security keys.
the Client to Authenticator Protocol (CTAP) of the FIDO Alliance that is based on U2F. (While, with the release of FIDO2, U2F was renamed to CTAP1.) CTAP is, among other use cases, for authentication in desktop applications and web services.
Personal data and private keys are always and exclusively in the hands of the user and are not stored on public servers. To register via FIDO2 an account on a server:
This way, the FIDO2-key can identify itself with an individual key at each server without the server obtaining information on the key pairs for the same FIDO2-key on other servers. That is, the FIDO2-key generates a separate key pair for each service, based on the domain of the other party. For example, Ebay and Google, cannot determine which of their users use the same FIDO2-key. In practice, this is an advantage (on the server-side!) to authentication by a password where a user often uses similar passwords on different servers.
A FIDO2 key (or authenticator or token) is the device by which to authenticate oneself to a service. It can be
either an external device to connect to the user’s PC or smart phone via USB, NFC or Bluetooth; such as
a security token to be inserted into a USB port
a smart card to be inserted into a card reader,
an NFC
token (Near Field Communication; a wireless communication technology to exchange data between closely located devices, that is, about a decimeter apart; jointly developed by Sony and Philips and approved an ISO standard in December 2003),
Smartphones with Bluetooth-Interface IEEE 802.15.1
Bluetooth Tokens (Bluetooth V4.0 Low Energy 2,45 GHz)
or an internal authenticator. That is, software that uses the crypto chip of your PC, smart phone or tablet for FIDO2, supported by Windows 10 and Android 7 and above.
Against misuse of the FIDO2 key, it can be additionally secured biometrically or with a password/PIN. If the stick is lost, then either a registered backup key is available or one has to identify oneself again by, say, the mobile phone number in combination with an e-mail address or alike.
A FIDO2 key can either be used instead of a password or in addition to it, as a second factor. Depending on how a service has implemented FIDO2, the key suffices for logging in (one-factor authentication) or additionally entering a password (two-factor authentication) is necessary. FIDO2 one-factor authentication already is available for Microsoft.com, Outlook.com, Office 365 and OneDrive in the Edge browser. FIDO2 two-factor authentication works, for example, with Google, GitHub, Dropbox and Twitter.
One solution to the problem of key distribution, that is,
is a central authority who is
so that each correspondent has only to protect one key, while the responsibility to protect all the keys among the correspondents is shifted to the central authority.
Kerberos (pronounced “kur-ber-uhs”) is named after Cerberus, the three-headed watchdog at the gate to Hades; while Cerberus authenticates dead souls, Kerberos mutually authenticates users over a network:
Kerberos is a network protocol (as specified in RFC 1510 — The Kerberos Network Authentication Service V5) which permits users to securely authenticate to each other over an insecure network by a trusted third party, the Key Distribution Center (KDC) . Once a user is authenticated (Kerberos), she is authorized by access protocols such as LDAP (Lightweight Directory Access Protocol).
Kerberos: a network protocol to securely authenticate clients to servers over an insecure network by a trusted third party.
Key Distribution Center (KDC): stores the symmetrical keys of all registered users and servers to securely authenticate them.
The Key Distribution Center (KDC) stores the symmetrical keys of all registered users (be it client or server) to authenticate them as an intermediary third-party. Due to its critical role, it is good practice to have a secondary KDC as a fallback.
Kerberos groups users into clients and (service) servers (SSs) that host services for the clients. The authentication protocol authenticates a client only once so that she is trusted by all SSs for the rest of her session. This is achieved by
The KDC is split up into :
an Authentication Server (AS): To authenticate each user in the network, the AS stores a symmetric key for each user, be it client or service server (SS), and which is only known to itself, the AS, and the user.
a Ticket-Granting Server (TGS):
Service Server (SS): the server that hosts a service the user wants to access.
Authentication Server (AS): stores for each user (be it client or server) in the network a symmetric key known only to itself, the AS, and the user.
Ticket-Granting Server (TGS): generates a session key as part of a ticket between two users of the network after authentication.
To allow a user (commonly referred to as client) to securely communicate with another user (commonly referred to as Service Server or Application Server (SS or AP), the server that hosts a service the client wants to access) via the KDC
, the Kerberos protocol defines ten messages, among them:
Code | Meaning |
---|---|
KRB_AS_REQ |
Kerberos Authentication Service Request |
KRB_AS_REP |
Kerberos Authentication Service Reply |
KRB_TGS_REQ |
Kerberos Ticket-Granting Service Request |
KRB_TGS_REP |
Kerberos Ticket-Granting Service Reply |
KRB_AP_REQ |
Kerberos Application Request |
KRB_AP_REP |
Kerberos Application Reply |
The client, to authenticate to the AS,
KRB_AS_REQ
) using a long-term shared secret (client’s key), andKRB_AS_REP
).The client, to authenticate to the SS via the AS,
AS
using her TGT, andKRB_TGS_REQ
) a ticket from the TGS
(KRB_TGS_REP
) that contains a session key between the client and the SS.The TGS, to create the ticket,
The client, to authenticate directly to the SS:
decrypts the client-to-server session key using her own key, and
sends to the SS (KRB_AP_REQ
and KRB_AP_REP
)
The SS
The client decrypts, using the client-to-server session key, the incremented timestamp and checks it. If the check succeeds, then the client can trust the server and can start issuing service requests to the server.
The server provides the requested services to the client.
The Kerberos protocol is based on the Needham-Schroeder authentication protocol, invented by Roger Needham and Michael Schroeder in 1978, and designed to securely establish a session key between two parties on an insecure network (the Internet, for example) by a symmetric-key algorithm.
MIT developed Kerberos to protect network services provided by Project Athena that aimed at establishing a computing environment with up 10,000 workstations on varying hardware, but where a user could on every workstation access all files or applications in the same user interface; similar to what a browser achieved today. Versions 1, 2 and 3 were only used at MIT itself. Version 4 employed the Data Encryption Standard encryption algorithm, which uses 56-bit keys, and was published in 1989.
In 2005, the Internet Engineering Task Force Kerberos working group introduced a new updated specifications for Kerberos Version 5, and in 2007, MIT formed the http://www.kerberos.org for continuation of development.
MIT Kerberos is the reference implementation that supports Unix, Linux, Solaris, Windows and macOS, but other commercial and non-commercial implementations exist: Most notably, Microsoft added a slightly altered version of Kerberos V5 authentication in Windows 2000.
Kerberos originally used the DES
algorithm for encryption and the CRC-32, MD4, MD5 algorithms for hashing, but nowadays Kerberos implementations can use any algorithm for encryption and hashing.
/tmp
directory and only deleted after their expiration, thus, in multi-user system, risk to be stolen.5
minutes difference and preferably achieved by the Network Time Protocol; NTP) of the involved parties because of the period of validityLet us detail each step from logon and authentication to service authorization and request
The client sends a cleartext message to the Authentication Server (AS) asking for services on behalf of the user (KRB_AS_REQ
).
The AS checks whether the client is in his database.
If she is, then the AS sends back the following two messages to the client:
Client/Ticket-Granting Server (TGS) Session Key, encrypted using the secret key of the client.
Ticket-granting Ticket (TGT), encrypted using the secret key of the TGS, and which includes the
Once the client receives messages A
and B
, she retrieves the Client/TGS Session Key by decrypting message A
using her secret key.
The Client/TGS Session Key is used for further communication with TGS. At this point, the client has enough information to authenticate herself to the TGS.
We observe that neither the user nor any eavesdropper on the network can decrypt message B
, since they do not know the TGS’s secret key used for encryption.
To request services, the client sends the following two messages C
and D
to the TGS:
Composed of the
ID
) of the requested service.Authenticator, encrypted using the Client/TGS Session Key, and which is composed of the
Once the TGS receives messages C and D, he retrieves the Client/TGS Session Key by
The TGS
retrieves the client ID and its timestamp by decrypting message D
(Authenticator) using the Client/TGS Session Key, and
sends the following two messages E
and F
to the client:
Client/Server Session Key, encrypted using the Client/TGS Session Key.
Client-to-Server ticket, encrypted using the Service Server ’s (SS) secret key, which includes the
At this point, after the client receives messages E
and F
from the TGS, the client has enough information to authenticate herself to the SS.
The client connects to the SS and sends the following two messages F
and G
:
f. from the previous step (the Client-to-Server ticket, encrypted using the SS's secret key).
g. a new Authenticator, encrypted using the Client/Server Session Key, and which includes the
- client ID, and
- timestamp.
The SS
retrieves the Client/Server Session Key by decrypting the ticket using his own secret key,
retrieves the Authenticator by decrypting it using the session key, and
authenticates to the client by sending the following message H
to the client :
The client checks whether the timestamp is correctly updated by decrypting Message H
using the Client/Server Session Key. If so, then the client can trust the server and can start issuing service requests to the server.
The server provides the requested services to the client.
A smart card has the form of a credit card, but contains a microprocessor to securely store and process information. (In contrast, a magnetic-stripe card only stores little information (around $< 100$ bytes) but cannot process it.) Security algorithms are implemented on the card so that only properly authorized parties can access and change this data after they have authenticated themselves.
The first smart card was invented in 1973. While it initally consisted only of read and write memory, a microprocessor was added in 1976. In 1981, the French state telephone company introduced it to store a pre-paid credit, which was reduced when calls were made. As memory capacity, computing power, and data encryption capabilities of the microprocessor increased, smart cards are graudally replacing cash, credit or debit cards, health records, and keys.
Random-Access Memory (RAM) reads and writes data, but only stores information as long as there is electricity and not permanently.
Read-Only Memory (ROM): No information can be written, but it can be stored permanently. The Operating System and encryption algorithms are stored.
For improved security, the ROM is buried in lower layers of silicon.
Electrically Erasable Programmable Read Only Memory (EEPROM) stores information permanently, but is slow and one can only read/write to it a limited number of times (around 100 000 times). Typically, a smart card has 8K – 128 kByte of EEPROM. For improved security, the EEPROM is shielded in a metal coating.
The Processor used to be an 8-bit microcontroller, but increasingly more powerful 16 and 32-bit chips are being used.
A coprocessor is often included to improve the speed of encryption computations.
Most of the processing in smart cards is dedicated to cryptographic operations; in particular, encryption between on-chip components To function, a smart card needs external power and clock signal provided through contact with a smart card reader (that usually can write as well). The operating system on most smart cards implements a standard set of control commands such as those standardized in ISO 7816 or CEN 726.
There is a single Input/Output port controlled by small data packets called APDUs (Application Protocol Data Units). Because data flows only at around 9600 bits per second and half-duplex, that is, the data can either flow from the reader to the card or from the card to the reader, but not simultaneously in both directions, the smart card can only be read out slowly, thus complicating attacks. The reader sends a command to the smart card, the card executes the command and returns the result (if any) to the reader; then waits for the next command. The smart card and the reader authenticate each other by a challenge-response protocol using a symmetric key encryption:
(and possibly with interchanged roles). Once mutually authenticated, each exchanged message is verified by a message authentication code (HMAC) which is calculated using as input the message, encryption key, and a random number.
For example, the UICC (Universal Integrated Circuit Card) or Universal Subscriber-Identity Module (USIM) is a smart card used in mobile phones in GSM and UMTS networks. It ensures the integrity and security of all kinds of personal data, and it typically holds a few hundred kilobytes. The Subscriber Identification Modules (SIM) is an application in the UICC that stores the subscriber’s information to authenticate her with the network:
Comparing authentication by what one has (for example, a smart card) to
what one knows (for example, a password), the advantages are:
to what one is (for example, the fingerprint), the advantages are:
less sophisticated and expensive hardware (such as an iris scanner), and
cryptographic keys are stored securely on a device; thus avoids forgeries with fingerprints or vein scanners; In particular, storing keys on a smartcard, that is accessed by a USB reader with its own keyboard, instead of a digital file has the advantage that reading the keys from a smart card:
List advantages of authentication by what one knows over what one has:
List disadvantages of authentication by what one knows over what one has or is:
Anonymity comes from the Greek word anonymía, “name-less,” and means “namelessness.” Colloquially, it means that a person’s identity is unknown. More abstractly, that an element (for example, a person or a computer) within a set (for example, a group or network) is unidentifiable (within this set).
Protecting one’s identity from being disclosed is not only necessary for someone who acts against the law, for example, when attempting to exploit a computer in a network, but also as a precaution to a possible abuse; for example, to amass data of the user’s online habits to build a profile for targeted advertising.
Countermeasures are
However, these countermeasures involve inconveniences, such as an involved setup and slower data transfer. As a less compromising practical measure is to adapt best practices for protecting one’s privacy on the world wide web, for example, by using the Firefox
browser with add-ons that filter out
such as
uBlock Origin
,Privacy Badger
Don't track me Google
, andDecentraleyes
.Also, there is a need for anonymised datasets, for example:
for hospitals to release patient information for medical research but withhold personal information.
to verify whether a password has leaked, for example, by the I Been Pwned
web form by Troy Hunt that contains over half a billion leaked passwords. This prevents Credential Stuffing where usernames and passwords from compromised websites are entered into website forms until compromised user accounts are found; an attack likely to succeed, because
For example, the I Been Pwned
web form transmits only the first five digits of the SHA-1
of the password to compare, thus leaking little personal information. This achieves what is generally called k-Anonymity where a data set has for each record $k-1$ identical copies.
Identity, from Late Latin identitas, from Latin idem, same, and entitas, entity, are the characteristics by which someone or something is uniquely recognizable, that is, held by no other person or thing. Identity theft is the assumption of another person’s identity, usually another person’s name or other personal information such as her PINs, social security numbers, driver’s license number or banking details, to commit fraud, for example, to open a fraudulent bank or credit card account to receive loans. With the advent of digitalized records and (the anonymity on) the Internet, identity theft has become more and more common.
For example, in SIM-card swapping, the attacker obtains a victim’s mobile-phone number to assume, temporarily, her online identity. The attacker initially obtains personal data about the victim, usually her name, mobile phone number and postal address. He then exploits that mobile operators usually offer their customers a new SIM card, for example, after the phone is lost, onto which the previous phone number is transferred. The attacker now pretends to be the actual customer to the mobile phone operator, for example, by phone in the customer service center (where often the date of birth and postal address suffice to identify oneself if no customer password was agreed on signing the contract). For example, it suffices to know the mobile phone number to reset the password of an Instagram account.
Identification is telling a computer who the user is, usually by her user (or account) name. This is followed by authentication, the verification of the user’s identity, that is, convincing a computer that a person is who he or she claims to be. To authenticate, the user prove her identity by information that
Each method with its proper advantages and disadvantages. In particular, passwords, the most common method, have to be memorizable, which in practice weakens them. The FIDO2 standard aims at replacing (or at least complementing) them by hardware and biometric authentication.
Instead of revealing the secret itself when authenticating, it is safer to prove only its knowledge. In a challenge-response authentication protocol, the successful response to the challenge requires its knowledge (such as encryption and decryption of random data by the secret key). In a zero-knowledge protocol, in contrast to a challenge-response protocol, no information whatsoever can be won on the secret key provided the computation of a mathematical function is assumed infeasible.
The Kerberos protocol mediates between users and servers by a central server that stores symmetric keys of all parties; then instead of the parties mutually proving the knowledge of their symmetric keys, it creates for each correspondence a session key of limited validity.
Read Bentz (2019) to get an idea how Kerberos works and Quisquater et al. (1989) for zero-knowledge proofs.
Read Schnorr (1991) to understand a basic zero-knowledge protocol.
Have a look at Menon-Sen et al. (2010) to get acquainted with the format of Request for Comments, in particular, thoroughly understand the SCRAM protocol.
On completion of this chapter, you will have learned …
Let us recall that the prefix Crypto-, comes from Greek kryptós, “hidden” and analysis from the Greek analýein, “to untie”: the breaking of ciphers, that is, recovering or forging enciphered information without knowledge of the key.
Cryptanalysis: Recovering or forging enciphered information without knowledge of the key.
History delivers plenty cryptanalytic success-stories; for example, among many others, the decryption of the German rotor machine Enigma by the Polish and British forces.
During World War I:
ADFGVX
cipher used by the German forces within a month just before the German army entered in Paris in 1918 (however, still too late to save much).During World War II:
Polish and British cryptanalysis of
The U.S. Army’s Signal Intelligence Service’s (SIS) cryptanalysis of the Japanese rotor machines . The pivotal Battle of Midway for the naval war in the Pacific was won by awareness of the Japanese attack on the Aleutian Islands being a diversionary maneuver and the Japanese order of attack on Midway.
In World War II the Battle of Midway, which marked the turning point of the naval war in the Pacific, was won by the United States largely because cryptanalysis had provided Admiral Chester W. Nimitz with information about the Japanese diversionary attack on the Aleutian Islands and about the Japanese order of attack on Midway.
During a debate over the Falkland Islands War of 1982, a member of Parliament, in a now-famous gaffe, revealed that the British were reading Argentine diplomatic ciphers with as much ease as Argentine code clerks.
Cryptanalytic feats were achieved also by the defeated powers, but no success-stories to be told.
While we will present some established principles of cryptanalysis, in practice the cryptanalyst’s intuition and ability to recognize subtle patterns in the ciphertext were paramount, but difficult to convey. Today, however, cryptanalysis is based on mathematics and put into practice by efficient use of extensive computing power.
Cryptanalysis of single-key ciphers relies on patterns in the plaintext carrying over to the ciphertext. For example, in a monoalphabetic substitution cipher (that is, each alphabetic letter a
, b
, … in the plaintext is replaced, independently of its position, by another alphabetic letter), the frequencies of the occurrences of the letters in the plaintext alphabet are the same as those in the ciphertext alphabet. This can be put to good cryptanalytic use by
A substitution by any permutation of the letters of the alphabet, such as,
A | B | … | Y | Z |
↓ | ↓ | … | ↓ | ↓ |
E | Z | … | G | A |
has $26 \cdot 25 \cdots 1 = 26! > 10^{26}$ keys, so a brute-force attack is computationally infeasible. But it violates the goals of diffusion and confusion: If the key (that is, the given permutation of the alphabet) exchanges the letter $\alpha$ for the letter $\beta$ , then there’s
In fact, the algorithm allows statistical attacks on the frequency of
in English For example,
e
,th
, andthe
.Thus substituting
e
),th
), …the
), …is a good starting point to decipher the text: The more ciphertext, the more likely that this substitution coincides with that the text was enciphered by.
For example, using these frequencies on the ciphertext
ACB ACBGA ACSBDQT
gives
THE THE** TH*E***
for yet to be deciphered letters marked by *
. By the restrictions of English vocabulary and sentence structure, yielding
THE THEFT THREAPS
As an exercise, Dear reader, build a short English sentence with common letter frequencies, ask a friend to encrypt it and try your hands at cryptanalyzing it!
Homophones: Multiple cipher symbols for the same plaintext letter.
To hide the frequencies of the alphabetic letters, one approach is to use homophones, a number of ciphertext symbols for the same alphabetic plaintext letter, chosen in proportion to the frequency of the letter in the plaintext; for example, twice as many symbols for E
as for S
and so forth, so that each cipher symbol occurs on average equally often in the ciphertext. However, other frequencies in the plaintext (partially) still resist encryption (and ease cryptanalysis), such as digraphs: TH
occurs most often, about 20
times as frequently as HT
, and so forth.
In practice the security of a cipher relies foremost
In contrast to single-key cryptography whose cryptanalysis exploits statistical patterns, by its reliance on computationally difficult mathematical problems (that is, whose runtime grows exponentially in the bit-length of the input), the cryptanalysis of two-key cryptography is that of computational mathematics: to find an algorithm that quickly computes the solutions of the difficult mathematical problem. Ideally, one whose runtime is polynomial in the number of input bits. However, in practice, for example, for
RSA
oralgorithms whose runtime grows slower than exponentially in the number of input bits (sub-exponential) are known, but none with polynomial runtime.
ECC
, elliptic curve cryptography, is becoming the new standard because its cryptographic problem (the logarithm over a finite elliptic curve) is (from what we know) computationally more difficult than that of RSA
(the factoring in prime numbers) or DH
(the logarithm over the multiplicative group of a finite field). Therefore, small keys for the ECC
achieve the same level of security as large keys for the RSA
or DH
. As an example, the security of a 224
bits key from the ECC
corresponds to that of a 2048
bits key from the RSA
or DH
. This factor in reducing key sizes corresponds to a similar factor in reducing computational costs. Regarding usability,
ECC
public key can be shared by spelling it out (it has 56
letters in hexadecimal notation,RSA
or DH
has to be shared in a file (which is for convenience referred to by a fingerprint).Let us compare the cryptographic problem behind ECC
to that of RSA
and DH
:
Both groups, the multiplicative group of a finite field and the group of a finite elliptic curve, are finite cyclic groups, that is, generated by an element $g$ of finite order $n$ . Mathematically speaking, they are equal to a group of type $G = \langle g\rangle = \{g^0, g^1, g^2, ..., g^{n-1}\} \simeq \{ 0, 1, 2, ..., n-1 \} = \mathbb{Z} / n \mathbb{Z}$ However, one way of this identification, from $\mathbb{Z} / n \mathbb{Z}$ to $G$ , is a lot faster than the other way around:
Given a generator $g$ in $G$ , the identification of $\mathbb{Z} / n \mathbb{Z}$ with $G$ for $x \mapsto g^x$, the exponential $\exp \colon \mathbb{Z} / n \mathbb{Z} \to G$ is quickly computed in every commutative group $G$ : Given $d$ in $1, ..., n-1$ , to calculate $g^d$, expand $d$ in binary base $d = d_0 + 2 d_1 + 2^2 d_2 + \cdots + 2^m d_m$ for $d_0$, …, $d_m$ in ${ 0, 1}$ and calculates by multiplying by $2$ successively $g^2$ , …, $g^{2m}\}$. So $g^d = g^{d_0 + 2 d_1 + 2^2 d_2 + \cdots + 2^m d_m} = g^{d_0} g^{2 d_1} g^{2 d_2} \cdots g^{2^m d_m}.$ That is, we calculate $g^d$ in $\leq 2 \log_2 n$ group operations.
Example. For $G$ the group of points of an elliptic curve, $P$ in $G$ , and $d$ in $\mathbb{N}$ we calculate successively $2 P$ , $2^2 = 2 \cdot P$ , $2^3 P = 2 \cdot 2 \cdot 2 P$ . Let $i_0$, $i_1$, … be the indices of the binary digits … that are different from $0$ . So $d P = 2^{i_0} P + 2^{i_1} P + \cdots$
Given a $g$ generator in $G$ , the computation of the reverse identification, the logarithm $\log \colon G \to \mathbb{F}_p,$ that is, given $y$ in $G$ , calculate $x$ in $0, ..., n-1$ such that $y = g^x$ is usually hard: By Shoup’s theorem in Shoup (1997), every generic algorithm, that is, using only the operations of the group, takes at least $n^{1/2}$ operations (of the group) to calculate the logarithm.
A generic algorithm that achieves this speed (except a $2$ factor) is the Baby Step, Giant Step
(or Shanks algorithm): Given $h$ in $\langle g\rangle$ , to calculate $d$ in $\{ 0 , 1, ..., n-1 \}$ such that $h = g^d$:
This algorithm works, because:
Note. For elliptic curves, Pollard’s $\rho$ -algorithm is slightly faster.
This estimation of the operations necessary to compute the logarithm of a group, that is, which uses only the operations of the group, applies to generic algorithms. However, specific algorithms, that is, those that use group-specific properties (such as the units of a finite field or that of an elliptic curve), may use less.
For example, for the multiplicative group of a finite field, there are indeed faster algorithms (called sub-exponential) to calculate the prime factors of a product and the logarithm. They are based on Index Calculus which makes use of the properties of this specific group. For example, for the cryptographic problems of RSA
and Diffie-Hellman
on the multiplicative group of a finite field, that is,
there are indeed faster algorithms (called sub-exponential). They are based on Index Calculus which makes use of the properties of this specific group. The fastest known algorithm is the General Number Field Sieve; see A. K. Lenstra et al. (1993) for the computation of the prime factors and Gordon (1993) for that of the logarithm.
Its complexity is sub-exponential, roughly for large $n$ , the number of group operations is $2^{n^{1/3} (C + o(1)) (\log n)^{2/3}}$ where $C = 64/9$ and $o(1)$ means a function $f$ over $\mathbb{N}$ such that $f(n) \to 0$ to $n \infty$ .
In contrast, all known algorithms for the ECC
cryptographic problem, that is, calculating the logarithm over a finite elliptic curve, are generic algorithms. For these generic algorithms, by Shoup’s Theorem, the complexity $2^{n/2}$ in the number of bits $n$ of input is as small as possible. The fastest algorithm at present is Pollard’s $\rho$ -algorithm, which has a roughly exponential complexity of $2^{n/2 + C}$ where $C = \log_2 (\pi/4)^{1/2} \simeq - 0.175$ .
World’s fastest supercomputer, IBM’s Summit (taking up 520 square meters in the Oak Ridge National Laboratory, Tennessee, USA) has around $150$ petaflops, that is, $1.5 \cdot 10^{17}$ floating point operations per second. The number of flops needed to check a key depends for example, on whether the plaintext is known or not, but can be very optimistically assumed to be $1000$. Therefore, Summit can check approximately $1.5 \cdot 10^3$ keys per second, and, a year having $365 \cdot 24 \cdot 60 \cdot 60 = 31536000 \approx 3 \cdot 10^8$ seconds, approximately $4.5 \cdot 10^{11}$ keys a year.
To counter the increasing computing power, one prudently applies a Moore’s Law that stipulates that computing power doubles every other year. Therefore, every twenty years computing power increases by a factor $2^{10} = 1024 \approx 10^3$. Therefore, to ensure that in, say, sixty years, a key not surely be found during a yearlong search by world’s fastest supercomputer at least $4.5 \cdot 10^{20}$ key combinations have to be used.
For a key of bit length $n$, the number of all possible keys is $2^{n }$. If $n = 80$, then there are are $2^{80} \approx 1.2 \cdot 10^{24}$ possible key combinations. While this number is sufficient for now, the probability for the key to be found during a yearlong search by world’s fastest supercomputer being around $1 / 250$, the projected fastest super computer in twenty years will likely find it in half a year. Instead, to be safe in 40 years, a minimal key length of $112$ is recommended.
This Table from A. Lenstra & Verheul (2001) compares the key sizes in bits to a security level comparable between
AES
,Diffie-Hellman
or RSA
.Symmetric Key | Asymmetric Elliptic Key | Classic Asymmetric Key |
---|---|---|
80 |
160 |
1024 |
112 |
224 |
2048 |
128 |
256 |
3072 |
192 |
384 |
7680 |
256 |
512 |
15360 |
The numbers in the table are estimated by the fastest known algorithm to solve the cryptographic problem: Given an input key with $n$ bits,
AES
, the fastest known algorithm is to try out all possible keys, whose complexity (= the number of operations) is $2^n$,RSA
or Diffie-Hellman
, on a finite field, the fastest algorithm is the General number field sieve whose complexity, roughly, for large $n$ is $2^{2 \sqrt[3]{n} \log (n)^{2/3}}$.In practice, the smaller ECC
keys speed up cryptographic operations by a factor of $>5$ compared to RSA
and Diffie-Hellman
(in addition to facilitating their exchange among people and saving bandwidth). However, there are also disadvantages to ECC
compared to RSA
, for example: its signature algorithm depends on the generation of an additional ephemeral key pair by a random number generator that, if its output is predictable or repetitive, reveals the private signature key!
ECC
is newer, so:
RSA
, andCerticom
).Which minimal key size is currently recommend as secure for RSA
and Diffie-Hellman
?
512
bits1024
bits4096
bitsWhich minimal key size is currently recommend as secure for Elliptic Curve Cryptography?
128
bits512
bits1024
bitsWhich minimal key size is currently recommend as secure for AES
?
128
bits256
bits1024
bitsRainbow tables are a method of password cracking that compares a password hash to precomputed hashes of the most likely passwords; a time-memory trade-off: more memory for less computation.
A lookup table is a data structure, usually an array, used to replace a runtime computation by an array indexing operation. That is, a value is first computed and then looked up to save runtime, because retrieving a value from memory is usually faster than computing it. For example, values of a common mathematical function, say the logarithm, can be precomputed to look up a nearby value be from memory.
A rainbow table is a table of cryptographic hashes of the most common passwords. Thus, more likely passwords are revealed sooner. The generation of the table depends on
Common cryptographic hash algorithms such as MD4/5
, SHA
… are fast; thus they are unsuitable for password creation because they are vulnerable to brute-force attacks. For example, MD5
as a cryptographic hash function is designed to be fast and thus lends itself towards a rainbow table attack, while hash functions such as PBKDF1
, PBKDF2
, bcrypt
, scrypt
and the recent Argon2
were designed to prevent this kind of attack by being:
bcrypt
,scrypt
.A rainbow attack can however most effectively be prevented by making the used hash function unique for each password. This is achieved by adding a salt
, an additional unique, usually random, argument.
The key size of the symmetric industry standard cryptographic algorithm DES
was merely 56
bits, so little that it ceded to brute-force attacks shortly after its vetting. Therefore, a twofold encryption for two different keys was thought to effectively double the key size to 112
bits. However, the meet-in-the-middle attack by Diffie and Hellman trades off memory for time to find the key in only $2^{n+1}$ encryptions (using around $2^n$ stored keys) instead of the expected $2^{2n}$ encryptions: Assume the attacker knows a plaintext $P$ and its ciphertext $C$ , that is, $C = \mathrm{E}_{K''}(\mathrm{E}_{K'}(P)),$ where E
denotes the encryption using $K'$ respectively $K''$ . The attacker: