A Quick Guide to Cryptography

Cryptography is the science of securing information by transforming it into an unreadable format, known as ciphertext, which can only be deciphered by someone who has the appropriate key. Its primary goals are confidentiality and integrity.

In this article, we provide a brief history of cryptography to help you get your bearings on the topic and to introduce you to the relevant technologies. Click the links in the document to learn more about these technologies.

Table of Contents

Key Terminology

  • Plaintext: The original readable message or data.
  • Ciphertext: The encrypted message or data, which is unreadable without the key.
  • Encryption: The process of converting plaintext into ciphertext.
  • Decryption: The process of converting ciphertext back into plaintext.
  • Key: A piece of information used in the encryption and decryption processes.
  • Cryptologist: A person who studies cryptology, which encompasses both cryptography and cryptanalysis.
  • Cryptanalyst: A person who focuses on breaking cryptographic codes and ciphers.

[xyz-ips snippet=”OpenAITest”]

The Early Years of Cryptography (non-digital)

The ancient Greeks used simple methods to secure messages. These included concealing messages under other mediums, such as wax, or using things like invisible ink. This approach of hiding the existence of a message is known as steganography.

Early Example of Stenography

Of course, once someone knew the method of concealment, uncovering the message was trivial.

The Greeks also used early forms of cryptography. For example, the Spartans used what’s called a scytale, where a strip of seemingly meaningless letters can be deciphered by wrapping the strip around a cylinder of a specific size, as shown in this image from Wikipedia. This is actually an early form of a transposition cipher.

Julius Caesar used a substitution cipher in his messages, where each letter was shifted a certain number of places along the alphabet. As long as you knew the size of the shift, you could decipher it. This technique is often referred to as a monoalphabetic cipher. Unfortunately, there were only 23 letters in the Latin alphabet, so you could decipher a message in only 22 attempts even if you didn’t know the size of the shift beforehand.

Eventually, people developed more complex forms of monoalphabetic ciphers (like using random substitution patterns) to make decryption quite difficult until the discovery of frequency analysis in 850 A.D. Frequency analysis allowed a cryptanalyst to compare the frequency of ciphertext letters to the known frequency of letters in the plaintext language, then make educated guesses about the substitutions.

Frequency analysis gave cryptanalysts the upper hand until the emergence of polyalphabetic ciphers in the 15th century. As the name implies, this cipher often uses a different starting point in an alphabet for each character that it encrypts based on the offsets represented by a keyword. For example, if your keyword is “BAD”, you might shift the first letter of your plaintext message by one place in the alphabet, the second letter by zero places, and the third letter by three places.

This shifting substitution pattern meant that two o’s in a plaintext word like ”good” would end up as different letters in the ciphertext, which made frequency analysis far more difficult.

This new method of encryption gave cryptologists the advantage until the 19th century when Friedrich Kasiski developed the Kasiski examination. He discovered that while the ciphertext showed no obvious patterns, the keyword itself created a repeating pattern that could be used to identify it. Once you knew the keyword, you could then use it to decrypt the ciphertext.

The Germans kicked things up in notch in 1918 with the invention of the Enigma machine. It used a set of rotors with 26 positions representing the letters of the alphabet. Each rotor could be set to different initial positions, which affected the substitution pattern for each letter. Every time a key was pressed, the rotors advanced, which continuously changed the pattern. There was no keyword in this instance, but instead an agreed-upon setup configuration.

German Enigma Machine

However, there was still a pattern in the cyphertext output, which eventually allowed Polish cryptanalysts to crack it in the 1930s — a technique the British then automated with their efforts at Bletchley Park by using an electromechanical device known as the Bombe machine.

To this point, cryptography was constrained by physical limitations to using techniques like character transposition and substitution, always leaving behind some kind of detectable pattern. Also, there was a need for two parties to know in advance which keyword to use for encryption/decryption (or, in the case of a machine like the Enigma, the machine configuration). If this knowledge was intercepted, then of course the interceptor could easily decrypt the message.

The emergence of digital computers and electronic cryptographic devices during and following WW2 changed all this because it opened up encryption and decryption to the digital world.

Digital Cryptography

Digital cryptography can be divided into three categories:

  1. Symmetric-Key Cryptography
  2. Public-Key Cryptography
  3. Hash Functions

There are plenty of dependencies between these technologies in the real world, but separating them makes it easier to understand their history.

Symmetric-Key Cryptography

Symmetric-key cryptography uses the same key for both encrypting and decrypting the data. It is efficient and fast, making it ideal for encrypting large amounts of data. Its main challenge is securely sharing the key between the sender and the recipient. If the key is compromised, the encrypted data can be easily decrypted.

In a way, symmetric-key cryptology is an artifact of the pre-digital era, where the same key was used to encrypt and decrypt monoalphabetic and polyalphabetic ciphers using non-digital methods. However, because of its performance advantages over other encryption methods, it is also a logical building block for modern digital cryptology.

Many incremental advances were made in symmetric-key cryptology following WW2, but most of these advances coalesced into two standards adopted by the U.S. federal government:

  • Data Encryption Standard (DES)
  • Advanced Encryption Standard (AES)

The Data Encryption Standard (DES)

DES became the accepted federal standard for digital cryptography in the U.S. in 1977.

Unlike traditional character substitution and transposition ciphers, DES operated on blocks of data (64 bits at a time) rather than on individual characters. It used a series of complex operations, including permutations, substitutions, and XOR operations to produce the ciphertext.

More broadly speaking, it marked a shift from encrypting the underlying digital data representing the plaintext rather than trying to encrypt the plaintext itself. Freed from the constraint of manipulating a limited character set, this allowed mathematicians to apply more powerful techniques. It also allowed cryptographers to universally encrypt all types of data, not just text.

However, the DES key was only 56 bits. While this was secure for the early 1970s, cryptologists weren’t the only ones using new digital capabilities. Cryptanalysts also applied rapidly increasing computer power to decrypt data via brute force methods. As a consequence, DES became increasingly vulnerable by the late 1980s.

As a stopgap measure, people began double and triple encrypting DES messages (i.e., where you submit the cyphertext output from one DES encryption to a second encryption), but, ultimately, DES was replaced by the Advanced Encryption Standard (AES) in 2001 as the new U.S. federal encryption standard.

Advanced Encryption Standard (AES)

The main advantages that AES had over DES from a security standpoint were larger key sizes ranging from 128 bits to 256 bits, and more complex encryption algorithms involving more sophisticated mathematical operations.

As a consequence, AES remains the standard for symmetric key encryption in 2024 due to its proven security, efficiency, and broad adoption across various industries and applications. It is resistant to all known practical cryptographic attacks. However, it still has two potential weaknesses:

  • It is still a symmetric-key block cipher, which involves inherent risks associated with key distribution.
  • It may face future vulnerabilities to attacks by quantum computers. While AES with 256-bit keys is expected to remain secure against quantum attacks, advancements in quantum computing could pose a challenge.

Public-Key Cryptography

While digital cryptography standards like DES and AES evolved, the cryptographic world also began to address the question of how to securely distribute symmetric keys. This led to the development of public-key cryptography, also known as asymmetric cryptography.

Public-key cryptography uses a pair of keys: a public key that can be widely disseminated and a private key that is kept secret. The public key is used to encrypt data, while the private key is used to decrypt it.

This system enables secure communication over unsecured channels, as only the intended recipient with the private key can decrypt the message, and it also facilitates secure key exchange protocols essential for establishing symmetric encryption keys.

Here are the relevant milestones:

Early Concepts and Theoretical Foundations (1970s)

  1. Diffie-Hellman Key Exchange (1976): Introduced the concept of public-key cryptography and key exchange. Allowed two parties to securely share a secret key over an insecure channel without having a prior shared secret.
  2. RSA Algorithm (1977): Developed the first practical public-key encryption and digital signature algorithm. RSA is based on the difficulty of factoring large integers.

Development and Standardization (1980s-1990s)

  1. Introduction of Digital Signatures: Enabled secure and verifiable electronic communication and transactions by extending public-key cryptography to include digital signatures, which provide authentication and non-repudiation.
  2. Public-Key Infrastructure (PKI): Developed standards to manage public keys and digital certificates, ensuring the authenticity of public keys. Adopted standards such as X.509 for digital certificates and the establishment of Certificate Authorities (CAs)
  3. Elliptic Curve Cryptography (ECC) (1985): Provided a more efficient form of public-key cryptography using elliptic curves, offering similar security with smaller key sizes compared to RSA.

Widespread Adoption and Integration (1990s-Present)

  1. Secure Communications Protocols:
    • SSL/TLS: Public-key cryptography became a cornerstone of secure internet communication protocols, ensuring secure data transmission in protocols like SSL/TLS.
    • PGP: Pretty Good Privacy (PGP) used public-key cryptography to provide secure email communication.
  2. Cryptographic Standards: Public-key cryptography was integrated into various national and international standards for secure communication and data protection.

Modern Developments and Future Directions (2000s-Present)

  1. Quantum Computing Threat: Emerging quantum computers pose a threat to current public-key cryptographic systems, as they could potentially break widely used algorithms like RSA and ECC. This has spurred development of post-quantum cryptography to create algorithms resistant to quantum attacks. NIST is leading efforts to standardize post-quantum cryptographic algorithms.
  2. Blockchain and Cryptocurrencies: Public-key cryptography underpins the security and functionality of blockchain technologies and cryptocurrencies like Bitcoin, enabling secure and verifiable transactions.

Hash Functions

Hash functions are algorithms that take an input (or ‘message’) and return a fixed-size string of bytes that cannot be interpreted by humans. The output, typically referred to as the hash value or digest, is unique to each unique input. The key characteristics of hash functions include:

  1. Deterministic: The same input will always produce the same output.
  2. Quick Computation: The hash value should be quick to compute for any given input.
  3. Pre-image Resistance: It should be infeasible to generate the original input given only the hash value.
  4. Small Changes, Big Differences: A small change to the input should produce a significantly different hash.
  5. Collision Resistance: It should be infeasible to find two different inputs that produce the same hash value.

Role in Modern Cryptography

Hash functions play a crucial role in various aspects of modern cryptography:

  1. Data Integrity: Hash functions ensure the integrity of data by generating a hash value that can be used to verify the original data has not been altered.
  2. Digital Signatures: They are used in creating digital signatures, where the hash of a message is encrypted with a private key to create a signature.
  3. Password Hashing: In authentication systems, passwords are stored as hashes to enhance security. Even if the hash is exposed, the original password remains protected.
  4. Cryptographic Protocols: Hash functions are integral to various cryptographic protocols, such as SSL/TLS for secure internet communication.
  5. Blockchain and Cryptocurrency: They are fundamental in the functioning of blockchain technology and cryptocurrencies, where they ensure data integrity and security of transactions.

Key Milestones

1950s-1960s: Early Concepts
  • 1953: Hans Peter Luhn developed a technique that could be considered a precursor to hash functions, for efficient text searching.
1970s: Birth of Modern Hash Functions
  • 1971: Horst Feistel published the first detailed paper on cryptographic hash functions as part of his work on block ciphers.
  • 1975: The National Bureau of Standards (now NIST) adopted the Data Encryption Standard (DES), which included the use of hash functions for integrity verification.
1980s: Early Hash Functions
  • 1980: Ralph Merkle introduced the concept of Merkle Trees, using hash functions for efficient and secure verification of data structures.
  • 1989: Ronald Rivest published MD2, MD4, and later MD5 hash functions, which became widely used but were eventually found to be vulnerable to collision attacks.
1990s: SHA Series Introduction
  • 1993: The National Institute of Standards and Technology (NIST) published the first Secure Hash Algorithm (SHA), known as SHA-0.
  • 1995: SHA-1 was introduced, addressing vulnerabilities found in SHA-0. It became widely used but was eventually found to be insecure.
2000s: Transition to More Secure Hash Functions
  • 2001: NIST published the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512), providing much stronger security compared to SHA-1.
  • 2004: The discovery of significant weaknesses in MD5 led to its deprecation in favor of more secure alternatives.
2010s: Modern Hash Functions
  • 2012: NIST announced the winner of the SHA-3 competition, Keccak, which was subsequently standardized as SHA-3 in 2015.
  • 2015: SHA-3 was officially published by NIST, providing an alternative to the SHA-2 family with different structural properties.
2020s: Continued Evolution
  • 2020: BLAKE3 was introduced, building on the BLAKE2 hash function (a finalist in the SHA-3 competition) with a focus on high performance and security.
  • Ongoing: Research continues into post-quantum cryptographic hash functions, preparing for potential future quantum computing threats.

Cryptology — Wrap-up

In this article, we provided you with an overview of cryptology, its history, and its relevant technologies.

We hope this document can serve as a framework for further studies.