Hashing and encryption are both cryptographic operations, but they solve different problems and cannot be substituted for each other. Confusing them is one of the most consequential security mistakes developers make, because the wrong choice can expose passwords to brute-force cracking or make stored data permanently inaccessible.
The cleanest way to understand the difference: encryption is designed to be reversed by the right party. Hashing is designed to be irreversible by anyone. Those different design goals flow from different intended uses. Encryption protects data that needs to be read later. Hashing verifies data without needing to recover the original.
What makes this topic more interesting than a simple comparison is that hashing appears inside encryption protocols. SHA-256 is used inside TLS to verify certificate integrity, inside HMAC to authenticate messages, and inside digital signature schemes. Encryption uses hashing as a component. Understanding both individually is prerequisite to understanding how they work together.
What Hashing Is: Deterministic, Fixed-Length, Irreversible
A cryptographic hash function takes any input of any length and produces a fixed-length output called a hash, digest, or fingerprint. SHA-256 always produces 256 bits of output regardless of whether the input is a single character or a 10 gigabyte file. The hash is deterministic: the same input always produces the same output. Change a single bit in the input and the output changes completely, unpredictably, and thoroughly. This property is called the avalanche effect.
Three security properties define a cryptographically strong hash function. Pre-image resistance: given a hash output, it is computationally infeasible to find any input that produces it. Second pre-image resistance: given an input and its hash, it is computationally infeasible to find a different input that produces the same hash. Collision resistance: it is computationally infeasible to find any two different inputs that produce the same hash output.
None of these properties are absolute. MD5 and SHA-1 are hash functions that were once considered secure but are now considered broken for security applications because collision attacks against them became practical. SHA-256 and SHA-3 are currently considered secure against known attacks. The practical meaning of a broken hash function for collision resistance is that an attacker can produce two different files with the same hash, which undermines uses like file integrity verification and certificate signing.
What hashing is used for
- Password storage: systems store the hash of a password rather than the password itself. At login, the entered password is hashed and compared to the stored hash. The original password is never stored.
- Certificate integrity: when a CA signs an SSL certificate, it first hashes the certificate data with SHA-256, then signs the hash. Verifying the signature confirms both the CA’s identity and that the certificate data has not been altered.
- File integrity verification: download pages for operating systems and software packages publish SHA-256 hashes of the files. After downloading, you hash the downloaded file and compare it to the published value. A match confirms the download was not corrupted or tampered with.
- Digital signatures: the same pattern used in SSL certificates applies broadly. You sign a hash of the data rather than the raw data. The hash provides a fixed-length representation that asymmetric cryptography can efficiently sign.
- HMAC (Hash-based Message Authentication Code): combines hashing with a secret key to produce a message authentication code. Used in TLS to verify that messages have not been altered in transit by someone without the session key.
What Encryption Is: Reversible, Key-Dependent, Confidentiality-Focused
Encryption transforms plaintext into ciphertext using a key, in a process designed to be reversed by anyone who holds the correct key. The purpose is confidentiality: making data unreadable to anyone who does not have the key, while preserving the ability to recover the original data when needed.
Two categories of encryption algorithms are in widespread use. Symmetric encryption uses the same key for both encryption and decryption. AES is the dominant symmetric algorithm and is used for bulk data encryption: encrypting files, database fields, disk volumes, and the application data inside TLS sessions. Asymmetric encryption uses a mathematically linked key pair: one key encrypts and only the other decrypts. RSA and elliptic curve algorithms are the main asymmetric systems used for key exchange, digital signatures, and small data items like session key encryption.
In practice, most encrypted communication combines both: asymmetric cryptography is used to securely exchange a symmetric key, then symmetric encryption is used for the actual data. TLS uses this pattern: the handshake uses asymmetric cryptography to derive a shared session key, and AES encrypts the HTTP traffic using that session key. Pure asymmetric encryption of large data is computationally impractical.
What encryption is used for
- Data in transit: TLS encrypts all HTTPS traffic, email transmission, VPN tunnels, and API calls. Without encryption in transit, anyone on the network path can read the data.
- Data at rest: full-disk encryption, encrypted database columns, and encrypted file storage protect data if the physical storage is compromised.
- End-to-end messaging: Signal, WhatsApp, and S/MIME email encrypt messages so that only the intended recipient can read them. The service provider cannot decrypt the content.
- Secure key exchange: asymmetric encryption allows two parties who have never communicated to establish a shared secret over an insecure channel. This is the foundation of HTTPS and every other public-key protocol.
The Core Difference: Purpose and Reversibility
The most important distinction between hashing and encryption is not technical but purposive: why is each being used, and does recovering the original data matter?
If the original data must be recoverable, encryption is required. If recovering the original data is explicitly not wanted, hashing is the correct tool.
Password storage is the canonical example of deliberately not wanting to recover original data. A system that stores passwords correctly never needs to compare a stored password to an entered one. It only needs to answer the question: does what the user just entered match what was stored? Hashing answers this without storing the original password. If the database is breached, the attacker has hashes, not passwords.
| Property | Hashing | Encryption |
| Reversible? | No. A hash cannot be reversed to recover the input. | Yes. Ciphertext can be decrypted with the correct key. |
| Key required? | No. Hash functions are keyless (HMAC uses a key, but the basic hash function does not). | Yes. Encryption always uses one or more keys. |
| Output length | Fixed regardless of input length (SHA-256 always produces 256 bits) | Varies with input length and algorithm |
| Same input, same output? | Yes. Hashing is deterministic. | Not necessarily for good encryption. IVs and nonces add randomness to prevent pattern detection. |
| Primary purpose | Integrity verification, fingerprinting, password verification, digital signatures | Confidentiality. Keeping data unreadable except to key holders. |
| Can a breach expose the originals? | Not directly. Brute-force and rainbow table attacks may crack weak hashes. | Yes. Stolen keys decrypt all data encrypted with them. |
| Typical algorithms | MD5 (broken), SHA-1 (broken), SHA-256, SHA-3, bcrypt, Argon2 | AES (symmetric), RSA, ECDSA, ECDH (asymmetric) |
The Most Common Misapplication: Using SHA-256 to Store Passwords
SHA-256 is a secure hash algorithm. Using SHA-256 to hash passwords before storing them is one of the most widespread security mistakes in web development, and the reason it is wrong is not immediately obvious.
SHA-256 is designed to be fast. A modern GPU can compute billions of SHA-256 hashes per second. An attacker who obtains a database of SHA-256-hashed passwords can run every common password, every dictionary word, and every combination of letters and digits through SHA-256 billions of times per second and compare the results to the stolen hashes. Cracking a database of SHA-256-hashed common passwords takes seconds to minutes.
The correct tools for password storage are bcrypt, scrypt, and Argon2. These are purpose-built password hashing functions designed to be computationally expensive: they include configurable work factors that make each hash computation take tens or hundreds of milliseconds of CPU time. An attacker cracking bcrypt hashes with a GPU is limited to tens of thousands of attempts per second rather than billions, making brute-force attacks orders of magnitude slower.
Password hashing functions also incorporate a salt: a random value that is unique per password and stored alongside the hash. The salt ensures that two users with the same password produce different stored hashes, preventing rainbow table attacks (precomputed tables mapping common passwords to their hashes) from being effective.
Never use SHA-256, SHA-512, or MD5 to hash passwords directly. Use bcrypt, scrypt, or Argon2id. These are the algorithms recommended by NIST SP 800-63B and OWASP for password storage. SHA-256 is correct for file integrity, certificate signing, and HMAC. It is wrong for passwords specifically because it is too fast and lacks salting built in. SHA-256 hashed passwords from a breached database can be cracked with GPU hardware in hours to days.
Where Hashing Appears Inside Encryption Protocols
The relationship between hashing and encryption is not just comparison and contrast. Hashing is a component of most encryption protocols and systems. Understanding this eliminates the false impression that they are completely separate domains.
SSL certificate signing
When a Certificate Authority issues an SSL certificate, the signing process uses both hashing and asymmetric encryption. The CA hashes the certificate data using SHA-256, producing a 256-bit digest. The CA then encrypts that digest with its private key using RSA or ECDSA. The result is the certificate’s digital signature. When a browser validates the certificate, it decrypts the signature using the CA’s public key to recover the hash, independently computes the SHA-256 hash of the certificate data it received, and compares them. If they match, the certificate is confirmed unaltered and genuinely signed by the CA. Hashing makes this efficient and tamper-evident: signing a fixed 256-bit hash is fast, and any change to the certificate changes the hash.
HMAC in TLS
TLS uses HMAC to verify that application data records have not been altered in transit. HMAC combines a secret key with a hash function: the MAC is computed over the message content and the session key, producing an authentication tag. The recipient recomputes the HMAC and compares it to the transmitted tag. If they match, the message arrived unaltered from someone who holds the session key. Without HMAC, an attacker who intercepts TLS traffic could modify ciphertext and the recipient would have no way to detect it. TLS 1.3 uses AEAD (Authenticated Encryption with Associated Data) which performs a similar integrity verification, integrating it directly into the encryption operation.
HKDF in TLS 1.3 key derivation
TLS 1.3 derives session encryption keys from handshake secrets using HKDF (HMAC-based Key Derivation Function), which uses SHA-256 or SHA-384 as its underlying hash function. The handshake produces a shared secret through Diffie-Hellman key exchange. HKDF expands that shared secret into the multiple keys TLS 1.3 needs: keys for encrypting client-to-server traffic, server-to-client traffic, handshake messages, and the session resumption ticket. Hashing is the mechanism through which a single shared secret becomes cryptographically independent keys for different purposes.
The pattern that appears throughout cryptography: hash functions are used as building blocks inside encryption protocols because they provide deterministic, collision-resistant compression of arbitrary data into a fixed size. Signing a hash is faster than signing raw data. Deriving keys from a hash produces independent key material. Authenticating messages with a hash prevents tampering. Encryption protocols do not replace hashing; they use it.
Decision Guide: Which to Use and When
| Situation | Use | Why |
| Storing user passwords | Bcrypt, scrypt, or Argon2id (password hashing, not plain SHA-256) | Recoverable originals not needed; must be slow and salted to resist brute-force |
| Verifying a downloaded file was not corrupted | Hashing (SHA-256) | Compute hash of received file, compare to published value; original not needed |
| Protecting credit card numbers in a database | Encryption (AES-256) | Original number must be recoverable to process transactions |
| Verifying email content was not altered (S/MIME) | Hashing inside digital signature | Hash is signed with sender’s private key; recipient verifies hash match |
| Transmitting sensitive data over the network | Encryption (TLS) | Data must be readable by recipient; hashing alone would not allow recovery |
| Generating a unique identifier for a document | Hashing (SHA-256) | Fixed-length fingerprint that changes if document changes; original not needed |
| Storing API keys or tokens | Hashing (SHA-256 with salt) | Application verifies submitted token by hashing it; original not stored |
| Encrypting a database backup | Encryption (AES-256) | Backup must be restorable; irreversible hashing would make it permanent garbage |
Frequently Asked Questions
What is the main difference between hashing and encryption?
Hashing is a one-way process that produces a fixed-length fingerprint of any input. It is designed to be irreversible: you cannot recover the original data from a hash. Encryption is a two-way process that transforms data into ciphertext using a key, designed so that the original data can be recovered by anyone with the correct key. The key question when choosing between them: do you need to recover the original data? If yes, encrypt. If deliberately not, hash. Passwords are deliberately not recovered, so they are hashed. Financial records must be recoverable, so they are encrypted.
Is SHA-256 encryption?
No. SHA-256 is a cryptographic hash function, not an encryption algorithm. It is one-way and produces a fixed 256-bit output from any input. There is no key and no decryption operation. SHA-256 is used for integrity verification, digital signatures, and as a component of encryption protocols, but it does not provide confidentiality because the output cannot be used to reconstruct the original data, nor because it hides the input in a retrievable way.
Can hashed data be reversed or cracked?
A cryptographic hash function cannot be mathematically reversed. There is no algorithm to compute the original input from a SHA-256 hash. However, hashes can be cracked through brute-force: trying every possible input and comparing the resulting hash to the target. Fast hash functions like SHA-256 and MD5 make brute-force practical for short or common inputs like passwords. Password hashing functions (bcrypt, Argon2) are designed to be slow, making brute-force impractical. Salting eliminates rainbow table attacks by making precomputed hash tables useless.
Why is hashing used in SSL certificates if it cannot protect confidentiality?
SSL certificate signing uses hashing for integrity, not confidentiality. When a CA signs a certificate, it hashes the certificate data with SHA-256 and signs the hash with its private key. Anyone with the CA’s public key can verify the signature and confirm the certificate data has not been changed since the CA signed it. This is integrity verification, not confidentiality. The certificate is a public document that should be readable by anyone. The hash ensures that no one can modify it without invalidating the signature.
What is salting and why does it matter for password storage?
A salt is a random value generated uniquely for each password before hashing. The salt is stored alongside the hash. When verifying a login, the system retrieves the stored salt, appends it to the entered password, hashes the combination, and compares to the stored hash. Salting has two effects. First, two users with the same password produce different hashes because their salts differ, so an attacker who sees the hash database cannot identify accounts with the same password. Second, salting defeats rainbow table attacks, which are precomputed tables mapping common passwords to their hashes. Since the salt is unique per password, precomputed tables cannot be used.
What is HMAC and how does it combine hashing and encryption?
HMAC (Hash-based Message Authentication Code) uses a hash function combined with a secret key to produce a message authentication code. The HMAC over a message is computed using both the message content and a shared secret key. Anyone with the key can verify the HMAC. Anyone without the key cannot forge a valid HMAC even if they know the hash algorithm. TLS uses HMAC to verify that data records have not been tampered with in transit. HMAC adds the key-dependence that plain hashing lacks, making it suitable for authentication as well as integrity verification.
