ZK Journey Week 3 — Why Prime? From Group Theory to Real Cryptography
Last week was theory — groups, rings, fields. This week, the question that made everything click: "But WHY does the modulus have to be prime?" One proof, two algorithms, and suddenly the theory had teeth.
The Question That Started Everything
Week 3 was an office hours session — no slides, no formal lecture, just João answering questions from the cohort. And it turned out to be one of the best sessions so far, because the discussion went from abstract theory straight to real cryptography.
Rinke opened with a deceptively simple question: "How can you prove that when the modulus is prime, all numbers have a multiplicative inverse?"
In Week 2, we just stated this as a fact — prime modulus means every non-zero element has an inverse, which is why prime fields work. But we never proved it. This week, we got the proof. And once you see the proof, everything about why cryptography cares about primes becomes obvious.
Proving Inverses Exist — Two Ways
João showed two different proofs. Both arrive at the same conclusion, but from different directions.
Proof 1: Fermat's Little Theorem
Fermat's Little Theorem says that if p is prime and gcd(a, p) = 1 (which is guaranteed when a is between 1 and p−1, since p is prime), then:
# Divide both sides by a (legal because gcd(a, p) = 1)
ap−1 ≡ 1 (mod p)
# Rewrite as:
ap−2 × a ≡ 1 (mod p)
# So the inverse of a is a^(p-2)
a−1 ≡ ap−2 (mod p)
That's it. If p is prime and a ≠ 0, then ap−2 mod p is the inverse of a. You don't need to search for it — you can compute it directly. This is actually what Python's pow(a, -1, p) does under the hood.
Proof 2: Extended Euclidean Algorithm
The Extended Euclidean Algorithm says that for any two integers x and p, there exist integers A and B such that:
# If gcd(x, p) = 1 (true when p is prime and 0 < x < p):
1 = x·A + p·B
# Take both sides mod p — the p·B term vanishes:
1 ≡ x·A (mod p)
# So A is the multiplicative inverse of x mod p
x−1 ≡ A (mod p)
Same conclusion, different route. The Extended Euclidean Algorithm actually finds the coefficients A and B, so it constructively gives you the inverse.
The common thread: Both proofs depend on gcd(a, p) = 1. And here's the key insight — if p is prime, then gcd(a, p) = 1 for ALL a between 1 and p−1, automatically. No exceptions. That's what makes primes special: they're coprime with everything smaller than them.
What Breaks When P Isn't Prime
João then showed exactly what goes wrong with a non-prime modulus. Let's take p = 6 and try to find multiplicative inverses:
# Does 2 have an inverse? Need: 2 × ? ≡ 1 (mod 6)
2 × 1 = 2 2 × 2 = 4 2 × 3 = 0 2 × 4 = 2 2 × 5 = 4
# None give 1. No inverse. ❌ (gcd(2,6) = 2 ≠ 1)
# Does 3 have an inverse?
3 × 1 = 3 3 × 2 = 0 3 × 3 = 3 3 × 4 = 0 3 × 5 = 3
# No inverse. ❌ (gcd(3,6) = 3 ≠ 1)
# Does 4 have an inverse?
4 × 1 = 4 4 × 2 = 2 4 × 3 = 0 4 × 4 = 4 4 × 5 = 2
# No inverse. ❌ (gcd(4,6) = 2 ≠ 1)
# Does 5 have an inverse?
5 × 1 = 5 5 × 2 = 4 5 × 3 = 3 5 × 4 = 2 5 × 5 = 1
# Yes! 5 × 5 ≡ 1 (mod 6). ✅ (gcd(5,6) = 1)
Out of {1, 2, 3, 4, 5}, only 1 and 5 have inverses. The others — 2, 3, and 4 — are all non-coprime with 6, so they're stuck without inverses. You can't divide by 2 in this system. Division is broken.
The Units of a Modulus
João introduced a concept I hadn't seen before: the units of a modulus n. These are the elements whose gcd with n equals 1 — the ones that do have multiplicative inverses. They form a group under multiplication.
For n = 7: units are {1, 2, 3, 4, 5, 6} — all 6 non-zero elements form a group
For n = 8: units are {1, 3, 5, 7} — only 4 out of 7 elements work
For n = 12: units are {1, 5, 7, 11} — only 4 out of 11 elements work
When n is prime: every non-zero element is a unit. That's the best case.
Euler's totient function φ(n) counts how many units n has.
When p is prime, the units are {1, 2, ..., p−1} — the full set. When p isn't prime, you lose elements. Some numbers become "dead" — you can't divide by them, can't find their inverses, and the group shrinks. For cryptography, you want the largest possible group. Prime gives you the maximum.
The Closure Argument — Why Primes Protect Groups
Brandon asked a sharp question: "If the multiplicative group 𝔽p* doesn't include zero, how can we be sure multiplying two elements never produces zero?"
João's answer was elegant. For a × b ≡ 0 (mod p), we'd need a × b to be a multiple of p. But if p is prime, the only way to factor p is 1 × p or p × 1. Since both a and b are between 1 and p−1, neither of them is p. So the product can never be a multiple of p. Zero can never appear.
Non-prime breaks closure. With mod 8: {1, 2, 3, 4, 5, 6, 7} under multiplication — but 2 × 4 = 8 ≡ 0 (mod 8). Zero just appeared inside a set that doesn't contain zero. The group isn't closed anymore. With mod 6: 2 × 3 = 6 ≡ 0. Same problem. A non-prime modulus means the set "leaks" — two perfectly valid elements can combine to produce something outside the set.
This immediately answered Brandon's earlier question about Diffie-Hellman: "If p is not prime, I don't have a group. And if I don't have a group, I can't define the discrete log problem." The entire security foundation of Diffie-Hellman depends on having a proper group, and you only get that with a prime modulus.
The Discrete Log Problem — How Big Is Big Enough?
Rinke asked the most practical question of the session: "In real terms, how big does the prime need to be for the discrete log problem to be secure?"
João's answer was blunt: it depends on which group you're using.
Know Y and G, find x?
Known attacks exist (baby-step giant-step, Pohlig-Hellman, number field sieve)
Required key size: ~4000-5000 bits
Pre-ECC era: ElGamal signatures used this
Know point P and generator G, find x?
No known efficient attacks
Required key size: ~256 bits
Modern era: ECDSA, used in Ethereum/Bitcoin
That's a massive difference. For the same security level, ECC needs keys roughly 20x smaller than traditional discrete log. This is why elliptic curves took over cryptography in the 2000s — same security, much smaller keys, faster operations. The old approach (ElGamal, which João mentioned by name) worked on the same philosophical principle, but needed 4000+ bit primes to be secure. ECC achieves equivalent security with 256 bits.
Why the difference? Because for the integer discrete log, mathematicians have found some clever sub-exponential algorithms — not fast enough to actually break it with large keys, but fast enough to require bigger keys. For the elliptic curve discrete log, no such algorithms exist. The best known attack is essentially brute force (Pollard's rho, which is still exponential). No mathematical shortcut has been found, so the keys can stay small.
João's phrasing stuck with me: "It's hard to find something better than elliptic curves because we don't know basically any kind of attack." The security of ECC isn't proven — it's based on the absence of known attacks. If someone discovers an efficient algorithm for the elliptic curve discrete log tomorrow, the entire foundation of blockchain cryptography collapses. But after decades of research, no one has.
The Quantum Elephant in the Room
Rinke raised the next obvious question: "But elliptic curves will also be broken by Shor's algorithm, right?"
Yes. Shor's algorithm, running on a sufficiently powerful quantum computer, would break both traditional discrete log and elliptic curve discrete log. Both. The entire foundation of modern public-key cryptography — RSA, Diffie-Hellman, ECDSA, everything based on the hardness of factoring or discrete logs — would fall.
João's response was pragmatic: quantum computers capable of running Shor's at scale don't exist yet. Classical cryptography remains secure against current hardware. But the threat is taken seriously enough that the entire field of post-quantum cryptography exists to prepare for that day.
Brandon brought up lattice-based cryptography — a family of cryptographic schemes based on entirely different hard problems (shortest vector problem, learning with errors) that are believed to be quantum-resistant. João acknowledged he wasn't deeply familiar with lattices, but confirmed they're a leading candidate for post-quantum standards. They use different algebraic structures than elliptic curves — they're not based on the discrete log problem at all.
STARKs: Built on hash functions only → Quantum-resistant
João: "If you want quantum resistance, you need to drop most of the cryptography you know." STARKs achieve this by using only hash functions — no elliptic curves, no pairings, no discrete log assumptions.
This is a major reason STARKs exist alongside SNARKs. Groth16 (what we're building toward in this course) produces the smallest proofs and fastest verification — but it's built on elliptic curves, so it's not quantum-safe. STARKs produce larger proofs but rely only on hash functions, making them quantum-resistant. In practice, some systems even wrap a STARK proof inside a Groth16 proof — using the STARK for the heavy computation and Groth16 for the final compression.
Hash Functions — Two Completely Different Worlds
This was the part of the discussion I found most surprising. A student asked whether hash functions use elliptic curves behind the scenes. João's answer: absolutely not. Traditional hash functions and elliptic curve cryptography are fundamentally different beasts.
Traditional Hashes: No Math, Just Chaos
SHA-2, SHA-3, MD5 — these are based on bit mixing. You take an input, shuffle it, rotate it, permute it, do another round, shuffle, rotate, permute, and output a fixed-size result. There is no mathematical structure. No hard problem. No formula. Just carefully designed chaos.
And here's the remarkable thing: no one has ever proven these are secure. There is no mathematical proof that SHA-256 is collision-resistant. It's secure because, after decades of cryptanalysis, nobody has been able to break it. That's a very different kind of security than "this is as hard as the discrete log problem."
João made the distinction sharply: Diffie-Hellman's security is based on the discrete log problem — a formally defined hard problem. If you break the DLP, you break Diffie-Hellman. SHA's security is based on nobody having broken it yet. There's no underlying hard problem to point to. Same for AES, DES, and all symmetric encryption. They're "ugly" — no elegant math, just effective bit mangling.
But this "ugliness" is exactly why hash functions are quantum-resistant. Shor's algorithm exploits mathematical structure — the periodicity in modular exponentiation, the group structure of elliptic curves. Hash functions have no structure to exploit. The best quantum attack on SHA-256 is Grover's algorithm, which only gives a square-root speedup (effectively halving the bit security from 256 to 128 — still plenty secure).
ZK-Friendly Hashes: Math Returns
Then João introduced a completely different category: hash functions designed specifically for zero-knowledge proofs. Names like Poseidon and Pedersen.
Unlike SHA, these hashes do have algebraic structure. They're built from mathematical formulas — operations like repeated exponentiation over finite fields. This structure is exactly what makes them useful for ZK: if you have a formula, you can build a circuit to verify that formula. When we learn circuits later in the course, this will make sense — a circuit is essentially a mathematical equation that a verifier checks.
Just bit manipulation
Proven secure? No (practical only)
Quantum-resistant? Yes
ZK circuit size: ~millions of gates
Has a mathematical formula
Proven secure? No (but analyzable)
Quantum-resistant? Uncertain
ZK circuit size: ~hundreds of gates
That difference in circuit size is staggering — millions vs hundreds. If you're building a ZK proof that needs to verify a hash, using Poseidon instead of SHA-256 makes your circuit thousands of times smaller and cheaper. That's why ZK-friendly hashes were invented: not because SHA is insecure, but because SHA is impossibly expensive to prove in a circuit.
Practical note: Ethereum and Bitcoin signatures use SHA (specifically Keccak-256 for Ethereum). If your ZK circuit needs to verify an Ethereum signature, you're stuck using SHA inside the circuit — and paying the cost. But if you're designing a new system from scratch, you'd choose Poseidon or similar to keep circuits small. This is a real engineering trade-off in ZK system design.
Ethereum Signatures — S, R, and V
The final topic connected everything back to blockchain. João explained the three components of an Ethereum ECDSA signature: S, R, and V. As someone who audits Move smart contracts daily, I've seen these values in transaction objects and signature verification code — but I'd never understood where they come from mathematically.
V: The recovery parameter. Allows recovering the public key (and therefore the address) from the signature itself.
This is why Ethereum transactions don't include the sender's address — it's derived from the signature using V. The signature proves who you are AND what you signed, simultaneously.
João mentioned that coding the ECDSA algorithm from scratch used to be one of the course homeworks — understanding where S comes from, where R comes from, and why V is needed. The S and R values come from elliptic curve point operations (which we'll learn about when we cover elliptic curves). V essentially tells the verifier which of two possible public keys the signature corresponds to, since the math produces an ambiguity that V resolves.
The security researcher angle: Understanding that V enables public key recovery explains why ecrecover in Solidity and similar functions in Move can derive a signer's address from just a signature. It also explains a class of bugs: if V is malleated (flipped between 27 and 28, or between 0 and 1 depending on the chain), the same message can produce two valid-looking signatures that recover to different addresses. Signature malleability bugs have been found in real protocols.
"Why Do We Need to Know This Math?"
A student asked the question that I think everyone in the cohort was thinking: "Would knowing this math help me spot security mistakes in ZK implementations?"
João's answer was nuanced and honest. It depends on what you're doing:
- Writing ZK circuits? You need to know how to write circuits. The abstract algebra is helpful context, but you don't need to derive Fermat's Little Theorem to write a valid Circom template. Know what a group is, what a field is, and you're functional.
- Implementing a library (SnarkJS, Plonky3)? You need deep math. Multiplying two 256-bit numbers efficiently requires Number Theoretic Transforms, which are built on group theory. The math isn't optional — it's the implementation.
- Reading documentation and papers? You need the vocabulary. Papers will say "group," "field," "isomorphism," "homomorphism" without explanation. If you can't parse these terms, you can't read the literature.
- Auditing ZK systems? You need to know what the constraints should be. If a protocol uses a non-prime modulus where a prime is required, if a circuit fails to constrain a witness value, if a hash function is used where a ZK-friendly hash should be — you need enough understanding to recognize when something is wrong.
João's broader point resonated with how I think about auditing: "You can always go deeper. You can always ask 'why' one more time. But at some point, you accept certain facts and work with them. The important thing is to know enough that when something doesn't look right, you notice."
That's exactly how security research works too. You don't need to be able to prove every theorem. But you need enough understanding that when a protocol says "we use a 256-bit prime field" and you see they're actually using a non-prime modulus, alarm bells go off.
What's Next
Next week: elliptic curves. We're finally going to construct a group from geometric objects — points on a curve — and see the point addition operation that makes elliptic curve cryptography work. Everything from Weeks 1-3 (modular arithmetic, group theory, why primes matter) converges in elliptic curves. The finite field we built is the coordinate system. The group properties we learned are what the points satisfy. The prime modulus we proved is necessary is the field those coordinates live in.
João mentioned homomorphisms and isomorphisms — maps between groups where one direction is easy and the other is hard. That's the mathematical formalization of public-key cryptography: easy to compute a public key from a private key, hard to go back. Elliptic curves are where that idea becomes concrete.
Week 3 done. The theory has teeth now — primes aren't arbitrary, hash functions aren't magic, and every Ethereum signature carries elliptic curve math inside it. Next week we meet the curves themselves. 🔑
📝 ZK and cryptography evolve fast — if any tip here is outdated, incorrect, or no longer applies, please reach out on X so I can update this article accordingly.
Instructor: João Paulo Morais (PhD Physics, 12+ years in ZK)
Duration: ~13-14 weeks
Goal: Code Groth16 from scratch
Week: 3 / 14
Format: Office hours Q&A — theory meets practice
Topics: Inverse proofs, units, discrete log sizes, ECC vs ElGamal, quantum, hash functions, ECDSA
Rare Skills: rareskills.io
Follow my journey: @thepantherplus