Tighten cuid() regex and deprecate CUID v1#5880
Conversation
|
@pullfrog review |
|
Reviewed PR #5880. One inline comment: the v3 Task list (5/5 completed)
|
There was a problem hiding this comment.
Clean, well-motivated change. The regex tightening and deprecation are both correct in v4. One issue in v3: the /i flag on the new regex silently defeats the base36 restriction for the body characters.
Task list (5/5 completed)
- Read the full diff and understand the changes
- Review regex changes in v3 and v4 for correctness
- Review deprecation annotations
- Review test changes and coverage
- Submit review
Claude Opus | 𝕏
| } | ||
|
|
||
| const cuidRegex = /^c[^\s-]{8,}$/i; | ||
| const cuidRegex = /^c[0-9a-z]{6,}$/i; |
There was a problem hiding this comment.
The /i flag makes [0-9a-z] match uppercase A-Z too, so "cAAAAAAAAA" still passes in v3 even though the intent is to restrict the body to base36 (lowercase). The old regex [^\s-] was inherently case-agnostic so /i was a no-op, but with the new allowlist character class it has real semantic impact.
The v4 regex in regexes.ts gets this right — it drops /i and uses [cC] explicitly. The v3 regex should match:
| const cuidRegex = /^c[0-9a-z]{6,}$/i; | |
| const cuidRegex = /^[cC][0-9a-z]{6,}$/; |
- Change cuid regex from `/^[cC][^\s-]{8,}$/` to `/^[cC][0-9a-z]{6,}$/`
in v4. Restricts the body to base36 (the actual CUID v1 charset) and
aligns the length floor with Eric Elliott's documented guidance
("starts with c, contains at least 7 chars", paralleldrive/cuid#88).
Brings the regex stylistically in line with the other ID validators
(ulid, xid, ksuid, nanoid, cuid2), all of which use allowlist
character classes.
The change is backwards-compatible for any real CUID v1 output
(always lowercase base36) and additionally accepts shorter cuid-shaped
values that the prior `{8,}` floor incorrectly rejected. It rejects
the punctuation-shaped false positives reported in #3621 and similar.
- Mark CUID v1 surfaces `@deprecated` (function, chain method, schema
class, internals, regex export) pointing at z.cuid2(). Per the
upstream repo, "Deprecated collision-resistant id spec. Insecure
because it leaks timestamps. Use cuid2 instead."
- Add regression tests asserting that previously-accepted non-base36
payloads are now rejected. Update one template-literal test that was
relying on the old {8,} floor.
Zod 3 is in maintenance mode and is intentionally left untouched.
4a34790 to
c81b9e7
Compare
|
TL;DR — Replaces the overly permissive CUID v1 regex ( Key changes
Summary | 9 files | 1 commit | base: Stricter CUID v1 regex
The old regex was a denylist: it accepted anything that wasn't whitespace or a hyphen. This meant strings like
CUID v1 deprecation
The deprecation message is consistent across all three packages (core, classic, mini) and reads: "CUID v1 is deprecated by its authors due to information leakage (timestamps embedded in the id). Use
|
There was a problem hiding this comment.
No new issues. Reviewed the following changes:
- Dropped v3 regex and test changes, scoping the PR to v4 only (commit message now notes "Zod 3 is in maintenance mode and is intentionally left untouched")
- All v4 changes (regex tightening, deprecation annotations, tests) are unchanged from the previous version
Prior review feedback:
-
/iflag on v3 regex defeats base36 restriction — addressed by removing v3 changes entirely
Claude Opus | 𝕏
Summary
/^[cC][^\s-]{8,}$/to/^[cC][0-9a-z]{6,}$/in both v3 and v4. The body is now restricted to base36 — the actual CUID v1 charset — and the length floor is aligned with Eric Elliott's documented guidance ("starts with c, contains at least 7 chars"). Brings the regex stylistically in line with the other ID validators (ulid,xid,ksuid,nanoid,cuid2), all of which use allowlist character classes.z.cuid(),.cuid()chain method,ZodCUID/ZodMiniCUID,$ZodCUID*,_cuid, thecuidregex export) as@deprecated, pointing atz.cuid2(). The upstreamparalleldrive/cuidrepo itself reads: "Deprecated collision-resistant id spec. Insecure because it leaks timestamps. Use cuid2 instead."Why
The
{8,}length floor was introduced in #438 (May 2021) without documented rationale and has been frozen since. The original PR made an arbitrary choice that diverged from the spec author's "≥7 chars" guidance by accident.The denylist body (
[^\s-]) was also unprincipled — it allowed quotes, angle brackets, curly braces, semicolons, and arbitrary Unicode in a string nominally identifying itself as a CUID. Real CUID v1 output is alwayscfollowed by 24 base36 characters; the validator should reflect that. This change rejects the punctuation-shaped false positives reported in #3621 (cly63t164000245zw008pggon';select1;and similar) while remaining compatible with every real CUID v1 ever generated.The change is strictly more permissive in length (
{8,}→{6,}) and strictly stricter in charset ([^\s-]→[0-9a-z]). No real CUID is rejected; some non-CUID strings that previously passed now correctly fail.This is not a security fix and should not be framed as one. CUID format validation is not — and cannot be — a substitute for output encoding (XSS prevention) or parameterized queries (SQLi prevention). It's a routine correctness/consistency improvement.
Known limitation (not addressed)
#3053 (
corrigendumvalidates as a cuid) is not fully fixed. Any all-lowercase English word starting withcwill still pass — that's an inherent limitation of regex-based CUID validation without a fixed length, not a bug we can solve here. Deprecatingz.cuid()in favor ofz.cuid2()is the proper long-term resolution.Test plan
pnpm vitest run— 3575 tests passpnpm lintclean via lint-staged on commit)Type Errors no errorsin vitest output)cuidtests in v3 and v4 still pass with updated snapshots{8,}length floor