Schemata uses 80+ unique regex patterns across hundreds of JSON Schema files to validate Nostr data structures. This document catalogs every pattern, organized by category, with examples and codegen classification.
- 1. Cryptographic
- 2. Network
- 3. Nostr Coordinates
- 4. Numeric
- 5. Structured Metadata (imeta)
- 6. Identity
- 7. Protocol-Specific
- 8. Codegen Pipeline
Hex-encoded cryptographic values used throughout Nostr events.
| Pattern | Description | Used For | Codegen Op |
|---|---|---|---|
^[a-f0-9]{64}$ |
32-byte lowercase hex | pubkey, event id, shared secret | hex(64, lower) |
^[a-fA-F0-9]{64}$ |
32-byte mixed-case hex | SHA-256 hash (case-insensitive) | hex(64, mixed) |
^[a-f0-9]{128}$ |
64-byte lowercase hex | Schnorr signature | hex(128, lower) |
^[a-f0-9]{40}$ |
20-byte lowercase hex | Git SHA-1 hash | hex(40, lower) |
^[a-fA-F0-9]{40}$ |
20-byte mixed-case hex | SHA-1 hash (case-insensitive) | hex(40, mixed) |
^[a-f0-9]{7,40}$ |
Abbreviated git SHA | Git short hash | hex_range(7, 40, lower) |
^0x[0-9a-f]{4}$ |
4-digit hex with 0x prefix |
Color codes, identifiers | hex_prefixed("0x", 4) |
| `^(02 | 03)[a-f0-9]{64}$` | Compressed secp256k1 public key (33 bytes) | NIP-61 P2PK pubkey tag |
Examples:
- Valid pubkey (x-only):
a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2 - Valid compressed pubkey (P2PK):
0229f1fad861410f3dcabb3cd75ceb0e8b7cc6a8d1fa17dbd10e8133c000326a96 - Valid signature: 128 hex chars
- Invalid:
A1B2C3...(uppercase in lowercase-only field),xyz123(non-hex)
URL patterns for relay WebSocket connections, HTTP endpoints, and streaming protocols.
| Pattern | Description | Used For | Codegen Op |
|---|---|---|---|
^wss?://[a-zA-Z0-9._-]+(?::[0-9]+)?(?:/.*)?$ |
Strict relay URL | Centralized relay URL type (@/relay-url.yaml) |
relay_url |
| `^(ws:// | wss://).+$` | Permissive relay URL | Legacy relay positions (being migrated) |
^wss?://.+ |
Relay URL (no anchor) | Relay hints | starts_with_any |
^(https?://).+$ |
HTTP(S) URL | Web resources | starts_with_any |
^https?://.+$ |
HTTP(S) URL (anchored) | Image/media URLs | starts_with_any |
^https?://\S+$ |
HTTP(S) no whitespace | Strict URL references | prefix_no_whitespace |
^(https?://|rtmp://|ws://|wss://).+$ |
Multi-protocol | Streaming URLs | starts_with_any |
^/.+ |
Absolute path | Resource paths | starts_with_any |
Valid relay URLs (strict pattern):
wss://relay.damus.io-- standard relaywss://relay.ditto.pub-- NIP-50 search relaywss://nos.lol-- public relayws://localhost:7777-- local devwss://relay.example_host.com-- underscore in hostnamewss://abc123def456.onion-- Tor hidden service
Invalid relay URLs (strict pattern):
wss://!!!-- invalid hostname charswss:// has spaces-- spaces not allowedhttp://relay.example.com-- wrong protocol
Event address patterns used in a tags and kind-specific references.
| Pattern | Description | Used For | Codegen Op |
|---|---|---|---|
^\d+:[a-f0-9]{64}:.+$ |
Generic address | a tag (kind:pubkey:dtag) |
a_tag |
^30311:[a-f0-9]{64}:.+$ |
Kind 30311 address | Live event a tag |
a_tag |
^30312:[a-f0-9]{64}:.+$ |
Kind 30312 address | Room a tag |
a_tag |
^31990:[a-f0-9]{64}:.+$ |
Kind 31990 address | NIP-89 handler address | a_tag |
^(31922|31923):[a-f0-9]{64}:.+$ |
Calendar kinds | Calendar event address | a_tag |
Example: 30311:a1b2c3...64hex...:my-live-event
Number patterns for timestamps, dimensions, and amounts.
| Pattern | Description | Used For | Codegen Op |
|---|---|---|---|
^[0-9]+$ |
Unsigned integer string | Timestamps, amounts, expiration | all_digits |
^\d+$ |
Unsigned integer (shorthand) | Equivalent to above | all_digits |
^-?[0-9]+$ |
Signed integer string | Relative values | all_digits(allowNeg) |
^\d+(?:\.\d+)?$ |
Decimal number | Prices, coordinates | decimal |
^\d+x\d+$ / ^[0-9]+x[0-9]+$ |
Dimensions | Image/video dimensions (WxH) | dim |
^[0-9]+(\.[0-9]+)*$ |
Version number | Semantic versioning | dotted_digits |
Examples:
1711234567-- valid timestamp-100-- valid signed integer19.99-- valid decimal1920x1080-- valid dimensions
Content prefix patterns for imeta tag entries (NIP-92).
| Pattern | Description | Codegen Op |
|---|---|---|
^url https?://\S+$ |
Media URL entry | prefix_no_whitespace |
^m [a-zA-Z].*/.* |
MIME type entry | mime_type |
^dim [0-9]{1,5}x[0-9]{1,5}$ |
Dimensions entry | imeta_dim |
^blurhash [A-Za-z0-9+/]+.*$ |
Blurhash entry | prefix_nonempty |
^x [a-f0-9]{64}$ |
SHA-256 hash entry | hex_prefixed |
^alt .+$ |
Alt text entry | prefix_nonempty |
^ox [a-f0-9]{64}$ |
Original hash entry | hex_prefixed |
^size \d+$ |
File size entry | prefix_delim_rest |
^fallback https?://\S+$ |
Fallback URL entry | prefix_no_whitespace |
^thumb \d+x\d+$ |
Thumbnail dimensions | dim (prefixed) |
Patterns for identifiers, bech32-encoded entities, and user references.
| Pattern | Description | Used For | Codegen Op |
|---|---|---|---|
^npub1[02-9ac-hj-np-z]{58}$ |
Public key (fixed 63 chars) | npub display format |
bech32("npub", 58) |
^note1[02-9ac-hj-np-z]{58}$ |
Event ID (fixed 63 chars) | note display format |
bech32("note", 58) |
^nprofile1[02-9ac-hj-np-z]+$ |
Profile (TLV, variable) | nprofile entities |
bech32("nprofile") |
^nevent1[02-9ac-hj-np-z]+$ |
Event (TLV, variable) | nevent entities |
bech32("nevent") |
^naddr1[02-9ac-hj-np-z]+$ |
Address (TLV, variable) | naddr entities |
bech32("naddr") |
^nostr:(npub|note)1... |
NIP-27 URI | nostr: protocol links |
nostr_uri |
^lnurl1[02-9ac-hj-np-z]+$ |
LNURL (bech32) | Lightning URL encoding | bech32("lnurl") |
Note: These patterns validate character set, prefix, and length only. Bech32 checksum validation requires actual decoding and is not expressible in regex. Use a bech32 library for full validation.
The bech32 character set [02-9ac-hj-np-z] is the standard Bech32 alphabet, which excludes
1 (separator), b, i, o (confusable with digits) from lowercase alphanumeric.
| Pattern | Description | Codegen Op |
|---|---|---|
^(([_A-Za-z0-9.-]+)|_)@... |
NIP-05 identifier | nip05_identifier |
^[a-z0-9._-]+$ |
Simple identifier | chars_in("a-z0-9._-") |
^[A-Za-z0-9]+$ |
Alphanumeric | chars_in("A-Za-z0-9") |
^[A-Za-z]+$ |
Alpha only | chars_in("A-Za-z") |
^[A-Z]+$ |
Uppercase alpha | chars_in("A-Z") |
^[A-Za-z]{3,6}$ |
Short alpha code | chars_in("A-Za-z", 3, 6) |
| Pattern | Description | Codegen Op |
|---|---|---|
^lnbc[a-z0-9]*1[02-9ac-hj-np-z]+$ |
BOLT11 invoice | ln_invoice |
^lnurl1[02-9ac-hj-np-z]+$ |
LNURL bech32 | bech32("lnurl") |
| Pattern | Description | Codegen Op |
|---|---|---|
^[a-f0-9]{40}$ |
SHA-1 commit hash | hex(40, lower) |
^[a-f0-9]{7,40}$ |
Abbreviated commit | hex_range(7, 40) |
^refs/(heads|tags)/[^\s]+$ |
Git branch/tag ref | prefix_no_whitespace |
^refs/.* |
Any git ref | starts_with_any |
| Pattern | Description | Codegen Op |
|---|---|---|
^[0-9]{4}-[0-9]{2}-[0-9]{2}$ |
ISO 8601 date | date_iso |
^\d{4}-\d{2}-\d{2}(T\d{2}:\d{2}...)?$ |
Full ISO 8601 datetime | datetime_iso |
| Pattern | Description | Codegen Op |
|---|---|---|
^[A-Za-z0-9][A-Za-z0-9!#$&^_.+-]*/... |
MIME type (RFC 2045) | mime_type_strict |
| Pattern | Description | Codegen Op |
|---|---|---|
^-----BEGIN PGP SIGNATURE-----... |
PGP signature block | wrapped |
| Pattern | Description | Codegen Op |
|---|---|---|
^(?:[A-Za-z0-9+/]{4})*... |
Base64 (standard) | base64 |
^[A-Za-z0-9+/]+={0,2}\?iv=... |
Base64 with IV (NIP-04/44) | nip04_encrypted |
^10\.\d{4,9}/[-._;()/:A-Z0-9]+$ |
DOI (Digital Object Identifier) | doi |
The schemata-codegen tool converts these regex patterns into native, regex-free code for multiple target languages.
- Extract: Codegen reads
dist/JSON schemas and extracts allpatternfields - Classify:
classify-pattern.tsconverts each regex into aPatternCheckIR node - Emit: Language-specific emitters render each
PatternCheckas native code
PatternCheck =
| hex(len, case) -- fixed-length hex string
| hex_range(min, max, case) -- variable-length hex string
| hex_prefixed(prefix, len) -- hex with literal prefix
| all_digits(allowNeg?) -- numeric string
| starts_with_any(prefixes) -- prefix check
| chars_in(charset, min?, max?) -- character set validation
| bech32(hrp, dataLen?) -- bech32-encoded string
| date_iso -- ISO 8601 date (YYYY-MM-DD)
| datetime_iso -- ISO 8601 datetime with optional time/timezone
| decimal -- decimal number (digits with optional .digits)
| relay_url -- WebSocket relay URL with hostname/port/path validation
| a_tag -- Nostr coordinate (kind:hex64:dtag)
| nostr_uri -- NIP-27 nostr: URI with bech32 entity
| ln_invoice -- BOLT11 Lightning invoice
| nip05_identifier -- NIP-05 internet identifier
| nip04_encrypted -- NIP-04 encrypted payload (base64?iv=base64)
| mime_type / mime_type_strict -- RFC 2045 content type
| base64 -- standard Base64
| doi -- Digital Object Identifier
| dim -- dimensions (WxH)
| imeta_dim -- prefixed dimensions (dim WxH)
| prefix_no_whitespace -- prefix + no-whitespace tail
| prefix_nonempty -- prefix + non-empty tail
| prefix_delim_rest -- charset + delimiter + tail
| wrapped(open, close) -- delimited block (e.g., PGP signature)
| compound(checks[]) -- AND combination
| regex(pattern) -- fallback (preserved as-is)
Input: ^[a-f0-9]{64}$
Output: { op: 'hex', len: 64, case: 'lower' }
C: check_hex_64(s)
--> iterates 64 chars, checks each is 0-9 or a-f, verifies null terminator
Rust: check_hex_64(s)
--> s.len() == 64 && s.bytes().all(|b| matches!(b, b'0'..=b'9' | b'a'..=b'f'))
Go: checkHex64(s)
--> len(s) == 64 && all chars in [0-9a-f]
100% of patterns are classified into native ops (no regex dependency needed). All 80 unique patterns in the current schema set have dedicated native implementations across 13 languages: C, Rust, Go, TypeScript, Python, Java, Kotlin, Swift, Dart, C#, C++, PHP, Ruby.
This means generated validators can run without importing any regex library. Each native op is implemented with exhaustive compiler guards (TypeScript switch with never default, Rust match exhaustiveness, etc.) so adding a new op is a compile error until all 13 emitters handle it.
- Add the regex to the appropriate schema YAML in
nips/ - Run
pnpm buildto generatedist/JSON - If the pattern should be native, add a classifier in
classify-pattern.ts - Add rendering in each emitter (
emit-c.ts,emit-rust.ts, etc.) - Add tests in
tests/classify-pattern.test.ts
- schemata-codegen -- pattern classifier and multi-language code generator
- NIP-01 -- base protocol types
- NIP-19 -- bech32 entity encoding