# Email Address Validation (Pre-Send Gate)

# Email Address Validation (pre-send gate)

**Owner:** CodeCraft (tooling)
**Implemented:** 2026-04-18 (after Quran outreach 2-bounce incident)
**Script:** `~/system/tools/email-address-validate.js`
**Cache:** `~/system/databases/email-address-cache.sqlite` (7-day TTL)

## Why

First-send to uncatalogued institutional addresses (Al-Burhan, FIN Sarajevo) bounced because of typo/wrong-user assumptions. Adding an MX-lookup gate in front of SMTP send catches that class of failure before a message leaves the building.

## What it does

| Layer | Catches | Cost |
|-------|---------|------|
| Syntax check (RFC 5322 simplified) | `invalid@` or missing domain | negligible |
| DNS MX lookup | Nonexistent domain, missing MX records | ~50–200 ms first time, cached after |
| SMTP RCPT probe (optional, `--probe`) | Hard 550 rejections on strict servers | ~1–5 s |
| Cache | Repeat validation on known addresses | 0 ms |

## What it does NOT catch

**Gmail-hosted domains respond `250 accepted` to RCPT even for nonexistent recipients.** The real NDR arrives seconds/minutes later from the submission pipeline. Since many academic/institutional domains are on Google Workspace, RCPT probing is not a reliable filter for the class of errors we hit.

**Mitigation:** the validator prints a WARN when sending to a first-seen Gmail-hosted address, instructing the caller to verify manually.

## Integration

`mail-native.js sendEmail()` calls the validator before the Himalaya/SMTP path. Hard-block on `exit=1` (no MX / syntax). `--force` bypasses.

## CLI

```bash
node ~/system/tools/email-address-validate.js <email>          # syntax + MX, cached
node ~/system/tools/email-address-validate.js <email> --probe  # + SMTP RCPT (see caveat above)
node ~/system/tools/email-address-validate.js <email> --force  # skip cache
```

Exit codes: 0 valid, 1 invalid, 2 transient/unknown.

## Related

- `~/system/docs/quran-research/` — the outbound sprint that exposed the gap.
- `~/system/tools/email-safety.js` — existing content/subject gates.