IMAP → Paperless Archive Pipe (archive.alai.no)
IMAP → Paperless Archive Pipe (archive.alai.no)
Overview
This pipe automates archival of email attachments (contracts, invoices, signed documents) from ALAI's IMAP inboxes into the centralized Paperless-ngx document management system at archive.alai.no.
Use Cases:
- Archive signed contracts received via email (e.g., SINTEF LOI, client MSAs)
- Store invoices, receipts, and financial documents
- Preserve legal correspondence with timestamped audit trail
- Upload arbitrary files that belong in long-term document archive
Architecture
The pipeline consists of two independent CLI tools that can be chained:
┌──────────────────┐
│ email-inbox.db │ (SQLite: all inboxes synced from one.com Dovecot IMAP)
└────────┬─────────┘
│
▼
┌────────────────────────────────────────┐
│ email-attachment-fetcher.js │ → /tmp/email-attachments/<msgid>/
│ (Extracts attachments from email DB) │
└────────┬───────────────────────────────┘
│
▼
┌────────────────────────────────────────┐
│ paperless-upload.js │ → HTTPS POST multipart/form-data
│ (Uploads file with metadata) │
└────────┬───────────────────────────────┘
│
▼ (3 headers: CF-Access-Client-Id, CF-Access-Client-Secret, Authorization)
│
┌────────────────────────────────────────┐
│ archive.alai.no/api/documents/ │ (Paperless-ngx behind CF Access)
│ post_document/ │
└─────────────────────────────────────────┘
Key Components:
- IMAP Source: one.com Dovecot server (imap.one.com:993) synced to ~/system/databases/email-inbox.db
- Fetcher:
/Users/makinja/system/tools/email-attachment-fetcher.js - Uploader:
/Users/makinja/system/tools/paperless-upload.js - Destination: Paperless-ngx on Azure VM (4.223.110.181) exposed via Cloudflare Access
Credentials
| Item Name | Bitwarden ID | Purpose | Fields |
|---|---|---|---|
| archive-alai-no CF Access | e4fd63de-5989-4316-9092-1dfa72f2d2ee |
CF Access service token for archive.alai.no | CF_ACCESS_CLIENT_ID, CF_ACCESS_CLIENT_SECRET |
| Paperless API Token — anvil | 94227e4d-c55a-48fa-9421-05c649c5451e |
Paperless API authentication | paperless_token |
Fetching Credentials:
BW_SESSION=$(cat /tmp/bw-session)
CF_CLIENT_ID=$(bw get item e4fd63de-5989-4316-9092-1dfa72f2d2ee --session "$BW_SESSION" | jq -r '.fields[] | select(.name=="CF_ACCESS_CLIENT_ID") | .value')
CF_CLIENT_SECRET=$(bw get item e4fd63de-5989-4316-9092-1dfa72f2d2ee --session "$BW_SESSION" | jq -r '.fields[] | select(.name=="CF_ACCESS_CLIENT_SECRET") | .value')
PAPERLESS_TOKEN=$(bw get item 94227e4d-c55a-48fa-9421-05c649c5451e --session "$BW_SESSION" | jq -r '.fields[] | select(.name=="paperless_token") | .value')
Note: Both scripts auto-fetch credentials from Bitwarden when BW_SESSION environment variable is set or /tmp/bw-session exists.
Usage Examples
Example 1: Archive a Single Email's Attachment
Most common workflow — fetch attachment from email DB and upload to Paperless:
# Step 1: Find the email ID (search by subject or sender)
node ~/system/tools/email-inbox.js list --account alem --limit 20
# Step 2: Extract attachments (creates /tmp/email-attachments/<msgid>/)
node ~/system/tools/email-attachment-fetcher.js 5480
# Step 3: Upload to Paperless with metadata
node ~/system/tools/paperless-upload.js \
--file "/tmp/email-attachments/<msgid>/SINTEF_LOI_signed.pdf" \
--correspondent "SINTEF" \
--document-type "Contract" \
--tags "legal,signed,sintef" \
--title "SINTEF Letter of Intent - Forskningsrådet Application"
Example 2: Archive Arbitrary File (Skip Email Fetch)
Upload any local file directly:
node ~/system/tools/paperless-upload.js \
--file "/Users/makinja/Downloads/Invoice_12345.pdf" \
--correspondent "SnowIT" \
--document-type "Invoice" \
--tags "billing,2026-05" \
--title "SnowIT Monthly Invoice - May 2026"
Example 3: SINTEF LOI First-Run (Historical Reference)
Exact command used for first production run (2026-05-08):
# Email ID 5480 from [email protected] inbox
node ~/system/tools/email-attachment-fetcher.js 5480
# Extracted: /tmp/email-attachments/<[email protected]>/SINTEF_LOI_signed.pdf
node ~/system/tools/paperless-upload.js \
--file "/tmp/email-attachments/[email protected]/SINTEF_LOI_signed.pdf" \
--correspondent "SINTEF" \
--document-type "Contract" \
--tags "legal,signed,sintef,forskningsradet" \
--title "SINTEF Letter of Intent - Forskningsrådet Application"
# Result: Paperless doc #127
# https://archive.alai.no/documents/127/
Example 4: Using Message-ID Instead of Email DB ID
node ~/system/tools/email-attachment-fetcher.js \
--message-id "<[email protected]>" \
--account alem
Script Details
email-attachment-fetcher.js
Location: /Users/makinja/system/tools/email-attachment-fetcher.js
SHA-256: a3a03d83516c2cc44bb8b0a3753d5c41f0feb9aff54f93fef5a1bb9e3699d739
Syntax:
node email-attachment-fetcher.js <email_db_id>
node email-attachment-fetcher.js --message-id <mid> --account <account>
Output: /tmp/email-attachments/<msgid>/<filename1>, <filename2>, ...
paperless-upload.js
Location: /Users/makinja/system/tools/paperless-upload.js
SHA-256: d185ed2f3f7ec816cb68f2a421e5762219449ebda420653d1a2f16558d2e06dd
Syntax:
node paperless-upload.js --file <path> [OPTIONS]
Options:
--correspondent NAME Auto-creates if missing
--document-type NAME Auto-creates if missing
--tags csv,list Auto-creates if missing
--title "Document Title"
--no-poll Skip task completion polling
Exit Codes:
0= Success1= Server error (network/API failure)2= Authentication failure3= Input validation error
Behavior:
- Polls Paperless task API for up to 30 seconds to confirm document consumption
- Auto-resolves correspondent/document-type/tag IDs via Paperless API (creates if missing)
- Sends 3 auth headers:
CF-Access-Client-Id,CF-Access-Client-Secret,Authorization: Token ...
CF Access Service-Token Rotation
Current Token:
- Created: 2026-05-08
- Expires: 2027-05-08 (1 year TTL)
- Bypass Policy ID:
5df57dcf-eeec-4634-8668-68d5b8751334
Rotation Procedure:
- Log in to Cloudflare Dashboard → Zero Trust → Access → Service Auth
- Find policy for
archive.alai.no - Click "Create Service Token" → name it
archive-pipe-YYYYMMv2 - Copy Client ID and Secret (shown only once)
- Update Bitwarden item
e4fd63de-5989-4316-9092-1dfa72f2d2ee:- Replace
CF_ACCESS_CLIENT_ID - Replace
CF_ACCESS_CLIENT_SECRET
- Replace
- Test with curl:
curl -I \ -H "CF-Access-Client-Id: <new_id>" \ -H "CF-Access-Client-Secret: <new_secret>" \ "https://archive.alai.no/api/" # Expected: HTTP 200 or 401 (not 302) - If 200 → revoke old token in Cloudflare dashboard
Troubleshooting
HTTP 302 Redirect from archive.alai.no
Symptom: curl returns 302 Found to Cloudflare login page
Cause: Missing or expired CF Access service token
Fix:
- Verify token exists in Bitwarden item
e4fd63de-5989-4316-9092-1dfa72f2d2ee - Check token expiry in Cloudflare dashboard (Zero Trust → Service Auth)
- If expired → rotate per procedure above
- Verify script is passing headers (check
paperless-upload.jscode around line 40-60)
HTTP 401 Unauthorized from Paperless API
Symptom: paperless-upload.js exits with code 2
Cause: Invalid or missing Paperless API token
Fix:
- Verify token in Bitwarden item
94227e4d-c55a-48fa-9421-05c649c5451e - Test token directly:
PAPERLESS_TOKEN="..." curl -s -H "Authorization: Token $PAPERLESS_TOKEN" \ -H "CF-Access-Client-Id: ..." \ -H "CF-Access-Client-Secret: ..." \ "https://archive.alai.no/api/correspondents/" | jq -r '.count' - If null or error → regenerate token in Paperless UI (Settings → API Tokens) and update Bitwarden
Tag/Correspondent/Document-Type Creation Failures
Symptom: Script errors with "Failed to create correspondent X"
Cause: Paperless API permissions or schema validation failure
Fix:
- Check Paperless UI → ensure API user has
documents.add_*permissions - Verify tag/correspondent names don't contain invalid characters (use alphanumeric + spaces only)
- Check Paperless logs on Azure VM:
ssh -i ~/.ssh/azure_alai [email protected] sudo docker logs paperless-webserver --tail 100
Email Attachment Not Found
Symptom: email-attachment-fetcher.js reports "No attachments found"
Causes:
- Email has no attachments (e.g., inline HTML only)
- Email not yet synced to
email-inbox.db(daemon runs every 5 minutes) - Wrong email ID or message-ID
Fix:
- Verify email exists:
node ~/system/tools/email-inbox.js show <id> - Force IMAP sync:
node ~/system/tools/email-inbox.js sync --account alem - Check attachment MIME parts in raw email (look for
Content-Disposition: attachment)
File Upload Stalls (No Response After 30s)
Cause: Paperless task processing slow or stuck
Fix:
- Use
--no-pollflag to skip task polling (upload completes instantly) - Check document manually in Paperless UI after 1-2 minutes
- Restart Paperless workers if stuck:
ssh -i ~/.ssh/azure_alai [email protected] sudo docker restart paperless-worker
Provenance
This runbook documents the IMAP→Paperless archive pipeline built and validated under:
- MC Task: #100004 (Subtask 4 of 5)
- Builder Teams:
- FlowForge (Subtask 1): CF Access service token creation
- CodeCraft (Subtask 2):
email-attachment-fetcher.jsCLI - CodeCraft (Subtask 3):
paperless-upload.jsCLI
- First Production Use: 2026-05-08 20:02 UTC (SINTEF LOI archive → Paperless doc #127)
- Documentation: Skillforge (Subtask 4)
- Operator: John (orchestrator)
- archive.alai.no — Paperless-ngx Setup & Operations (Infrastructure runbook)
- Email Inbox — Setup & Operations (IMAP sync daemon)
- Source:
~/system/tools/email-attachment-fetcher.js - Source:
~/system/tools/paperless-upload.js
Last Updated: 2026-05-08 | MC #100004 | Skillforge
No comments to display
No comments to display