Atomic-write pattern for shared state files (POSIX os.replace)
Atomic-Write Pattern for Shared State Files (POSIX os.replace)
1. Why This Matters
In a multi-session environment where hooks, tools, and daemons write to shared state files (JSON configs, task markers, session identifiers), a naive open() + write() + close() pattern creates a torn-write hazard:
- Concurrent sessions racing to write the same file can corrupt each other's writes (last-writer-wins with no atomicity guarantee)
- Crash mid-write (SIGKILL, disk-full, context compaction, kernel panic) leaves the file in a partial or zero-byte state
- Silent corruption of session isolation guarantees — hooks reading an empty or malformed file may silently fall back to legacy global state or fail-open, defeating ZAKON enforcement
Impact: ZAKON #27 (active-thread enforcement) and ZAKON #28 (max-depth gate) rely on per-session state files that must NEVER contain partial writes. A torn write to /tmp/mc-active-task-$PID causes the hook to fall back to the global /tmp/mc-active-task, silently defeating session isolation.
2. The Pattern — POSIX Atomic Rename
2.1 Python Pattern
The correct pattern uses tempfile + fsync + os.replace() to guarantee atomicity:
import os
import tempfile
def write_active_task(task_id, claude_pid=None):
"""Write active task for this session (atomic POSIX rename pattern).
Writes to a tempfile in the same directory as the target, then uses
os.replace() for an atomic swap. A crash or SIGKILL during the write
leaves the target either absent (first write) or containing the previous
complete value — never a partial write.
"""
task_file = get_session_task_file(claude_pid)
dir_ = os.path.dirname(task_file) or "."
fd, tmp = tempfile.mkstemp(prefix=".active-task-", dir=dir_)
try:
with os.fdopen(fd, "w") as f:
f.write(str(task_id))
f.flush()
os.fsync(f.fileno())
os.replace(tmp, task_file)
except Exception:
try:
os.unlink(tmp)
except OSError:
pass
raise
Why this works:
tempfile.mkstemp()creates a unique temp file in the SAME directory (same filesystem) as the target- Write content to the temp file, flush buffers, call
fsync()to ensure data is on disk os.replace(tmp, target)performs an atomic rename — POSIX guarantees this is a single syscall- Readers see either the old complete file OR the new complete file — never a partial write
- If the process crashes before
os.replace(), the temp file is abandoned but the target is untouched (or absent if first write)
2.2 Bash Pattern
For bash hooks writing to state files, use mktemp + mv pattern:
# Atomic write in bash using mktemp + mv
TARGET="/tmp/some-state-file.json"
CONTENT='{"count":0,"ts":"2026-05-03T10:00:00Z"}'
# Create temp file in same directory as target (same filesystem requirement)
TMP=$(mktemp "${TARGET}.XXXXXX")
echo "$CONTENT" > "$TMP"
mv -f "$TMP" "$TARGET" # POSIX atomic on same filesystem
Why mv is atomic: On POSIX, mv within the same filesystem calls rename(2), which is atomic. Same guarantee as Python's os.replace().
Constraints:
mktemptemplate must use same directory as$TARGET(guarantees same filesystem, required for atomicmv)- Use
printforechoto write to$TMP, NOT to$TARGET mv -fatomically replaces$TARGET(POSIX guarantees this on same filesystem)- No portable
fsyncin bash — durability across power loss requires Python/Node.js with explicitos.fsync()
3. What It Replaces — The Anti-Pattern
3.1 Python Anti-Pattern
DO NOT USE:
# WRONG — non-atomic, torn-write hazard
def write_active_task_WRONG(task_id, task_file):
with open(task_file, "w") as f:
f.write(str(task_id))
Why this is broken:
- The
open("w")call truncates the file immediately (size=0 bytes) - The
write()may be buffered and not hit disk untilclose()or explicitflush() - A SIGKILL or crash between truncate and flush leaves a zero-byte file
- A concurrent reader during the write window sees partial content or empty file
- No reader/writer can distinguish "empty because not written yet" from "empty because crashed mid-write"
3.2 Bash Anti-Pattern
DO NOT USE:
# WRONG — torn-write hazard in bash
echo "$TASK_ID" > /tmp/mc-active-task-$$
The > operator truncates the file immediately, then writes. A crash between truncate and write completion leaves a zero-byte or partial file — identical hazard to the Python anti-pattern.
4. Same-Filesystem Requirement
The dir= kwarg in tempfile.mkstemp(prefix=".active-task-", dir=dir_) is critical:
os.replace()is atomic ONLY when the source and target are on the same filesystem- Cross-device rename (e.g.,
/tmp→/homeon different partitions) degrades to copy-then-delete, which is NOT atomic - By creating the temp file in the same directory as the target (
os.path.dirname(task_file)), we guarantee same-device - If
dirnameis empty (target in cwd), fallback to"."
Verification: df -h /tmp vs df -h ~/.claude/hooks — if different mount points, you MUST use dir= kwarg with target's parent directory.
For bash: Use mktemp "${TARGET}.XXXXXX" template — the suffix pattern ensures temp file is created in the same directory as $TARGET.
5. Crash Recovery Semantics
| Scenario | Before os.replace() |
After os.replace() |
|---|---|---|
| First write, no prior file | Target absent, temp exists | Target exists with new content |
| Overwrite existing file | Target has old content, temp exists | Target has new content |
Crash during write() |
Target unchanged (or absent), temp partial/incomplete | N/A — replace() never called |
Crash during fsync() |
Target unchanged, temp may have partial data on disk | N/A |
Crash after os.replace() |
N/A | Target has new complete content (atomic swap already done) |
Key guarantee: The target file NEVER contains partial writes. A reader always sees either:
- File absent (no write has completed yet), OR
- File with the last successfully-completed write's full content
The exception handler (except: os.unlink(tmp)) cleans up the temp file on failure, preventing temp-file accumulation.
6. Testing Pattern
Unit test crash-recovery by mocking the write to raise an exception:
import unittest
import os
import tempfile
from unittest.mock import patch, mock_open
class TestAtomicWrite(unittest.TestCase):
def test_crash_during_overwrite_preserves_old_content(self):
"""If write crashes after target exists, old content is preserved."""
with tempfile.TemporaryDirectory() as tmpdir:
target = os.path.join(tmpdir, "test-task.txt")
# Write initial content
with open(target, "w") as f:
f.write("OLD-TASK-11111")
# Simulate crash during second write
with patch("builtins.open", side_effect=IOError("Simulated crash")):
with self.assertRaises(IOError):
write_active_task_atomic("NEW-TASK-22222", target)
# Old content must survive
with open(target, "r") as f:
content = f.read()
self.assertEqual(content, "OLD-TASK-11111")
# No temp files leaked
leaked_temps = [f for f in os.listdir(tmpdir) if f.startswith(".active-task-")]
self.assertEqual(len(leaked_temps), 0)
What this validates:
- Exception during write → old content survives intact
- No temp files leaked to disk (cleanup path works)
- File state is never partial or corrupt
7. When to Apply
Use this pattern for any hook/lib writing JSON or state files where torn writes = corruption:
/tmp/mc-active-task-$SESSION_ID— ZAKON #28 depth gate relies on this/tmp/active-thread-$SESSION_ID.txt— ZAKON #27 active-thread enforcement shadow file~/.claude/session-state.mdshadow files (if per-session scoping is added)- Counter files (
/tmp/john-mc-turn-counter.json,/tmp/ceo-approved-token-uses-*.count) - Mehanik clearance markers (
/tmp/mehanik-cleared-<MC>with session_id field) - Any file where a concurrent reader must NEVER see partial data
Do NOT use for:
- Log files (append-only, partial writes acceptable)
- Human-edited markdown files (git-tracked, editor handles temp files)
- SQLite databases (has internal transaction layer)
8. Sites Covered
This pattern has been applied to the following high-risk state file writes:
8.1 Python Sites (Phase 2A — MC #99076)
~/.claude/hooks/archive/lib-legacy/session_id.py:138-161—write_active_task()function (S8 surface:/tmp/mc-active-task-$SESSION_ID)
8.2 Bash Hook Sites (Phase 2B-2 — MC #99080)
8 atomic-write patches applied across 4 hooks covering surfaces S3, S8, S9, S10:
| File | Line | Pattern | Surface | Description |
|---|---|---|---|---|
mc-turn-reset.sh |
12 | Python tempfile.mkstemp + os.replace |
S8 | Reset MC turn counter |
mc-turn-reset.sh |
20 | Bash mktemp + mv |
S3 | Reset CEO_APPROVED token counter |
mc-turn-reset.sh |
23 | Bash mktemp + mv |
S9 | Reset dispatch turn counter |
ceo-intent-classifier.sh |
38 | Python tempfile.mkstemp + os.replace |
S10 | Write CEO intent classification |
one-ceo-turn-dispatch-cap.sh |
33 | Python tempfile.mkstemp + os.replace |
S9 | Increment dispatch counter |
one-ceo-turn-dispatch-cap.sh |
50 | Python tempfile.mkstemp + os.replace |
S9 | Rollback dispatch counter on failure |
one-ceo-turn-mc-cap.sh |
40 | Python tempfile.mkstemp + os.replace |
S8 | Increment MC add counter |
one-ceo-turn-mc-cap.sh |
59 | Python tempfile.mkstemp + os.replace |
S8 | Rollback MC counter on failure |
Validation: All 8 sites passed Proveo crash-safety testing (AC5: runtime exception AFTER write+fsync but BEFORE os.replace/mv — old content preserved, no temp file leak). See /tmp/proveo-99080-2026-05-03.json.
8.3 Shadow-File Pattern for Human-Editable Shared State (Phase 2D — MC #99084)
For human-readable source files that must remain unmodified by automation (e.g., ~/.claude/session-state.md) but where enforcement hooks need per-session isolation, Phase 2D introduced the shadow-file pattern:
When to Use Shadow Files
- The source file is human-editable markdown or config that the CEO directly modifies
- Enforcement hooks need to read session-specific values without blocking concurrent sessions
- Direct atomic write to the human-readable source would defeat its purpose (CEO must see/edit the canonical value)
- Session isolation requires structural sharding (separate files per session), not locking
The Shadow-File Pattern
Write a per-session machine-readable shadow file at /tmp/<key>-${SESSION_ID}.txt (atomically via mktemp+mv) at the same point the human-readable source is updated. Enforcement hooks read shadow-first with fallback to the human-readable source.
# Shadow write (in user-message-logger.sh at UserPromptSubmit)
# SESSION_ID resolution: stdin JSON → env CLAUDE_SESSION_ID → pid-$$ → REJECT (never "default")
_SHADOW_SESSION_ID="$SESSION_ID"
if [[ -z "$_SHADOW_SESSION_ID" ]]; then
_SHADOW_SESSION_ID="${CLAUDE_SESSION_ID:-}"
fi
if [[ -z "$_SHADOW_SESSION_ID" ]]; then
_SHADOW_SESSION_ID="pid-$$"
fi
_SHADOW_TARGET="/tmp/active-thread-${_SHADOW_SESSION_ID}.txt"
_SESSION_STATE_FILE="$HOME/.claude/session-state.md"
# Extract ACTIVE_THREAD IDs from session-state.md
_ACTIVE_THREAD_VALUE=$(python3 -c "
import re, sys
with open('$_SESSION_STATE_FILE', 'r') as f:
content = f.read()
match = re.search(r'## ACTIVE_THREAD:.*?(?=\n---|\n## [A-Z]|\Z)', content, re.DOTALL)
if not match:
sys.exit(1)
block = match.group(0)
ids = re.findall(r'#(\d{4,6})', block)
print('\n'.join(sorted(set(ids))))
" 2>/dev/null)
if [[ -n "$_ACTIVE_THREAD_VALUE" ]]; then
# Atomic write: mktemp + mv
_SHADOW_TMP=$(mktemp "${_SHADOW_TARGET}.XXXXXX")
printf '%s\n' "$_ACTIVE_THREAD_VALUE" > "$_SHADOW_TMP"
mv -f "$_SHADOW_TMP" "$_SHADOW_TARGET"
fi
# Shadow-first read (in active-thread-lock.sh)
_SHADOW_PATH="/tmp/active-thread-${SESSION_ID}.txt"
APPROVED_IDS=""
if [[ -f "$_SHADOW_PATH" ]]; then
# Shadow file present: read per-session ACTIVE_THREAD (atomic, no stale-read risk)
APPROVED_IDS=$(cat "$_SHADOW_PATH" 2>/dev/null || echo "")
else
# Fallback: read session-state.md (global, backward-compatible)
if [[ ! -f "$SESSION_STATE" ]]; then
echo "[active-thread-lock] session-state.md not found and no shadow file — fail-open." >&2
exit 0
fi
APPROVED_IDS=$(python3 -c "
import re, sys
with open('$SESSION_STATE', 'r') as f:
content = f.read()
match = re.search(r'## ACTIVE_THREAD:.*?(?=\n---|\n## [A-Z]|\Z)', content, re.DOTALL)
if match:
block = match.group(0)
ids = re.findall(r'#(\d{4,6})', block)
print('\n'.join(sorted(set(ids))))
" 2>/dev/null)
fi
Properties
- Structural isolation: Sessions read from sharded storage (
/tmp/active-thread-${SESSION_ID}.txt), no lock contention - CEO-facing source unchanged:
~/.claude/session-state.mdremains canonical human-editable markdown - SESSION_ID resolution chain: stdin JSON → env
CLAUDE_SESSION_ID→ pid-$$ → REJECT (NEVER literal "default") - Fail-open fallback: If shadow absent, enforcement reads
session-state.md(backward-compatible with pre-Phase-2D behavior) - Atomic shadow write: mktemp+mv ensures concurrent sessions cannot corrupt each other's shadow files
Shadow-File Sites
~/.claude/hooks/user-message-logger.shlines 49-84 — Shadow write for/tmp/active-thread-${SESSION_ID}.txt(ACTIVE_THREAD extraction from session-state.md)~/.claude/hooks/active-thread-lock.shlines 23-46 (SESSION_ID resolution) + lines 84-114 (shadow-first read with session-state.md fallback)
Validation: Proveo PASS (6/6 ACs) — concurrent sessions with distinct session_id values read their own shadow files with no cross-session leak. Sessions without shadow files fall back to session-state.md with identical enforcement behavior. No "default" terminal value. See /tmp/proveo-99084-2026-05-03.json.
9. Reference
- MC #99076 — Phase 2A atomic-write patch on
session_id.py(Python pattern) - MC #99080 — Phase 2B-2 atomic-write patches on 4 bash hooks (8 line-level sites)
- MC #99084 — Phase 2D shadow-file pattern for human-editable shared state (session-state.md ACTIVE_THREAD field)
- MC #99078 — Phase 2B-1 bash atomicity audit (identified 8 UNSAFE sites)
- MC #99069 — Session Isolation Audit (parent task, genesis of the finding)
- Spec:
~/system/specs/session-isolation-audit-2026-05-03.md§3 W1 (Weakness 1) + Appendix A - Spec:
~/system/specs/bash-atomicity-audit-2026-05-03.md— Phase 2B-1 full inventory + fix templates - Source:
~/.claude/hooks/archive/lib-legacy/session_id.pylines 138-161 (Python pattern reference) - Source:
~/.claude/hooks/mc-turn-reset.sh,ceo-intent-classifier.sh,one-ceo-turn-dispatch-cap.sh,one-ceo-turn-mc-cap.sh(bash pattern implementations) - Source:
~/.claude/hooks/user-message-logger.sh(shadow write implementation),~/.claude/hooks/active-thread-lock.sh(shadow-first read) - Tests:
~/.claude/hooks/archive/lib-legacy/test_session_id_atomic.py(5 unit tests covering crash-recovery) - Proveo Reports:
/tmp/postflight-99076/proveo-report.md(Phase 2A Python validation)/tmp/postflight-99080/proveo-report.md(Phase 2B-2 bash validation)/tmp/postflight-99084/proveo-report.md(Phase 2D shadow-file validation)
10. Further Reading
- Martin Kleppmann panelist review (
/tmp/forged-99069-martin-kleppmann.md§2 Weakness 1): "write_active_task() is not atomic. Lines 138-142 use a bare open(task_file, 'w') write with no mktemp + os.replace() pattern. If the hook is interrupted mid-write (SIGKILL, context compaction crash, disk-full), the file is left in a partial or zero-byte state." - POSIX rename(2) man page: "If newpath already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath will find it missing."
- Best-in-class reference:
one-ceo-turn-mc-cap.sh:108-113(already usedmktemp + mvfor counter increment before Phase 2B audit — correct pattern)
Generated by Skillforge for MC #99076 — Phase 2A Session Isolation Fix
Updated: 2026-05-03 (MC #99080 — Phase 2B-2 bash hook atomicity expansion)
Updated: 2026-05-03 (MC #99084 — Phase 2D shadow-file pattern for human-editable shared state)
Last verified: 2026-05-03 — Proveo Phase 2D report (PASS 6/6)
No comments to display
No comments to display