Git WAL¶
This document describes the Write-Ahead Log (WAL) format used by grite.
Overview¶
The WAL is an append-only event log stored in git refs. It is the source of truth for all grite state.
Location¶
This ref points to a git commit containing the current WAL state.
Structure¶
refs/grite/wal
└── commit (HEAD of WAL)
├── parent commit (previous WAL state)
└── tree
└── blob (CBOR-encoded events)
Commit Chain¶
Each append creates a new commit:
The ref always points to the latest commit.
Tree Layout¶
Each commit contains a tree with event data:
Chunk Format¶
Events are encoded in chunks using the GRITCHNK format.
Header¶
Events¶
After the header, events are CBOR-encoded and concatenated.
Example¶
CBOR Encoding¶
Each event is encoded as a CBOR array:
[
1, # schema version
h'<issue_id>', # 16 bytes
h'<actor_id>', # 16 bytes
1700000000000, # ts_unix_ms
h'<parent_event_id>', # 32 bytes or null
3, # kind tag
["comment body"], # kind payload
h'<signature>' # optional signature
]
Kind Payloads¶
| Tag | Kind | Payload |
|---|---|---|
| 1 | IssueCreated | [title, body, [labels...]] |
| 2 | IssueUpdated | [title_or_null, body_or_null] |
| 3 | CommentAdded | [body] |
| 4 | LabelAdded | [label] |
| 5 | LabelRemoved | [label] |
| 6 | StateChanged | ["open" or "closed"] |
| 7 | LinkAdded | [url, note_or_null] |
| 8 | AssigneeAdded | [user] |
| 9 | AssigneeRemoved | [user] |
| 10 | AttachmentAdded | [name, h'<sha256>', mime] |
Append Algorithm¶
To append new events:
- Read current WAL ref to get HEAD commit
- Create new blob with chunk header + CBOR events
- Create tree containing the blob
- Create commit with HEAD as parent
- Update ref atomically via
git update-ref
Atomicity¶
git update-ref is atomic. If two processes race:
- One succeeds
- Other gets "ref already updated" error
- Loser must re-read HEAD and retry
Read Algorithm¶
To read all events:
- Get current WAL ref
- Walk commit history from HEAD to root
- For each commit, read blob from tree
- Parse chunk header
- Decode CBOR events
- Return events in chronological order
Optimization: Snapshots¶
For large WALs, snapshots accelerate reads:
Snapshots contain consolidated events up to a point. Reading starts from the latest snapshot instead of the root commit.
Sync Operations¶
Fetch (Pull)¶
After fetch:
- Local ref updated to remote HEAD
- New events read from commits
- Events inserted into sled
- Projections updated
Push¶
If rejected (non-fast-forward):
- Fetch remote changes
- Identify local-only events (not in remote)
- Re-append local events on top of remote HEAD
- Push again
This "rebase" preserves all events from all actors.
Snapshots¶
Purpose¶
Snapshots accelerate rebuilds for large WALs.
Location¶
Format¶
Same as WAL: commit with tree containing CBOR blob.
Creation¶
Creates a snapshot containing all events up to current WAL HEAD.
Garbage Collection¶
Removes old snapshots according to policy.
Distributed Locks¶
Location¶
Format¶
Lock data is stored in a commit tree:
{
"resource": "issue:abc123",
"owner": "<actor_id>",
"nonce": "<random>",
"expires_unix_ms": 1700003600000
}
Acquisition¶
- Check if lock ref exists
- If exists and not expired, fail (conflict)
- If not exists or expired, create commit with lock data
- Atomic update-ref to claim lock
Release¶
Delete the lock ref.
Ref Summary¶
| Ref | Purpose |
|---|---|
refs/grite/wal |
Append-only event log |
refs/grite/snapshots/<ts> |
Point-in-time snapshots |
refs/grite/locks/<hash> |
Distributed lease locks |
Design Rationale¶
Why CBOR?¶
- Compact binary format
- Self-describing
- Cross-language support
- Canonical encoding possible
Why Append-Only?¶
- Simple conflict resolution
- Full history preserved
- Easy to sync and merge
- Natural audit trail
Why Git Refs?¶
- Atomic updates
- Built-in replication
- Works with any git remote
- No separate storage system
Next Steps¶
- Storage Layout - Local file organization
- CRDT Merging - Event merge semantics
- Syncing Guide - Practical sync usage