git dogs is an experiment in refactoring git to make it suitable for modern workflows. The data model and the syncing protocol are left as-is to stay compatible with the existing mass of git repos. The rest is reworked. The system is made syntax-aware, diffing and merging is finer grained and draws heavily from CRDT ideas. Content is indexed for efficient search.
The project dogfoods from day 1.
The repo is structured into *dogs*. Each dog has its purview, the data and functions it is responsible for. Dogs coordinate to carry out complex tasks.- Bro: interactive syntax-highlighted pager/viewer.
- Spot: structural code search, grep, regex, and replace across a repo. Maintains a trigram index for instant lookups.
- Graf: does token-level diffing, 3-way merges, history navigation. Maintains a history index.
- Sniff: serves the worktree, detects changes.
- Keeper: keeps the data per se (git blobs, trees, commits).
New dogs may join, old dogs may learn new tricks. If it works, it gets used. If it's used, it evolves.
Build (requires libsodium, lz4, zlib and cmake, also ninja is recommended):
mkdir build && cd build
CC=clang CXX=clang++ cmake -GNinja -DCMAKE_BUILD_TYPE=Release ..
ninja
be is the dispatcher that ties the dogs together. Every verb is a
pipeline — be get fans out to keeper, sniff, spot and graf in turn;
be post walks the worktree into a commit and advances refs. See
beagle/GURI.md for the URI grammar.
be get ssh://host/repo.git # clone (fetch + checkout + index)
be get ?v1.2 # checkout the "v1.2" ref locally
be get path/to/file.c # open the file in the pager (bro)
be get path/to/file.c#TODO # grep (spot) inside one file
be get .#FuncName # structural search across repo
be put src/foo.c src/bar.c # stage two files
be put # stage everything dirty
be delete src/obsolete.c # stage removal of one path
be delete # stage every tracked file rm'd on disk
Each put / delete grows or shrinks the staged base tree in
keeper's object store. HEAD does not move.
be post -m "fix the parser" # commit; auto-stage dirty if needed
be post -m "release" //origin # commit and push to the remote
be post wraps the current base tree into a commit with parent =
HEAD, advances HEAD, and updates refs. With a remote authority in the
URI the keeper push step is included; without, it's purely local.
be diff path/to/file.c?v1..v2 # one file, ref-to-ref
be diff path/to/file.c?v1 # one file, ref vs worktree
be diff ?v1..v2 # whole tree, ref-to-ref
be diff ?v1 # whole tree, ref vs worktree
The URI shape picks the mode: a ?from..to query diffs two refs, a
bare ?ref diffs that ref against the current worktree. Add a path
for a single file, leave it off for the whole tree. Output is the
same token-level unified hunk format graf diff emits on disk.
mkdir my-repo && cd my-repo
echo 'int main(){return 0;}' > hello.c
be post -m "initial" # first commit, auto-stages hello.c
echo 'printf("hi\n");' >> hello.c
be put hello.c # stage the edit
be post -m "greet" # commit
be get ?$(cat .dogs/sniff/HEAD) # round-trip: recheck out HEAD
bro file.c # syntax-highlighted pager
graf --diff old.c new.c # token-level diff
graf --install # register as git diff/merge driver
See each dog's INDEX.md (e.g. sniff/INDEX.md,
keeper/INDEX.md) for the full API surface.
Is this git based? This is git-compatible.
Is this VC funded? Nope. The project is ran on old hardware discarded by a university. Heavy things (eg massive fuzzing), all run on a 32 core discounted Hetzner server. Coding is mostly done by Claude Max, in 2..5 parallel sessions.
Trigram indexing idea from Russ Cox. Tokenizers started with tree-sitter, later rewritten as ragel scanners for speed. The Merkle scheme is by Linus Torvalds AFAIK.