A persistent, high-performance key-value storage engine built in Java 21+ using the Log-Structured Merge-Tree (LSM-Tree) architecture. No external database libraries—pure Java.
- Features
- Architecture Overview
- Prerequisites
- Setup Guide
- Building & Testing
- Usage
- Configuration
- Project Structure
- Design Notes
- License
- Concurrent reads and writes — Lock-free Memtable (CAS), thread-safe WAL and store
- Durability — Write-ahead log (WAL) with crash recovery
- Efficient lookups — In-memory Memtable, Bloom filters, and sparse indexes on SSTables
- Background compaction — Size-tiered compaction on a virtual thread with K-way merge
- Range scans — Ordered iteration over key ranges
- Tombstones — Deletes via tombstones with correct recovery semantics
┌─────────────────────────────────────────────────┐
│ LSMStore │
│ (coordinates WAL → Memtable, reads, compaction) │
└─────────────────────────────────────────────────┘
│
┌────────────────────────────────┼────────────────────────────────┐
▼ ▼ ▼
┌───────────┐ ┌─────────────┐ ┌─────────────┐
│ WAL │ │ Memtable │ │ SSTables │
│ (append- │ │ (SkipList, │ │ (immutable │
│ only) │ ── replay ──► │ in-memory) │ ── flush ──► │ on disk) │
└───────────┘ └─────────────┘ └─────────────┘
│ │ │
│ │ │
▼ ▼ ▼
current.wal Sorted K/V sstable_1.sst
(binary log) (~64MB then sstable_2.sst
immutable) + Bloom + index
- WAL: Every put/delete is appended before updating the Memtable; on startup the WAL is replayed to reconstruct the Memtable.
- Memtable: In-memory concurrent SkipList; when it exceeds a size threshold it becomes immutable and is flushed to a new SSTable.
- SSTables: Immutable sorted files with a sparse index and Bloom filter; reads check Memtable first, then SSTables (newest first).
- Compaction: A background virtual thread merges the oldest N SSTables into one using a K-way merge to limit the number of on-disk files.
- Java 21 or later (for Virtual Threads, Records, and modern APIs)
- Gradle 8.x+ (or use the included wrapper; no global install required)
Check your Java version:
java -version
# Should show version 21 or highercd /path/to/your/workspace
# If using Git:
# git clone <repository-url> JQL
# cd JQL-
macOS (Homebrew):
brew install openjdk@21
Then setJAVA_HOMEif needed, e.g.
export JAVA_HOME=$(/usr/libexec/java_home -v 21) -
Linux (apt):
sudo apt install openjdk-21-jdk -
Windows:
Install Eclipse Temurin 21 or Oracle JDK 21 and setJAVA_HOME.
If the project doesn’t already have gradlew and gradlew.bat:
gradle wrapper --gradle-version 9.3If you don’t have Gradle installed, download the binary from gradle.org or use SDKMAN: sdk install gradle.
./gradlew --version
./gradlew tasksYou should see the Java version (21+) and a list of available tasks.
| Command | Description |
|---|---|
./gradlew build |
Compile, run tests, and build (no publishing) |
./gradlew compileJava |
Compile main sources only |
./gradlew test |
Run all tests |
./gradlew clean |
Remove build/ |
./gradlew clean build |
Clean and full build |
Run tests (recommended after setup):
./gradlew testTests include: put/get, overwrites, deletes, range scans, restart recovery, concurrent reads/writes, and flush-to-SSTable behavior.
import core.LSMStore;
import java.nio.file.Path;
Path dataDir = Path.of("/var/data/jql-store");
try (LSMStore store = new LSMStore(dataDir)) {
// Put
store.put("user:1000".getBytes(), "Alice".getBytes());
// Get
byte[] value = store.get("user:1000".getBytes());
// value != null => "Alice"
// Delete (tombstone)
store.delete("user:1000".getBytes());
store.get("user:1000".getBytes()); // null
// Range scan [from, to] (inclusive; null = unbounded)
var entries = store.scan("user:".getBytes(), "user:~".getBytes());
for (var e : entries) {
System.out.println(new String(e.getKey()) + " -> " + new String(e.getValue()));
}
}
// store.close() called automatically; WAL and SSTables are on diskData survives process restarts. Open the same directory again to recover from the WAL and existing SSTables:
// First run
try (LSMStore store = new LSMStore(dataDir)) {
store.put("k".getBytes(), "v".getBytes());
}
// After restart: open same directory
try (LSMStore store = new LSMStore(dataDir)) {
byte[] v = store.get("k".getBytes());
// v equals "v"
}Default flush threshold is 64 MiB. You can override it:
long thresholdBytes = 32 * 1024 * 1024; // 32 MiB
try (LSMStore store = new LSMStore(dataDir, thresholdBytes)) {
// ...
}| Component | Default / behavior |
|---|---|
| Memtable threshold | 64 MiB (DEFAULT_MEMTABLE_THRESHOLD_BYTES in LSMStore) |
| WAL file | current.wal in the store directory |
| SSTable naming | sstable_<generation>.sst (e.g. sstable_1.sst) |
| Compaction | Runs in a virtual thread; when there are ≥ 4 SSTables, the oldest 4 are merged into one |
| Sparse index | Every 16th key is stored in the in-memory index (SSTable.SPARSE_INDEX_INTERVAL) |
| Bloom filter | 1% target false positive rate for SSTables |
To change compaction threshold or index interval, edit the constants in LSMStore and SSTable respectively.
JQL/
├── build.gradle.kts # Gradle build (Kotlin DSL)
├── settings.gradle.kts
├── README.md # This file
├── gradlew / gradlew.bat # Gradle wrapper
└── src/
├── main/java/core/
│ ├── LSMStore.java # Main entry: WAL → Memtable, reads, compaction
│ ├── WAL.java # Write-ahead log (binary format, replay)
│ ├── Memtable.java # In-memory concurrent SkipList
│ ├── SSTable.java # Immutable on-disk sorted table + index + Bloom
│ ├── BloomFilter.java # BitSet-based probabilistic filter
│ └── ByteArrayComparator.java
└── test/java/core/
└── LSMTest.java # Unit and integration tests (including restart)
- SkipList (not Red-Black tree): Lock-free concurrent inserts via VarHandle CAS; no rebalancing; natural ordering for range scans.
- WAL format:
Key Size (4B) | Key | Value Size (4B, -1 = tombstone) | Valuefor simple parsing and recovery. - SSTables are immutable: Simplifies concurrency and compaction; updates and deletes are handled by newer tables and tombstones.
- Size-tiered compaction: Merges several small SSTables into one larger one to bound read amplification and file count.
- Virtual thread for compaction: Background work doesn’t block the main request path; Java 21 virtual threads keep overhead low.
This project is provided as-is for reference and learning. Use and modify according to your needs.