For learning Go purpose and a tool that I can use daily for network monitor when a server has alert related to network issues! I'm suck at Linux Network!
Yes, fucking yes.
- See
docs/ai-context/:README.mdPROJECT_STRUCTURE.mdHIGH_LEVEL_DESIGN.md
Runtime requirements:
# Linux 4.9+ with root/sudo — no external tools needed for core features.
# The app uses kernel APIs directly (netlink sockets) for:
# - Socket operations (replaces ss)
# - Conntrack queries (replaces conntrack CLI)
# - Firewall rules (replaces iptables/ip6tables via nftables)
#
# Optional: only needed if kernel API detection fails (rare on modern distros)
sudo apt install -y conntrack iproute2 iptables # Ubuntu/Debian
sudo yum install -y conntrack-tools iproute iptables # CentOS/RHEL
# Required: only for Trace Packet feature (tcpdump is the only mandatory external tool)
sudo apt install -y tcpdump # Ubuntu/Debian
sudo yum install -y tcpdump # CentOS/RHELFor bandwidth monitoring, enable conntrack byte accounting:
sudo sysctl -w net.netfilter.nf_conntrack_acct=1
# To make it persist across reboots:
echo "net.netfilter.nf_conntrack_acct=1" | sudo tee -a /etc/sysctl.d/99-conntrack.confKernel / distro notes:
- Linux only (reads
/procnetwork data, uses netlink kernel APIs). - Kernel 4.9+ recommended — enables SOCK_DESTROY for socket killing and all netlink features.
- Conntrack section needs
nf_conntrackmodule loaded (/proc/sys/net/netfilter/nf_conntrack_count). - Release binary is built static to run across common Linux distros (Ubuntu 20.04+, CentOS 7+).
- On startup, the app auto-detects kernel capabilities and shows
API:kernel(green) orAPI:exec(...)(yellow) in the status bar.
Privilege notes:
- Run with
sudo(or root) for full functionality. sudo/CAP_NET_ADMINis required for: netlink socket access, block/unblock peer, kill active flows, conntrack stats, and full process mapping from/proc/[pid]/fd.
curl -sL https://github.com/BlackMetalz/holyf-network/releases/latest/download/holyf-network-linux-amd64 -o /tmp/holyf-network.bin
chmod +x /tmp/holyf-network.bin
sudo mv /tmp/holyf-network.bin /usr/local/bin/holyf-network
sudo holyf-network -v# Default refresh rate is 30 seconds
sudo holyf-network
# Set refresh rate to 5 seconds
sudo holyf-network -r 5
# Keyboard shortcuts inside TUI
# Tab / Shift+Tab: move focus between panels
# Ctrl+1: dashboard view (default — Top Connections + System Health + Diagnosis)
# Ctrl+2: bandwidth chart view (full-screen RX/TX time-series charts)
# Up / Down: select row in Top Connections
# [ / ]: previous / next page in Top Connections when rows exceed visible height
# o: toggle Top Connections IN/OUT mode
# Enter / k: block selected Top Connections row (IN mode only)
# K: K8s pod lookup by port (scan network namespaces → PID, pod, deployment)
# T: trace packet for selected Top Connections row (bounded tcpdump flow)
# t: open Trace History (latest trace runs, Enter=view detail)
# Shift+B: sort by Bandwidth (press again toggles DESC/ASC)
# Shift+C: sort by Conns (press again toggles DESC/ASC)
# Shift+P: sort by Port (press again toggles DESC/ASC)
# i: explain Send-Q / Recv-Q / TX/s / RX/s
# Shift+I: explain Interface Stats (RX/TX, Packet rate, App CPU/RSS, Errors, Drops)
# g: toggle grouped view (peer + process, capped to top 20 groups by CONNS)
# /: search Top Connections by text (contains match)
# f: port filter (local port in IN mode, remote port in OUT mode; press f again to clear all filters)
# m: toggle sensitive IP masking
# s: sort Connection States by count (toggle DESC/ASC)
# b: list active blocks and remove selected peer block
# d: show Diagnosis History (latest 20 changes in current live session)
# h: show action log (latest 20)
# p: pause/resume auto-refresh
# r: refresh now
# z: zoom/unzoom Top Connections only
# ?: show help
# q: quit# Build local binary
make build
# Build + run with sudo
make local
# Build + run with extra args
make local ARGS="-r 5"Live refresh model:
-r/--refreshcontrols the main full refresh loop (Top Connections + System Health + Diagnosis).System Healthpanel has a dedicated1srefresh lane for faster RX/TX and bandwidth chart visibility.App Usageline shows holyf-network process CPU cores + RSS, sampled on the configured-r/--refreshinterval.Ctrl+2switches to bandwidth chart view — two side-by-side RX/TX time-series charts (Braille rendering, 60s window).p(pause) pauses both refresh lanes.- Status bar shows
API:kernel(green) when using kernel APIs,API:exec(...)(yellow) when falling back to CLI tools. LINK:<speed>Mb/sshown only when NIC speed is detectable.
Mitigation behavior:
minutes > 0: block first, then kill active connections, then auto-unblock when timer expires.minutes = 0: kill active connections only (no block rule, no timer).- Kill success ignores
TIME_WAIT; partial results show asremaining N (storm/race).
- Beginner TCP/network foundation (English, operator-first):
docs/NETWORK_FOUNDATIONS_FOR_SRE_EN.md - Beginner TCP/network foundation (Vietnamese, operator-first):
docs/NETWORK_FOUNDATIONS_FOR_SRE_VI.md - Tcpdump for beginners (Vietnamese, practical packet-capture basics):
docs/TCPDUMP_FOR_BEGINNERS_VI.md - Tcpdump trace feature in holyf-network (Vietnamese, UX/guardrails):
docs/TCPDUMP_TRACE_FEATURE_VI.md - Incident mental checklist (English, 1-page quick reference):
docs/INCIDENT_MENTAL_CHECKLIST_EN.md - Incident mental checklist (Vietnamese, 1-page quick reference):
docs/INCIDENT_MENTAL_CHECKLIST_VI.md - Vietnamese (practical ops):
docs/USER_METRICS_GUIDE_VI.md - English (practical ops):
docs/USER_METRICS_GUIDE_EN.md - Daemon snapshot file spec:
docs/SNAPSHOT_FORMAT.md
PID/NAME(example44011/sshd): socket mapped to a host process.ct/nat: synthetic row from conntrack/NAT host-facing visibility path.-: process info not available.
# Start daemon in background (default interval = 30s)
holyf-network daemon start
# Typical production start with explicit controls
holyf-network daemon start \
--interface eth0 \
--interval 30 \
--top-limit 500 \
--data-dir ~/.holyf-network/snapshots \
--retention-hours 168
# Check status
holyf-network daemon status
# Prune old segment files immediately (on-demand)
holyf-network daemon prune
# Stop daemon
holyf-network daemon stopDaemon notes:
- Storage format: daily JSON Lines (
.jsonl, one JSON object per line) by server local time (connections-YYYYMMDD.jsonl).- Spec: https://jsonlines.org/
- Full format contract:
docs/SNAPSHOT_FORMAT.md
- Snapshot payload stores aggregate-only rows in two arrays:
incoming_groups: grouped bypeer_ip + local service port + proc_nameoutgoing_groups: grouped bypeer_ip + remote service port + proc_name
- No raw connection list is persisted in history.
- For Docker/NAT traffic, daemon snapshots can persist
proc_name=ct/natrows (same semantics as live view). --top-limitis the max aggregate rows stored per side per snapshot (INcap +OUTcap).- Retention policy: remove old segments beyond
--retention-hours.- In daemon runtime, prune runs once at startup and then daily at local
00:00. - Use
holyf-network daemon prunefor immediate manual prune.
- In daemon runtime, prune runs once at startup and then daily at local
- A lock file prevents multiple daemons writing the same
--data-dir. - Default paths on Linux root:
- snapshots:
/var/lib/holyf-network/snapshots - daemon log:
/var/log/holyf-network/daemon.log - active state:
/run/holyf-network/daemon.state
- snapshots:
- Optional daemon defaults file:
/etc/holyf-network/daemon.json- file is optional; if absent, built-in defaults are used
- partial config is allowed, for example only overriding
data-dir - precedence:
daemon start/run: CLI flags ->daemon.json-> built-in defaultsreplay: explicit--data-dir/--file-> active daemon state ->daemon.json-> built-in defaultsdaemon status/stop/prune: explicit target flags -> active daemon state ->daemon.json-> built-in defaults
- example:
{
"data-dir": "/var/lib/holyf-network/snapshots"
}daemon status/stopwithout explicit--data-dir/--pid-fileuses active-state targeting.daemon prunewithout explicit--data-dir/--pid-filealso uses active-state targeting.daemon pruneretention source precedence:--retention-hoursflag- active-state
retention_hours(if present) daemon.jsonretention-hours(if present)- default
168h
- Explicit flags (
--data-diror--pid-file) forcestatus/stop/pruneto target that explicit location. - For bandwidth monitoring, use shorter interval (
--interval 5..10) to capture bursts better. - For connection trend monitoring, keep default interval (
30s) to reduce noise/storage.
Snapshot file quick inspect:
# count snapshots in one day file
wc -l /var/lib/holyf-network/snapshots/connections-YYYYMMDD.jsonl
# read latest 3 snapshots
tail -n 3 /var/lib/holyf-network/snapshots/connections-YYYYMMDD.jsonlOne line = one snapshot record (JSON object), for example:
{"captured_at":"2026-03-08T12:56:30.196962352+07:00","interface":"eth0","top_limit_per_side":500,"sample_seconds":29.999999695,"bandwidth_available":true,"incoming_groups":[{"peer_ip":"172.25.110.116","port":22,"proc_name":"sshd","conn_count":2,"tx_queue":0,"rx_queue":0,"total_queue":0,"tx_bytes_delta":377892,"rx_bytes_delta":41164,"total_bytes_delta":419056,"tx_bytes_per_sec":12596.400128063402,"rx_bytes_per_sec":1372.1333472833558,"total_bytes_per_sec":13968.533475346758,"states":{"ESTABLISHED":2}}],"outgoing_groups":[{"peer_ip":"20.205.243.168","port":443,"proc_name":"curl","conn_count":1,"tx_queue":0,"rx_queue":0,"total_queue":0,"tx_bytes_delta":0,"rx_bytes_delta":0,"total_bytes_delta":0,"tx_bytes_per_sec":0,"rx_bytes_per_sec":0,"total_bytes_per_sec":0,"states":{"ESTABLISHED":1}}],"version":"v0.3.46"}# Open replay UI for current day (server local time)
holyf-network replay
# Open replay UI with masked IP prefixes
holyf-network replay --sensitive-ip
# Open one specific daily snapshot segment file
holyf-network replay --file connections-20260304.jsonl
# shorthand:
holyf-network replay -f connections-20260304.jsonl
# Replay only snapshots inside a time window (inclusive)
holyf-network replay -b 20:00 -e 23:59
# With --file, clock-only time binds to that file's date
holyf-network replay --file connections-20260304.jsonl -b 20:00 -e 23:59
# shorthand:
holyf-network replay -f connections-20260304.jsonl -b 20:00 -e 23:59Replay path resolution:
- Default (
holyf-network replay): uses active daemondata_dirfrom state file. - If no active daemon state is found, falls back to system default snapshot path.
--data-dirstill exists as an advanced override (hidden from help output).
Replay hotkeys:
- left/right bracket (
[/]): previous/next snapshot a/e: oldest/latest snapshott: jump to specific timestampL: live tail — auto-jump to newest snapshot as daemon writes new data (liketail -f)o: toggle replayIN/OUTUp/Down: select rowf: port filter (press again to clear filters)/: grep-like contains filter for current snapshotShift+B/C/P: sort mode (press same key to toggle DESC/ASC)i: explain Send-Q / Recv-Q / TX/s / RX/sShift+I: alias ofi(explain queue/bandwidth columns)m: mask IP displayx: toggle skip-empty snapshot navigation?: helpq: quit
t accepts: YYYY-MM-DD HH:MM[:SS], HH:MM[:SS] (today), yesterday HH:MM, or RFC3339.
--begin/-b and --end/-e use the same time parsing semantics as t.
When --file is set and -b/-e are clock-only (HH:MM[:SS]), replay uses that segment's date.
If only -b is provided, replay auto-sets -e to end-of-day of -b.
If only -e is provided, replay auto-sets -b to start-of-day of -e.
Time-window filtering is inclusive (captured_at >= begin and captured_at <= end).
Replay mode is read-only: no block/kill actions are executed. Replay view is aggregate-only:
IN: rows are grouped bypeer_ip + local service port + proc_nameOUT: rows are grouped bypeer_ip + remote service port + proc_name- replay does not inherit the live top-20 group cap; it shows all stored rows in the snapshot, limited only by panel height
History reader accepts only this aggregate snapshot format.
For metric semantics (
Send-Q,Recv-Q,TX/s,RX/s,Conntrack Used/Max,ct/nat) and troubleshooting, seedocs/USER_METRICS_GUIDE_EN.md(or Vietnamese version:docs/USER_METRICS_GUIDE_VI.md).
Default threshold file:
# config/health_thresholds.toml
[retrans_percent]
warn = 2.0
crit = 5.0
[drops_per_sec]
warn = 10
crit = 50
[conntrack_percent]
warn = 70
crit = 85
[bandwidth_per_sec]
warn = 52428800
crit = 157286400
[bandwidth_per_snapshot]
warn = 524288000
crit = 2147483648
[retrans_sample]
# Only evaluate retrans health if BOTH conditions are met.
min_established = 20
min_out_segs_per_sec = 60In case you want to custom threshold:
# Use custom health strip thresholds
holyf-network --health-config /etc/holyf-network/health_thresholds.tomlNotes:
- If retrans sample is below thresholds, UI shows
LOW SAMPLEand does not trigger retrans WARN/CRIT. bandwidth_per_seccontrols BW column warn/crit coloring (TX/s,RX/s).bandwidth_per_snapshotremains used for internal total-delta evaluation/sorting.