Skip to content

damoahdominic/notasong

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

notasong

Detect whether an audio file is actually a song or just a podcast, interview, lecture, or random YouTube rip. Returns a confidence score so you can gate your pipeline.

Built as a companion to anysong — since anysong can pull from YouTube, not everything that comes back is music.

notasong check ~/music/lollipop_by_lil_wayne.mp3
# => song  confidence=0.94

notasong check ~/downloads/joe_rogan_clip.mp3
# => not_song  confidence=0.12

How It Works

notasong analyzes audio files using ffmpeg/ffprobe and scores them across multiple signals that distinguish music from speech and non-music content.

Detection Signals

Signal What it measures Music vs Speech
Spectral spread How energy is distributed across frequencies Music uses the full spectrum; speech clusters in 300Hz-3kHz
Beat regularity Presence of a consistent rhythmic pattern Songs have steady BPM; podcasts don't
Harmonic ratio Tonal vs noisy content Instruments are harmonic; speech is mixed
Dynamic range Loudness variation over time Produced music is compressed; speech has wide swings
Silence ratio Percentage of near-silent segments Podcasts have conversational pauses; songs don't
Duration Track length Short songs are still songs; podcasts/lectures run long
Zero-crossing rate How often the waveform crosses zero Speech has higher ZCR variability than music

Each signal produces a sub-score between 0.0 and 1.0. The final confidence is a weighted average. A file scoring >= 0.80 is classified as a song.

Architecture

notasong/
  cmd/           # CLI (cobra)
    root.go
    check.go     # `notasong check <file>` command
    batch.go     # `notasong batch <dir>` command
  analyzer/
    analyzer.go  # Orchestrates all detectors, produces final score
    spectral.go  # Spectral spread + harmonic ratio via ffprobe
    rhythm.go    # Beat detection via onset analysis
    dynamics.go  # Dynamic range + silence ratio
    zcr.go       # Zero-crossing rate analysis
    duration.go  # Duration heuristic
    types.go     # Report struct, signal weights, thresholds
  ffutil/
    probe.go     # Wrapper around ffprobe JSON output
    extract.go   # Raw PCM / stats extraction via ffmpeg
  main.go
  go.mod

Install

From Source (requires Go 1.22+)

git clone https://github.com/damoahdominic/notasong.git
cd notasong
go build -o notasong .
sudo mv notasong /usr/local/bin/

Prerequisites

  • ffmpeg + ffprobeapt install ffmpeg or brew install ffmpeg

No Python. No ML models. Just ffmpeg and a single Go binary.

Usage

Check a single file

notasong check song.mp3
# song  confidence=0.91  file=song.mp3

notasong check --json song.mp3
# {"file":"song.mp3","classification":"song","confidence":0.91,"signals":{...}}

Check a directory

notasong batch ~/music/
# song      0.94  lollipop_by_lil_wayne.mp3
# song      0.89  wild_thoughts_by_rihanna.mp3
# not_song  0.23  random_yt_clip.mp3

notasong batch ~/music/ --threshold 0.80 --fail
# exits non-zero if any file scores below threshold

Use as a Go library

import "github.com/damoahdominic/notasong/analyzer"

report, err := analyzer.Check("track.mp3")
if err != nil { ... }

if report.IsSong(0.80) {
    fmt.Println("it's a song")
}

Integration with anysong

The end goal — anysong calls notasong after downloading to verify:

anysong download "Lil Wayne Lollipop"
# downloads → ~/music/lollipop_by_lil_wayne.mp3
# notasong check → 0.94 → keep

anysong download "some weird query"
# downloads → ~/music/some_weird_query.mp3
# notasong check → 0.35 → warn/skip/delete

Test Results

Tested against a library of real audio files spanning hip-hop, pop, jazz, electronic, and ambient genres — both full-length tracks and 30-second clips.

Full-length songs

Artist Track Duration Score Result
MF DOOM Absolutely 2:43 0.88 song
Sarkodie CEO Flow 4:13 0.88 song
Meduza Don't Wanna Go Home 2:38 0.88 song
Migos Hannah Montana 3:33 0.88 song
Lil Wayne Love Me 4:13 0.88 song
Static Garden Nightfall 4:10 0.88 song
George Benson Breezin' 5:42 0.81 song
Midnight Mind Sibilance 3:32 0.80 song

8/8 full-length songs correctly classified (100%)

30-second clips (short songs are still songs)

A 30-second clip of music is still music. Duration doesn't decide what's a song — the audio content does. Only recordings of people talking (podcasts, interviews, lectures) should be classified as not_song.

Artist Track Duration Result
Michael Jackson Black or White 0:30 song
Drake Hotline Bling 0:30 song
50 Cent Just a Lil Bit 0:30 song
2Pac Life Goes On 0:30 song
Lil Wayne Lollipop 0:30 song
Lil Wayne Mrs. Officer 0:30 song
Robyn Sucker for Love 0:30 song
Stat Quo Billion Bucks 0:30 song
Jay-Z Renegade 0:30 song
George Benson People Get Ready 0:30 song
Michael Jackson Heal the World 0:30 song

Key takeaways

  • Short songs are songs. A 30-second clip of Lollipop is still Lollipop.
  • Jazz (George Benson) scores slightly lower due to wider dynamic range and less compressed mastering — but still passes.
  • The 0.80 default threshold separates music from speech/non-music content across all tested genres and durations.

Scoring Weights

spectral_spread   0.20
beat_regularity   0.25
harmonic_ratio    0.20
dynamic_range     0.10
silence_ratio     0.10
duration          0.10
zcr               0.05

Beat regularity gets the highest weight — it's the strongest single indicator that something is a produced song vs spoken word.

License

MIT

About

Detect whether an audio file is a song or just a podcast/random YouTube rip. Companion to anysong.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages