Starred repositories
Open-source meeting transcription API for Google Meet, Microsoft Teams & Zoom. Auto-join bots, real-time WebSocket transcripts, MCP server for AI agents. Self-host or use hosted SaaS.
A native liquid glass view for iOS, with fallbacks for older versions and Android.
YABS - a simple bash script to estimate Linux server performance using fio, iperf3, & Geekbench
Fast, accurate & comprehensive text measurement & layout
Code for openai.fm, a demo for the OpenAI Speech API
A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD
Generates an image from a DOM node using HTML5 canvas
Offline voice input app for macOS on Apple Silicon — powered by MLX-Audio (Whisper/Qwen3-ASR)
Capture system loopback audio on macOS 12.3+, Windows and Linux
Build ultra fast, tiny, and cross-platform desktop apps with Typescript.
The swiss army knife of lossless video/audio editing
C inference for Qwen3-ASR 0.6b and 1.7b transcriptions models
A fast and soft pattern search for trillion-scale corpora.
Offline streaming speech-to-text in the browser
A Streaming-Native Serving Engine for TTS/STS Models
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
A real-time software for turn-taking, backchannel, and head-nodding prediction
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
[TACL'26] VoiceBench: Benchmarking LLM-Based Voice Assistants
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
🎙️ AI Dictation App - Open Source and Local-first ⚡ Type 3x faster, no keyboard needed. 🆓 Powered by open source models, works offline, fast and accurate.
Chrome extension that analyzes tweets on X timeline based on the X algorithm weights
Massive open Japanese speech corpus