VocalFold

System-wide voice-to-text dictation for Windows using AI

☕ Support VocalFold

Enjoying VocalFold? Support the development of this free, open-source project!

☕ Buy me a coffee on Ko-fi — Your support helps keep this project alive and growing!

Overview

VocalFold is a Windows desktop application that transcribes your voice to text using Whisper.NET AI. Press a global hotkey, speak, and your words appear as text at the cursor position in any application.

Features

🎤 Voice Recording: Activate with global hotkey (Ctrl+Windows by default)
🤖 AI Transcription: Local processing using Whisper.NET (no cloud services)
⚡ GPU Acceleration: Support for NVIDIA (CUDA), AMD (Vulkan), and Intel (Vulkan) GPUs
⌨️ Text Output: Types text at cursor position (works in any app)
🔒 Privacy-First: All processing happens locally, no data leaves your machine
🎯 Background Operation: Runs in system tray, always available
🚀 Windows Startup: Optional auto-start with Windows
🎨 Modern Web UI: Configure settings via beautiful web interface
📝 Keyword Replacement: Create shortcuts for frequently used phrases
🗂️ Category Organization: Organize keywords into collapsible categories

System Requirements

Operating System

Windows 11 (recommended)
Windows 10 (supported)

GPU Support (Optional, CPU fallback available)

VocalFold automatically selects the best available GPU runtime:

NVIDIA GPUs (CUDA):

Supported: RTX 20 series or newer (RTX 2060, 3060, 3080, 4080, etc.)
Requires: NVIDIA CUDA Toolkit 12.x (Download)
Performance: Excellent (~0.5s for 5s speech with Base model)
VRAM: 4GB minimum, 8GB+ recommended

AMD GPUs (Vulkan):

Supported: Radeon RX 6000 series or newer (RX 6700 XT, 6800 XT, 7900 XTX, etc.)
Requires: Latest AMD Adrenalin drivers with Vulkan support (Download)
Performance: Good (~1-2s for 5s speech with Base model)
VRAM: 4GB minimum, 8GB+ recommended
Note: Older GPUs (RX 5000 series) may have slower performance

Intel GPUs (Vulkan):

Supported: Intel Arc series (A750, A770, etc.)
Requires: Latest Intel Graphics drivers with Vulkan support (Download)
Performance: Moderate (~2-3s for 5s speech with Base model)
VRAM: 8GB+ recommended

No GPU / Unsupported GPU (CPU Fallback):

Performance: Slow (~5-10s for 5s speech with Base model)
Recommended: Use Tiny or Base model for acceptable speed
Works: Fully functional, just slower

Runtime Priority

VocalFold automatically detects and uses the best available runtime:

CUDA (NVIDIA GPUs) - Best performance
Vulkan (AMD/Intel GPUs) - Good performance
CPU (Fallback) - Slowest but functional

Other Requirements

.NET 9.0 SDK (for building from source)
16GB RAM (recommended)
8GB RAM (minimum)

Installation

Option 1: Download Pre-built Executable (Recommended)

Download VocalFold.exe from the Releases page
Install GPU drivers (see GPU Support section above)
Run VocalFold.exe
First run will download the AI model (~150MB, one-time)
Application runs in system tray

Option 2: Build from Source

# Clone repository
git clone https://github.com/yourusername/VocalFold.git
cd VocalFold

# Build project
dotnet restore
dotnet build

# Run application
dotnet run --project VocalFold

# Or build standalone executable
dotnet publish -c Release -r win-x64 --self-contained true -p:PublishSingleFile=true

Usage

Basic Usage

Launch VocalFold (runs in system tray)
Click in any text field (Notepad, browser, Word, etc.)
Press Ctrl+Windows (or your configured hotkey)
Speak clearly into your microphone
Your words appear as text at the cursor position

Configuring Settings

Right-click the VocalFold tray icon
Click Settings
Configure in the web UI:
- Change global hotkey
- Adjust typing speed
- Manage keyword replacements
- Organize keywords into categories
- Enable/disable voice input

Keyword Replacements

Create shortcuts for frequently used text:

Say "comma" → types ","
Say "period" → types "."
Say "email signature" → types your full email signature
Say "code snippet" → types your code template

Organize keywords into categories for better management.

Troubleshooting

GPU Not Detected

Symptoms: Slow transcription (5-10 seconds), console shows "CPU mode"

Solutions:

For NVIDIA Users:

Install NVIDIA CUDA Toolkit 12.x: https://developer.nvidia.com/cuda-downloads
Verify installation: Open command prompt, run nvcc --version
Restart VocalFold

For AMD Users:

Install latest AMD Adrenalin drivers: https://www.amd.com/en/support
Verify Vulkan support:
- Download Vulkan SDK: https://vulkan.lunarg.com/
- Run vulkaninfo to check availability
Ensure GPU is RX 6000 series or newer
Restart VocalFold

For Intel Users:

Install latest Intel Graphics drivers: https://www.intel.com/content/www/us/en/download-center/home.html
Verify Vulkan support (Arc series required)
Restart VocalFold

Performance Benchmarks

Use these benchmarks to verify GPU is working:

Test method:

Record 5 seconds of clear speech
Use Base model
Measure transcription time (shown in console)

Expected performance:

NVIDIA RTX 3080: <1s
NVIDIA RTX 3060: <1.5s
AMD RX 6800 XT: <1.5s
AMD RX 6700 XT: <2s
AMD RX 5700 XT: <3s
Intel Arc A750: <2.5s
CPU (i7-10700K): 5-8s

If your performance is significantly worse, GPU acceleration may not be working correctly.

Other Issues

Hotkey not working: Check for conflicts with other applications
Microphone not detected: Check Windows sound settings
Application crashes: Check logs, report issue on GitHub

Technology Stack

Language: F# (.NET 9.0)
AI Engine: Whisper.NET with CUDA and Vulkan runtimes
Audio: NAudio
Input Simulation: InputSimulatorCore
Web UI: F# Fable + React + TailwindCSS
Web Server: Giraffe + ASP.NET Core

Privacy & Security

✅ All processing happens locally on your machine
✅ No audio data is sent to external services
✅ No recordings stored to disk
✅ No telemetry or tracking
✅ Offline operation (after initial model download)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

[Your License Here]

Acknowledgments

Whisper.NET - .NET bindings for OpenAI's Whisper
NAudio - Audio library for .NET
OpenAI - Original Whisper model

Made with ❤️ using F#

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
VocalFold.WebUI		VocalFold.WebUI
VocalFold		VocalFold
scripts		scripts
specifications		specifications
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CONTEXT.md		CONTEXT.md
DEPLOYMENT.md		DEPLOYMENT.md
LICENSE		LICENSE
PHASE13_STATUS.md		PHASE13_STATUS.md
PHASE_11_TESTING.md		PHASE_11_TESTING.md
README.md		README.md
SPECIFICATION.md		SPECIFICATION.md
TASKS.md		TASKS.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
Tech-options.md		Tech-options.md
build-and-package.bat		build-and-package.bat
build-exe.bat		build-exe.bat
convert-logo-to-ico.ps1		convert-logo-to-ico.ps1
installer.iss		installer.iss
logo.png		logo.png
package-lock.json		package-lock.json
package.json		package.json
run.bat		run.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VocalFold

☕ Support VocalFold

Overview

Features

System Requirements

Operating System

GPU Support (Optional, CPU fallback available)

Runtime Priority

Other Requirements

Installation

Option 1: Download Pre-built Executable (Recommended)

Option 2: Build from Source

Usage

Basic Usage

Configuring Settings

Keyword Replacements

Troubleshooting

GPU Not Detected

Performance Benchmarks

Other Issues

Technology Stack

Privacy & Security

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VocalFold

☕ Support VocalFold

Overview

Features

System Requirements

Operating System

GPU Support (Optional, CPU fallback available)

Runtime Priority

Other Requirements

Installation

Option 1: Download Pre-built Executable (Recommended)

Option 2: Build from Source

Usage

Basic Usage

Configuring Settings

Keyword Replacements

Troubleshooting

GPU Not Detected

Performance Benchmarks

Other Issues

Technology Stack

Privacy & Security

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages