edgetts

Documentation

English guide: README.md
Chinese guide: README.zh-CN.md
API reference: https://pkg.go.dev/github.com/lib-x/edgetts
Releases: https://github.com/lib-x/edgetts/releases

A Go library for Microsoft Edge TTS with a simpler API for common use cases.

Highlights

Client-based API for reusable configuration.
Package-level convenience functions for one-off calls.
Text and SSML are first-class, symmetric inputs.
Output to []byte, file, io.Writer, stream, directory, and ZIP.
Voice listing and filtering helpers.
Legacy Speech API kept as a deprecated compatibility layer.

Install

go get github.com/lib-x/edgetts

Quick start

Save text to mp3

package main

import (
    "context"

    "github.com/lib-x/edgetts"
)

func main() {
    err := edgetts.Save(
        context.Background(),
        "Hello, world.",
        "hello.mp3",
        edgetts.WithVoice("en-US-GuyNeural"),
    )
    if err != nil {
        panic(err)
    }
}

Reuse a client

client := edgetts.New(
    edgetts.WithVoice("en-US-GuyNeural"),
    edgetts.WithRate("+10%"),
)

data, err := client.Bytes(context.Background(), "This is a reusable client example.")

Runnable demo

A runnable demo is included in this repository:

go run ./cmd/demo -text "hello world" -voice en-US-GuyNeural -output hello.mp3

Write to a file through streaming output:

go run ./cmd/demo -text "hello world" -voice en-US-GuyNeural -output hello.mp3 -stream

Use SSML input:

go run ./cmd/demo -type ssml -text '<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><voice name="en-US-GuyNeural"><prosody rate="+10%">hello world</prosody></voice></speak>' -output hello.mp3

If -output is omitted, the demo generates audio in memory and prints the byte size.

Package-level convenience API

Best for one-off calls.

Text to bytes

data, err := edgetts.Bytes(
    ctx,
    "hello world",
    edgetts.WithVoice("en-US-GuyNeural"),
)

SSML to bytes

ssml := `<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><voice name="en-US-GuyNeural"><prosody rate="+10%">hello world</prosody></voice></speak>`
data, err := edgetts.BytesSSML(ctx, ssml)

Text directly to file

err := edgetts.Save(ctx, "hello world", "hello.mp3", edgetts.WithVoice("en-US-GuyNeural"))

SSML directly to file

err := edgetts.SaveSSML(ctx, ssml, "hello.mp3")

Client API

Best for reusable defaults, service-side usage, and batch workflows.

Create a reusable client

client := edgetts.New(
    edgetts.WithVoice("en-US-GuyNeural"),
    edgetts.WithRate("+15%"),
)

Text / SSML with explicit request objects

textReq := edgetts.Text("hello world", edgetts.WithVoice("en-US-GuyNeural"))
textData, err := client.Do(ctx, textReq)

ssmlReq := edgetts.SSML(`<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><voice name="en-US-GuyNeural"><prosody pitch="+5Hz">hello world</prosody></voice></speak>`)
ssmlData, err := client.Do(ctx, ssmlReq)

_ = textData
_ = ssmlData

Output shapes

Write text to an `io.Writer`

var buf bytes.Buffer
_, err := client.WriteTo(ctx, "hello world", &buf)

Write SSML to an `io.Writer`

var buf bytes.Buffer
_, err := client.WriteSSMLTo(ctx, ssml, &buf)

Stream text audio

stream, err := client.Stream(ctx, "hello world")
if err != nil {
    return err
}
defer stream.Close()

_, err = io.Copy(w, stream)

Stream SSML audio

stream, err := client.StreamSSML(ctx, ssml)
if err != nil {
    return err
}
defer stream.Close()

_, err = io.Copy(w, stream)

Stream directly in an HTTP handler

client := edgetts.New(edgetts.WithVoice("en-US-GuyNeural"))

http.HandleFunc("/tts", func(w http.ResponseWriter, r *http.Request) {
    stream, err := client.Stream(r.Context(), "hello from streaming tts")
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    defer stream.Close()

    w.Header().Set("Content-Type", "audio/mpeg")
    _, _ = io.Copy(w, stream)
})

Save SSML directly to file

err := client.SaveSSML(ctx, ssml, "speech.mp3")

Batch

Save batch into a directory

results, err := client.SaveBatch(ctx, "out", []edgetts.BatchItem{
    {Name: "a.mp3", Request: edgetts.Text("hello", edgetts.WithVoice("en-US-GuyNeural"))},
    {Name: "b.mp3", Request: edgetts.Text("welcome", edgetts.WithVoice("en-US-JennyNeural"))},
})

Each BatchResult contains:

Name
Bytes
N
Err

Write batch into a zip file

f, _ := os.Create("tts.zip")
defer f.Close()

err := client.WriteZIP(ctx, f, []edgetts.BatchItem{
    {Name: "a.mp3", Request: edgetts.Text("hello", edgetts.WithVoice("en-US-GuyNeural"))},
    {Name: "b.mp3", Request: edgetts.SSML(ssml)},
}, map[string]any{"source": "demo"})

Voices

List voices

voices, err := client.Voices(ctx)

Filter voices

matches := edgetts.FilterVoices(voices, edgetts.VoiceFilter{
    Locale: "en-US",
    Gender: "Female",
})

Find the first matching voice

voice, err := client.FindVoice(ctx, edgetts.VoiceFilter{
    ShortName: "en-US-GuyNeural",
})

Runnable demo flags

go run ./cmd/demo -h

Main flags:

-type (text or ssml)
-text
-output
-voice
-rate
-pitch
-volume
-stream

Migration guide

The old Speech API still works, but it is no longer the recommended entry point.

Old usage	New usage
`NewSpeech(opts...)`	`client := edgetts.New(opts...)`
`speech.AddSingleTask(text, w); speech.StartTasks()`	`client.WriteTo(ctx, text, w)`
`speech.AddSingleTask(text, file); speech.StartTasks()`	`client.Save(ctx, text, path)`
`speech.GetVoiceList()`	`client.Voices(ctx)`
`AddPackTask(...)`	`client.SaveBatch(...)` or `client.WriteZIP(...)`
Text tasks with per-call options	`client.Do(edgetts.Text(...))`
SSML advanced flows	`client.Do(edgetts.SSML(...))` or `client.StreamSSML(...)`

Migration example

Old:

speech, err := edgetts.NewSpeech(edgetts.WithVoice("en-US-GuyNeural"))
if err != nil {
    panic(err)
}

file, err := os.Create("hello.mp3")
if err != nil {
    panic(err)
}
defer file.Close()

if err := speech.AddSingleTask("hello world", file); err != nil {
    panic(err)
}
if err := speech.StartTasks(); err != nil {
    panic(err)
}

New:

client := edgetts.New(edgetts.WithVoice("en-US-GuyNeural"))
if err := client.Save(context.Background(), "hello world", "hello.mp3"); err != nil {
    panic(err)
}

Legacy compatibility

The old Speech task API still exists as a compatibility wrapper, but new code should prefer Client and the package-level helpers.

References

Notes

Speech is still available for compatibility, but new integrations should use Client.
Real network synthesis depends on the upstream Edge TTS endpoint behavior.

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.github		.github
cmd/demo		cmd/demo
internal		internal
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
client.go		client.go
client_test.go		client_test.go
errors.go		errors.go
examples_test.go		examples_test.go
go.mod		go.mod
go.sum		go.sum
option.go		option.go
option_test.go		option_test.go
speech.go		speech.go
speech_test.go		speech_test.go
types.go		types.go
voiceManager.go		voiceManager.go

Folders and files

Latest commit

History

Repository files navigation

edgetts

Documentation

Highlights

Install

Quick start

Save text to mp3

Reuse a client

Runnable demo

Package-level convenience API

Text to bytes

SSML to bytes

Text directly to file

SSML directly to file

Client API

Create a reusable client

Text / SSML with explicit request objects

Output shapes

Write text to an io.Writer

Write SSML to an io.Writer

Stream text audio

Stream SSML audio

Stream directly in an HTTP handler

Save SSML directly to file

Batch

Save batch into a directory

Write batch into a zip file

Voices

List voices

Filter voices

Find the first matching voice

Runnable demo flags

Migration guide

Migration example

Legacy compatibility

References

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Write text to an `io.Writer`

Write SSML to an `io.Writer`

Packages