Docs

TTS Provider Setup

Step-by-step guides for setting up API keys for each text-to-speech provider.

Audiobook Studio supports three TTS providers: ElevenLabs, OpenAI TTS, and Google Cloud TTS. You only need to set up the one you plan to use. Each project is locked to the provider that was active when it was created.

Your TTS provider is chosen per-project at creation time based on your default in Settings. To switch providers for future projects, change your default before creating a new project.

ElevenLabs

ElevenLabs offers high-quality, expressive voices with a large library of community and cloned voices. It is the default provider.

Getting Your API Key

Create an ElevenLabs account

Go to elevenlabs.io and sign up for a free or paid account.

Open your API key settings

Once logged in, click your profile icon in the bottom-left corner and select Profile + API key.

Create a new API key

Click + Create API Key. Give it a name like "Audiobook Studio". When prompted for permissions, enable:

Text to Speech — required for generating audio
Voices Read — required for listing and previewing voices
User Read — allows the app to show your usage and character quota

Paste into Settings

Copy the key and paste it into the ElevenLabs API Key field in Audiobook Studio's Settings (gear icon in the top bar).

ElevenLabs has a free tier with limited characters per month. Paid plans offer more characters and access to premium voices.

OpenAI TTS

OpenAI offers simple, high-quality voices. The standard models (tts-1 and tts-1-hd) include 9 voices, while gpt-4o-mini-tts adds 4 extra voices for a total of 13.

Getting Your API Key

Create an OpenAI account

Go to platform.openai.com and sign up or log in.

Add billing credits

Navigate to Settings → Billing and add credits. OpenAI's TTS API is pay-per-use — you are charged based on the number of characters processed.

Create an API key

Go to API keys (in the left sidebar) and click + Create new secret key. Give it a name like "Audiobook Studio". No special permissions are needed — the default "All" scope works fine.

Paste into Settings

Copy the key and paste it into the OpenAI TTS API Key field in Audiobook Studio's Settings. You can also choose which model to use (tts-1 for faster generation, tts-1-hd for higher quality).

OpenAI API keys are shown only once when created. If you lose it, you'll need to create a new one.

Google Cloud TTS

Google Cloud Text-to-Speech offers a wide variety of voices including Standard, WaveNet, Neural2, Journey, Studio, and the latest Chirp 3 HD voices. Setup requires a few more steps than the other providers because you need to enable the API in Google Cloud Console.

Getting Your API Key

Create a Google Cloud account

Go to console.cloud.google.com and sign up or log in with your Google account. New accounts get $300 in free credits.

Create a project (or select an existing one)

In the top bar, click the project dropdown and select New Project. Give it a name like "Audiobook Studio" and click Create. Make sure this project is selected in the top bar for the following steps.

Enable the Text-to-Speech API

This is the most important step. Go to APIs & Services → Library, search for "Cloud Text-to-Speech API", and click Enable. Without this, you'll get a "permission denied" error.

Create an API key

Go to APIs & Services → Credentials, click + Create Credentials → API key. A new key will be generated.

(Recommended) Restrict the API key

Click on the newly created key to edit it. Under API restrictions, select Restrict key and choose only Cloud Text-to-Speech API. This limits what the key can access for better security.

Paste into Settings

Copy the key and paste it into the Google Cloud TTS API Key field in Audiobook Studio's Settings.

You must enable the Cloud Text-to-Speech API (Step 3) before the key will work. If you skip this step, you'll see a "permission denied" error when loading voices. After enabling, it may take a minute or two to take effect.

Google Cloud offers a free tier for Text-to-Speech: up to 4 million characters per month for Standard voices, and 1 million characters for WaveNet/Neural2 voices. Check cloud.google.com/text-to-speech/pricing for current pricing.

Choosing a Provider

Not sure which provider to use? Here's a quick comparison:

ElevenLabs — Best voice quality and variety. Large library of community voices. Supports voice cloning. Most expressive for fiction and audiobooks.
OpenAI TTS — Simple setup, consistent quality. Fewer voices but all are high quality. Good for straightforward narration.
Google Cloud TTS — Widest selection of voice types (Standard, WaveNet, Neural2, Journey, Chirp 3 HD). Generous free tier. Requires a few extra setup steps.

API Key Security

Treat your API keys like passwords. Never share them publicly or post them online. Anyone with your key can generate audio and consume your credits. Your keys are stored securely in your account and are never shared with other users.

Create a dedicated key for Audiobook Studio so you can revoke it independently if needed.
Use API restrictions where available (Google, OpenAI) to limit what the key can do.
Monitor your usage and billing in your provider's dashboard to catch unexpected charges early.