ASR Provider Setup

This page covers how to register, obtain credentials, and configure ASR providers in BiBi Keyboard (说点啥).

Before you start

Open Settings → ASR Settings and select your ASR provider.
Cloud providers usually require an API Key / Access Token.
Local models require downloading/importing model files (first load may take a few seconds).

Security

API keys and access tokens are sensitive. Do not share them publicly. If you suspect leakage, revoke the key/token immediately and create a new one.

Provider overview

Provider	Type	Streaming	Best for
Volcengine	Cloud	✅	Low-latency, real-time streaming
SiliconFlow	Cloud	❌	Beginner-friendly, low cost
DashScope (Alibaba)	Cloud	✅	Balanced accuracy and cost
Soniox	Cloud	✅	Stable streaming, international usage
Gemini	Cloud	❌	Small usage / file-based recognition
ElevenLabs	Cloud	✅/❌	High accuracy (model-dependent)
OpenAI (compatible)	Cloud	❌	OpenAI/compatible file transcription
Zhipu GLM	Cloud	❌	Simple integration, lower cost
Local models (SenseVoice / Paraformer / FunASR Nano / TeleSpeech)	Local	Partial ✅	Privacy-first, offline usage

Volcengine

Volcengine (Doubao Voice) has strong Chinese recognition and supports both streaming and non-streaming.

1. Create an app and enable ASR services

Open the console: https://console.volcengine.com/speech/app?opt=create
Enable these capabilities:
- Streaming Speech Recognition Large Model
- Audio File Recognition Large Model (Express)

Create app and enable capabilities

2. Get APP ID and Access Token

Open the service page: https://console.volcengine.com/speech/service/10011
Copy APP ID and Access Token under the credential section

APP ID and Access Token

3. Configure in BiBi Keyboard

Open Settings → ASR Settings
Select Volcengine
Paste APP ID into X-Api-App-Key
Paste Access Token into X-Api-Access-Key
If you want streaming, enable “Use Streaming (WebSocket)”

Configure Volcengine in app

Note

If you enabled both streaming and audio-file recognition when creating the app, they share the same credentials.

SiliconFlow

SiliconFlow provides a built-in free ASR option (no key required) and paid models (own key).

Quick start (no API key required)

In Settings → ASR Settings, select SiliconFlow
Keep the “Free ASR” toggles enabled
Switch between the free models (e.g. FunAudioLLM/SenseVoiceSmall, TeleAI/TeleSpeechASR) as needed

Use your own API key (optional)

Sign up / log in: https://cloud.siliconflow.cn/
Create an API key in the console
Paste it into the SiliconFlow section in BiBi Keyboard

SiliconFlow API key

DashScope (Alibaba Bailian / Qwen)

DashScope offers good accuracy and cost efficiency, with partial streaming support.

1. Create an API key

Open: https://bailian.console.aliyun.com/?tab=model#/api-key
Create and copy an API key

DashScope API key

2. Configure in BiBi Keyboard

Open Settings → ASR Settings and select DashScope
Paste the API key and save

Soniox

Soniox supports both streaming and non-streaming.

Log in: https://console.soniox.com
In your project, go to API keys
Create and copy the API key, then paste it into BiBi Keyboard

Gemini

Gemini is commonly used for file-based recognition and small usage.

Open: https://aistudio.google.com/api-keys
Create and copy a key
Paste it into the Gemini section in BiBi Keyboard

Gemini API key

ElevenLabs

ElevenLabs scribe_v1 is non-streaming only; scribe_v2 is streaming only.

Open: https://elevenlabs.io/app/settings/api-keys
Create an API key
Enable Speech to Text permission for the key

Create ElevenLabs key Enable Speech to Text permission

OpenAI (compatible endpoints)

The OpenAI provider supports OpenAI-format transcription endpoints (and compatible third-party endpoints).

In Settings → ASR Settings, select OpenAI
Fill in:
- ASR Endpoint (e.g. https://api.openai.com/v1/audio/transcriptions or a compatible endpoint)
- API Key (Bearer)
- Model name (e.g. gpt-4o-mini-transcribe / whisper-1)

OpenAI settings example

Zhipu GLM

Zhipu GLM is simple to integrate and usually used as non-streaming.

Get an API key: https://bigmodel.cn/usercenter/proj-mgmt/apikeys
Paste it into the Zhipu section in BiBi Keyboard

Local model setup

Local models are ideal for offline usage and privacy. Each model trades off speed, quality, and streaming support.

Model selection tips

SenseVoice: non-streaming; fast and balanced; supports language settings
FunASR Nano: non-streaming; slower but often higher quality
Paraformer: streaming supported; decent quality
TeleSpeech: non-streaming; slightly better dialect support

Download in-app (recommended)

Select a local provider (e.g. SenseVoice / Paraformer)
In the model manager, choose a variant and download
If notification permission is granted, you can track download/unzip progress in notifications

Download local models in-app

Import from local files (optional)

If you prefer adding models from local files, download the ZIP first, then choose "Import from local" in the model manager.

Direct download links (GitHub Releases)

Direct links

The links below point to BiBi-Keyboard model ZIPs. If you see 404 or slow downloads, use the models page (Releases: models) or a GitHub mirror site.

https://github.com/BryceWG/BiBi-Keyboard/releases/tag/models

ASR Provider Setup ​

Before you start ​

Provider overview ​

Volcengine ​

1. Create an app and enable ASR services ​

2. Get APP ID and Access Token ​

3. Configure in BiBi Keyboard ​

SiliconFlow ​

Quick start (no API key required) ​

Use your own API key (optional) ​

DashScope (Alibaba Bailian / Qwen) ​

1. Create an API key ​

2. Configure in BiBi Keyboard ​

Soniox ​

Gemini ​

ElevenLabs ​

OpenAI (compatible endpoints) ​

Zhipu GLM ​

Local model setup ​

Model selection tips ​

Download in-app (recommended) ​

Import from local files (optional) ​

Direct download links (GitHub Releases) ​

SenseVoice (non-streaming) ​

Paraformer (streaming) ​

TeleSpeech (non-streaming) ​

FunASR Nano (non-streaming) ​

Universal punctuation model (optional) ​