Skip to content

AI Post-processing

AI post-processing uses a Large Language Model (LLM) to refine ASR transcripts. Typical improvements include removing filler words, fixing typos, adjusting punctuation, and polishing tone, making voice typing smoother and more natural.

Quick Setup

BiBi Keyboard ships with SiliconFlow free service. No API key is required:

  1. Open Settings → AI Post-processing
  2. Enable "AI post-processing"
  3. Ensure vendor is SF_FREE (default)
  4. Pick a prompt preset (recommended: "General post-process")
  5. Tap the magic wand button on the keyboard to enable AI post-processing mode
  6. Done — transcripts will be refined automatically

About the free service

  • Has a free quota (see SiliconFlow website for details)
  • Example models: Qwen/Qwen3-8B, THUDM/GLM-4-9B, etc.
  • If you need other models, register on SiliconFlow and use your own API key

Configure a paid vendor

Example with DeepSeek:

  1. Sign up at https://platform.deepseek.com/
  2. Create an API key and add credits
  3. In Settings → AI Post-processing:
    • Vendor: DEEPSEEK
    • API key: paste your key
    • Model: e.g. deepseek-chat
    • Temperature: recommended 0.2
  4. Save and test

Configure a custom vendor

For any OpenAI-compatible API:

  1. Vendor: CUSTOM
  2. Fill in:
    • Endpoint: e.g. https://your-api.com/v1
    • API key
    • Model: e.g. gpt-3.5-turbo
    • Temperature: recommended 0.2
  3. Save and test

Custom endpoint requirements

  • Must be compatible with OpenAI Chat Completions API
  • The path is typically /v1/chat/completions (the app will append it automatically)

Overview

Pipeline

Recording → ASR → [AI post-processing] → Insert text

Streaming Preview & Typewriter Effect

When your LLM vendor supports streaming output, AI post-processing can show a live preview while the model is generating. You can toggle the "typewriter effect" under Settings → AI Post-processing to make the preview output smoother.

Note

The typewriter effect only affects how the streaming preview is displayed. It does not change the final inserted text.

Good for

  • Speech to writing: meeting notes, reports
  • Long-form input: reduce manual edits afterwards
  • Professional content: more formal/consistent output
  • Multilingual: combine with translation prompts for cross-language voice input

Not recommended

  • Casual chat (spoken style may feel more natural)
  • Very short input (single word, numbers)
  • Latency-sensitive scenarios

Supported LLM Providers

BiBi Keyboard supports 12 LLM providers. All of them use an OpenAI-compatible API format:

VendorSign-up link
SF_FREE (SiliconFlow Free)https://cloud.siliconflow.cn/i/g8thUcWa
DEEPSEEKhttps://platform.deepseek.com/
ZHIPUhttps://bigmodel.cn/usercenter/proj-mgmt/apikeys
MOONSHOThttps://platform.moonshot.cn/console/api-keys
VOLCENGINEhttps://console.volcengine.com/ark
OPENAIhttps://platform.openai.com/signup
GEMINIhttps://aistudio.google.com/apikey
GROQhttps://console.groq.com/keys
CEREBRAShttps://cloud.cerebras.ai/platform
FIREWORKShttps://fireworks.ai/
OHMYGPThttps://x.dogenet.win/i/CXuHm49s
CUSTOM-

Reasoning mode

Some providers expose a "thinking/reasoning" mode. The model reasons before producing output, which can help for complex editing but usually increases latency and token usage.

Prompt Presets

BiBi Keyboard includes 5 built-in prompt presets and supports custom ones.

Built-in presets

Preset nameUse caseEffect
General post-processdaily voice inputremove filler words, fix slips, keep original meaning
Basic polishingformal rewritegrammar fixes, punctuation, smoother expression
Translate to Englishcross-languagetranslate transcript into English
Extract key pointsmeeting notesextract key info into a bullet list
Extract to-dostask trackingidentify tasks and generate a checklist

Custom prompts

Go to Settings → AI Post-processing → Prompt presets:

  1. Tap "Add preset"
  2. Write your prompt (role, task, rules, output format, etc.)
  3. Save and apply quickly in AI Edit

Configuration

Basic

KeyTypeDefaultDescription
postProcessEnabledBooleanfalsemaster switch
postprocTypewriterEnabledBooleantruetypewriter effect for streaming preview (UI only)
llmVendorLlmVendorSF_FREEselected LLM vendor
llmEndpointStringvendor defaultAPI endpoint (auto for built-in vendors)
llmApiKeyString""API key (not needed for free service)
llmModelStringvendor defaultmodel name
llmTemperatureFloat0.2temperature (0-2; lower = more deterministic)

Advanced

KeyTypeDefaultDescription
postprocSkipUnderCharsInt0skip AI post-processing if shorter than this (0=disable)
activePromptIdString""active prompt preset id
promptPresetsJsonString""prompt preset list JSON

Temperature hints

  • 0 - 0.3: highly consistent, good for precise edits
  • 0.4 - 0.7: balanced creativity and stability
  • 0.8 - 2.0: more creative but less stable

Reasoning Mode Support

Different vendors control reasoning mode in different ways:

VendorControl methodSupported modelsNotes
DEEPSEEKmodel choicedeepseek-reasonerchoose the reasoner model
MOONSHOTmodel choicekimi-k2-thinkingchoose the thinking model
SF_FREEtoggle paramQwen3 series, DeepSeek-V3.1, etc.enable "Reasoning mode" in settings
GEMINItoggle paramgemini-2.5-flash+reasoning_effort param
GROQtoggle paramqwen3-32b, gpt-oss seriesreasoning_effort param
CEREBRAStoggle paramgpt-oss-120breasoning_effort param
VOLCENGINEtoggle paramdoubao-seed series, deepseekthinking.type param
ZHIPUtoggle paramglm-4.6, glm-4.5 seriesthinking.type param
OHMYGPTtoggle paramgemini-2.5, claude, gpt-5 seriesreasoning_effort param

When to use reasoning mode

  • ✅ complex rewrites (technical terms, strict formatting)
  • ✅ tasks needing reasoning (e.g. to-do extraction)
  • ✅ multi-step transformations (e.g. translate + polish)
  • ❌ simple filler-word removal (extra latency without benefit)

Model Selection & Fetching Model List

In Settings → AI Post-processing, you can tap "Fetch model list" to query available models from your vendor and add commonly used ones into the in-app dropdown.

Tip

For CUSTOM vendors, if your backend has a default model, the model field can be left empty. If the test call fails, fill in the required model name as your provider expects.

Advanced: Custom Reasoning Params (JSON)

For some vendors, reasoning mode exposes JSON fields like "Reasoning params (on/off)" to attach extra parameters depending on whether reasoning is enabled.

  • Leave it empty if you’re not sure (defaults are fine)
  • Must be valid JSON objects (example: {"reasoning_effort":"medium"})
  • Parameter names depend on vendor documentation

Tips

There are three ways to trigger AI post-processing:

TriggerDescriptionBest for
Auto post-processruns automatically after each voice inputdaily use; output is final
AI Editselect text and open AI Edit; choose a prompt preset for the edititerative edits / retrying
Skip short inputset a minimum length threshold to skip AI for short phrasesavoid overhead on tiny inputs

Troubleshooting

AI post-processing does not run

Checklist:

  1. ✅ enabled (postProcessEnabled = true)
  2. ✅ input length ≥ postprocSkipUnderChars
  3. ✅ vendor config is valid (API key works if needed)
  4. ✅ network works
  5. ✅ quota not exhausted

Output is not as expected

Possible causes:

  • Prompt too vague → add constraints and examples
  • Temperature too high → reduce to ~0.2
  • Model too weak → try a stronger model
  • Input too long → might exceed model context limits

Too slow

Ideas:

  1. Switch to faster vendors (Groq, Cerebras)
  2. Use smaller models (e.g. Qwen3-8B instead of 235B)
  3. Disable reasoning mode
  4. Simplify the prompt

Released under the Apache 2.0 License.