Skip to content
Star

AI Post-processing

AI post-processing uses a Large Language Model (LLM) to refine ASR transcripts. Typical improvements include removing filler words, fixing typos, adjusting punctuation, and polishing tone, making voice typing smoother and more natural.

Overview

Pipeline

Recording → ASR → [AI post-processing] → Insert text

Good for

  • Speech to writing: meeting notes, reports
  • Long-form input: reduce manual edits afterwards
  • Professional content: more formal/consistent output
  • Multilingual: combine with translation prompts for cross-language voice input

Not recommended

  • Casual chat (spoken style may feel more natural)
  • Very short input (single word, numbers)
  • Latency-sensitive scenarios

Supported LLM Providers

BiBi Keyboard supports 11 LLM providers. All of them use an OpenAI-compatible API format:

VendorDefault modelNotesSign-up link
SF_FREE (SiliconFlow Free)Qwen/Qwen3-8B🆓 free, no confighttps://cloud.siliconflow.cn/i/g8thUcWa
DEEPSEEKdeepseek-chat💰 good value; reasoning modehttps://platform.deepseek.com/
ZHIPUglm-4.6🇨🇳 China-based; reasoning modehttps://bigmodel.cn/usercenter/proj-mgmt/apikeys
MOONSHOTkimi-k2-0905-preview🧠 long context; reasoning modehttps://platform.moonshot.cn/console/api-keys
VOLCENGINEdoubao-seed-1-6-flash🇨🇳 Doubao; reasoning modehttps://console.volcengine.com/ark
OPENAIgpt-4o-mini🌍 OpenAIhttps://platform.openai.com/signup
GEMINIgemini-2.0-flash🚀 fast; reasoning supportedhttps://aistudio.google.com/apikey
GROQllama-3.3-70b-versatile⚡ very fast inferencehttps://console.groq.com/keys
CEREBRASllama-3.3-70b⚡ fast inferencehttps://cloud.cerebras.ai/platform
OHMYGPTgpt-4o-mini🔀 multi-vendor relayhttps://x.dogenet.win/i/CXuHm49s
CUSTOMuser-defined🛠️ any OpenAI-compatible API-

Reasoning mode

Some providers expose a "thinking/reasoning" mode. The model reasons before producing output, which can help for complex editing but usually increases latency and token usage.

Prompt Presets

BiBi Keyboard includes 5 built-in prompt presets and supports custom ones.

Built-in presets

Preset nameUse caseEffect
General post-processdaily voice inputremove filler words, fix slips, keep original meaning
Basic polishingformal rewritegrammar fixes, punctuation, smoother expression
Translate to Englishcross-languagetranslate transcript into English
Extract key pointsmeeting notesextract key info into a bullet list
Extract to-dostask trackingidentify tasks and generate a checklist

Custom prompts

Go to Settings → AI Post-processing → Prompt presets:

  1. Tap "Add preset"
  2. Write your prompt (role, task, rules, output format, etc.)
  3. Save and apply quickly in AI Edit

Configuration

Basic

KeyTypeDefaultDescription
postProcessEnabledBooleanfalsemaster switch
llmVendorLlmVendorSF_FREEselected LLM vendor
llmEndpointStringvendor defaultAPI endpoint (auto for built-in vendors)
llmApiKeyString""API key (not needed for free service)
llmModelStringvendor defaultmodel name
llmTemperatureFloat0.2temperature (0-2; lower = more deterministic)

Advanced

KeyTypeDefaultDescription
postprocSkipUnderCharsInt0skip AI post-processing if shorter than this (0=disable)
activePromptIdString""active prompt preset id
promptPresetsJsonString""prompt preset list JSON

Temperature hints

  • 0 - 0.3: highly consistent, good for precise edits
  • 0.4 - 0.7: balanced creativity and stability
  • 0.8 - 2.0: more creative but less stable

Quick Setup

BiBi Keyboard ships with SiliconFlow free service. No API key is required:

  1. Open Settings → AI Post-processing
  2. Enable "AI post-processing"
  3. Ensure vendor is SF_FREE (default)
  4. Pick a prompt preset (recommended: "General post-process")
  5. Tap the magic wand button on the keyboard to enable AI post-processing mode
  6. Done — transcripts will be refined automatically

About the free service

  • Has a free quota (see SiliconFlow website for details)
  • Example models: Qwen/Qwen3-8B, THUDM/GLM-4-9B, etc.
  • If you need other models, register on SiliconFlow and use your own API key

Configure a paid vendor

Example with DeepSeek:

  1. Sign up at https://platform.deepseek.com/
  2. Create an API key and add credits
  3. In Settings → AI Post-processing:
    • Vendor: DEEPSEEK
    • API key: paste your key
    • Model: e.g. deepseek-chat
    • Temperature: recommended 0.2
  4. Save and test

Configure a custom vendor

For any OpenAI-compatible API:

  1. Vendor: CUSTOM
  2. Fill in:
    • Endpoint: e.g. https://your-api.com/v1
    • API key
    • Model: e.g. gpt-3.5-turbo
    • Temperature: recommended 0.2
  3. Save and test

Custom endpoint requirements

  • Must be compatible with OpenAI Chat Completions API
  • The path is typically /v1/chat/completions (the app will append it automatically)

Reasoning Mode Support

Different vendors control reasoning mode in different ways:

VendorControl methodSupported modelsNotes
DEEPSEEKmodel choicedeepseek-reasonerchoose the reasoner model
MOONSHOTmodel choicekimi-k2-thinkingchoose the thinking model
SF_FREEtoggle paramQwen3 series, DeepSeek-V3.1, etc.enable "Reasoning mode" in settings
GEMINItoggle paramgemini-2.5-flash+reasoning_effort param
GROQtoggle paramqwen3-32b, gpt-oss seriesreasoning_effort param
CEREBRAStoggle paramgpt-oss-120breasoning_effort param
VOLCENGINEtoggle paramdoubao-seed series, deepseekthinking.type param
ZHIPUtoggle paramglm-4.6, glm-4.5 seriesthinking.type param
OHMYGPTtoggle paramgemini-2.5, claude, gpt-5 seriesreasoning_effort param

When to use reasoning mode

  • ✅ complex rewrites (technical terms, strict formatting)
  • ✅ tasks needing reasoning (e.g. to-do extraction)
  • ✅ multi-step transformations (e.g. translate + polish)
  • ❌ simple filler-word removal (extra latency without benefit)

Tips

There are three ways to trigger AI post-processing:

TriggerDescriptionBest for
Auto post-processruns automatically after each voice inputdaily use; output is final
AI Editselect text and open AI Edit; choose a prompt preset for the edititerative edits / retrying
Skip short inputset a minimum length threshold to skip AI for short phrasesavoid overhead on tiny inputs

Troubleshooting

AI post-processing does not run

Checklist:

  1. ✅ enabled (postProcessEnabled = true)
  2. ✅ input length ≥ postprocSkipUnderChars
  3. ✅ vendor config is valid (API key works if needed)
  4. ✅ network works
  5. ✅ quota not exhausted

Output is not as expected

Possible causes:

  • Prompt too vague → add constraints and examples
  • Temperature too high → reduce to ~0.2
  • Model too weak → try a stronger model
  • Input too long → might exceed model context limits

Too slow

Ideas:

  1. Switch to faster vendors (Groq, Cerebras)
  2. Use smaller models (e.g. Qwen3-8B instead of 235B)
  3. Disable reasoning mode
  4. Simplify the prompt

Released under the Apache 2.0 License.