AIDL Integration (Fcitx Linking)
BiBi Keyboard (asr-keyboard) provides a standard AIDL-compatible service, allowing other apps (e.g. the modified Fcitx (Little Penguin) IME) to use BiBi Keyboard's speech recognition.
The server side uses a hand-written Binder protocol, but it is fully compatible with AIDL-generated stubs/proxies. Clients can use .aidl generated code, or call it via raw Binder transact (as Fcitx does).
User Guide
Currently, only the modified Fcitx IME supports this. Steps:
- Install the latest BiBi Keyboard (OSS or Pro; Pro is preferred if installed)
- Enable external linking in BiBi Keyboard:
Settings → Input Settings → Allow external IME linking (AIDL) - Install the modified Fcitx IME
- In Fcitx, enable BiBi Keyboard linking:
Settings → Virtual Keyboard → Long-press Space Bar Behavior → Voice Input (AIDL) - While using Fcitx, long-press Space to start voice input, release to finish
Package priority (same as Fcitx implementation):
com.brycewg.asrkb.procom.brycewg.asrkb
Clients should try binding in this order and prefer the installed Pro package (same interface and behavior).
Developer Guide
Service interface (IExternalSpeechService)
Descriptor: com.brycewg.asrkb.aidl.IExternalSpeechService
Transaction codes (match AIDL stub):
| Method | Transaction code | Description |
|---|---|---|
startSession | FIRST_CALL_TRANSACTION + 0 | server-recording session |
stopSession | FIRST_CALL_TRANSACTION + 1 | stop current session |
cancelSession | FIRST_CALL_TRANSACTION + 2 | cancel current session |
isRecording | FIRST_CALL_TRANSACTION + 3 | whether session is recording |
isAnyRecording | FIRST_CALL_TRANSACTION + 4 | whether any session is recording |
getVersion | FIRST_CALL_TRANSACTION + 5 | app version name |
startPcmSession | FIRST_CALL_TRANSACTION + 6 | client-pushed PCM session |
writePcm | FIRST_CALL_TRANSACTION + 7 | push one PCM frame |
finishPcm | FIRST_CALL_TRANSACTION + 8 | finish PCM input and process |
AIDL signatures:
fun startSession(config: SpeechConfig?, callback: ISpeechCallback): Int
fun stopSession(sessionId: Int)
fun cancelSession(sessionId: Int)
fun isRecording(sessionId: Int): Boolean
fun isAnyRecording(): Boolean
fun getVersion(): String
fun startPcmSession(config: SpeechConfig?, callback: ISpeechCallback): Int
fun writePcm(sessionId: Int, pcm: ByteArray, sampleRate: Int, channels: Int)
fun finishPcm(sessionId: Int)Config object (SpeechConfig)
SpeechConfig is a nullable Parcelable with fields:
vendorId: String?streamingPreferred: BooleanpunctuationEnabled: Boolean?autoStopOnSilence: Boolean?sessionTag: String?
Current behavior:
startSession/startPcmSessionignore all config fields exceptvendorId == "mock". The actual vendor/streaming mode always follows BiBi Keyboard's current in-app settings.- Only
startSessionsupportsvendorId == "mock"connectivity test mode (see below).
Callback interface (ISpeechCallback)
Descriptor: com.brycewg.asrkb.aidl.ISpeechCallback
fun onState(sessionId: Int, state: Int, message: String)
fun onPartial(sessionId: Int, text: String)
fun onFinal(sessionId: Int, text: String)
fun onError(sessionId: Int, code: Int, message: String)
fun onAmplitude(sessionId: Int, amplitude: Float)Key Methods
startSession (server-recording)
BiBi Keyboard handles recording and audio upload.
Return values:
>0: success; returns server-generatedsessionId-2: busy (an existing session is recording)-3: feature disabled or engine not ready-4: BiBi Keyboard lacksRECORD_AUDIOpermission
Callbacks:
- For feature disabled (
-3), it will callonError(-1, 403, "feature disabled")first. - For permission (
-4), it will callonError(-1, 401, "record permission denied")first. -2/-3(engine not ready) are indicated only via return value (no extra callback).
Connectivity test (mock)
When SpeechConfig.vendorId == "mock", it skips real recording: the server directly calls onPartial("【testing】...") and onFinal("External AIDL integration OK (mock)") without needing record permission.
startPcmSession / writePcm / finishPcm (client-pushed PCM)
The client records audio and pushes PCM frames to BiBi Keyboard (Fcitx uses this mode).
startPcmSession return values:
>0: success; returnssessionId-2: busy-3: feature disabled-5: current vendor does not support pushed PCM (unsupported)
Notes:
- Pushed PCM mode does not check BiBi Keyboard's microphone permission; the client handles recording permission itself.
- Recommended format:
PCM16LE / 16000Hz / mono, around 200ms per frame. The server currently does not strictly validate sample rate/channels, but mismatches may hurt results for some engines. finishPcm(sessionId)is equivalent tostopSession(sessionId)and indicates end of audio input, waiting for final result.
stopSession / cancelSession
Both are void; no success status is returned.
stopSession: end input and enter processing; later you will receiveonFinaloronError.cancelSession: cancel and cleanup; it is not guaranteed thatonFinalwill never be called (AIDL notes "final result not guaranteed").
isRecording / isAnyRecording
isRecording(sessionId): whether the given session is recording/accepting input.isAnyRecording(): whether any active session exists.
getVersion
Returns semantic version name (BuildConfig.VERSION_NAME), e.g. "1.6.0".
Callback States & Errors
onState: state values
| state (Int) | Meaning | Common message |
|---|---|---|
0 | idle / finished | final / canceled |
1 | recording | recording |
2 | processing | processing |
3 | error | error text |
onError: code values
| code | Meaning |
|---|---|
401 | missing record permission (only possible from startSession) |
403 | external linking is disabled |
500 | server internal error (engine/network/etc.) |
Enable Requirements
External AIDL linking requires:
Prefs.externalAidlEnabled == true
Settings entry: Settings → Input Settings → External IME linking.
Vendor & Streaming Mode Decision
External calls always follow BiBi Keyboard's current settings (ignore SpeechConfig).
Cloud vendors:
- Volc:
prefs.volcStreamingEnabledselectsVolcStreamAsrEngine - DashScope:
prefs.dashStreamingEnabled - Soniox:
prefs.sonioxStreamingEnabled - ElevenLabs:
prefs.elevenStreamingEnabled - OpenAI / Gemini / SiliconFlow / Zhipu: fixed non-streaming file engines
Local vendors:
- Paraformer / Zipformer: streaming only
- SenseVoice / TeleSpeech: non-streaming file engines (pseudo-streaming is UI-only and not exposed externally)
Result Filters
Final results (onFinal) go through unified post-filters:
- If
trimFinalTrailingPunctis enabled, trim trailing punctuation/emoji. - Speech preset replacement: if matched, replace with preset content.
- If
postProcessEnabledis enabled and LLM keys are valid, run AI post-processing; on failure/empty output, fall back to simple processing.
Entry points: AsrFinalFilters.applySimple / AsrFinalFilters.applyWithAi.
Session Cleanup
Server removes the session and releases resources:
- after
onFinal - after
onError - immediately on
cancelSession
Clients should proactively call cancelSession on window/focus changes to avoid dangling sessions.
Fcitx (bibi/lexi) Integration Example
The modified Fcitx IME (bibi) integrates BiBi Keyboard linking. The example directory in this repo still uses the old name fcitx5-android-lexi-keyboard.
Client implementation file:
fcitx5-android-lexi-keyboard/app/src/main/java/org/fcitx/fcitx5/android/link/AsrkbSpeechClient.ktKey behaviors:
- bind package order:
com.brycewg.asrkb.pro→com.brycewg.asrkb - component:
com.brycewg.asrkb.api.ExternalSpeechService - raw Binder
transactcalls (no AIDL generated classes) - uses pushed PCM mode:
startPcmSession→ loopwritePcm→finishPcm/cancelSession - long-press Space to start; on release, call
finishPcmif any PCM was sent, elsecancelSession onPartialfor preview (setComposingText)onFinalcommits (commitText) and unbinds serviceonAmplitudedrives waveform overlay animation
Error handling:
-2: show "busy"-3/403: show "enable external IME linking in BiBi Keyboard"-5: show "current vendor doesn't support pushed PCM"- record permission is handled by Fcitx; pushed PCM mode won't return
401/-4from server
Best Practices
- Bind with
Context.BIND_AUTO_CREATEand try Pro → OSS in order. - Store returned
sessionIdafterstartSession/startPcmSession. Do not generate/reuse ids. - Call
cancelSession(sessionId)on focus/window changes. - Use
onPartialfor live preview; afteronFinal/onErroryou should cleanup/unbind. - If you want the client to control recording (recommended for IME use), prefer pushed PCM mode.
