Scarce data
Little labeled Amharic clinical speech exists. We build consented datasets carefully, never scraping patient conversations without permission.
Bekur listens to a real consultation — Amharic, English, and the code-switching in between — and writes the structured clinical note automatically. Here is how the system works, how we measure it, and how we keep patient data safe.
A consultation is messy — two voices, two languages, interruptions, and clinical shorthand. Bekur turns that raw audio into a structured note in four stages, each one tuned for Ethiopian clinical reality.
Bekur records the consultation through the clinician's device. Audio is processed securely and encrypted.
Speech recognition transcribes Amharic and English in real time, tuned on Ethiopian accents, names, and the spoken patterns of a real clinic — not Western datasets.
Doctors mix Amharic and English mid-sentence. Our models follow the switch word by word, so a phrase that starts in Amharic and ends with an English drug name still resolves correctly.
A clinical language model maps the conversation into a structured SOAP note — Subjective, Objective, Assessment, Plan — ready for the clinician to review, edit, and sign.
A medical scribe is only useful if clinicians trust it. We treat accuracy as an engineering discipline — measured against clinician-reviewed ground truth, on real Ethiopian consultations, in both languages.
Recognized natively, including mid-sentence code-switching.
Every generated note is reviewed and signed by a clinician before it enters the record.
The share of a doctor's day documentation can consume — the time Bekur is built to give back.
We benchmark transcription and structuring against notes written by Ethiopian clinicians, so our metrics reflect the work as it is actually done.
Bekur drafts; the clinician decides. Nothing is finalized without review, and every edit becomes a signal that makes the next note better.
When the model is unsure — an unfamiliar drug, a noisy room — it flags the uncertainty rather than inventing detail. We optimize against confident mistakes.
English medical AI rides on decades of digitized notes, open datasets, and billions of labeled words. Amharic has almost none of that. Building a clinical scribe for Ethiopia means doing the harder science: making models that learn from far less, in a language with rich morphology and a non-Latin script.
Amharic words inflect heavily — a single root can take dozens of forms — and clinical speech blends it with English drug names, lab terms, and abbreviations. Off-the-shelf models trained on Western data simply break here. So we build the data, the evaluation, and the models for the language as it is actually spoken in an Ethiopian consulting room.
This is the work: collecting and respecting consented clinical speech, tuning recognition for local accents and code-switching, and teaching a language model to structure messy bilingual dialogue into a note a doctor would sign. We build in the open because credible medical AI for low-resource languages should be earned in public.
“If technology only works in the languages with the most data, it will never reach the patients who need it most. We build for Amharic first, on purpose.”
Little labeled Amharic clinical speech exists. We build consented datasets carefully, never scraping patient conversations without permission.
One Amharic root takes many forms. Recognition and structuring must handle inflection that Western tokenizers were never designed for.
Clinical Amharic is interleaved with English terms. Models must follow the switch fluidly rather than treating one language as noise.
A wrong drug or dose is unacceptable. We optimize against confident errors and keep a clinician in the loop on every note.
What clinic and practice leaders ask before bringing Bekur to their team.
Bekur listens in Amharic and English and writes the clinical note automatically. Join the waitlist and we'll show you how it works on your own consultations.