BetterKhmer is a text normalization tool for the Khmer (Cambodian) language. It solves a subtle but important problem: the same Khmer word can be typed in multiple different ways that look identical on screen but are stored differently in the computer. This causes search engines to return wrong results, allows malicious websites to disguise themselves, and makes code review unreliable. BetterKhmer converts all these different-looking versions into one correct, consistent form. The tool has been carefully ported to work in 18 different programming languages, and all versions produce identical results, verified against over 10,000 test cases from real Khmer text.
How It Works
You notice that searching for words in Khmer gives different results, or that some text looks wrong in certain apps.
The same Khmer word can be stored in multiple ways that look identical on screen but have different byte sequences underneath.
A tool that converts all those different-looking versions into one consistent, correct form that works everywhere.
Popular scripting languages for web and automation
Fast compiled languages for performance-critical apps
Enterprise languages for large business applications
Systems languages for maximum control and speed
No package managers or complicated setup—just grab the single source file for your language and add it to your project.
Pass any Khmer text through the function and get back a perfectly standardized version that renders correctly everywhere.
Search finds the right results, comparisons work reliably, and your app displays Khmer text correctly for everyone.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.