Mohd-Mursaleen

A complete native Android application in Kotlin that runs Google's Gemma 4 E2B multimodal LLM locally on-device via Google's LiteRT-LM SDK.

11
2
100% credibility
Found Apr 21, 2026 at 11 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Kotlin
AI Summary

An Android app that runs a local multimodal AI model for text chatting, image analysis, and sharing responses with other local apps on the device.

How It Works

1
📱 Discover the app

You hear about a handy Android app that lets you run a smart AI helper right on your phone without internet.

2
🔽 Install on your phone

Download and install the app on your Android device to get started.

3
⬇️ Get the AI brain

Open the app and tap to download the big AI knowledge file that makes it smart (it shows progress so you know it's working).

4
Wait for setup

The app loads the AI into your phone's memory, which takes a minute or two the first time.

5
💬 Start chatting

Switch to the chat tab and type messages to talk with your private AI assistant as it responds in real time.

6
👁️ Analyze photos

Pick a picture from your gallery, add a question like 'what's in this?', and get a detailed description instantly.

7
🔄 Share with other apps

Turn on the sharing option so programs on your phone can ask the AI questions too.

Private AI ready

Now you have a fast, offline AI for chats and image insights anytime, all on your phone.

Sign up to see the full architecture

6 more

Sign Up Free

Star Growth

See how this repo grew from 11 to 11 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is LiteRT-Server?

LiteRT-Server is a complete native Android app in Kotlin that downloads and runs Google's Gemma 4 E2B multimodal LLM entirely on-device using the LiteRT-LM SDK. It solves the problem of needing offline AI inference by spinning up a local HTTP server on port 8080 with OpenAI-compatible endpoints for chat and vision analysis—send text prompts or image paths via curl, get streaming responses. Users get an in-app chat UI, image analyzer, server controls, and model management, all tested on mid-range hardware like Snapdragon 845.

Why is it gaining traction?

It stands out by packing GPU-accelerated multimodal LLM serving into a single APK with resumable 2.5GB model downloads from Hugging Face, automatic CPU fallback, and battery-optimized foreground service for Android 15. Developers dig the zero-setup API (health checks, /chat, /vision, /reset) that mimics OpenAI, plus request logging and curl examples right in the app—no cloud dependency for local experiments. As a complete GitHub project for on-device AI, it's a quick win over heavier frameworks.

Who should use this?

Android devs prototyping local AI features like chatbots or image captioning in apps without server costs. ML tinkerers on phones wanting to test Gemma models offline, or Termux users scripting against a localhost LLM server. Ideal for mobile hackers exploring native instrument bundles or complete native control in edge AI setups.

Verdict

Worth forking for local LLM proofs-of-concept on Android, but at 11 stars and 1.0% credibility, it's early-stage—docs are solid via README and APK link, but expect tweaks for production. Grab it if you're into litert server experiments.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.