Choose your model
All models run locally in your browser. No data leaves your device.
Three paths to in-browser AI
WebLLM + MLC
Models are compiled ahead-of-time using Apache TVM / MLC. The compiler transforms model weights into optimized WebGPU compute shaders that run directly on your GPU.
SmolLM2 1.7B, Phi-3.5, Llama 3.2
Transformers.js + ONNX
ONNX Runtime Web interprets the model graph and executes it via WebGPU, or falls back to WebAssembly.
Qwen3.5 0.8B, 2B, 4B
MediaPipe + LiteRT
Google's MediaPipe LLM Inference loads Gemma models. Supports multimodal input — text and images.
Gemma 3n E2B, Gemma 3n E4B
System Prompt
Generation
Knowledge Base
By using ThinkHere you agree to our Terms of Use and Privacy Policy · A Qanata Labs product