Description of App
Run private AI on iPhone, Mac, and visionOS with local models, multimodal support, built-in tools, RAG, and optional remote endpoints.
Noema brings large language model intelligence to all your devices, fully offline. Download lightweight models directly from Hugging Face, connect supported remote endpoints, and pair models with curated textbooks and your own PDFs or EPUBs. The privacy-first design means your data never leaves your device when running locally, whether you are on iPhone, Mac, or visionOS.
- Native macOS app: Run the full Noema experience on your desktop with a rebuilt interface that feels at home on macOS.
- visionOS support: Use Noema in spatial computing environments, with windows you can place around your workspace.
- Noema Relay: Connect your iPhone to your Mac via CloudKit, with no local Wi-Fi required, so one device can host a model while another becomes the client.
- Vision support for models: Attach photographs to your prompts and use multimodal models for on-device image understanding and analysis.
- Open Textbook Library integration: Browse and import entire textbooks through the built-in Explore view; Noema indexes them locally so you can search and retrieve relevant passages on demand.
- Bring your own data: Add personal documents in PDF or EPUB formats, which are embedded and indexed on-device to power retrieval-augmented generation.
- Integrated Hugging Face search: Discover and install quantized models from the Hugging Face Hub with one-tap installation, automatic dependency management, and real-time download progress.
- Remote model support: Connect to supported remote endpoints including OpenRouter and LM Studio, with updated LM Studio REST v1 compatibility and a smoother model download flow through Explore.
- Expanded model runtime support: Run models across GGUF, MLX, ExecuTorch, CoreML, and Apple Foundation Model support, giving you flexible on-device options across Apple hardware.
- RAM check and model size helper: A built-in advisor estimates each modelβs memory footprint and shows when it fits your deviceβs budget; it can also estimate the maximum context length that fits in RAM.
- Advanced settings for power users: Fine-tune context length, quantization, and GPU acceleration; enable tool calling for built-in search and other functions; and customize model parameters for optimal performance.
- Built-in tool calling and Python support: Use integrated tools, including Python, to extend model capabilities for more advanced workflows.
- Built-in search and RAG: Use integrated search tools and retrieval-augmented generation to query your data without hitting context limits.
- Localization upgrades: Experience Noema in 10 languages, so international teams can work in the interface that suits them best.
- Private and offline by default: Local models run entirely on-device, and your conversations and files stay on your device unless you choose to use a connected remote provider.
Accessibility Comments