Private beta · context-aware orchestration

AI that knowswhere to look.

Lumina routes every request through a transparent pipeline that chooses the right model, adds live web and knowledge context, applies memory when it helps, and streams answers you can inspect.

  • No credit card
  • Memory controls
  • Web + knowledge grounding
01
Model routing
Classify the task, then choose the path
02
Grounded context
Web, knowledge, and tool signals
03
Useful memory
Personal context with user controls

Model routing

Lumina chooses the best route for the task.

Instead of making you pick from a model menu, Lumina classifies the request and automatically routes it to the right path for speed, balanced quality, or deeper reasoning.

Lumina Fast
Short, lightweight tasks
Lumina Balanced
Everyday analysis and writing
Lumina Thinking
Reasoning-heavy work
Task classifier
Understands the request first
Web context
Fresh external signals
Knowledge retrieval
Your scoped documents
Memory context
Useful preferences and history
Tool execution
Approved actions when needed
Built so Lumina can improve routing without changing your workspace
Roadmap preview · Model Fusion

One prompt. Multiple perspectives. One fused answer.

Model Fusion is the next step in Lumina routing: for high-value tasks, Lumina will run a panel of model routes, compare their reasoning, and synthesize the strongest answer instead of stopping at the first response.

Deep research briefsStrategic decision memosArchitecture and code reviewsDocument-heavy analysis
Help shape Model Fusion

Fusion trace

Planned quality pass

In development
1
Prompt

One high-stakes request

2
Panel

Multiple Lumina model routes

3
Review

Compare strengths and disagreement

4
Fuse

One synthesized answer

Panel-based reasoning

Send important work through multiple Lumina routes instead of trusting a single model perspective.

Disagreement analysis

Surface where routes agree, where they conflict, and which evidence should influence the final answer.

Fused final answer

Synthesize the strongest reasoning, context, and sources into one cleaner response for decisions.

Quality controls

Reserve fusion for research, strategy, code review, and other tasks where extra quality is worth the cost.

Platform

Context beats another generic chatbot.

Lumina adds the orchestration layer between your request and the model: routing, web context, memory, knowledge retrieval, and tool execution.

Adaptive routing

Right model, right moment

Classify the task first, then route to the model path that fits the work.

  • Task-aware model selection
  • Lumina Mini, v1.0, and Thinking paths
  • Automatic route selection for each task
  • Streaming responses through one interface

Grounded context

Current answers, not stale guesses

Bring in live web and tool context when the request needs evidence.

  • Search routing across configured sources
  • Firecrawl-powered page extraction
  • Visible tool activity during the stream
  • Context only when it improves the answer

Memory controls

Personalization with boundaries

Use saved preferences and project context without hiding what is remembered.

  • Cross-session memory when enabled
  • Settings panel for stored memories
  • Delete controls for user context
  • No memory dependency for one-off questions

Knowledge retrieval

Answers from your own material

Retrieve scoped document and collection context before generating a response.

  • Collection-aware RAG flows
  • @collection mentions for targeted context
  • Naive and agentic retrieval paths
  • Document context alongside web signals

Capabilities

A pipeline built for grounded answers.

Lumina is not another chat wrapper. It is an orchestration layer that decides what context to gather, which model to use, and how to make the response traceable.

Intent classification

Every request is categorized before execution so simple questions stay fast and complex work gets deeper handling.

Automatic model routing

Lumina chooses between fast, balanced, and reasoning-heavy model paths based on what the task needs.

Live web context

Search and page extraction tools add current information when the answer needs more than model memory.

Knowledge retrieval

Pull relevant document and collection context into the prompt for grounded answers from your own material.

Persistent memory

Remember preferences and project context across sessions, with settings that expose what has been stored.

Prompt engineering layer

Task-specific system prompts and context shaping keep responses structured, relevant, and easier to reuse.

Tool execution loop

Let the assistant call approved tools, incorporate results, and continue the answer without hiding the work.

Transparent trust posture

Model routing, memory use, and knowledge context are treated as product surfaces, not invisible magic.

Streaming by default

See progress as the answer forms.

Context-aware retrieval

Add web, memory, and knowledge only when useful.

Inspectable execution

Expose the stages that shaped the response.

Trust posture

Production-ready means no black boxes.

Credibility starts with product behavior teams can inspect: routing, memory, knowledge scope, and model routing before badges or metrics.

Routing is explainable

Lumina is built around visible pipeline stages, so model selection and context gathering are not treated as a black box.

Memory stays user-facing

Saved memories are surfaced in settings, with controls to remove context that should no longer be used.

Knowledge is scoped

Document and collection retrieval is added only when it is relevant to the conversation and selected context.

Model routing is transparent

Lumina selects the model route automatically for each task, keeping routing behavior part of the product experience.

How it works

Nine stages underneath. Four moments you can follow.

The pipeline is technical by design, but the product experience stays simple: understand, route, ground, and answer.

1

Understand the request

Lumina classifies the task, identifies whether it needs current context, and prepares the right prompt shape.

2

Route the model

Fast, balanced, and reasoning-heavy paths are selected automatically based on what the task needs.

3

Ground the answer

Web search, memory, and knowledge retrieval are added only when they improve the response.

4

Stream the result

The assistant responds while tool activity and pipeline context stay visible enough to trust and debug.

Bring context into every answer.

Join the private beta to help shape an AI assistant that searches when it should, remembers what matters, retrieves your knowledge, and routes work through the right model.

Questions before you join?

Straight answers about the beta, the pipeline, and what Lumina is built to do.

Lumina is a context-aware AI assistant for chat, research, and decision work. It routes each request through a pipeline that can classify the task, choose a model, add web or knowledge context, apply memory, and execute approved tools.

Most chatbots send your prompt straight to one model. Lumina adds orchestration first: task classification, model routing, context retrieval, prompt shaping, optional thinking, and visible tool execution so answers are more grounded and easier to inspect.

Lumina automatically routes each request to the best model path for the task, whether it needs a fast answer, a balanced response, or deeper reasoning.

Model Fusion is a roadmap feature for high-value work. The goal is to let Lumina run multiple model routes, compare their reasoning, detect disagreement, and synthesize one stronger final answer.

Depending on the request and configuration, Lumina can use live web search, scraped page content, saved user memories, and relevant document or collection chunks from the knowledge layer.

Yes. Memory is designed to be user-facing: saved memories can be reviewed from settings, individual memories can be deleted, and memory can be disabled by configuration when personalization is not needed.

Lumina includes a knowledge retrieval layer for document and collection context. The goal is to ground answers in the material you choose instead of relying only on the model's general training.

The beta focuses on AI chat quality: model routing, memory, knowledge retrieval, web context, and tool execution. The waitlist helps us invite users in waves while those surfaces are hardened.

Lumina is in private beta. Join the waitlist to request early access; invites are sent in waves while the hosted product is hardened for broader availability.

Private beta · Inviting in waves

Request access to Lumina

Join the waitlist for a context-aware AI assistant that can route models, search the web, remember useful context, retrieve your knowledge, and shape the Model Fusion roadmap.

Join the private beta

No credit card. We’ll email when your invite is ready.

Free to requestNo spam, ever

Private Beta Access

Invites roll out in waves while the hosted product is hardened.

Product Updates

Get concise notes as routing, memory, knowledge, and Model Fusion features ship.

No Commitment

Free to request access. No credit card or sales call required.