Add Foundry Local as a chat and embedding provider in the AI Chat Web template

### Summary

The AI Chat Web template (`dotnet new aichatweb`, `Microsoft.Extensions.AI.Templates`) lets
you pick a chat and embedding provider: Azure OpenAI, GitHub Models, OpenAI, and Ollama. The
only on-device option today is Ollama, a third-party runtime. This proposes adding
[Foundry Local](https://www.foundrylocal.ai/), Microsoft's own on-device inference runtime, as
a first-class provider for both the chat `IChatClient` and the embedding `IEmbeddingGenerator`.

### Why

- It gives the template a Microsoft-owned local inference story: no cloud, no API key, models
  run on the developer's machine on CPU, GPU, or NPU through ONNX Runtime.
- Foundry Local is OpenAI-compatible, so it drops into the same
  `OpenAIClient -> AsIChatClient() / AsIEmbeddingGenerator()` shape the template already uses
  for the OpenAI and GitHub Models providers. The change is small and well-contained.
- The template's provider list is the de facto answer to "what does .NET AI support locally."
  Adding Foundry Local closes the gap and gives the docs and RAG tutorials a reference
  integration to point at.

### What it would look like

A new provider choice, for example `--provider foundrylocal`, that wires Foundry Local for
both chat and embeddings. Suggested defaults: `qwen3-4b` for chat (Apache-2.0, tool-calling
capable) and `qwen3-embedding-0.6b` for embeddings (1024-dimension vectors).

### Proof of concept

A runnable end-to-end sample is here:
https://github.com/luisquintanilla/foundry-local-aichatweb

It is the standard AI Chat Web app (Blazor, RAG over a local PDF, SQLite vector store) with the
provider swapped to Foundry Local. It builds and runs end to end, and includes a dev container
so you can try it in Codespaces.

An `aspire` branch shows what an Aspire orchestration could look like, using a small custom
`AddFoundryLocal` hosting integration (there is no Aspire hosting integration for Foundry Local
today): https://github.com/luisquintanilla/foundry-local-aichatweb/tree/aspire

### Notes for implementation

- SDK: `Microsoft.AI.Foundry.Local` 1.2.3 (stable, on nuget.org). The app uses the manager to
  start the local OpenAI-compatible web service and load models, then points the standard
  OpenAI client at it. Foundry Local is keyless on localhost.
- The embedding model returns 1024-dimension vectors, so `IngestedChunk.VectorDimensions`
  (currently hardcoded to 1536) needs to become per-provider.
- First run downloads the model weights; later runs start fast.

### Scope

Non-Aspire first. The Aspire path is under investigation: there is no Aspire hosting
integration for Foundry Local today (unlike Ollama's `AddOllama`), so a custom
`AddFoundryLocal` integration is prototyped on the sample's `aspire` branch (linked above). We
can decide how to bring Aspire into the template once that settles.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Foundry Local as a chat and embedding provider in the AI Chat Web template #7585

Summary

Why

What it would look like

Proof of concept

Notes for implementation

Scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add Foundry Local as a chat and embedding provider in the AI Chat Web template #7585

Description

Summary

Why

What it would look like

Proof of concept

Notes for implementation

Scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions