Name	Name	Last commit message	Last commit date
parent directory ..
src/main	src/main
README.md	README.md
pom.xml	pom.xml
run.sh	run.sh

Genkit Ollama Sample

This sample demonstrates integration with local Ollama models using Genkit Java.

Features Demonstrated

Ollama Plugin Setup - Configure Genkit with local Ollama models
Flow Definitions - Create observable, traceable AI workflows
Text Generation - Generate text with Gemma 3n
Streaming - Real-time response streaming
Code Generation - Generate code
Creative Writing - Generate creative content
Translation & Summarization - Language tasks

Model

This sample uses gemma3n:e4b - Google Gemma 3n Edge 4B model.

Prerequisites

Java 21+
Maven 3.6+
Ollama installed and running

Installing Ollama

Download and install Ollama from https://ollama.ai
Pull the models you want to use:

# Pull the model
ollama pull gemma3n:e4b

Verify Ollama is running:

ollama list

Running the Sample

Option 1: Direct Run

# Navigate to the sample directory
cd java/samples/ollama

# Run the sample
./run.sh
# Or: mvn compile exec:java

Option 2: With Genkit Dev UI (Recommended)

# Navigate to the sample directory
cd java/samples/ollama

# Run with Genkit CLI
genkit start -- ./run.sh

The Dev UI will be available at http://localhost:4000

Custom Ollama Host

If Ollama is running on a different host:

export OLLAMA_HOST=http://your-ollama-server:11434
./run.sh

Available Flows

Flow	Input	Output	Description
`greeting`	String (name)	String	Simple greeting flow
`chat`	String (message)	String	Chat with Gemma
`tellJoke`	String (topic)	String	Generate a joke
`streamingChat`	String (message)	String	Streaming chat
`generateCode`	String (prompt)	String	Code generation (streaming)
`quickAnswer`	String (question)	String	Fast, brief answers
`creativeWriting`	String (prompt)	String	Creative writing (streaming)
`translate`	String (text)	String	Translate to Spanish
`summarize`	String (text)	String	Text summarization

Example API Calls

Once the server is running on port 8080:

Simple Greeting

curl -X POST http://localhost:8080/api/flows/greeting \
  -H 'Content-Type: application/json' \
  -d '"World"'

Chat

curl -X POST http://localhost:8080/api/flows/chat \
  -H 'Content-Type: application/json' \
  -d '"What is the capital of France?"'

Generate a Joke

curl -X POST http://localhost:8080/api/flows/tellJoke \
  -H 'Content-Type: application/json' \
  -d '"programming"'

Streaming Chat

curl -X POST http://localhost:8080/api/flows/streamingChat \
  -H 'Content-Type: application/json' \
  -d '"Explain quantum computing"'

Code Generation

curl -X POST http://localhost:8080/api/flows/generateCode \
  -H 'Content-Type: application/json' \
  -d '"Write a Python function to find prime numbers up to n"'

Quick Answer

curl -X POST http://localhost:8080/api/flows/quickAnswer \
  -H 'Content-Type: application/json' \
  -d '"What is 2+2?"'

Creative Writing

curl -X POST http://localhost:8080/api/flows/creativeWriting \
  -H 'Content-Type: application/json' \
  -d '"Write a short story about a robot learning to paint"'

Translation

curl -X POST http://localhost:8080/api/flows/translate \
  -H 'Content-Type: application/json' \
  -d '"Hello, how are you?"'

Summarization

curl -X POST http://localhost:8080/api/flows/summarize \
  -H 'Content-Type: application/json' \
  -d '"The quick brown fox jumps over the lazy dog. This sentence contains every letter of the English alphabet."'

Configuration

The Ollama plugin can be configured with the following options:

OllamaPlugin plugin = new OllamaPlugin(
    OllamaPluginOptions.builder()
        .baseUrl("http://localhost:11434")  // Or use OLLAMA_HOST env var
        .timeout(300)                        // Request timeout in seconds
        .models("gemma3n:e4b")               // Model to register
        .build()
);

Features

Streaming

Stream responses for real-time output:

genkit.generateStream(
    GenerateOptions.builder()
        .model("ollama/gemma3n:e4b")
        .prompt("Tell me a story")
        .build(),
    (chunk) -> {
        System.out.print(chunk.getText());
    });

JSON Output

Request JSON-formatted responses:

genkit.generate(
    GenerateOptions.builder()
        .model("ollama/gemma3n:e4b")
        .prompt("List 3 colors as JSON")
        .output(OutputConfig.builder()
            .format(OutputFormat.JSON)
            .build())
        .build());

Performance Tips

Enable GPU acceleration - Ollama automatically uses GPU if available
Adjust context window - Smaller context = faster responses
Use streaming - Better UX for longer responses

Troubleshooting

"Connection refused" error

Ensure Ollama is running:

ollama serve

"Model not found" error

Pull the required model:

ollama pull gemma3n:e4b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Genkit Ollama Sample

Features Demonstrated

Model

Prerequisites

Installing Ollama

Running the Sample

Option 1: Direct Run

Option 2: With Genkit Dev UI (Recommended)

Custom Ollama Host

Available Flows

Example API Calls

Simple Greeting

Chat

Generate a Joke

Streaming Chat

Code Generation

Quick Answer

Creative Writing

Translation

Summarization

Configuration

Features

Streaming

JSON Output

Performance Tips

Troubleshooting

"Connection refused" error

"Model not found" error

Slow responses

Out of memory

Resources

FilesExpand file tree

ollama

Directory actions

More options

Directory actions

More options

Latest commit

History

ollama

Folders and files

parent directory

README.md

Genkit Ollama Sample

Features Demonstrated

Model

Prerequisites

Installing Ollama

Running the Sample

Option 1: Direct Run

Option 2: With Genkit Dev UI (Recommended)

Custom Ollama Host

Available Flows

Example API Calls

Simple Greeting

Chat

Generate a Joke

Streaming Chat

Code Generation

Quick Answer

Creative Writing

Translation

Summarization

Configuration

Features

Streaming

JSON Output

Performance Tips

Troubleshooting

"Connection refused" error

"Model not found" error

Slow responses

Out of memory

Resources