This sample demonstrates how to use Pinecone as a vector database for RAG (Retrieval Augmented Generation) workflows with Genkit.
- Pinecone Account: Sign up at pinecone.io
- Pinecone API Key: Get from the Pinecone console
- Google GenAI API Key: For embeddings and LLM
| Variable | Required | Default | Description |
|---|---|---|---|
GEMINI_API_KEY |
Yes | - | Google GenAI API key |
PINECONE_API_KEY |
Yes | - | Pinecone API key |
PINECONE_INDEX_NAME |
No | genkit-films | Name of the Pinecone index |
PINECONE_CLOUD |
No | aws | Cloud provider (aws, gcp, azure) |
PINECONE_REGION |
No | us-east-1 | Region for the index |
-
Copy the example environment file:
cp .env.example .env
-
Edit
.envand set your API keys:GEMINI_API_KEY=your-gemini-api-key PINECONE_API_KEY=your-pinecone-api-key -
Run the sample:
./run.sh
# Set required environment variables
export GEMINI_API_KEY="your-gemini-api-key"
export PINECONE_API_KEY="your-pinecone-api-key"
# Optional: Custom index name
# export PINECONE_INDEX_NAME="my-custom-index"
# Run the sample
./run.shOr with Maven directly:
mvn exec:java -Dexec.mainClass="com.google.genkit.samples.pinecone.PineconeRAGSample"The sample is configured to NOT auto-create indexes by default. You have two options:
- Go to Pinecone Console
- Create a new index with:
- Name:
genkit-films(or your custom name) - Dimensions:
768 - Metric:
cosine - Cloud: Your preferred cloud provider
- Name:
Modify the sample code to set createIndexIfNotExists(true):
.addIndex(PineconeIndexConfig.builder()
.indexName(indexName)
.createIndexIfNotExists(true) // Enable auto-creation
// ...
.build())Indexes sample documents about famous films into Pinecone.
curl -X POST http://localhost:4000/api/flows/indexDocuments \
-H 'Content-Type: application/json' \
-d '{}'Retrieves documents matching a semantic query.
curl -X POST http://localhost:4000/api/flows/retrieveDocuments \
-H 'Content-Type: application/json' \
-d '{"data": "sci-fi movies"}'Answers questions using RAG with retrieved context.
curl -X POST http://localhost:4000/api/flows/ragQuery \
-H 'Content-Type: application/json' \
-d '{"data": "What Christopher Nolan films are mentioned?"}'- Create an index in Pinecone (see Index Setup above)
- Start the sample application
- Index the sample documents:
curl -X POST http://localhost:4000/api/flows/indexDocuments -H 'Content-Type: application/json' -d '{}'
- Query for relevant documents:
curl -X POST http://localhost:4000/api/flows/retrieveDocuments -H 'Content-Type: application/json' -d '{"data": "movies about dreams"}'
- Ask questions using RAG:
curl -X POST http://localhost:4000/api/flows/ragQuery -H 'Content-Type: application/json' -d '{"data": "Which films were directed by Christopher Nolan and what are they about?"}'
The sample uses:
- Embedder:
googleai/text-embedding-004(768 dimensions) - LLM:
googleai/gemini-2.0-flash - Metric: Cosine similarity
- Index Type: Serverless (AWS us-east-1)
You can modify these settings in PineconeRAGSample.java.
Pinecone supports namespaces for multi-tenant applications. To use namespaces:
.addIndex(PineconeIndexConfig.builder()
.indexName("my-index")
.namespace("production") // Add namespace
.embedderName("googleai/text-embedding-004")
.build())The retriever/indexer will be available at /retriever/pinecone/my-index/production.
Index genkit-films does not exist and createIndexIfNotExists is false
Either create the index in the Pinecone console or enable auto-creation.
Vector dimension does not match index dimension
Ensure your index was created with 768 dimensions to match text-embedding-004.
Pinecone has rate limits on operations. For high-volume applications, implement retry logic with exponential backoff.
Pinecone charges based on:
- Number of vectors stored
- Number of queries
- Pod type (for pod-based indexes)
For development, serverless indexes on the free tier are recommended.