Skip to content

viktorferenczi/JsGeminiService

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Gemini AI Service

TypeScript Node.js Express License

Production-ready Express.js service with Google Gemini AI integration supporting multimodal AI capabilities and custom agents.

✨ Features

  • πŸ€– Google Gemini AI Integration - Full support for Gemini 1.5 Flash models
  • πŸ’¬ Text Chat - Conversational AI with history support
  • πŸ–ΌοΈ Image Analysis - Analyze single or multiple images
  • 🎡 Audio Processing - Audio analysis and transcription
  • πŸŽ₯ Video Analysis - Video content understanding
  • 🎨 Image Generation - Generate image descriptions (Imagen 3 coming soon)
  • 🎬 Video Generation - Generate video storyboards (Veo coming soon)
  • 🀝 Custom Agents - Create specialized AI agents with custom instructions
  • πŸ“¦ Agent Presets - Pre-configured agents (Support, Code Assistant, Writer, etc.)
  • πŸ”’ Production Ready - Security, logging, error handling, Docker support
  • πŸ“ TypeScript - Full type safety and IntelliSense
  • βœ… Tested - Comprehensive test suite with Jest

Note: Image and video generation endpoints currently return detailed descriptions as Imagen 3 and Veo APIs are not yet available in the Node.js SDK. These will be updated when the APIs become available.

πŸš€ Quick Start

Prerequisites

Installation

# Clone the repository
git clone https://github.com/viktorferenczi/JsGeminiService.git
cd JsGeminiService

# Install dependencies
npm install

# Configure environment
cp .env.example .env
# Edit .env and add your GEMINI_API_KEY

# Start development server
npm run dev

The service will start on http://localhost:5000

Verify Installation

curl http://localhost:5000/health

πŸ“¦ Available Scripts

npm run dev          # Start development server with hot reload
npm run build        # Build TypeScript to JavaScript
npm start            # Start production server
npm test             # Run tests
npm run test:watch   # Run tests in watch mode
npm run test:coverage # Run tests with coverage
npm run lint         # Lint code
npm run format       # Format code with Prettier

πŸ”§ Configuration

Environment variables (.env):

GEMINI_API_KEY=your_api_key_here
NODE_ENV=development
PORT=5000
MAX_FILE_SIZE=52428800
UPLOAD_DIR=uploads
ALLOWED_ORIGINS=http://localhost:3000,http://localhost:5173
LOG_LEVEL=info

πŸ“š API Documentation

Health Check

# Check service health
curl http://localhost:5000/health

Text Chat

# Simple chat
curl -X POST http://localhost:5000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Hello, how are you?"}'

# Chat with history
curl -X POST http://localhost:5000/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What did I just ask?",
    "history": [
      {
        "role": "user",
        "parts": [{"text": "What is AI?"}]
      },
      {
        "role": "model",
        "parts": [{"text": "AI stands for Artificial Intelligence..."}]
      }
    ]
  }'

Image Analysis

# Analyze image
curl -X POST http://localhost:5000/api/analyze-image \
  -F "images=@path/to/image.jpg" \
  -F "prompt=Describe this image in detail"

# Analyze multiple images
curl -X POST http://localhost:5000/api/analyze-image \
  -F "images=@image1.jpg" \
  -F "images=@image2.jpg" \
  -F "prompt=Compare these images"

Audio Processing

# Analyze audio
curl -X POST http://localhost:5000/api/analyze-audio \
  -F "audio=@audio.mp3" \
  -F "prompt=Summarize this audio"

# Transcribe audio
curl -X POST http://localhost:5000/api/transcribe \
  -F "audio=@audio.mp3" \
  -F "language=English"

Video Analysis

# Analyze video
curl -X POST http://localhost:5000/api/analyze-video \
  -F "video=@video.mp4" \
  -F "prompt=Describe what happens in this video"

Image Generation

# Generate image
curl -X POST http://localhost:5000/api/generate-image \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A beautiful sunset over mountains",
    "numberOfImages": 2,
    "aspectRatio": "16:9"
  }'

Custom Agents

# List all agents
curl http://localhost:5000/api/agents

# Create custom agent
curl -X POST http://localhost:5000/api/agents/create \
  -H "Content-Type: application/json" \
  -d '{
    "config": {
      "name": "My Assistant",
      "description": "Helpful assistant",
      "systemInstruction": "You are a helpful assistant...",
      "temperature": 0.7
    }
  }'

# Load preset agent
curl -X POST http://localhost:5000/api/agents/preset/customer_support

# Chat with active agent
curl -X POST http://localhost:5000/api/agents/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!"}'

# Chat with specific agent
curl -X POST http://localhost:5000/api/agents/My%20Assistant/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Help me with something"}'

# Get agent history
curl http://localhost:5000/api/agents/My%20Assistant/history

# Reset agent history
curl -X POST http://localhost:5000/api/agents/My%20Assistant/reset

πŸ’» Usage with TypeScript/JavaScript

Basic Chat

import axios from 'axios';

const API_URL = 'http://localhost:5000';

// Simple chat
const chat = async (prompt: string) => {
  const response = await axios.post(`${API_URL}/api/chat`, { prompt });
  return response.data.data;
};

const result = await chat('What is the capital of France?');
console.log(result.text);

Image Analysis

import FormData from 'form-data';
import fs from 'fs';

const analyzeImage = async (imagePath: string, prompt: string) => {
  const form = new FormData();
  form.append('images', fs.createReadStream(imagePath));
  form.append('prompt', prompt);

  const response = await axios.post(`${API_URL}/api/analyze-image`, form, {
    headers: form.getHeaders(),
  });
  
  return response.data.data;
};

const result = await analyzeImage('./photo.jpg', 'What is in this image?');
console.log(result.text);

Working with Agents

// Create a custom agent
const createAgent = async () => {
  const response = await axios.post(`${API_URL}/api/agents/create`, {
    config: {
      name: 'Code Helper',
      description: 'Helps with coding questions',
      systemInstruction: 'You are an expert programmer...',
      temperature: 0.3,
    },
  });
  return response.data.data;
};

// Load a preset
const loadPreset = async () => {
  const response = await axios.post(
    `${API_URL}/api/agents/preset/code_assistant`
  );
  return response.data.data;
};

// Chat with agent
const chatWithAgent = async (message: string) => {
  const response = await axios.post(`${API_URL}/api/agents/chat`, {
    message,
  });
  return response.data.data;
};

await loadPreset();
const reply = await chatWithAgent('How do I reverse a string in Python?');
console.log(reply.response.text);

🐳 Docker Deployment

Using Docker Compose (Recommended)

# Build and start
docker-compose up -d

# View logs
docker-compose logs -f

# Stop
docker-compose down

Using Docker

# Build
docker build -t gemini-ai-service .

# Run
docker run -d \
  -p 5000:5000 \
  -e GEMINI_API_KEY=your_key_here \
  --name gemini-service \
  gemini-ai-service

πŸ§ͺ Testing

# Run all tests
npm test

# Run tests in watch mode
npm run test:watch

# Generate coverage report
npm run test:coverage

πŸ“ Project Structure

gemini-ai-service/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ config/           # Configuration
β”‚   β”œβ”€β”€ middleware/       # Express middleware
β”‚   β”œβ”€β”€ routes/           # API routes
β”‚   β”œβ”€β”€ services/         # Business logic
β”‚   β”œβ”€β”€ types/            # TypeScript types
β”‚   β”œβ”€β”€ utils/            # Utility functions
β”‚   β”œβ”€β”€ app.ts            # Express app
β”‚   └── index.ts          # Entry point
β”œβ”€β”€ tests/                # Test files
β”œβ”€β”€ docs/                 # Documentation
β”œβ”€β”€ uploads/              # Uploaded files
β”œβ”€β”€ dist/                 # Compiled JavaScript
└── logs/                 # Log files

🎯 Available Agent Presets

  • customer_support - Friendly customer support assistant
  • code_assistant - Expert programming helper
  • creative_writer - Creative writing assistant
  • data_analyst - Data analysis and insights
  • language_tutor - Language learning tutor (configurable)
  • json_api - Returns responses in JSON format

πŸ”’ Security Features

  • Helmet.js for security headers
  • CORS configuration
  • File upload validation
  • Request size limits
  • Input sanitization with Joi
  • Safety settings for AI responses

πŸ“ License

MIT

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“§ Support

For issues and questions, please open an issue on GitHub.

πŸ™ Acknowledgments

  • Google Gemini AI for the amazing AI capabilities
  • Express.js community
  • TypeScript team

About

Gemini API service template for JS/TS backend frameworks

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages