🧠 REA AI Photo Enrichment

Overview

Project: rea-ai-photo-enrichment

Goal: To build a backend that classifies property photos by room type and visible attributes, then enriches them with semantic insights (e.g., style, notable features). The output enhances REA’s search, filtering, and listing relevance.

Architecture Summary

flowchart TD
  U["REA Platform / API Client"] -->|POST /api/enrich| A[FastAPI Backend]
  U -->|POST /api/batch_enrich| A
  A --> P["Pre-selection (Dedup + Quality Filter)"]
  P --> M["Vision Model (OpenCLIP + LoRA / ONNX Triton)"]
  M --> L["Light LLM Enricher (Gemini Flash / Claude Haiku)"]
  L --> C[(LRU Cache)]
  L -->|Structured JSON| A
  subgraph Observability
    ML["MLflow (training)"]
    LF["Langfuse (LLM traces)"]
  end
  A -->|Response| U

Core Features

Layer	Description
Prefilter	Removes duplicate / low-quality images before inference.
Weak Labelling	A captioning model (BLIP) generates open-vocabulary descriptions for images.
LLM Refinement	A lightweight LLM (Gemini/Claude) filters captions and extracts structured JSON.
Vision Model	OpenCLIP fine-tuned with LoRA adapters on the generated captions.
Enrichment (LLM)	Adds style, design keywords, and notable features.
Caching	In-memory cache stores results by a hash of the inputs (TTL 24 h).
Observability	MLflow logs model metrics; Langfuse logs LLM traces.
Deployment	Dockerized (FastAPI + Triton) for Cloud Run / GKE.

API Endpoints

`POST /api/enrich`

Analyzes and enriches a single photo.

Input:

{
  "image_url": "https://example.com/photo.jpg",
  "listing_id": "REA123",
  "include_semantics": true
}

Output:

{
  "listing_id": "REA123",
  "room_type": {"label": "kitchen", "confidence": 0.91},
  "attributes": [{"label": "oven", "confidence": 0.89}],
  "semantics": {
    "style": "modern",
    "notable_features": ["open layout", "stainless appliances"],
    "confidence": 0.92
  },
  "model_version": "openclip_v1_lora_v1",
  "prompt_version": "enrich_v1_0_0"
}

`POST /api/batch_enrich`

Asynchronous endpoint for multiple images.

`POST /api/files/analyze`

Uploads one or more files directly.

`POST /api/feedback` (Optional)

Collects human corrections for continuous improvement.

Quick Start

1. Clone & Setup

git clone https://github.com/rea-group/rea-ai-photo-enrichment.git
cd rea-ai-photo-enrichment
poetry install

2. Run Locally (Docker)

docker compose up --build

FastAPI: http://localhost:8080
Triton Server: http://localhost:8000

3. Deploy

gcloud run deploy rea-photo-enrichment-api \
  --region australia-southeast1 \
  --allow-unauthenticated \
  --set-env-vars SECRET_KEY=changeme,MODEL_PROVIDER=openclip_lora

Business Impact & Observability

KPI	Definition	Target Gain
Search Conversion Rate	% of searches leading to enquiry or click	+3 pp
Avg Search Time to Match	Median seconds to find relevant property	−20 %
Filter Usage Uplift	Users applying new visual filters	+30 %
Manual Tagging Reduction	Human labeling workload	−80 %
Latency Budget	End-to-end p95 latency	≤ 700 ms

Monitored Metrics:

Technical: Top-1/Top-3 accuracy, Attribute F1, Latency p95, Cache hit ratio, Error rate.
LLM (via Langfuse): LLM usage, Cost.

Development Process

This project followed a structured plan outlined in DEVELOPMENT_PLAN.md. The plan was executed by the project owner, with assistance from Cline and Gemini Pro, and breaks down the work into clear, sequential phases, from initial setup to final deployment and observability.

The development process adhered to a set of defined programming rules, with a strong emphasis on the principles from Robert C. Martin's "Clean Code". This phased and disciplined approach ensured that each component was built and tested systematically, with defined goals and deliverables for each stage. By using a detailed development plan, we were able to manage complexity, track progress against clear success criteria, and ensure all requirements were met in a structured manner.

Development Environment

Python: 3.11
ML: PyTorch + PEFT (LoRA), OpenCLIP
LLM: Langchain (supporting Gemini Flash, Claude Haiku, and others)
Inference: ONNXRuntime / Triton GPU
Authentication: Bearer token (SECRET_KEY)
Cache: LRU Cache
Monitoring: MLflow + Langfuse
Testing: pytest + pytest-asyncio (≥ 90 %)
Code Quality: Ruff + Black + Mypy (120 chars)
Pre-commit: Hooks for formatting, linting, and type-checking on every commit.

Directory Layout

.
├── ai_enrichment/
│   ├── app/
│   ├── config/
│   ├── enrichment/
│   ├── prefilter/
│   ├── prompts/
│   ├── routers/
│   ├── schemas/
│   ├── utils/
│   └── vision/
├── data/
├── mlruns/
├── models/
├── property_photo_1k_sample/
├── reports/
└── tests/

Data and Labelling Workflow

This project uses an open-vocabulary labelling pipeline to generate rich, semantic labels for training.

Weak Label Generation: The ai_enrichment/vision/run_open_labelling.py script uses a vision-language model to generate captions for images.
LLM Refinement: A lightweight LLM filters these captions and outputs a structured JSON object containing room_type, attributes, and style.
Cleaning and Balancing: The ai_enrichment/vision/cleaning_labels.py script normalizes synonymous labels, removes rare or invalid ones, and oversamples minority classes to create a balanced dataset for training.

Testing

Run all tests:

poetry run pytest

Coverage Target: ≥ 90%

Future Work

Cloud Data Storage: Migrate datasets to a cloud storage solution (S3/GCS).
Improved Training Data: Source higher-quality, human-verified labels.
Hyperparameter Tuning: Conduct a more extensive search for optimal LoRA and training settings.
Cache Upgrade to Redis: Replace the in-memory cache with a distributed Redis cache.
Observability Enhancements: Integrate Prometheus/Grafana and automate evaluation reports.
Human Feedback Loop: Implement a /api/feedback endpoint to collect user corrections for continuous improvement.

Trade-offs and Limitations

This prototype uses weak zero-shot pseudo-labels with a LoRA fine-tuning adapter. Given the limited and imbalanced data, high precision was prioritized over recall.

Current limitations:

Noisy pseudo-labels from zero-shot CLIP
Class imbalance and a limited dataset size (< 5k images)
No human validation set

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
ai_enrichment		ai_enrichment
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
DEVELOPMENT_PLAN.md		DEVELOPMENT_PLAN.md
Dockerfile		Dockerfile
README.md		README.md
TEST.MD		TEST.MD
docker-compose.yaml		docker-compose.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 REA AI Photo Enrichment

Overview

Architecture Summary

Core Features

API Endpoints

`POST /api/enrich`

`POST /api/batch_enrich`

`POST /api/files/analyze`

`POST /api/feedback` (Optional)

Quick Start

Business Impact & Observability

Development Process

Development Environment

Directory Layout

Data and Labelling Workflow

Testing

Future Work

Trade-offs and Limitations

About

Uh oh!

Releases

Packages

Languages

gilbertofp16/rea_ai_enrichment

Folders and files

Latest commit

History

Repository files navigation

🧠 REA AI Photo Enrichment

Overview

Architecture Summary

Core Features

API Endpoints

POST /api/enrich

POST /api/batch_enrich

POST /api/files/analyze

POST /api/feedback (Optional)

Quick Start

Business Impact & Observability

Development Process

Development Environment

Directory Layout

Data and Labelling Workflow

Testing

Future Work

Trade-offs and Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`POST /api/enrich`

`POST /api/batch_enrich`

`POST /api/files/analyze`

`POST /api/feedback` (Optional)

Packages