Skip to content

Deadpoolhp-4/CSV-parser

Repository files navigation

DataForge AI banner

Next.js 16 React 19 TypeScript 5 Firebase OpenAI SDK Express 5

Active Build CSV Workspace AES Encrypted Local Verifier

DataForge AI

A premium CSV operations workspace for cleaning, transforming, enriching, verifying, and recovering datasets at speed.

Built for ops, revops, lead-gen, research, and internal tooling teams who live in spreadsheets but need something far more powerful.


Overview

DataForge AI combines a polished Next.js frontend with a local verification sidecar so teams can:

  • upload messy CSVs and inspect them instantly
  • clean and normalize data inside a focused workspace
  • enrich rows through AI-assisted flows
  • verify email lists with a dedicated backend
  • save encrypted projects to Firebase and reopen them later

This repo is best thought of as a product foundation for a modern data-ops console, not just a CSV parser.

Visual Architecture

DataForge AI workflow overview

Why This Repo Is Different

Most CSV tools force you to choose between speed, control, and extensibility. DataForge AI is built to deliver all three:

  • Fast UX: drag-and-drop upload, immediate parsing, focused workspace controls
  • Serious persistence: Firebase auth plus encrypted Firestore project storage
  • Real operations value: local email verification backend, queue controls, result downloads
  • Product-ready structure: clear frontend/backend split with API surfaces already in place

Feature Surface

Area What it does Status
Workspace Upload CSVs, inspect rows, track columns, edit dataset state Stable
Cleaning Trim whitespace, title case, deduplicate, sort, type conversion Stable
Saved Projects Google sign-in, AES encryption, Firestore storage, reload later Stable
Verification Upload verification CSVs, detect email column, run and manage jobs Stable
AI Enrichment Model selection, prompt modal, transform API surface Partially wired

Product Highlights

CSV Workspace

  • Drag-and-drop CSV ingestion
  • Table-based data exploration
  • Zustand-powered client state
  • Responsive workspace layout with dedicated tool panels

Data Cleaning Toolkit

  • Trim leading and trailing whitespace
  • Convert text to title case
  • Remove duplicate rows
  • Sort rows by selected columns
  • Convert values between string and numeric types

Saved Project System

  • Sign in with Google via Firebase Auth
  • Encrypt CSV payloads before persistence
  • Chunk large encrypted payloads to respect Firestore size limits
  • Recover projects from the dashboard

Verification Engine

  • Local Express sidecar for verification workflows
  • CSV upload with email-column detection
  • Queue controls for pause, resume, stop, and clear
  • Pollable status and paginated results endpoints
  • Result export via CSV download URLs

AI Layer

  • Local browser storage for OpenAI API key usage
  • Model discovery and selection
  • Next.js API routes for transforms and model listing

Note: the AI UI is ahead of the current execution wiring. The modal and backend-facing surface exist, but some enrichment buttons still operate as scaffolded flows and placeholders rather than fully completed production jobs.

Stack

Frontend

  • Next.js 16
  • React 19
  • TypeScript
  • Tailwind CSS 4
  • Framer Motion
  • Zustand
  • Papa Parse

Platform

  • Firebase Auth
  • Firestore
  • Firebase Storage
  • OpenAI SDK
  • crypto-js

Backend Sidecar

  • Express 5
  • SQLite
  • Multer
  • Winston
  • Jest

Repository Layout

app/                  Next.js App Router pages and API routes
app/api/lead-ops/     Same-origin proxy to the Lead Ops FastAPI backend
app/lead-ops/         Lead Ops product route
components/           Product UI and interaction surfaces
context/              Auth provider and shared React context
lib/                  Firebase, crypto, store, CSV, and backend clients
brandnav_backend/     Local sidecar for email verification
lead_ops_backend/     FastAPI + Celery service for lead ingestion workflows
public/               Static assets and README visuals
samples/              Example CSV inputs for local testing and demos

Quick Start

1. Install dependencies

npm install
npm run install:backend

2. Configure the frontend

Create .env.local in the repo root:

NEXT_PUBLIC_FIREBASE_API_KEY=...
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=...
NEXT_PUBLIC_FIREBASE_PROJECT_ID=...
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=...
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=...
NEXT_PUBLIC_FIREBASE_APP_ID=...
NEXT_PUBLIC_ENCRYPTION_KEY=replace-this-in-real-environments
NEXT_PUBLIC_BRANDNAV_URL=http://localhost:5000
NEXT_PUBLIC_LEAD_OPS_API_BASE_URL=/api/lead-ops
LEAD_OPS_INTERNAL_URL=http://localhost:8000/api/v1

3. Configure the verifier backend

Create brandnav_backend/.env:

PORT=5000
CORS_ORIGIN=http://localhost:3000
DB_PATH=.sql/user_auth.db
SESSION_SECRET=replace-with-a-long-random-secret
SESSION_MAX_AGE=86400000
ADMIN_EMAIL=your-email@example.com
ADMIN_PASSWORD=change-me
MAX_CSV_ROWS=100000
MAX_CSV_SIZE_MB=100
MX_DOMAIN=your-mail-from-domain
EM_DOMAIN=your-envelope-domain

4. Run locally

npm run dev

This starts:

  • frontend on http://localhost:3000
  • verifier backend on http://localhost:5000

Use the sample imports in samples/lead-imports/ if you want quick demo data without cluttering the repo root.

Docker

Run the whole website stack, including the Next.js app, verifier backend, and Lead Ops services:

docker compose up --build

This brings up:

  • website on http://localhost:3000
  • verifier backend on http://localhost:5000
  • Lead Ops API on http://localhost:8000
  • Lead Ops Postgres on localhost:5432
  • Lead Ops Redis on localhost:6379

The Docker setup defaults to the public no-login flow. Firebase values are optional, and Lead Ops basic auth is disabled by default so the catalog-to-product handoff works immediately.

If one of the default host ports is already taken on your machine, copy .env.example to .env and change WEB_PORT, VERIFIER_PORT, LEAD_OPS_API_PORT, LEAD_OPS_POSTGRES_PORT, or LEAD_OPS_REDIS_PORT before starting compose.

Scripts

Root

npm run dev
npm run dev:next
npm run dev:backend
npm run build
npm run start
npm run lint

Backend

cd brandnav_backend
npm run dev
npm run test
npm run test:watch
npm run test:coverage
npm run type-check

Typical Workflow

  1. Sign in with Google.
  2. Upload a CSV into the workspace.
  3. Clean the dataset with built-in controls.
  4. Run verification if the file contains emails.
  5. Save the transformed result as an encrypted project.
  6. Reopen prior datasets from the dashboard whenever needed.

API Surface

Next.js routes

  • POST /api/transform
  • GET /api/models

Verifier backend routes

  • GET /api/verifier/health
  • POST /api/verifier/csv/upload
  • POST /api/verifier/csv/verify
  • GET /api/verifier/verification/:id/status
  • GET /api/verifier/verification/:id/results
  • POST /api/verifier/verification/:id/pause
  • POST /api/verifier/verification/:id/resume
  • POST /api/verifier/verification/:id/stop
  • POST /api/verifier/verification/queue/clear

Security Notes

  • Project data is AES-encrypted before being written to Firestore.
  • OpenAI API keys are stored in browser local storage, not on the server.
  • The backend uses Helmet, CORS controls, and session protections.
  • NEXT_PUBLIC_ENCRYPTION_KEY should always be set explicitly outside local experiments.
  • The backend exposes a development shutdown route when not in production; keep it private to trusted environments.

Current Maturity

Strong Today

  • polished authenticated UI shell
  • fast CSV ingestion and workspace transitions
  • encrypted Firestore project persistence
  • integrated verifier sidecar surface
  • clean separation between frontend product code and backend processing engine

Best Next Investments

  • fully wire the AI enrichment buttons to the transform pipeline
  • add production deployment docs for frontend plus verifier sidecar
  • add .env.example files for both app layers
  • clean the repository’s macOS ._* metadata artifacts
  • add screenshots from a live seeded environment once dependencies are installed

Who This Repo Fits

This is a great foundation for teams building:

  • lead list preparation tools
  • internal CSV cleanup consoles
  • outbound personalization pipelines
  • email hygiene and verification products
  • AI-assisted spreadsheet operations

Contributing

If you extend this repo, keep the bar high:

  • preserve the product-grade UI quality
  • keep frontend and sidecar responsibilities cleanly separated
  • document any new env vars and routes
  • prefer shipping complete user-facing flows over partial abstractions

License

No license is currently defined at the root of the project. Add one before public distribution, client delivery, or commercial reuse.

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors