Skip to content
This repository was archived by the owner on Nov 15, 2025. It is now read-only.
This repository was archived by the owner on Nov 15, 2025. It is now read-only.

Phase 4: Intelligent File Organization with Phi-3.5 (128K Context) #46

@thewildofficial

Description

@thewildofficial

Overview

Leverage Phi-3.5-mini's 128K context window to analyze entire file collections and suggest intelligent directory structures for automatic organization.

Parent Issue: #36
Depends On: #44 (image analysis), #45 (audio processing)
Estimated Time: 2 weeks


Problem Statement

Users have large, disorganized file collections. We need to:

  1. Analyze metadata from all files (images, videos, audio)
  2. Find semantic groupings and relationships
  3. Suggest intuitive directory structures
  4. Automatically organize files based on content

Challenge: With potentially thousands of files, we need a model that can:

  • Process large collections in one context (✅
  • Reason about relationships between files
  • Generate hierarchical organization schemes
  • Handle mixed media types (images + videos + audio)

Architecture

File Collection → Extract Metadata (BLIP/Whisper/CLIP)
                         ↓
          Aggregate Metadata Summary (all files)
                         ↓
          Phi-3.5 (128K Context Window)
                         ↓
    Analyze Patterns, Themes, Relationships
                         ↓
    Suggest Directory Structure + Assign Files
                         ↓
         User Review → Apply Organization

Implementation Tasks

Core File Organizer

  • Create src/organization/ package
  • Create src/organization/organizer.py with FileOrganizer class
  • Create src/organization/models.py with organization schemas
  • Implement metadata aggregation logic
  • Add confidence scoring system

Data Models

@dataclass
class DirectoryNode:
    """Hierarchical directory structure node."""
    name: str
    description: str
    path: str
    parent: Optional['DirectoryNode']
    children: List['DirectoryNode']
    assigned_files: List[UUID]  # Asset IDs
    category: str
    estimated_size_gb: float

@dataclass
class OrganizationPlan:
    """Complete file organization plan."""
    root_directory: DirectoryNode
    total_files: int
    total_directories: int
    file_assignments: Dict[UUID, str]  # asset_id -> target_path
    confidence_scores: Dict[UUID, float]  # asset_id -> confidence
    rationale: str  # Why this structure was chosen
    alternative_plans: List['OrganizationPlan']
    
@dataclass  
class OrganizationRule:
    """User-defined organization rule."""
    condition: str  # e.g., "category == 'pets' AND tags contains 'dog'"
    target_directory: str
    priority: int
    enabled: bool

Metadata Aggregation

  • Implement aggregate_metadata_summary()
    • Collect all VLMMetadata from images
    • Collect all AudioMetadata from audio files
    • Collect filename patterns
    • Extract temporal information (creation dates, EXIF)
    • Calculate tag frequencies
    • Identify common themes
  • Create compact representation for 128K context
  • Handle large collections (10K+ files) with sampling

Phi-3.5 Directory Structure Generation

  • Design prompt for organization analysis
  • Implement hierarchical structure generation
  • Add rationale generation for decisions
  • Generate multiple alternative plans
  • Implement file assignment logic with confidence

Example Prompt:

ORGANIZATION_PROMPT_TEMPLATE = """You are an expert file organizer. Analyze this collection and suggest an optimal directory structure.

COLLECTION SUMMARY:
- Total files: {total_files}
- Image files: {image_count} (tags: {top_image_tags})
- Video files: {video_count} (tags: {top_video_tags})  
- Audio files: {audio_count} (tags: {top_audio_tags})
- Date range: {date_range}
- Common themes: {themes}

TOP 50 FILE SAMPLES:
{file_samples}

REQUIREMENTS:
1. Create 2-4 levels of hierarchy (not too deep)
2. Use clear, descriptive folder names
3. Group by primary semantic theme, then sub-categories
4. Consider temporal organization where relevant (e.g., by year/month for events)
5. Balance folder sizes (avoid 1000+ files in one folder)
6. Handle edge cases ("Unsorted", "Misc" for ambiguous items)

Return JSON:
{{
    "directory_structure": [
        {{
            "path": "Photos/Vacations/2024-Summer-Europe",
            "description": "Summer 2024 vacation photos from Europe",
            "category": "travel",
            "expected_file_count": 120
        }},
        ...
    ],
    "file_assignments": [
        {{
            "asset_id": "<uuid>",
            "target_path": "Photos/Vacations/2024-Summer-Europe",
            "confidence": 0.95,
            "reason": "Beach scenes with European landmarks, dated June 2024"
        }},
        ...
    ],
    "rationale": "Organized primarily by content type, then by theme and temporal order...",
    "estimated_accuracy": 0.87
}}"""

Rule-Based + ML Hybrid System

  • Implement user-defined rule engine
  • Priority-based rule application
  • Combine rules with ML suggestions
  • Handle rule conflicts
  • Allow rule templates ("All photos with 'dog' tag → Pets/Dogs")

Confidence Scoring & Human Review

  • Calculate confidence scores per file assignment
  • Flag low-confidence assignments for review
  • Implement review interface workflow
  • Allow user corrections and feedback
  • Learn from user corrections (update rules)

Similarity-Based Assignment

  • Use CLIP embeddings for edge cases
  • Find similar files already organized
  • Assign to same directory if high similarity
  • Handle new categories (create if needed)

Dynamic Directory Creation

  • Detect when new category is needed
  • Suggest directory names based on content
  • Avoid over-fragmentation (minimum files per directory)
  • Handle directory name conflicts

Batch Organization Engine

  • Implement organize_collection() method
  • Add dry-run mode (preview without moving files)
  • Implement safe file moving (with rollback)
  • Add progress tracking
  • Handle filesystem errors gracefully

Database Schema

CREATE TABLE organization_plans (
    id UUID PRIMARY KEY,
    user_id UUID REFERENCES users(id),
    name VARCHAR(255),
    directory_structure JSONB,
    file_assignments JSONB,
    confidence_scores JSONB,
    rationale TEXT,
    status VARCHAR(50), -- draft|reviewing|applied|rolled_back
    created_at TIMESTAMP DEFAULT NOW(),
    applied_at TIMESTAMP
);

CREATE TABLE organization_rules (
    id UUID PRIMARY KEY,
    user_id UUID REFERENCES users(id),
    condition TEXT,
    target_directory VARCHAR(500),
    priority INTEGER,
    enabled BOOLEAN DEFAULT TRUE,
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE organization_feedback (
    id UUID PRIMARY KEY,
    asset_id UUID REFERENCES assets(id),
    suggested_path VARCHAR(500),
    actual_path VARCHAR(500),
    user_accepted BOOLEAN,
    user_id UUID REFERENCES users(id),
    timestamp TIMESTAMP DEFAULT NOW()
);

API Endpoints

  • POST /api/organization/analyze - Analyze collection and generate plan
  • GET /api/organization/plans - List organization plans
  • GET /api/organization/plans/{id} - Get specific plan
  • POST /api/organization/plans/{id}/apply - Apply organization
  • POST /api/organization/plans/{id}/rollback - Undo organization
  • POST /api/organization/rules - Create organization rule
  • GET /api/organization/rules - List user rules
  • POST /api/organization/feedback - Submit user feedback

Testing

  • Unit tests for FileOrganizer
  • Test metadata aggregation with mixed file types
  • Test Phi-3.5 structure generation with various collection sizes
  • Test rule engine with complex conditions
  • Test confidence scoring accuracy
  • Integration tests for full organization pipeline
  • Test with real user collections:
    • Personal photo library (1000+ photos)
    • Mixed media (photos + videos + audio)
    • Professional archive (organized vs unorganized)
    • Time-series data (dated files)
  • Performance benchmarks:
    • 1K files: <2 minutes
    • 10K files: <10 minutes
    • 100K files: <60 minutes (with sampling)

User Interface (Future)

  • Organization preview interface
  • Drag-and-drop to override suggestions
  • Confidence score visualization
  • Before/after directory tree view
  • One-click apply/rollback

Documentation

  • Document organization algorithm
  • Add user guide for file organization
  • Document rule syntax and examples
  • Create best practices guide
  • Add troubleshooting for common issues

Example Implementation

# src/organization/organizer.py

from typing import List, Dict
import json
from collections import Counter

from src.models.model_client import Phi35Client
from src.organization.models import OrganizationPlan, DirectoryNode
from src.catalog.models import Asset
from sqlalchemy.orm import Session

class FileOrganizer:
    """Intelligent file organization using Phi-3.5 reasoning."""
    
    def __init__(self, db: Session):
        self.db = db
        self.phi35_client = Phi35Client(get_settings().phi35_endpoint)
    
    async def analyze_and_organize(
        self,
        asset_ids: List[UUID],
        target_root: str = ".",
        dry_run: bool = True
    ) -> OrganizationPlan:
        """Analyze collection and generate organization plan."""
        
        # Step 1: Aggregate metadata
        metadata_summary = await self._aggregate_metadata(asset_ids)
        
        # Step 2: Generate organization plan with Phi-3.5
        plan = await self._generate_organization_plan(
            metadata_summary,
            target_root
        )
        
        # Step 3: Calculate confidence scores
        plan = self._calculate_confidence_scores(plan)
        
        # Step 4: Apply user rules
        plan = self._apply_user_rules(plan)
        
        if not dry_run:
            # Apply organization
            await self._apply_organization(plan)
        
        return plan
    
    async def _aggregate_metadata(
        self,
        asset_ids: List[UUID]
    ) -> Dict:
        """Aggregate metadata from all assets."""
        
        assets = self.db.query(Asset).filter(Asset.id.in_(asset_ids)).all()
        
        all_tags = []
        all_categories = []
        date_range = None
        file_samples = []
        
        for asset in assets:
            # Extract from VLMMetadata (images)
            if asset.vlm_metadata:
                all_tags.extend(asset.vlm_metadata.get('tags', []))
                all_categories.append(asset.vlm_metadata.get('primary_category'))
            
            # Extract from AudioMetadata
            if asset.audio_metadata:
                all_tags.extend(asset.audio_metadata.get('semantic_tags', []))
                all_categories.append(asset.audio_metadata.get('primary_category'))
            
            # Sample files for context (top 50)
            if len(file_samples) < 50:
                file_samples.append({
                    'id': str(asset.id),
                    'filename': asset.filename,
                    'type': asset.media_type,
                    'tags': asset.vlm_metadata.get('tags', [])[:5] if asset.vlm_metadata else [],
                    'created': str(asset.created_at)
                })
        
        # Calculate frequencies
        tag_freq = Counter(all_tags)
        category_freq = Counter(all_categories)
        
        return {
            'total_files': len(assets),
            'image_count': sum(1 for a in assets if a.media_type == 'image'),
            'video_count': sum(1 for a in assets if a.media_type == 'video'),
            'audio_count': sum(1 for a in assets if a.media_type == 'audio'),
            'top_tags': [tag for tag, _ in tag_freq.most_common(20)],
            'top_categories': [cat for cat, _ in category_freq.most_common(10)],
            'file_samples': file_samples
        }
    
    async def _generate_organization_plan(
        self,
        metadata: Dict,
        target_root: str
    ) -> OrganizationPlan:
        """Use Phi-3.5 to generate organization plan."""
        
        prompt = self.ORGANIZATION_PROMPT_TEMPLATE.format(
            total_files=metadata['total_files'],
            image_count=metadata['image_count'],
            video_count=metadata['video_count'],
            audio_count=metadata['audio_count'],
            top_image_tags=', '.join(metadata['top_tags'][:10]),
            top_video_tags=', '.join(metadata['top_tags'][10:15]),
            top_audio_tags=', '.join(metadata['top_tags'][15:20]),
            themes=', '.join(metadata['top_categories']),
            file_samples=json.dumps(metadata['file_samples'], indent=2)
        )
        
        # Use Phi-3.5's 128K context
        result = await self.phi35_client.generate(
            prompt=prompt,
            max_tokens=4096,  # Allow detailed response
            temperature=0.5,  # Some creativity for structure
            response_format={"type": "json_object"}
        )
        
        plan_data = json.loads(result)
        
        # Convert to OrganizationPlan object
        return self._parse_plan(plan_data, target_root)

Acceptance Criteria

  • ✅ Can analyze 10K+ files in reasonable time (<10 min)
  • ✅ Generates intuitive directory structures
  • ✅ File assignments >80% user acceptance
  • ✅ Confidence scores correlate with accuracy
  • ✅ Rule engine works with complex conditions
  • ✅ Dry-run preview works correctly
  • ✅ Rollback functionality prevents data loss
  • ✅ User feedback improves suggestions over time
  • ✅ Handles mixed media types correctly
  • ✅ 128K context enables whole-collection reasoning
  • ✅ All tests passing
  • ✅ Documentation complete

Advanced Features (Future Enhancements)

  • 🔄 Continuous organization (auto-organize new files)
  • 🎯 Smart duplicate detection across directories
  • 📊 Organization quality metrics dashboard
  • 🤖 Learn from user's existing organization patterns
  • 🔗 Cross-reference related files (same event, different media types)
  • 📅 Automatic event album creation
  • 🏷️ Suggest tags for untagged files based on directory placement

Use Cases

  • 📸 Organize 10 years of unsorted photos by events, people, locations
  • 🎥 Categorize video library by content type and topic
  • 🎙️ Structure podcast collection by theme and speaker
  • 📁 Clean up messy Downloads folder automatically
  • 🗂️ Migrate from flat structure to hierarchical organization
  • 🔍 Make files discoverable through semantic search

Resources

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions