Skip to content

Retrospective Consortium DB Linking Analysis #7

@ShellLM

Description

@ShellLM

LLM Consortium Database Analysis - Phase 1 Report (Final - Rev 2)

1. Problem Statement & Goal

The responses table in /home/thomas/.config/io.datasette.llm/logs.db needs retrospective population of two columns:

  • consortium_id: A unique identifier to group related requests (members, arbiters, final output) within a single consortium run.
  • is_arbiter_response: A flag (0, 1, or 2) to identify the specific role of a response within a run (0=member/output, 1=final arbiter, 2=intermediate arbiter).

The goal of Phase 1 is to investigate the data, devise a reliable strategy for grouping and identification based on user guidance, and plan the implementation.

2. Investigation Findings

2.1. Database Schema & Structure

  • Confirmed schema: id, model, prompt, response, datetime_utc, consortium_id, is_arbiter_response, etc.
  • Total records: 20,778.
  • Crucially, each record has a unique conversation_id. No existing field groups related consortium run records.
  • Record order by id is the key sequencing factor.

2.2. Key Patterns and Resources

  • Consortium Models List: Located at /home/thomas/Projects/llm/plugins/Utilities/llm-consortium/llm_consortium/consortium_models.txt, containing 100+ specific model names indicating the final output of a run.
  • Deterministic Link (Final Output <-> Final Arbiter): The record immediately preceding (by id) a consortium model record is reliably the final arbiter response for that run. This is the primary anchor for grouping.
  • Arbiter Prompts: Records acting as arbiters (both final and intermediate) typically contain an <arbiter_prompt> tag structure in their prompt field, wrapping <original_prompt> and <model_responses> which contain snippets of member responses.
  • Member Linking: Member responses can be linked to a final arbiter by finding records whose response content exactly matches snippets found within the final arbiter's <model_responses>.
  • Intermediate Arbiters: Other records containing <arbiter_prompt> exist and belong to the same run.

2.3. Policy Decisions (User Confirmation)

  • Intermediate Arbiters: Should receive the same consortium_id as the final arbiter/output. They will be flagged is_arbiter_response = 2 to distinguish them from the final arbiter (is_arbiter_response = 1).
  • Missing/Malformed Final Arbiter: If the record preceding a consortium output is invalid (e.g., missing, or another consortium output), the assignment for that run should be skipped and logged for manual review.
  • Time Window (Search Optimization): A time window (e.g., up to 1 hour before the final arbiter) should be used only to limit the search space for candidate member/intermediate records for performance. The definitive linking mechanism remains content/pattern matching based on the final arbiter's prompt.

3. Final Linking & Identification Strategy

A two-phase approach leveraging the deterministic link and user policies:

Phase 1: Identify Final Arbiter-Consortium Output Pairs & Assign Base IDs

  1. Load Data: Load consortium model names. Read all records from responses table, ordered by id.
  2. Iterate & Identify Pairs: Process records sequentially. When a record R_consortium has a model name matching the consortium list:
    • Look at the immediately preceding record R_arbiter (by id).
    • Validation: Check if R_arbiter exists and is not also a consortium model.
    • If Valid Pair:
      • Generate a unique KSUID-style consortium_id.
      • Prepare updates:
        • R_consortium: consortium_id = new ID, is_arbiter_response = 0
        • R_arbiter: consortium_id = new ID, is_arbiter_response = 1 (Final Arbiter)
      • Store the consortium_id, R_arbiter's id, datetime_utc, and prompt for Phase 2.
    • If Invalid Pair:
      • Log the id of R_consortium and the issue (missing/invalid preceding record) for manual review.
      • Do not assign a consortium_id.

Phase 2: Link Members & Intermediate Arbiters to Groups

  1. Iterate Through Identified Groups: For each consortium_id successfully assigned in Phase 1:
    • Retrieve the associated final arbiter's details (id, datetime_utc, prompt).
    • Parse Final Arbiter Prompt: Extract member response snippets from the <model_responses> section. Handle potential parsing errors.
    • Define Search Window (Optimization): Calculate the start time (e.g., final arbiter datetime_utc - 1 hour) and end time (final arbiter datetime_utc). This window only limits the candidate search space for performance.
    • Query Candidates: Find all records R_candidate such that:
      • R_candidate.datetime_utc is within the search window.
      • R_candidate.id is less than the final arbiter's id.
      • R_candidate.consortium_id is currently NULL (to avoid reprocessing).
    • Process Candidates (Content/Pattern Matching is Key): For each R_candidate:
      • Match Members: If R_candidate.response exactly matches one of the extracted member snippets:
        • Prepare update: R_candidate: consortium_id = current ID, is_arbiter_response = 0
      • Match Intermediate Arbiters: If R_candidate.prompt contains <arbiter_prompt>:
        • Prepare update: R_candidate: consortium_id = current ID, is_arbiter_response = 2 (Intermediate Arbiter)
        • (Ensure this doesn't overwrite a member match if an intermediate arbiter's response happens to match a snippet - prioritize the arbiter=2 flag).

Phase 3: Database Update

  1. Consolidate Updates: Collect all prepared update statements from Phase 1 and Phase 2.
  2. Execute Transaction: Apply all updates within a single database transaction for atomicity.

4. Implementation Plan (Python Script: analyze_consortium_logs.py)

Corrected Mermaid Flowchart:

flowchart TD
    A[Start] --> B[Backup Database]
    B --> C[Load Consortium Models]
    C --> D[Read All Records]
    D --> E[Process Records]
    E --> F[Identify Consortium-Arbiter Pairs]
    F --> G[Generate IDs]
    G --> H[Process Member Records]
    H --> I[Update Database]
    I --> J[End]
Loading

4.1. Key Implementation Points

  • Database Interaction: Use sqlite3 library. Fetch records efficiently. Use transactions for updates.
  • KSUID Generation: Use the ksuid library (pip install ksuid).
  • Arbiter Prompt Parsing: Use robust XML parsing (e.g., xml.etree.ElementTree) to handle potential variations or errors in the <arbiter_prompt> structure.
  • Exact String Matching: Implement precise comparison of response content to extracted snippets.
  • Date/Time Handling: Use Python's datetime module for reliable window calculations (e.g., - timedelta(hours=1)). Ensure timezone awareness if needed (UTC seems standard).
  • Logging: Implement comprehensive logging for progress, assigned groups, skipped records, parsing errors, and database updates.

5. Technical Implementation Snippets (Conceptual - Updated Time Window)

import sqlite3
import xml.etree.ElementTree as ET
from datetime import datetime, timedelta
from ksuid import ksuid # Requires: pip install ksuid

# --- Phase 1 (remains the same) ---
def process_records_phase1(conn, consortium_models):
    # ... (as defined previously) ...
    pass 

# --- Phase 2 (Updated Time Window) ---
def extract_snippets_from_arbiter(prompt_text):
     # ... (as defined previously) ...
     pass

def process_group_phase2(conn, consortium_id, group_info):
    updates_phase2 = []
    final_arbiter_dt = datetime.fromisoformat(group_info['final_arbiter_dt_utc'].replace('+00:00', '')) # Ensure compatible format
    # Use potentially larger window (e.g., 1 hour) for search optimization
    window_start_dt = final_arbiter_dt - timedelta(hours=1) 
    
    member_snippets = extract_snippets_from_arbiter(group_info['final_arbiter_prompt'])
    if not member_snippets:
        print(f"Warning: No member snippets found for consortium_id {consortium_id}")

    # Query candidates within the time window, before the final arbiter, and not yet assigned
    query = """
        SELECT id, prompt, response 
        FROM responses 
        WHERE datetime(datetime_utc) BETWEEN datetime(?) AND datetime(?) 
          AND id < ? 
          AND consortium_id IS NULL 
        ORDER BY id DESC 
    """ 
    
    candidates_cursor = conn.execute(query, (
        window_start_dt.isoformat(), 
        final_arbiter_dt.isoformat(), 
        group_info['final_arbiter_id']
    ))

    matched_ids_in_phase2 = set() # Track IDs updated in this phase to prioritize arbiter=2

    for candidate in candidates_cursor:
        candidate_id = candidate['id']
        
        # Check for Intermediate Arbiter first (higher priority for flagging)
        is_intermediate_arbiter = '<arbiter_prompt>' in candidate['prompt']
        
        if is_intermediate_arbiter:
            if candidate_id not in matched_ids_in_phase2:
                updates_phase2.append({
                    'id': candidate_id, 
                    'consortium_id': consortium_id, 
                    'is_arbiter_response': 2 # Intermediate Arbiter
                })
                matched_ids_in_phase2.add(candidate_id)
        
        # Check for Member match only if not already flagged as intermediate arbiter
        elif member_snippets and candidate_id not in matched_ids_in_phase2:
            # Perform EXACT match - careful with whitespace/newlines
            if candidate['response'] in member_snippets:
                 updates_phase2.append({
                     'id': candidate_id, 
                     'consortium_id': consortium_id, 
                     'is_arbiter_response': 0 # Member
                 })
                 matched_ids_in_phase2.add(candidate_id)

    # Filter out potential duplicates if logic allows (though matched_ids_in_phase2 should prevent it)
    # final_updates_phase2 = list({update['id']: update for update in updates_phase2}.values())


    return updates_phase2

# --- Main Execution Flow ---
# 1. Backup DB
# 2. Load consortium models
# 3. Connect to DB
# 4. updates1, groups2, errors = process_records_phase1(conn, models)
# 5. all_updates = list(updates1) # Use a list directly
# 6. for cid, info in groups2.items():
# 7.    updates2 = process_group_phase2(conn, cid, info)
# 8.    all_updates.extend(updates2)
# 9. Apply all_updates in a transaction using executemany more effectively
#    (Separate lists for consortium_id updates and is_arbiter_response updates)
# 10. Log summary and errors

6. Final Check Before Proceeding (Phase 2)

  • Mermaid diagram syntax corrected.
  • Time window clarification incorporated (search optimization, potentially 1 hour).
  • Strategy remains focused on the reliable id-based link for final pairs and content/pattern matching for members/intermediate arbiters within the search window.
  • Flagging scheme (0, 1, 2) and error handling are defined.

This plan seems complete and addresses the latest feedback. Requesting final approval to proceed to Phase 2: Script Development & Execution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions