Migrate to browser-based AI inference with Transformers.js#13
Conversation
Co-authored-by: tpC529 <106394358+tpC529@users.noreply.github.com>
Co-authored-by: tpC529 <106394358+tpC529@users.noreply.github.com>
Co-authored-by: tpC529 <106394358+tpC529@users.noreply.github.com>
Co-authored-by: tpC529 <106394358+tpC529@users.noreply.github.com>
|
@copilot change the description to specifically review and explain code blocks const result = await modelPipeline(croppedImage, { |
I've updated the PR description to include a detailed explanation of the code block. The key points: What this code does:
Parameter explanations:
The low temperature (0.3) ensures consistent, factual code descriptions rather than creative interpretations. The 100-token limit balances detail with readability in the floating panel UI. The result processing handles different response formats from the model (array, object with |
Replaces Python backend + Ollama with Transformers.js for browser-native vision-language inference, eliminating server requirements and improving performance on older GPUs like Intel Iris Xe.
Architecture Changes
Implementation
Core Inference (
model-worker.js)Parameter Details:
prompt: Natural language instruction directing the model to identify and explain code/text in the imagemax_new_tokens: 100: Constrains output length to balance detail with readability in the floating panel UI. 100 tokens ≈ 75-100 wordstemperature: 0.3: Controls output randomness. Low value (0.0-0.3) produces consistent, factual descriptions rather than creative interpretations. Essential for reliable code analysisgenerated_text,text, or raw string) for robust text extractionContent Script (
content.js)initializeModelWorker()with progress trackingprocessWithBrowser()for local inferenceprocessWithBackend()for legacy modeSettings (
options.html,options.js)inferenceMode+backendUrlManifest (
manifest.json)'wasm-unsafe-eval'for WebAssembly executionmodel-worker.js1.0.0→2.0.0Model Selection
Evaluated Florence-2, Moondream2, BLIP, and ViT-GPT2. Selected ViT-GPT2 for:
Florence-2 and Moondream2 deferred until browser support stabilizes.
Performance Profile
Documentation
MIGRATION_EVALUATION.md: Technical evaluation of 6 frameworksTESTING_GUIDE.md: Comprehensive test matrixIMPLEMENTATION_SUMMARY.md: Change inventory and metricsTesting Surface
Migration Path: Users default to browser mode. Backend mode available via settings for those requiring Ollama/moondream:1.8b.
Original prompt
Problem Statement
The CodeLearner extension currently uses a Python backend (
backend.py) with Ollama to run the moondream:1.8b vision-language model. While functional, this approach is extremely slow on older GPUs like the Intel Iris Xe found in Dell 3330 laptops.Objective
Migrate the extension to use Transformers.js for browser-based inference with WebGL/WebGPU acceleration, eliminating the need for the Python backend entirely. This will leverage the browser's GPU acceleration capabilities and improve performance significantly on older hardware.
Current Architecture
The extension currently works as follows:
http://127.0.0.1:8000/apibackend.py) uses Ollama to process image with moondream:1.8b modelFiles involved:
backend.py(88 lines) - Python FastAPI server using Ollamacontent.js(162 lines) - Content script handling UI and API callsbackground.js- Service worker for screenshot capturemanifest.json- Extension manifestRequirements
1. Evaluate Alternatives to Transformers.js
Before implementation, research and document the best approach for browser-based vision-language inference:
Options to evaluate:
Evaluation criteria:
Document your findings in a new file:
MIGRATION_EVALUATION.md2. Implement Browser-Based Inference
Based on your evaluation, implement the best solution (likely Transformers.js unless you find a better alternative).
Key changes needed:
A. Remove Python Backend Dependency
backend.pyfile should be deprecated (keep for reference but don't require it)B. Add Model Loading Script
Create a new file
model-worker.jsor similar that:Suggested models (in order of priority):
C. Update content.js
Modify
content.jsto:http://127.0.0.1:8000/api(lines 84-101)D. Update background.js
E. Update manifest.json
Add necessary permissions:
storage(for model caching)webRequestif needed3. Optimize for Performance
Critical optimizations:
4. Update Documentation
Update these files:
5. Maintain Backward Compatibility (Optional)
Consider adding a settings option to allow users to choose between:
This could be added to
options.htmlandoptions.js.Testing Requirements
Test on:
This pull request was created from Copilot chat.
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.