Enhance Media Support, UI Improvements, and Pipeline Robustness with Gemini Integration and Video Processing Features by navidshad · Pull Request #47 · navidshad/frameflow

navidshad · 2026-04-06T14:09:45Z

📋 Summary

This PR introduces comprehensive enhancements and refactoring across the project, including support for multiple media formats such as images and videos. It implements advanced image processing features like upscaling, iteration, and refinement with Gemini creative re-rendering. The video processing pipeline is significantly improved with yt-dlp integration for video downloads, real-time progress tracking, resolution selection, and playback support within the UI.

Additionally, the PR refactors various UI components for better maintainability and usability, including redesigning graph nodes with interactive previews, upgrading chat inputs to support images, and improving modal layouts. Background task management and cancellation support via AbortSignal are introduced, along with robust retry logic and error handling for Gemini API calls.

The repository documentation and project branding have been updated accordingly, and a global design system incorporating custom colors and ambient animations has been integrated.

🔗 Related Tasks

#86ex50815 - Implement multimodal intent recognition, image support, and intelligent reference supply controller
#86ex3gqx6 - Add video link support, yt-dlp integration, video resolution selection, and improved download reliability
#86ex2rmyn - Implement automated thumbnail generation pipeline with background scene enrichment and Gemini 3.1 Flash support
#86ex2bna2 - Add markdown rendering, persistent graph node positioning, and drag-and-drop support for ConversationNode and ResultNode
#86ewqdkht - Fix transcript parser and update transcript enrichment with visual metadata
#86ewqdkec - Implement background task management with UI status updates and cancellation support
#86ex41t5v - Add temporary directory safety checks, UI warnings, system instruction support, and professional thumbnail design prompts
#86ex4147w - Add copy-to-clipboard functionality and auto-resizing chat inputs with keyboard shortcuts
#86ex3ghzu - Integrate AI-generated release notes into GitHub release workflow
#86ex3gkk8 - Add video controller cost details and visual metadata display in timeline segments

📝 Additional Details

Centralized temporary directory path management with safety checks and subdirectory organization to prevent file deletion outside generated assets.
Lazy initialization of managers and improved artifact detection with UI feedback to enhance stability.
Enhanced thumbnail generation prompts and image processing logic for better output quality.
Introduced pipeline cancellation and retry mechanisms with exponential backoff for resilience against transient errors.
Improved download and video playback UI with real-time progress indicators and resolution-specific options.
Refactored UI components to standardize styles, disabled button states, and incorporate a new global design system featuring glassmorphism and ambient animations.
Documentation updates include supported media formats, optimization details, and project rebranding to FrameFlow.

📜 Commit List

188f225 docs: add supported media formats and optimization details to README
aeabbc7 docs: simplify dashboard screenshot styling in README
572d04f docs: remove Brain pipeline diagram and reproducibility link from README
c28b12f style: update MediaNode metadata text colors to support light mode and refresh screenshot
0268bc2 docs: update project banner and description layout in README
f4049fc docs: reorder README header elements and reposition banner image
91e899b docs: update README with new interface screenshots and project attribution
16b2677 feat: enable thinking configuration with 8000 budget in image generation adapter
905cfce feat: implement image upscaling functionality with Gemini creative re-rendering and update settings UI to support new operation
a1086a7 refactor: centralize temporary directory path constants and restrict file deletion to generated assets
a47a399 feat: add support for image iteration and refinement by passing attached images as visual context to the intent and generation phases
726cf45 feat: skip image extraction task if image text data is already cached
ddf8a91 feat: enable Gemini thinking mode and implement robust response text extraction to filter out thought blocks
03bb720 feat: update prompt instructions to use generic descriptors instead of real-world names for privacy compliance
591d1d4 feat: extract model text output from Gemini adapter and display it in ThumbnailNode UI
5b8c23c refactor: implement dynamic node height calculation for graph layout and improve thumbnail generation prompt and image processing logic
09617f0 refactor: improve background task error handling, retry logic, and transient error detection for Gemini adapter
1c01678 refactor: implement lazy initialization for managers and add missing artifact detection with UI feedback
ada3a464 refactor: include frame paths in scene descriptions to improve reference frame retrieval in the supply phase
ab07cc3 feat: implement automatic thread path repair and synchronization when changing temp directory
e541405 feat: implement real-time thread updates across windows and improve attachment modal layout
fe60fb7 style: standardize disabled button states across UI pages with consistent colors and transitions
bcc229c feat: update temporary directory path to include FrameFlow subdirectory and ensure its creation
08da113 refactor: update UploadPage UI text and improve code formatting
7da8681 chore: rebrand project to FrameFlow and update documentation accordingly
bc4cf92 Merge pull request Implement multimodal intent recognition, image attachments, and UI improvements for AttachmentModal #86ex50815 #46 from navidshad/CU-86ex50815_Implement-image-support-as-initial-resource-apart-link-and-video-file_Navid-Shad
19bbec7 feat: implement multimodal intent recognition and intelligent reference supply controller
b2ba3ae style: increase grid column count in AttachmentModal for better layout density #86ex50815
17341c6 refactor: replace manual input implementations with BaseMessageInput component across all graph nodes to support image attachments #86ex50815
0ecfb61 refactor: redesign AttachmentModal using shared components and unify image source handling #86ex50815
33b697e feat: integrate multi-image processing pipeline and image-only graph threads (#86ex50815)
a5f55f0 refactor: decompose ResultNode into specialized SummaryNode, ThumbnailNode, and VideoNode components for better maintainability
71d2439 refactor: improve yt-dlp download reliability with path normalization, ffmpeg binary resolution, and robust thread directory management #86ex3gqx6
23a3b9d Merge remote-tracking branch 'origin/dev' into CU-86ex3gqx6_Implement-link-support-then-user-is-able-to-provide-video-link-and-start-a-project_somayeh-roohani
d546e4e Merge pull request feat: implement robust retry logic with exponential backoff for Gemini API calls and add batch support for scene description model and image-based structured generation #0a4910d #45 from navidshad/CU-86ex3gw92_Add-gemini-batch-support-for-scene-analysis_Navid-Shad
0a4910d feat: implement robust retry logic with exponential backoff for Gemini API calls and add batch support for scene description model and image-based structured generation
09a22da refactor: replace manual child_process spawning with ytdlp-nodejs wrapper for yt-dlp operations #86ex3gqx6
e5bcb86 refactor: make pipeline execution asynchronous and add loading state to summary creation UI #86ex3gqx6
e5d7b47 feat: add real-time download progress tracking and UI visualization for video downloads #86ex3gqx6
7ff8b36 feat: add video metadata retrieval and display in ResultNode component
57b6856 refactor: remove hover-based opacity transitions and update overlay z-indexing for media nodes
6f63783 feat: implement video resolution selection by adding format fetching and resolution-specific download support #86ex3gqx6
b1c2bcb feat: implement video URL download support using yt-dlp integration #86ex3gqx6
e06201b feat: add status field to background tasks and display it in MediaNode UI
e6cad9b feat: add AbortSignal to task context for cancellation support
96915c0 refactor: organize temporary files into subdirectories, implement immediate usage recording with abort checks, and add stop confirmation UI for pipeline tasks
33f5fe5 refactor: ensure usage is recorded immediately and add stop confirmation UI for pipeline tasks
69b59a4 feat: implement pipeline cancellation support using AbortSignal across FFmpeg tasks and processing phases
c84b74e feat: add system instruction support to Gemini adapter and integrate professional thumbnail design prompts #86ex41t5v
8a42dc1 feat: add temporary directory safety checks and UI warnings for unstable storage paths #86ex41t5v
4105e65 feat: add copy-to-clipboard functionality to conversation messages #86ex4147w
0026d25 feat: upgrade chat inputs to auto-resizing textareas with consistent focus styling and keyboard shortcuts #86ex4147w
e9cd917 docs: update PilotUI documentation with source URLs, improved navigation, and corrected Button component props
347ce54 feat: implement global design system with custom colors, glassmorphism components, and ambient animations
2341b3b feat: integrate AI-generated release notes into the GitHub release workflow #86ex3ghzu
54ac31a Merge pull request Refactor header layout and ResultNode UI; enhance timeline segments with visual metadata #86ex3gkk8 #38 from navidshad/CU-86ex3gkk8_Add-cost-detail-Video-controller-Video-detail-Image-detail_Navid-Shad
7ce8506 refactor: update GraphChatPage header layout with constrained title width and repositioned cost display #86ex3gkk8
54efd4b feat: upgrade timeline segments to include visual metadata and update UI to display segment details. #86ex3gkk8
eaf3698 refactor: rename videoUrl to mediaContentUrl and add image support to ResultNode component #86ex3gkk8
874788d refactor: redesign ResultNode UI with media-centric layout and enhanced control overlays #86ex3gkk8
9bd4d73 feat: introduce EnrichedTimelineSegment and update transcript enrichment to merge visual descriptions into every segment #86ewqdkht
12ce900 feat: add waitForEnrichTranscript pipeline phase and remove inline transcript enrichment logic #86ewqdkht
30b811b refactor: replace SRT format with line-based transcript format for improved segment indexing #86ewqdkht
6f3a146 feat: implement recursive message branch deletion and add UI controls for branching and node removal #86ex2rmyn
7b8a56c feat: implement automated thumbnail generation pipeline, add Gemini 3.1 Flash support, and update UI to display generated thumbnails. #86ex2rmyn
07ba7d6 chore: release 1.1.6 [skip ci]
ed0c467 feat: add markdown rendering support to ConversationNode and ResultNode components #86ex2bna2
104eee1 feat: add version and file type badges to ResultNode and include version in graph message data #86ex2bna2
193c978 feat: implement persistent graph node positioning with drag-and-drop support #86ex2bna2
57087e0 feat: add draggable handle to ConversationNode and restrict drag interaction to specific UI elements #86ex2bna2
a87994b feat: redesign graph nodes with interactive video previews and add ConversationNode component #86ex2bna2
5439720 feat: Implement video playback and download functionality in result nodes and pass the user message ID to the video processing pipeline.
ad82624 feat: Add extensive debug logging and improve asynchronous handling within the video processing pipeline.
4f2595e feat: Implement retry functionality for message processing and preprocessing tasks with enhanced pipeline context and UI feedback.
d717e4c feat: Implement robust cross-platform scenedetect binary and module path resolution.
5926744 feat: introduce Vue Flow graph-based chat interface for parallel tasks #86ex2bna2
033beb8 feat: Implement background task management for preprocessing and update UI to reflect task status. #86ewqdkec

…te UI to reflect task status. #86ewqdkec

…nd-Tasks_Navid-Shad feat: Implement background task management for preprocessing and update UI to reflect task status #86ewqdkec

…s #86ex2bna2

…ath resolution.

…cessing tasks with enhanced pipeline context and UI feedback.

…ithin the video processing pipeline.

…odes and pass the user message ID to the video processing pipeline.

…nversationNode component #86ex2bna2

…raction to specific UI elements #86ex2bna2

…support #86ex2bna2

…ion in graph message data #86ex2bna2

…de components #86ex2bna2

…asks_Navid-Shad Support parallel tasks #86ex2bna2

….1 Flash support, and update UI to display generated thumbnails. #86ex2rmyn

… for branching and node removal #86ex2rmyn

…l-Generation-Pipeline-with-Background-Scene-Enrichment_Navid-Shad Implement Recursive Message Branch Deletion, Automated Thumbnail Generation, and UI Enhancements #86ex2rmyn

…proved segment indexing #86ewqdkht

…anscript enrichment logic #86ewqdkht

…ent to merge visual descriptions into every segment #86ewqdkht

…ser-in-correction-stage_Navid-Shad Enhance Transcript Processing with EnrichedTimelineSegment, Pipeline Phase, and Improved Format #86ewqdkht

…ed control overlays #86ex3gkk8

… ResultNode component #86ex3gkk8

… UI to display segment details. #86ex3gkk8

…idth and repositioned cost display #86ex3gkk8

…deo-controller-Video-detail-Image-detail_Navid-Shad Refactor header layout and ResultNode UI; enhance timeline segments with visual metadata #86ex3gkk8

…rkflow #86ex3ghzu

…m components, and ambient animations

…ce supply controller - Multimodal Context: Updated GeminiAdapter and thread context to aggregate and send user-selected images to the intent recognizer. - Visual-First Intent: Optimized determineIntent to prioritize Enriched Timeline Segments (scene descriptions) instead of raw transcripts for visual tasks. - Intelligent Supply Controller: Introduced a new pipeline phase to manage reference images. It strictly uses user attachments if provided, or intelligently selects a subset of video frames based on AI intent to avoid token overflows. - Reliability Fixes: Added automatic directory creation in GeminiAdapter to prevent ENOENT errors during image generation. - Performance: Limited intent image history to the last 8 images to maintain low latency and context relevance. Task ID: #86ex50815

…pport-as-initial-resource-apart-link-and-video-file_Navid-Shad Implement multimodal intent recognition, image attachments, and UI improvements for AttachmentModal #86ex50815

navidshad · 2026-04-06T14:11:02Z

…ry and ensure its creation

…tent colors and transitions

…ttachment modal layout

… changing temp directory

…nce frame retrieval in the supply phase

…artifact detection with UI feedback

…ansient error detection for Gemini adapter

…and improve thumbnail generation prompt and image processing logic

… ThumbnailNode UI

…f real-world names for privacy compliance

…extraction to filter out thought blocks

…hed images as visual context to the intent and generation phases

…file deletion to generated assets

…-rendering and update settings UI to support new operation.

…ion adapter

…ution

…d refresh screenshot

navidshad and others added 30 commits February 25, 2026 01:11

Merge branch 'main' into dev

4e4fde7

feat: Implement background task management for preprocessing and upda…

033beb8

…te UI to reflect task status. #86ewqdkec

Merge pull request #34 from navidshad/CU-86ewqdkec_Implement-Backgrou…

092948e

…nd-Tasks_Navid-Shad feat: Implement background task management for preprocessing and update UI to reflect task status #86ewqdkec

feat: introduce Vue Flow graph-based chat interface for parallel task…

5926744

…s #86ex2bna2

feat: Implement robust cross-platform scenedetect binary and module p…

d717e4c

…ath resolution.

feat: Implement retry functionality for message processing and prepro…

4f2595e

…cessing tasks with enhanced pipeline context and UI feedback.

feat: Add extensive debug logging and improve asynchronous handling w…

ad82624

…ithin the video processing pipeline.

feat: Implement video playback and download functionality in result n…

5439720

…odes and pass the user message ID to the video processing pipeline.

feat: redesign graph nodes with interactive video previews and add Co…

a87994b

…nversationNode component #86ex2bna2

feat: add draggable handle to ConversationNode and restrict drag inte…

57087e0

…raction to specific UI elements #86ex2bna2

feat: implement persistent graph node positioning with drag-and-drop …

193c978

…support #86ex2bna2

feat: add version and file type badges to ResultNode and include vers…

104eee1

…ion in graph message data #86ex2bna2

feat: add markdown rendering support to ConversationNode and ResultNo…

ed0c467

…de components #86ex2bna2

Merge pull request #35 from navidshad/CU-86ex2bna2_Support-parallel-t…

063aa69

…asks_Navid-Shad Support parallel tasks #86ex2bna2

chore: release 1.1.6 [skip ci]

07ba7d6

Merge remote-tracking branch 'origin/main' into dev

1bc2dbd

feat: implement automated thumbnail generation pipeline, add Gemini 3…

7b8a56c

….1 Flash support, and update UI to display generated thumbnails. #86ex2rmyn

feat: implement recursive message branch deletion and add UI controls…

6f3a146

… for branching and node removal #86ex2rmyn

Merge pull request #36 from navidshad/CU-86ex2rmyn_Implement-Thumbnai…

719b7c7

…l-Generation-Pipeline-with-Background-Scene-Enrichment_Navid-Shad Implement Recursive Message Branch Deletion, Automated Thumbnail Generation, and UI Enhancements #86ex2rmyn

refactor: replace SRT format with line-based transcript format for im…

30b811b

…proved segment indexing #86ewqdkht

feat: add waitForEnrichTranscript pipeline phase and remove inline tr…

12ce900

…anscript enrichment logic #86ewqdkht

feat: introduce EnrichedTimelineSegment and update transcript enrichm…

9bd4d73

…ent to merge visual descriptions into every segment #86ewqdkht

Merge pull request #37 from navidshad/CU-86ewqdkht_Fix-transcript-par…

b80559b

…ser-in-correction-stage_Navid-Shad Enhance Transcript Processing with EnrichedTimelineSegment, Pipeline Phase, and Improved Format #86ewqdkht

refactor: redesign ResultNode UI with media-centric layout and enhanc…

874788d

…ed control overlays #86ex3gkk8

refactor: rename videoUrl to mediaContentUrl and add image support to…

eaf3698

… ResultNode component #86ex3gkk8

feat: upgrade timeline segments to include visual metadata and update…

54efd4b

… UI to display segment details. #86ex3gkk8

refactor: update GraphChatPage header layout with constrained title w…

7ce8506

…idth and repositioned cost display #86ex3gkk8

Merge pull request #38 from navidshad/CU-86ex3gkk8_Add-cost-detail-Vi…

54ac31a

…deo-controller-Video-detail-Image-detail_Navid-Shad Refactor header layout and ResultNode UI; enhance timeline segments with visual metadata #86ex3gkk8

feat: integrate AI-generated release notes into the GitHub release wo…

2341b3b

…rkflow #86ex3ghzu

feat: implement global design system with custom colors, glassmorphis…

347ce54

…m components, and ambient animations

navidshad added 2 commits April 6, 2026 16:59

Merge pull request #46 from navidshad/CU-86ex50815_Implement-image-su…

bc4cf92

…pport-as-initial-resource-apart-link-and-video-file_Navid-Shad Implement multimodal intent recognition, image attachments, and UI improvements for AttachmentModal #86ex50815

navidshad added 25 commits April 6, 2026 18:53

chore: rebrand project to FrameFlow and update documentation accordingly

7da8681

refactor: update UploadPage UI text and improve code formatting

08da113

feat: update temporary directory path to include FrameFlow subdirecto…

bcc229c

…ry and ensure its creation

style: standardize disabled button states across UI pages with consis…

fe60fb7

…tent colors and transitions

feat: implement real-time thread updates across windows and improve a…

e541405

…ttachment modal layout

feat: implement automatic thread path repair and synchronization when…

ab07cc3

… changing temp directory

refactor: include frame paths in scene descriptions to improve refere…

da3a464

…nce frame retrieval in the supply phase

refactor: implement lazy initialization for managers and add missing …

1c01678

…artifact detection with UI feedback

refactor: improve background task error handling, retry logic, and tr…

09617f0

…ansient error detection for Gemini adapter

refactor: implement dynamic node height calculation for graph layout …

5b8c23c

…and improve thumbnail generation prompt and image processing logic

feat: extract model text output from Gemini adapter and display it in…

591d1d4

… ThumbnailNode UI

feat: update prompt instructions to use generic descriptors instead o…

03bb720

…f real-world names for privacy compliance

feat: enable Gemini thinking mode and implement robust response text …

ddf8a91

…extraction to filter out thought blocks

feat: skip image extraction task if image text data is already cached

726cf45

feat: add support for image iteration and refinement by passing attac…

a47a399

…hed images as visual context to the intent and generation phases

refactor: centralize temporary directory path constants and restrict …

a1086a7

…file deletion to generated assets

feat: implement image upscaling functionality with Gemini creative re…

905cfce

…-rendering and update settings UI to support new operation.

feat: enable thinking configuration with 8000 budget in image generat…

16b2677

…ion adapter

docs: update README with new interface screenshots and project attrib…

91e899b

…ution

docs: reorder README header elements and reposition banner image

f4049fc

docs: update project banner and description layout in README

0268bc2

style: update MediaNode metadata text colors to support light mode an…

c28b12f

…d refresh screenshot

docs: remove Brain pipeline diagram and reproducibility link from README

572d04f

docs: simplify dashboard screenshot styling in README

aeabbc7

docs: add supported media formats and optimization details to README

188f225

navidshad changed the title ~~Dev~~ Enhance Media Support, UI Improvements, and Pipeline Robustness with Gemini Integration and Video Processing Features Apr 7, 2026

navidshad merged commit 8af65d3 into main Apr 7, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Media Support, UI Improvements, and Pipeline Robustness with Gemini Integration and Video Processing Features#47

Enhance Media Support, UI Improvements, and Pipeline Robustness with Gemini Integration and Video Processing Features#47
navidshad merged 89 commits into
mainfrom
dev

navidshad commented Apr 6, 2026 •

edited

Loading

Uh oh!

navidshad commented Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

navidshad commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 Summary

🔗 Related Tasks

📝 Additional Details

📜 Commit List

Uh oh!

navidshad commented Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

navidshad commented Apr 6, 2026 •

edited

Loading