Enhance Media Support, UI Improvements, and Pipeline Robustness with Gemini Integration and Video Processing Features#47
Merged
Conversation
…te UI to reflect task status. #86ewqdkec
…nd-Tasks_Navid-Shad feat: Implement background task management for preprocessing and update UI to reflect task status #86ewqdkec
…cessing tasks with enhanced pipeline context and UI feedback.
…ithin the video processing pipeline.
…odes and pass the user message ID to the video processing pipeline.
…nversationNode component #86ex2bna2
…raction to specific UI elements #86ex2bna2
…support #86ex2bna2
…ion in graph message data #86ex2bna2
…de components #86ex2bna2
…asks_Navid-Shad Support parallel tasks #86ex2bna2
….1 Flash support, and update UI to display generated thumbnails. #86ex2rmyn
… for branching and node removal #86ex2rmyn
…l-Generation-Pipeline-with-Background-Scene-Enrichment_Navid-Shad Implement Recursive Message Branch Deletion, Automated Thumbnail Generation, and UI Enhancements #86ex2rmyn
…proved segment indexing #86ewqdkht
…anscript enrichment logic #86ewqdkht
…ent to merge visual descriptions into every segment #86ewqdkht
…ser-in-correction-stage_Navid-Shad Enhance Transcript Processing with EnrichedTimelineSegment, Pipeline Phase, and Improved Format #86ewqdkht
…ed control overlays #86ex3gkk8
… ResultNode component #86ex3gkk8
… UI to display segment details. #86ex3gkk8
…idth and repositioned cost display #86ex3gkk8
…deo-controller-Video-detail-Image-detail_Navid-Shad Refactor header layout and ResultNode UI; enhance timeline segments with visual metadata #86ex3gkk8
…rkflow #86ex3ghzu
…m components, and ambient animations
…ce supply controller - Multimodal Context: Updated GeminiAdapter and thread context to aggregate and send user-selected images to the intent recognizer. - Visual-First Intent: Optimized determineIntent to prioritize Enriched Timeline Segments (scene descriptions) instead of raw transcripts for visual tasks. - Intelligent Supply Controller: Introduced a new pipeline phase to manage reference images. It strictly uses user attachments if provided, or intelligently selects a subset of video frames based on AI intent to avoid token overflows. - Reliability Fixes: Added automatic directory creation in GeminiAdapter to prevent ENOENT errors during image generation. - Performance: Limited intent image history to the last 8 images to maintain low latency and context relevance. Task ID: #86ex50815
…pport-as-initial-resource-apart-link-and-video-file_Navid-Shad Implement multimodal intent recognition, image attachments, and UI improvements for AttachmentModal #86ex50815
Owner
Author
…ry and ensure its creation
…tent colors and transitions
…ttachment modal layout
… changing temp directory
…nce frame retrieval in the supply phase
…artifact detection with UI feedback
…ansient error detection for Gemini adapter
…and improve thumbnail generation prompt and image processing logic
… ThumbnailNode UI
…f real-world names for privacy compliance
…extraction to filter out thought blocks
…hed images as visual context to the intent and generation phases
…file deletion to generated assets
…-rendering and update settings UI to support new operation.
…d refresh screenshot
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📋 Summary
This PR introduces comprehensive enhancements and refactoring across the project, including support for multiple media formats such as images and videos. It implements advanced image processing features like upscaling, iteration, and refinement with Gemini creative re-rendering. The video processing pipeline is significantly improved with yt-dlp integration for video downloads, real-time progress tracking, resolution selection, and playback support within the UI.
Additionally, the PR refactors various UI components for better maintainability and usability, including redesigning graph nodes with interactive previews, upgrading chat inputs to support images, and improving modal layouts. Background task management and cancellation support via AbortSignal are introduced, along with robust retry logic and error handling for Gemini API calls.
The repository documentation and project branding have been updated accordingly, and a global design system incorporating custom colors and ambient animations has been integrated.
🔗 Related Tasks
📝 Additional Details
📜 Commit List