feat: Work on 'all' file format #553

LomikXP · 2025-05-05T05:44:06Z

Related issues / PRs. Summarize issues.

nothing

Summarize Changes

Extending feature of using all formats while whispering
Extending txt file format for splitting by speaker (if diarization enabled)

gemini-code-assist

Hello @LomikXP, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request focuses on enhancing the application's file format handling, specifically for the 'all' format and improving speaker diarization in TXT output. The primary goal is to extend the functionality of the application to support outputting all available subtitle formats simultaneously and to better handle speaker-separated text in TXT files when diarization is enabled. This involves modifications to the UI, translation modules, subtitle management, and the base transcription pipeline.

Highlights

File Format 'all': The pull request introduces the 'all' option for file format selection, allowing users to generate all supported subtitle formats (SRT, WebVTT, TXT, LRC, JSON, TSV) in a single operation.
Speaker Diarization in TXT: The pull request enhances the TXT file format output to include speaker diarization, splitting the text by speaker if diarization is enabled. It uses regex to detect speaker labels in the text and formats the output accordingly.
Translation Module Adaptations: The translation modules (both generic and DeepL) have been adapted to handle the 'all' file format option, ensuring that the correct output paths are returned when multiple files are generated.

Changelog

Click here to see the changelog

app.py
- Added 'all' as a choice in the file format dropdown menu.
modules/translation/deepl_api.py
- Modified the translate_deepl function to handle multiple subtitle files when the 'all' format is selected.
- The files_info dictionary now stores a list of dictionaries, each containing subtitle and path information for a specific format.
- The output_file_paths list comprehension was updated to iterate through all items in files_info.
modules/translation/translation_base.py
- Modified the translate_file function to handle multiple subtitle files when the 'all' format is selected.
- The files_info dictionary now stores a list of dictionaries, each containing subtitle and path information for a specific format.
- The output_file_paths list comprehension was updated to iterate through all items in files_info.
modules/utils/subtitle_manager.py
- Implemented speaker diarization in the WriteTXT class, formatting the output to separate text by speaker.
- Modified the get_writer function to handle the 'all' output format, creating a writer for each available format.
- Modified the generate_file function to generate all subtitle formats when 'all' is selected, returning a list of file paths.
modules/whisper/base_transcription_pipeline.py
- Modified the transcribe_file function to handle multiple subtitle files when the 'all' format is selected.
- The files_info dictionary now stores a list of dictionaries, each containing subtitle, time_for_task and path information for a specific format.
- The result_file_path list comprehension was updated to iterate through all items in files_info.
- The return value of generate_file is now indexed at zero when calling it from transcribe_mic and transcribe_youtube.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Formats align,
'All' files in their prime,
Diarization speaks.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request introduces the 'all' file format option, allowing users to generate all available subtitle formats simultaneously. Additionally, it extends the 'txt' file format to support splitting by speaker when diarization is enabled. The changes seem well-structured, but there are a few areas that could be improved for clarity and efficiency.

Summary of Findings

Inconsistent return types: The generate_file function now returns a list of tuples, but in some cases, it only returns a single-element list. This inconsistency could lead to unexpected behavior and should be addressed.
Duplicated code: The writers dictionary is defined twice in modules/utils/subtitle_manager.py. It should be defined only once to avoid redundancy.
Potential performance issue: When the output format is 'all', the code iterates through all writers and generates files. However, the write_all function is not actually used. This could lead to unnecessary file generation.

Merge Readiness

The pull request is not quite ready for merging. The inconsistent return types and duplicated code should be addressed before merging. Additionally, the potential performance issue with the 'all' output format should be investigated and resolved. I am unable to approve this pull request, and recommend that others review and approve this code before merging.

Lomik_XP and others added 2 commits April 30, 2025 12:48

feat: Work on 'all' file format

292a224

Merge branch 'jhj0517:master' into feat/work-on-all-file-format

b40c59b

gemini-code-assist bot reviewed May 5, 2025

View reviewed changes

gemini-code-assist bot suggested changes May 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: Work on 'all' file format #553

feat: Work on 'all' file format #553

Uh oh!

LomikXP commented May 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

feat: Work on 'all' file format #553

Are you sure you want to change the base?

feat: Work on 'all' file format #553

Uh oh!

Conversation

LomikXP commented May 5, 2025

Related issues / PRs. Summarize issues.

Summarize Changes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant