Video2Article

About

Video2Article demonstrates the use of Large Multimodal Model (LMM) to generate a full-length article from a video tutorial.

Using the vision capabilities of GPT-4o, you can now turn any video tutorial into technical article with relevant code snippets, screenshots extracted from the video without manual intervention.

The following illustrates the high-level overview on Video2Article's inner workings:

For specifics in the implementation, you can read more in my detailed write-up.

Note

While Video2Article works well to a certain extent, it still requires manual proofreading and editing to fix inaccuracies and inconsistencies in the content and formatting.

Getting Started

Setting Up Environment

This project uses uv for dependency management. To install uv, please refer to this guide:

# On macOS and Linux.
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows.
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

# With pip.
pip install uv

# With pipx.
pipx install uv

# With Homebrew.
brew install uv

# With Pacman.
pacman -S uv

To setup the project and install the required dependencies:

# git clone the repo along with submodules
git clone --recurse-submodules https://github.com/wtlow003/video2article.git

# create a virtual env
uv venv

# install dependencies
uv pip install -r requirements.txt  # Install from a requirements.txt file.

Usage

The following are the available options to trigger a dubbing workflow:

source .venv/bin/activate
python3 main.py --help

>>> usage: main.py [-h] [--api-key API_KEY] [--transcript-path TRANSCRIPT_PATH] [--segments-path SEGMENTS_PATH] [--url URL]
               [--semantic-chunking]

Convert video to article.

optional arguments:
  -h, --help            show this help message and exit
  --transcript-path TRANSCRIPT_PATH
                        [OPTIONAL] Path to video transcript (in SRT) format.
  --segments-path SEGMENTS_PATH
                        [OPTIONAL] Path to transcript segments (in JSON) format.
  --url URL             Video url.
  --semantic-chunking   Enable semantic chunking of images with Semantic Router.

For example, to trigger a straightforward article generation from a YouTube url:

# api keys for openai + langsmith tracing
source .env

python3 main.py --url "https://www.youtube.com/watch?v=TCH_1BHY58I" --semantic-chunking

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
assets		assets
examples		examples
frames/unlabelled		frames/unlabelled
tests		tests
utils		utils
whisper.cpp @ 420b6ab		whisper.cpp @ 420b6ab
.coveragerc		.coveragerc
.env_sample		.env_sample
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md
_config.yml		_config.yml
conftest.py		conftest.py
index.md		index.md
main.py		main.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Video2Article

About

Getting Started

Setting Up Environment

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

wtlow003/video2article

Folders and files

Latest commit

History

Repository files navigation

Video2Article

About

Getting Started

Setting Up Environment

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages