Skip to content

frank4591/LiquidTraining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LFM2-VL-1.6B Vision Language Model

This folder contains scripts and tools for using the LFM2-VL-1.6B vision language model from Liquid AI. The model is designed for processing text and images with variable resolutions, optimized for low-latency and edge AI applications.

🚀 Features

  • 2× faster inference speed on GPUs compared to existing VLMs
  • Flexible architecture with user-tunable speed-quality tradeoffs
  • Native resolution processing up to 512×512 pixels
  • Lightweight: Only 1.6B parameters
  • Multimodal: Processes both text and images

📁 Files

  • save_lfm2_vl_model.py - Downloads and saves the model locally
  • instagram_caption_generator.py - Generates Instagram-like captions for images
  • test_setup.py - Tests the setup and dependencies
  • requirements.txt - Required Python packages
  • run_preprocessing.sh - Launcher for Instagram dataset preprocessing
  • preprocessing/ - Folder containing all preprocessing scripts
  • README.md - This file

🛠️ Setup

1. Install Dependencies

pip install -r requirements.txt

2. Download the Model

python save_lfm2_vl_model.py

This will download the ~1.6GB model to ./lfm2_vl_1_6b_model/

3. Test the Setup

python test_setup.py

📸 Usage

Preprocess Instagram Dataset for Training

# Quick preprocessing (recommended)
./run_preprocessing.sh InstaDataset.zip

# This will create a training-ready dataset in ./processed_dataset/instagram_dataset/

Train the Model

# Start training on the processed dataset
python3 train_lfm2_instagram_trainer.py \
    --data-dir ./processed_dataset/instagram_dataset \
    --output-dir ./trained_model \
    --num-epochs 5 \
    --batch-size 1 \
    --learning-rate 5e-5

# Training will create checkpoints and a final model in ./trained_model/

Generate Instagram Captions

# Basic usage with the provided image
python instagram_caption_generator.py --image ../img1.jpg

# Generate multiple captions with different styles
python instagram_caption_generator.py --image ../img1.jpg --style creative --num-captions 5

# Use a different output file
python instagram_caption_generator.py --image ../img1.jpg --output my_captions.txt

Available Styles

  • instagram - Trendy, relatable captions with hashtags
  • professional - Business-appropriate descriptions
  • casual - Friendly, conversational tone
  • creative - Artistic, mood-capturing captions

Command Line Options

python instagram_caption_generator.py --help

Options:

  • --image, -i - Path to image file (required)
  • --style, -s - Caption style (default: instagram)
  • --num-captions, -n - Number of captions to generate (default: 3)
  • --output, -o - Output file for captions (default: generated_captions.txt)
  • --model-path, -m - Path to local model (default: ./lfm2_vl_1_6b_model)

🔧 Technical Details

Model Architecture

  • Language Model: LFM2-1.2B backbone
  • Vision Encoder: SigLIP2 NaFlex shape-optimized (400M parameters)
  • Hybrid Backbone: Combines convolution and attention layers
  • Context: 32,768 text tokens
  • Image Tokens: Dynamic, user-tunable
  • Precision: bfloat16

Performance

The model achieves competitive performance on various benchmarks:

  • RealWorldQA: 65.23
  • MM-IFEval: 37.66
  • InfoVQA: 58.68
  • OCRBench: 742
  • MMStar: 49.53

Memory Requirements

  • GPU Memory: ~3-4GB for inference
  • Model Size: ~1.6GB on disk
  • RAM: ~2-3GB additional

💡 Example Output

For the provided img1.jpg image, the model might generate captions like:

📸 Caption 1:
Beautiful sunset vibes! 🌅 Nature never fails to amaze me. 
#sunset #nature #photography #beautiful #peaceful

📸 Caption 2:
When the sky paints itself in golden hour magic ✨ 
#goldenhour #sky #photography #nature #beauty

📸 Caption 3:
Sunset serenity - the perfect way to end the day 🌅
#serenity #sunset #peace #nature #photography

🚨 Troubleshooting

Common Issues

  1. CUDA Out of Memory

    • Reduce max_tokens in the generation parameters
    • Use CPU if GPU memory is insufficient
  2. Model Not Found

    • Ensure you've run save_lfm2_vl_model.py first
    • Check the model path in the script
  3. Dependencies Missing

    • Install requirements: pip install -r requirements.txt
    • Ensure you have Python 3.8+ and PyTorch 2.0+

Performance Tips

  • Use device_map="auto" for automatic device placement
  • Set torch_dtype="bfloat16" for memory efficiency
  • Adjust max_image_tokens for speed/quality tradeoff

📚 References

📄 License

This model uses the LFM Open License v1.0. Please review the license terms on the Hugging Face model page.

🤝 Support

For issues or questions:

  1. Check the troubleshooting section above
  2. Review the model documentation on Hugging Face
  3. Test with python test_setup.py to verify your setup

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published