This application analyzes research papers by extracting hypotheses and identifying limitations of a research paper using a fine-tuned Llama 3.2 model. This project was developed for the course Natural Language Processing (UE22CS342AB)
- PDF Processing: Upload and extract text from research papers in PDF format
- RAG Pipeline: Retrieval-Augmented Generation for context-aware analysis
- Multi-Agent System:
- Agent 1: Generates null and alternate hypotheses from the abstract
- Agent 2: Identifies limitations and weaknesses in the research methodology for the paper given as input
- User-friendly Streamlit Interface: Easy-to-use GUI for interacting with the system
- Clone this repository
- Install dependencies:
pip install -r requirements.txt - Set up your Hugging Face token:
- Create a
.streamlit/secrets.tomlfile with content:HF_TOKEN = "your_huggingface_token_here" - Or set it as an environment variable:
export HF_TOKEN="your_huggingface_token_here"
- Create a
Note: The fine tuned model can be accessed at: https://huggingface.co/NLP-team-6/llama-3-hypothesis-qlora
- Run the Streamlit app:
streamlit run app.py - Access the web interface at http://localhost:8501
- Load the model using the sidebar button
- Upload a research paper PDF
- Process the PDF and analyze it using the provided buttons
- At least 8GB GPU VRAM (for 4-bit quantized model)
- 16GB+ RAM recommended
- Hugging Face account with access to Llama 3.2 models