Welcome to PDF-Conv, a Streamlit-based application that allows you to upload multiple PDF files and interact with them through a conversational interface. π
- Upload multiple PDF documents π
- Extract text from PDFs and split into manageable chunks π§©
- Create vector store from text chunks using Google Palm embeddings π
- Conversational retrieval chain for interacting with PDF content π€
- Beautiful and user-friendly interface π¨
Before you begin, ensure you have met the following requirements:
- Python 3.8 or later π
- Streamlit
- PyPDF2
- Langchain
- Google Generative AI
- FAISS
- Clone the repo
git clone https://github.com/silentkiller18/pdf-conv.git
- Navigate to the project directory
cd pdf-conv - Install required packages
pip install -r requirements.txt
-
Set up your Google API key
os.environ['GOOGLE_API_KEY'] = 'YOUR_GOOGLE_API_KEY'
-
Run the Streamlit application
streamlit run app.py
-
Open your web browser and go to
http://localhost:8501 -
Upload your PDF files and start interacting with the content! ποΈ
-
Launch an EC2 Instance
- Go to the EC2 Dashboard and click "Launch Instance".
- Choose an Amazon Machine Image (AMI), such as the latest Ubuntu Server.
- Select an instance type, like t2.micro for free tier.
- Configure instance details, add storage, and configure security groups to allow HTTP (port 80) and HTTPS (port 443).
- Review and launch the instance.
-
Connect to Your Instance
- Use SSH to connect to your instance.
ssh -i /path/to/your-key-pair.pem ubuntu@your-ec2-public-dns
-
Install Dependencies
- Update package lists and install Python, pip, and git.
sudo apt-get update sudo apt-get install python3-pip git
-
Clone the Repository and Install Python Packages
git clone https://github.com/silentkiller18/pdf-conv.git cd pdf-conv pip3 install -r requirements.txt -
Set Up Environment Variable
- Add your Google API key to the environment.
echo "export GOOGLE_API_KEY='YOUR_GOOGLE_API_KEY'" >> ~/.bashrc source ~/.bashrc
-
Run the Streamlit Application
streamlit run app.py --server.port 80
-
Access Your Application
- Open your web browser and navigate to
http://your-ec2-public-dns
- Open your web browser and navigate to
- Upload PDFs: Users can upload multiple PDF files through the sidebar.
- Text Extraction: Extracts text from each PDF file.
- Text Splitting: Splits the extracted text into chunks for better processing.
- Vector Store Creation: Creates a vector store using Google Palm embeddings.
- Conversational Interface: Users can ask questions about the PDF content and receive responses.
Contributions, issues, and feature requests are welcome! Feel free to check the issues page.
Distributed under the MIT License. See LICENSE for more information.
Project Link: https://github.com/silentkiller18/pdf-conv
βοΈ Don't forget to give the project a star if you found it useful! βοΈ
