MoneyMatic is an AI-powered web application that extracts and classifies scanned financial documents using OCR, NLP, and deep learning. It also includes a secure user authentication system built with Node.js and MongoDB. It is designed to classify scanned financial documents into categories such as:
- Balance Sheets
- Cash Flow Statements
- Income Statements
- Notes
- Others
- OCR Processing: Extract text from images of financial documents (e.g.,
.jpg,.jpeg,.png). - Data Cleaning: Preprocess extracted text to remove noise and standardize the format.
- Classification: Classify financial documents into categories using a trained machine learning model.
- Web Interface: Upload documents via a user-friendly dashboard.
- Secure: Bank-level security for user data.
MoneyMatic/
├── backend/
│ ├── app.py
│ ├── utils/
│ │ ├── extract_and_prepare.py
│ │ └── other_utils.py
│ ├── model/
│ │ └── financial_text_classifier.joblib
│ └── uploads/
│
├── frontend/
│ ├── index.html
│ ├── login/
│ │ ├── dashboard.html
│ │ └── signup.html
│ ├── CSS/
│ │ └── styles.css
│ └── img/
│
├── moneymatic-backend/
│ ├── server.js
│ ├── config/
│ │ └── db.js
│ ├── routes/
│ │ └── auth.js
│ └── middleware/
│ └── auth.js
│
└── README.md
- Python: Version 3.8 or higher.
- Node.js: Version 14 or higher.
- Tesseract OCR: Install from Tesseract OCR GitHub.
- MongoDB: For user authentication and data storage.
-
Clone the repository:
git clone https://github.com/mushxoxo/MoneyMatic.git cd MoneyMatic -
Create and activate a virtual environment:
python -m venv venv1 source venv1/bin/activate # On Windows: venv1\Scripts\activate
-
Install the required packages:
pip install -r requirements.txt
-
Install Tesseract OCR:
-
Ubuntu:
sudo apt update sudo apt install tesseract-ocr
-
macOS (using Homebrew):
brew install tesseract
-
Windows:
Download and install from Tesseract OCR GitHub.
-
-
Navigate to the backend directory:
cd backend -
Start the Flask application:
python app.py
-
Navigate to the moneymatic-backend directory:
cd moneymatic-backend -
Start the Node.js server:
node server.js
-
Access the web interface:
Open your browser and go to http://localhost:5000
-
Login or sign-up using an email id
-
Upload and classify documents:
- Click on the upload button to select a
.jpgor.pngfile. - View the predicted category, confidence score, and extracted text.
- Click on the upload button to select a
If you wish to retrain the model:
-
Prepare your dataset:
- Organize images into subdirectories named after their respective categories.
-
Run all the scripts in the utils directory:
Ensure that the script paths and parameters are correctly set according to your dataset.
This project is licensed under the MIT License.