Smart Agriculture Advisory System

An ML-based advisory system that recommends suitable crops for Indian farmers by state and district, land size, and region-specific soil/climate. It shows the top 5 crops by advisory score (suitability, risk, and regional potential), with estimated production in kg, market price in ₹/kg, risk and disease information, and prevention measures. No profit figures are shown—advisory only. Built for final-year / academic use.

Features

Region-first flow: Select state, district, and land size (bigha). No manual soil/climate input—the system uses state-specific agro-climatic defaults so recommendations vary by region (e.g. Rajasthan vs Kerala vs Himachal Pradesh).
Advisory-only output: Top 5 crops ranked by a balanced advisory score (suitability + regional potential − risk). No net profit, ROI, or profit charts.
Indian units: Production and sale quantity in kg; prices in ₹/kg; land in bigha (with acres shown). No quintals or tons.
Per-crop details: Suitability %, estimated production (kg), market price (₹/kg), estimated sale quantity (kg), risk score, disease/pest risks, prevention measures, and soil-based growing tips.
Soil nutrient view: After analysis, a soil nutrient distribution (N, P, K) chart is shown as a crop-average reference for the selected region.
Data used for analysis: Sidebar shows total records, number of states, and total crops used by the engine. Optional refresh via data.gov.in API.
Dark theme UI; optional lighter theme in code.

Problem Statement

Choosing the wrong crop for a given region and land leads to lower yield and wasted effort. This system acts as a decision-support tool: given state, district, and land size, it uses region-specific soil/climate profiles and an ML model to recommend the most suitable crops, with explainability, risk, and preventive advice—without showing direct profit to avoid misleading estimates.

Methodology

Data
- Crop recommendation dataset: N, P, K, temperature, humidity, ph, rainfall → crop label (e.g. Kaggle – Crop Recommendation Dataset).
- Adding more training data: Put any extra CSVs with the same columns (N, P, K, temperature, humidity, ph, rainfall, label) in data/raw/. The pipeline merges all compatible CSVs when you run python run_pipeline.py, so you can keep Crop_Recommendation.csv and add e.g. crop_extra.csv.
- Regional data (optional): state_wise_yield.csv, market_prices.csv, cost_of_cultivation.csv, climate_vulnerability.csv in data/raw/ for state/district-aware yield, price, and risk. If absent, embedded national averages are used.
Region-specific inputs
- States are mapped to agro-climatic zones (arid NW, eastern humid, southern, west coast, central, Himalayan, western dry). Each zone has distinct default N, P, K, temperature, humidity, ph, rainfall (aligned with training data). State + district offsets ensure different regions get meaningfully different inputs so recommendations vary by location and land size.
ML pipeline
- Preprocessing: Label encoding, StandardScaler on training set, stratified train–test split.
- Models: Decision Tree, Random Forest, KNN, SVM, Logistic Regression—tuned via GridSearchCV; best model by test F1-macro (e.g. SVM).
- Prediction: predict_crop(N, P, K, temperature, humidity, ph, rainfall, land_size_bigha, state, district, scoring_mode="balanced") returns top 5 crops with suitability, production (kg), price (₹/kg), risk, disease risks, prevention, and explanations.
Explainability & soil health
- Feature importance (in models/metadata.json), explanation text, and rule-based soil health messages and crop-specific suggestions.

Algorithms Used

Component	Role
StandardScaler	Feature scaling
LabelEncoder	Crop labels
SVM / KNN / RF / etc.	Classification (best model saved)
GridSearchCV	Hyperparameter tuning
Stratified K-Fold	Cross-validation
Region data loader	State/district yield, price, cost, vulnerability
Balanced scoring	Suitability + regional potential − risk (no profit in UI)

Project Structure

SMART CROP REC/
├── data/raw/              # Crop_Recommendation.csv (or sample); optional: state_wise_yield, market_prices, cost_of_cultivation, climate_vulnerability
├── models/                # model.joblib, scaler.joblib, label_encoder.joblib, metadata.json (after run_pipeline.py)
├── reports/figures/       # EDA and evaluation plots
├── src/                   # config, data_loader, zone_soil, preprocess, train, evaluate, predictor, region_data_loader, profit_engine, risk_engine, soil_health, explainer, market_price_fetcher
├── app.py                 # Streamlit UI — Smart Agriculture Advisory System
├── run_pipeline.py        # One-command ML pipeline
├── tests/                 # Crop variety tests (state, district, land size)
├── requirements.txt
├── README.md
├── REPORT.md
└── docs/ARCHITECTURE.md

How to Run

One-time setup

cd "SMART CROP REC"
pip install -r requirements.txt

Dataset

Place Crop_Recommendation.csv in data/raw/ (e.g. from Kaggle), or use Crop_Recommendation_sample.csv for quick testing.
Optional: Add state_wise_yield.csv, market_prices.csv, cost_of_cultivation.csv, climate_vulnerability.csv for better region-aware results.

Train the model

python run_pipeline.py

This loads data, runs EDA, preprocesses, trains and compares models, selects the best, and saves artifacts to models/.

Run the web app

streamlit run app.py

Or:

python -m streamlit run app.py

Then open the URL (e.g. http://localhost:8501). Select state, district, and land size (bigha) → click Proceed to Analysis → view top 5 crops, production (kg), price (₹/kg), risk, diseases, and prevention. Use Start new analysis to run again.

Run tests

python -m pytest tests/test_crop_variety.py -v

Tests verify that crop recommendations vary by state, district, and land size (not the same 5 crops for all regions).

Documentation

Technical design: docs/ARCHITECTURE.md
Academic report: REPORT.md
Code: Comments in src/ and app.py

License and Dataset

Use the dataset in accordance with its source (e.g. Kaggle) and cite it in your report. This project is for academic and educational use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart Agriculture Advisory System

Features

Problem Statement

Methodology

Algorithms Used

Project Structure

How to Run

One-time setup

Dataset

Train the model

Run the web app

Run tests

Documentation

License and Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data/raw		data/raw
docs		docs
models		models
reports/figures		reports/figures
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
REPORT.md		REPORT.md
app.py		app.py
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py

Folders and files

Latest commit

History

Repository files navigation

Smart Agriculture Advisory System

Features

Problem Statement

Methodology

Algorithms Used

Project Structure

How to Run

One-time setup

Dataset

Train the model

Run the web app

Run tests

Documentation

License and Dataset

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages