Nano-Material Adsorption Prediction for CCU

This project uses an XGBoost machine learning model to predict the adsorption capacity (nads) of nano-materials for Carbon Capture and Utilization (CCU). The goal is to identify optimal nano-material configurations that efficiently capture CO₂ and H₂O molecules, supporting the development of advanced membranes, sorbents, and catalysts as described in the research paper "Nano-enabled Membranes, Sorbents, and Catalysts for Addressing the Challenges of Carbon Capture and Utilization (CCU)". The model is trained on simulation data (e.g., is2r_train_optimized.csv), saved for reuse, and can predict adsorption capacity for new nano-material designs.

Project Overview

Objective: Predict nads (number of adsorbed molecules) to rank nano-material configurations by their CO₂ capture efficiency.
Model: XGBoost regression with parameters optimized for accuracy (RMSE evaluation).
Dataset: Based on is2r_train_optimized.csv, a simulation dataset of nano-material properties.
Output: Predicted nads values, identifying top-performing configurations for CCU applications.

Installation

Prerequisites

Python 3.7+

Libraries: Install via pip:

pip install pandas numpy xgboost scikit-learn matplotlib

Setup

Clone or download this repository:

git clone https://github.com/thearpankumar/CCU-Prediction-Nano-Enabled-Membrane-ML.git
cd CCU-Prediction-Nano-Enabled-Membrane-ML

Place your training data (is2r_train_optimized.csv) in the project directory, or update the file path in train_xgboost.py.
Ensure new input data follows the required format (see Input Data Format).

2. Making Predictions

Use the saved model to predict nads for new data:

python predict_xgboost.py

What It Does:
- Predicts nads for your input data (e.g., your_new_data.csv).
- Saves results to predictions_output.csv.
Steps:
1. Prepare your input data (see Input Data Format).
2. Update the file path in CCU.ipynb (e.g., "your_new_data.csv").
3. Run the script.

Output Example:

Model loaded successfully from 'xgboost_nads_model.json'
Predictions for the first 5 rows:
   nco2  nh2o  y_init  natoms  defective  ...  predicted_nads
0     1     2  0.6126      82          1  ...        3.0318
1     1     0 -0.4206      95          0  ...        2.9876
Results saved to 'predictions_output.csv'

Required Features

Feature	Description	Data Type	Example
`nco2`	Number of CO₂ molecules exposed	Integer/Float	1
`nh2o`	Number of H₂O molecules exposed	Integer/Float	2
`y_init`	Initial energy (eV)	Float	0.612635
`natoms`	Number of atoms in material	Integer	82
`defective`	Has defects? (True/False or 1/0)	Boolean/Int	True
`cell_mean`	Average cell size (Å)	Float	5.643653
`cell_std`	Std dev of cell size (Å)	Float	4.755599
`pos_relaxed_mean`	Mean relaxed position (Å)	Float	8.48211
`pos_relaxed_std`	Std dev of relaxed position (Å)	Float	4.277549
`atomic_numbers_mean`	Mean atomic number	Float	13.573171
`atomic_numbers_std`	Std dev of atomic numbers	Float	12.533885

Example CSV

nco2,nh2o,y_init,natoms,defective,cell_mean,cell_std,pos_relaxed_mean,pos_relaxed_std,atomic_numbers_mean,atomic_numbers_std
1,2,0.612635,82,True,5.643653,4.755599,8.48211,4.277549,13.573171,12.533885
1,0,-0.420629,95,False,5.373723,5.147071,7.654403,3.760834,6.905263,12.555620

Notes:
- Use the same units (e.g., Å for distances, eV for energy) as the training data.
- defective can be True/False or 1/0; the script converts it to 0/1.

Project Goals

Identify high-performing nano-materials for CCU by predicting nads.
Support research into efficient CO₂ capture, aligning with nano-enabled sorbent/membrane advancements.
Provide a reusable model for testing new material designs.

Future Improvements

Add CO₂-specific selectivity metrics (current nads includes H₂O).
Integrate with neural networks for deeper pattern analysis.
Expand to predict stability (raw_y) alongside capacity.

Contributing

Feel free to fork this repo, submit issues, or suggest enhancements via pull requests!

Notes

Customization: Adjust the repo URL, license, or file paths as needed for your setup (e.g., if hosted on GitHub).
Data: I didn’t include is2r_train_optimized.csv since it’s your proprietary file—users should have their own version.
Tone: It’s practical and user-friendly, aimed at researchers or developers in CCU.

Want to tweak anything (e.g., add more details, change the structure)? Let me know! You can save this as README.md in your project folder.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
ccu.ipynb		ccu.ipynb
co2.py		co2.py
lmdb_dataprocessor.py		lmdb_dataprocessor.py
predict_xgboost.py		predict_xgboost.py
train_xgboost.py		train_xgboost.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Nano-Material Adsorption Prediction for CCU

Project Overview

Installation

Prerequisites

Setup

2. Making Predictions

Required Features

Example CSV

Project Goals

Future Improvements

Contributing

Notes

About

Uh oh!

Releases

Packages

Languages

thearpankumar/CCU-Prediction-Nano-Enabled-Membrane-ML

Folders and files

Latest commit

History

Repository files navigation

Nano-Material Adsorption Prediction for CCU

Project Overview

Installation

Prerequisites

Setup

2. Making Predictions

Required Features

Example CSV

Project Goals

Future Improvements

Contributing

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages