Skip to content

Andrewrgarcia #37

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
1f685bd
changes README: selected rewote
Jan 1, 2023
3e206d5
setup pipenv as a package manager tohelp with the dependency / compat…
Jan 2, 2023
e576f89
git mv rewotes
Jan 2, 2023
9ba0fdd
pass notes.txt to readme in mlbands dir; diff-patterns example; creat…
Jan 2, 2023
7b58b8c
thermo: make sure it works (single mat get_data does not work, but co…
Jan 3, 2023
868103a
wget on mp-1103503; figure out a way to export cif / structural info …
Jan 3, 2023
5d9dff8
this is how you get crystal structures (a b c data)
Jan 3, 2023
25de7f6
cif and json for example
Jan 3, 2023
95772a2
pymatgen.core.sites.PeriodicSites features
Jan 3, 2023
f05190f
pass all these initial files to new draft folder
Jan 3, 2023
fa98388
start transcribing material so far into modular format for package ge…
Jan 3, 2023
b1a44b7
MPRester takes a while to retrieve ElectronicStructureDoc [single] do…
Jan 3, 2023
13741b0
develop bulk band_gap download algorithm; consider parallel processin…
Jan 3, 2023
2082e8d
script for extra functions; current functions make axes on all 3D of …
Jan 4, 2023
2c493cb
transformation of coordinate crystal data to xyz / box arrays
Jan 4, 2023
326c5ef
make material_ID instance variable as direct input to Material constr…
Jan 4, 2023
50343fa
simplify input calls for Material methods; develop method to load ban…
Jan 4, 2023
1b39dd3
integrate a neural network to package and test (LeNet5)
Jan 5, 2023
2fe1b74
add lenet3d --> from fork
Jan 5, 2023
f4099b8
fix neuralnetwork import error: rename nn to neuralnets dir
Jan 5, 2023
7b19bab
introduce load/save modules for data and option to visualize loaded d…
Jan 5, 2023
9b2090a
add machinelearning code (3d neural network)
Jan 5, 2023
8a4db2c
add option to extend data to other properties (characteristics)
Jan 5, 2023
d6c2594
add comments / descriptions to code
Jan 5, 2023
7a41b14
add final descriptions (text)
Jan 5, 2023
7b85113
deploy v1.0 with poetry
Jan 5, 2023
ad3a52e
deploy patch for v1.0 and update README with a user-friendly Jupyter …
Jan 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
__pycache__
*.egg-info/
*.eggs/
*.data

#neuralnetwork-generated
data

old
deploy-notes
venv
secret.py
71 changes: 27 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,66 +1,49 @@
# ReWoTes
# mlbands

REal WOrld TEstS
A python package that implements automatic prediction of electronic band gaps for a set of materials based on training data.

## Overview

This repository contains example test assignments used during our hiring process.
## Installation

We find that regular job interview questions can often be misleading and so use more engaged "real-world" examples instead.
```ruby
pip install --upgrade mlbands
```

Each file represents an assignment similar to what one would get when hired.
## Documentation and Usage on Google Colab (click below)

| Focus | ReWote | Keywords |
| ---------------| --------------------------| ------------------------------- |
| Comp. Science | [Convergence Tracker](Convergence-Tracker.md) | Python, OOD, DFT, Planewaves |
| Comp. Science | [Basis Set Selector](Basis-Set-Selector.md) | Python, OOD, DFT, Local-orbital |
| Data. Science | [ML Property Predict](ML-Band-Gaps.md) | Python, ML Models, Scikit, Featurization |
| Front-End / UX | [Materials Designer](Materials-Designer.md) | ReactJS / UX Design, ThreeJS |
| Front-End / UX | [Flowchart Designer](Flowchart-Designer.md) | ReactJS / UX Design, DAG |
| Back-End / Ops | [Parallel Uploader](Parallel-File-Uploader.md) | Python, OOD, Threading, Objectstore |
| CI/CD, DevOps | [End-to-End Tests](End-To-End-Tests.md) | BDD tests, CI/CD workflows, Cypress |
| HPC, Cloud Inf | [Cloud HPC Bench.](Cloud-Infrastructure.md) | HPC Cluster, Linpack, Benchmarks |
| HPC, Containers| [Containerized HPC](Containerization-HPC.md) | HPC Cluster, Containers, Benchmarks |
<a href="https://colab.research.google.com/drive/14GS_jUo_B6ojDip-Ak2VsGdrZL1Fgx5a?usp=sharing">
<img src="https://github.com/andrewrgarcia/powerxrd/blob/main/img/colab.png?raw=true" width="300" ></a>

## Usage

We suggest the following flow:

1. [Fork](https://docs.github.com/en/free-pro-team@latest/github/getting-started-with-github/fork-a-repo) this repository on GitHub
2. Create a branch using your GitHub username as a branch name
3. Create a subfolder with your GitHub username
4. Copy one of the ReWoTe suggestions (`.md` files) to `README.md` in that subfolder and modify the content of the ReWoTe as necessary
5. Introduce any changes under the subfolder
6. Submit a [pull request](https://docs.github.com/en/free-pro-team@latest/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request-from-a-fork) into the `dev` branch of this repository
# ML Band Gaps (Materials)

See [dev branch](https://github.com/Exabyte-io/rewotes/tree/dev) also.
> Ideal candidate: skilled ML data scientist with solid knowledge of materials science.

## Notes
# Overview

Examples listed here are only meant as guidelines and do not necessarily reflect on the type of work to be performed at the company. Modifications to the individual assignments with an advance notice are encouraged.
The aim of this task is to create a python package that implements automatic prediction of electronic band gaps for a set of materials based on training data.

We will screen for the ability to (1) pick up new concepts quickly, (2) implement a working proof-of-concept solution, and (3) outline how the PoC can become more mature. We value attention to details and modularity.
# User story

As a user of this software I can predict the value of an electronic band gap after passing training data and structural information about the target material.

## Hiring process
# Requirements

Our hiring process in more details:
- suggest the bandgap values for a set of materials designated by their crystallographic and stoichiometric properties
- the code shall be written in a way that can facilitate easy addition of other characteristics extracted from simulations (forces, pressures, phonon frequencies etc)

| Stage | Target Duration | Topic |
| ----------------- | ----------------- | ------------------------------ |
| 0. Email screen | | why mat3ra.com / exabyte.io |
| 1. Phone screen | 15-20 min | career goals, basic skillset |
| 2. ReWoTe | 1-2h x 2-5 days | real-world work/thought process|
| 3. On-site meet | 3-4 x 30 min | personality fit |
| 4. Discuss offer | 30 min | cash/equity/benefits |
| 5. References | 2 x 15 min | sanity check |
| 6. Decision | | when to start |
# Expectations

TOTAL: ~2 weeks tentative.
- the code shall be able to suggest realistic values for slightly modified geometry sets - eg. trained on Si and Ge it should suggest the value of bandgap for Si49Ge51 to be between those of Si and Ge
- modular and object-oriented implementation
- commit early and often - at least once per 24 hours

# Timeline

## Contact info
We leave exact timing to the candidate. Must fit Within 5 days total.

With any questions about this repository or our hiring process please contact us at [email protected].
# Notes

© 2022 Exabyte Inc.
- use a designated github repository for version control
- suggested source of training data: materialsproject.org
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file added dist/mlbands-1.0.0-py3-none-any.whl
Binary file not shown.
Binary file added dist/mlbands-1.0.0.tar.gz
Binary file not shown.
Binary file added dist/mlbands-1.0.1-py3-none-any.whl
Binary file not shown.
Binary file added dist/mlbands-1.0.1.tar.gz
Binary file not shown.
38 changes: 38 additions & 0 deletions draft/GdSnPd.cif
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# generated using pymatgen
data_GdSnPd
_symmetry_space_group_name_H-M 'P 1'
_cell_length_a 4.65008947
_cell_length_b 7.31311105
_cell_length_c 7.95465153
_cell_angle_alpha 90.00000000
_cell_angle_beta 90.00000000
_cell_angle_gamma 90.00000000
_symmetry_Int_Tables_number 1
_chemical_formula_structural GdSnPd
_chemical_formula_sum 'Gd4 Sn4 Pd4'
_cell_volume 270.51081729
_cell_formula_units_Z 4
loop_
_symmetry_equiv_pos_site_id
_symmetry_equiv_pos_as_xyz
1 'x, y, z'
loop_
_atom_site_type_symbol
_atom_site_label
_atom_site_symmetry_multiplicity
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
Gd Gd0 1 0.25000000 0.51062534 0.20448969 1
Gd Gd1 1 0.25000000 0.01062534 0.29551031 1
Gd Gd2 1 0.75000000 0.48937466 0.79551031 1
Gd Gd3 1 0.75000000 0.98937466 0.70448969 1
Sn Sn4 1 0.25000000 0.69150951 0.58737792 1
Sn Sn5 1 0.25000000 0.19150951 0.91262208 1
Sn Sn6 1 0.75000000 0.30849049 0.41262208 1
Sn Sn7 1 0.75000000 0.80849049 0.08737792 1
Pd Pd8 1 0.25000000 0.79706757 0.91570644 1
Pd Pd9 1 0.25000000 0.29706757 0.58429356 1
Pd Pd10 1 0.75000000 0.20293243 0.08429356 1
Pd Pd11 1 0.75000000 0.70293243 0.41570644 1
129 changes: 129 additions & 0 deletions draft/GdSnPd.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
{
"@module": "pymatgen.core.structure",
"@class": "Structure",
"charge": 0,
"lattice": {
"matrix": [
[4.65008947, 0.0, 2.847358592592156e-16],
[1.1760379519337024e-15, 7.31311105014398, 4.477989019683982e-16],
[0.0, 0.0, 7.954651530160617]
],
"pbc": [true, true, true],
"a": 4.65008947,
"b": 7.31311105014398,
"c": 7.954651530160617,
"alpha": 90.0,
"beta": 90.0,
"gamma": 90.0,
"volume": 270.51081728514777
},
"sites": [{
"species": [{
"element": "Gd",
"occu": 1
}],
"abc": [0.25, 0.51062534, 0.2044896850000001],
"xyz": [1.1625223675000007, 3.734259816437527, 1.6266441856873135],
"label": "Gd",
"properties": {}
}, {
"species": [{
"element": "Gd",
"occu": 1
}],
"abc": [0.25, 0.010625340000000039, 0.2955103149999999],
"xyz": [1.1625223675, 0.07770429136553712, 2.350681579392995],
"label": "Gd",
"properties": {}
}, {
"species": [{
"element": "Gd",
"occu": 1
}],
"abc": [0.75, 0.48937465999999996, 0.7955103149999999],
"xyz": [3.4875671025000003, 3.5788512337064526, 6.328007344473304],
"label": "Gd",
"properties": {}
}, {
"species": [{
"element": "Gd",
"occu": 1
}],
"abc": [0.75, 0.98937466, 0.7044896850000001],
"xyz": [3.4875671025000012, 7.235406758778443, 5.603969950767622],
"label": "Gd",
"properties": {}
}, {
"species": [{
"element": "Sn",
"occu": 1
}],
"abc": [0.25, 0.69150951, 0.5873779150000001],
"xyz": [1.162522367500001, 5.057085838860649, 4.672386630337304],
"label": "Sn",
"properties": {}
}, {
"species": [{
"element": "Sn",
"occu": 1
}],
"abc": [0.25, 0.19150950999999994, 0.9126220849999999],
"xyz": [1.1625223675000003, 1.4005303137886584, 7.259590664903621],
"label": "Sn",
"properties": {}
}, {
"species": [{
"element": "Sn",
"occu": 1
}],
"abc": [0.75, 0.30849048999999995, 0.41262208499999986],
"xyz": [3.4875671025000003, 2.2560252112833306, 3.2822648998233133],
"label": "Sn",
"properties": {}
}, {
"species": [{
"element": "Sn",
"occu": 1
}],
"abc": [0.75, 0.80849049, 0.08737791500000025],
"xyz": [3.487567102500001, 5.91258073635532, 0.6950608652569968],
"label": "Sn",
"properties": {}
}, {
"species": [{
"element": "Pd",
"occu": 1
}],
"abc": [0.25, 0.79706757, 0.915706435],
"xyz": [1.162522367500001, 5.82904365387841, 7.284125594350674],
"label": "Pd",
"properties": {}
}, {
"species": [{
"element": "Pd",
"occu": 1
}],
"abc": [0.25, 0.29706756999999984, 0.584293565],
"xyz": [1.1625223675000005, 2.172488128806419, 4.647851700890252],
"label": "Pd",
"properties": {}
}, {
"species": [{
"element": "Pd",
"occu": 1
}],
"abc": [0.75, 0.20293243000000005, 0.08429356499999996],
"xyz": [3.4875671025000003, 1.4840673962655702, 0.6705259358099435],
"label": "Pd",
"properties": {}
}, {
"species": [{
"element": "Pd",
"occu": 1
}],
"abc": [0.75, 0.70293243, 0.41570643500000015],
"xyz": [3.487567102500001, 5.14062292133756, 3.306799829270367],
"label": "Pd",
"properties": {}
}]
}
Loading