Note
This is the English 🇬🇧🇺🇸 version of the README. If you want to see the French 🇫🇷 version, you can click on the link below:
This GitHub repository
stores the source files used to build the site
https://pythonds.linogaliana.fr/.
It contains the entire course Python for Data Science
that I teach in the second year (Master 1) at ENSAE.
The syllabus is available on the ENSAE website and on the course website.
Overall, it offers a very comprehensive content that can satisfy both beginners in data science and those looking for more advanced content:
- Data Manipulation: standard data manipulation (
Pandas), geographical data (Geopandas), data retrieval (web scraping, API)... - Data Visualization: classic visualizations (
Matplotlib,Seaborn), cartography, interactive visualizations (Plotly,Folium) - Modeling: machine learning (
Scikit), econometrics - Text Data Processing (NLP): introduction to tokenization with
NLTKandSpaCy, modeling... - Introduction to Modern Data Science: cloud computing,
ElasticSearch, continuous integration...
The content of this site is based on open data, whether French data (mainly from the central platform data.gouv or the website of Insee) or American data.
A good complement to the website's content is the course we give with Romain Avouac (@avouacr) in the final year at ENSAE, more focused on the production of data science projects: https://ensae-reproductibilite.github.io/website/
