Skip to content

PranayMehta/apache-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

apache-spark

All my research about Apache Spark and the Associate Level Certification that will come through this process.

We will be reading the book "Spark - The definitive guide" for this certification. Here is a high level overview -

Table of Contents

Part I. Gentle Overview Of Big Data And Spark Chapter 1. What Is Apache Spark? Chapter 2. A Gentle Introduction To Spark Chapter 3. A Tour Of Spark’S Toolset

Part II. Structured Apis—Dataframes, Sql, And Datasets Chapter 4. Structured Api Overview Chapter 5. Basic Structured Operations Chapter 6. Working With Different Types Of Data Chapter 7. Aggregations Chapter 8. Joins Chapter 9. Data Sources Chapter 10. Spark Sql Chapter 11. Datasets

Part III. Low-Level Apis Chapter 12. Resilient Distributed Datasets (Rdds) Chapter 13. Advanced Rdds Chapter 14. Distributed Shared Variables Chapter Iv. Production Applications Chapter 15. How Spark Runs On A Cluster Chapter 16. Developing Spark Applications Chapter 17. Deploying Spark Chapter 18. Monitoring And Debugging Chapter 19. Performance Tuning

Part V. Streaming Chapter 20. Stream Processing Fundamentals Chapter 21. Structured Streaming Basics Chapter 22. Event-Time And Stateful Processing Chapter 23. Structured Streaming In Production

Part VI. Advanced Analytics And Machine Learning Chapter 24. Advanced Analytics And Machine Learning Overview Chapter 25. Preprocessing And Feature Engineering Chapter 26. Classification Chapter 27. Regression Chapter 28. Recommendation Chapter 29. Unsupervised Learning Chapter 30. Graph Analytics Chapter 31. Deep Learning

Part VII. Ecosystem Chapter 32. Language Specifics: Python (Pyspark) And R (Sparkr And Sparklyr) Chapter 33. Ecosystem And Community

About

All my research about Apache Spark and certification related to it

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published