# PySpark & Databricks

- [Spark vs Hadoop](/pyspark-and-databricks/spark-vs-hadoop.md): In this document we will try to understand why spark is better than hadoop
- [Cluster Computing](/pyspark-and-databricks/cluster-computing.md): In this document we will try to understand what is cluster computing
- [PySpark](/pyspark-and-databricks/pyspark.md): Introduction
- [Databricks Introduction](/pyspark-and-databricks/databricks-introduction.md): In this page we will look at a brief introduction about Databricks
- [PySpark in Databricks](/pyspark-and-databricks/pyspark-in-databricks.md): In this tutorial, we will see how we can get started with Databricks
- [Reading Data with PySpark](/pyspark-and-databricks/reading-data-with-pyspark.md): In this tutorial we will see how we can read data in PySpark
- [PySpark Transformation Methods](/pyspark-and-databricks/pyspark-transformation-methods.md): In this tutorial, we will try to explore some of the common data transformation methods in PySpark
- [Handling Duplicate Data](/pyspark-and-databricks/handling-duplicate-data.md): In this tutorial, we will see some common methods for how we can handle duplicate data.
- [PySpark Action Methods](/pyspark-and-databricks/pyspark-action-methods.md): In this tutorial we will try to look at some of the common action methods in PySpark
- [PySpark Native Functions](/pyspark-and-databricks/pyspark-native-functions.md): In this tutorial we will explore some common PySpark native functions
- [Partitioning](/pyspark-and-databricks/partitioning.md): In this tutorial we will learn about Partitioning strategy in PySpark
- [Bucketing](/pyspark-and-databricks/bucketing.md): In this tutorial we learn about Bucketing strategy in PySpark
- [Partitioning vs Bucketing](/pyspark-and-databricks/partitioning-vs-bucketing.md): In this tutorial we will try to understand the difference between Partitioning and Bucketing
