# Pandas and SQL

### Reading Pandas DataFrame using SQL select statements

```python
# import sqldf from pandasql
from pandasql import sqldf
# create a short alternative for sqldf using lambda function to avoid passing globals() or locals() every time while calling sqldf
# this command below will give as alternative to sqldf and automatically pass globals() or locals() we just need to pass the query
pysqldf = lambda q: sqldf(q, globals())
```

```python
# import pandas
import pandas as pd
```

```python
# read from csv file and use 'header' attribute if column headers are not in 1st line of csv file
customer_df=pd.read_csv("D:\\PowerBI\\Datasets\\World Wide Importers\\DimCustomer.csv", header=1)
```

```python
customer_df.head(1)
```

|   | Customer Key | WWI Customer ID | Customer | Bill To Customer | Category | Buying Group | Primary Contact | Postal Code | Credit Limit | Valid From | Valid To                                          | Lineage Key |
| - | ------------ | --------------- | -------- | ---------------- | -------- | ------------ | --------------- | ----------- | ------------ | ---------- | ------------------------------------------------- | ----------- |
| 0 | 0            | 0               | Unknown  | NaN              | NaN      | NaN          | NaN             | NaN         | ? -          | 00:00.0    | ##############################################... | 0           |

```python
# create your SQL query that you want to excute on the DataFrame
q='''
SELECT "Customer Data" 
FROM customer_df LIMIT 4
'''
```

```python
# use the custom sqldf method created above
pysqldf(q)
```

|   | Customer Key | WWI Customer ID | WWI Customer ID |
| - | ------------ | --------------- | --------------- |
| 0 | 0            | 0               | Unknown         |

### Read from SQL to DF

#### install `sqlalchemy` to us pandas's `read_sql` and `to_sql` functions

alternatively you can use sqlalchemy with a little extra code

```python
!pip install sqlalchemy
```

```
Requirement already satisfied: sqlalchemy in c:\python3\lib\site-packages (1.4.31)
Requirement already satisfied: greenlet!=0.4.17 in c:\python3\lib\site-packages (from sqlalchemy) (1.1.2)


WARNING: You are using pip version 21.2.4; however, version 22.0.3 is available.
You should consider upgrading via the 'C:\Python3\python.exe -m pip install --upgrade pip' command.
```

#### install pymysql (it is an extension for sqlalchemy to enable it to connect to mysql sources)

```python
!pip install pymysql
```

```python
# import sqlalchemy
import sqlalchemy
```

```python
# import pymysql
import pymysql
```

#### create a connection for pandas using slqalchemy function

sqlalchemy.create\_engine('mysql+pymysql://`username`:@`host_name`/`db_name`')

```python
conection_engine = sqlalchemy.create_engine('mysql+pymysql://root:@localhost/pandas_sql')
```

```python
df_customer=pd.read_sql_query("SELECT * FROM customer", conection_engine)
```

```python
df_customer.head()
```

|   | Id | Name   | City      |
| - | -- | ------ | --------- |
| 0 | 1  | Ajeet  | Ghaziabad |
| 1 | 2  | Rashid | Noida     |
| 2 | 3  | Ajay   | Gurgaon   |
| 3 | 4  | Ramu   | Hyderbad  |
| 4 | 5  | Rajesh | Bangalore |

#### Once you read the data from SQL and store into a DataFrame you can do anything you want with it

```python
# let's say after doing some transformation and modificaton you have a new DataFrame (df_customer_modified)
df_customer_modified=df_customer
```

#### Now if you want you can save this DataFrame in you SQL Database

```python
# write DataFrame into SQL database (if the mentioned table name already exists then this qury will fail)
df_customer_modified.to_sql("customer_modified", conection_engine)
```

## Congratulation !! 🎉

* You can now access/read DataFrames using SQL Select statements instead of using the traditional Datafram syntax.
* You can also read data from SQL and save the result as DataFrame.
* You can save an existing dataframe directly into sql table.
