# Pandas and SQL

### Reading Pandas DataFrame using SQL select statements

```python
# import sqldf from pandasql
from pandasql import sqldf
# create a short alternative for sqldf using lambda function to avoid passing globals() or locals() every time while calling sqldf
# this command below will give as alternative to sqldf and automatically pass globals() or locals() we just need to pass the query
pysqldf = lambda q: sqldf(q, globals())
```

```python
# import pandas
import pandas as pd
```

```python
# read from csv file and use 'header' attribute if column headers are not in 1st line of csv file
customer_df=pd.read_csv("D:\\PowerBI\\Datasets\\World Wide Importers\\DimCustomer.csv", header=1)
```

```python
customer_df.head(1)
```

|   | Customer Key | WWI Customer ID | Customer | Bill To Customer | Category | Buying Group | Primary Contact | Postal Code | Credit Limit | Valid From | Valid To                                          | Lineage Key |
| - | ------------ | --------------- | -------- | ---------------- | -------- | ------------ | --------------- | ----------- | ------------ | ---------- | ------------------------------------------------- | ----------- |
| 0 | 0            | 0               | Unknown  | NaN              | NaN      | NaN          | NaN             | NaN         | ? -          | 00:00.0    | ##############################################... | 0           |

```python
# create your SQL query that you want to excute on the DataFrame
q='''
SELECT "Customer Data" 
FROM customer_df LIMIT 4
'''
```

```python
# use the custom sqldf method created above
pysqldf(q)
```

|   | Customer Key | WWI Customer ID | WWI Customer ID |
| - | ------------ | --------------- | --------------- |
| 0 | 0            | 0               | Unknown         |

### Read from SQL to DF

#### install `sqlalchemy` to us pandas's `read_sql` and `to_sql` functions

alternatively you can use sqlalchemy with a little extra code

```python
!pip install sqlalchemy
```

```
Requirement already satisfied: sqlalchemy in c:\python3\lib\site-packages (1.4.31)
Requirement already satisfied: greenlet!=0.4.17 in c:\python3\lib\site-packages (from sqlalchemy) (1.1.2)


WARNING: You are using pip version 21.2.4; however, version 22.0.3 is available.
You should consider upgrading via the 'C:\Python3\python.exe -m pip install --upgrade pip' command.
```

#### install pymysql (it is an extension for sqlalchemy to enable it to connect to mysql sources)

```python
!pip install pymysql
```

```python
# import sqlalchemy
import sqlalchemy
```

```python
# import pymysql
import pymysql
```

#### create a connection for pandas using slqalchemy function

sqlalchemy.create\_engine('mysql+pymysql://`username`:@`host_name`/`db_name`')

```python
conection_engine = sqlalchemy.create_engine('mysql+pymysql://root:@localhost/pandas_sql')
```

```python
df_customer=pd.read_sql_query("SELECT * FROM customer", conection_engine)
```

```python
df_customer.head()
```

|   | Id | Name   | City      |
| - | -- | ------ | --------- |
| 0 | 1  | Ajeet  | Ghaziabad |
| 1 | 2  | Rashid | Noida     |
| 2 | 3  | Ajay   | Gurgaon   |
| 3 | 4  | Ramu   | Hyderbad  |
| 4 | 5  | Rajesh | Bangalore |

#### Once you read the data from SQL and store into a DataFrame you can do anything you want with it

```python
# let's say after doing some transformation and modificaton you have a new DataFrame (df_customer_modified)
df_customer_modified=df_customer
```

#### Now if you want you can save this DataFrame in you SQL Database

```python
# write DataFrame into SQL database (if the mentioned table name already exists then this qury will fail)
df_customer_modified.to_sql("customer_modified", conection_engine)
```

## Congratulation !! 🎉

* You can now access/read DataFrames using SQL Select statements instead of using the traditional Datafram syntax.
* You can also read data from SQL and save the result as DataFrame.
* You can save an existing dataframe directly into sql table.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.consoleflare.com/python/pandas-and-sql.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
