Pandas and SQL

Reading Pandas DataFrame using SQL select statements

# import sqldf from pandasql
from pandasql import sqldf
# create a short alternative for sqldf using lambda function to avoid passing globals() or locals() every time while calling sqldf
# this command below will give as alternative to sqldf and automatically pass globals() or locals() we just need to pass the query
pysqldf = lambda q: sqldf(q, globals())

# import pandas
import pandas as pd

# read from csv file and use 'header' attribute if column headers are not in 1st line of csv file
customer_df=pd.read_csv("D:\\PowerBI\\Datasets\\World Wide Importers\\DimCustomer.csv", header=1)

customer_df.head(1)

Customer Key

WWI Customer ID

Customer

Bill To Customer

Read from SQL to DF

install `sqlalchemy` to us pandas's `read_sql` and `to_sql` functions

alternatively you can use sqlalchemy with a little extra code

!pip install sqlalchemy

Requirement already satisfied: sqlalchemy in c:\python3\lib\site-packages (1.4.31)
Requirement already satisfied: greenlet!=0.4.17 in c:\python3\lib\site-packages (from sqlalchemy) (1.1.2)


WARNING: You are using pip version 21.2.4; however, version 22.0.3 is available.
You should consider upgrading via the 'C:\Python3\python.exe -m pip install --upgrade pip' command.

install pymysql (it is an extension for sqlalchemy to enable it to connect to mysql sources)

!pip install pymysql

# import sqlalchemy
import sqlalchemy

# import pymysql
import pymysql

create a connection for pandas using slqalchemy function

sqlalchemy.create_engine('mysql+pymysql://username:@host_name/db_name')

conection_engine = sqlalchemy.create_engine('mysql+pymysql://root:@localhost/pandas_sql')

df_customer=pd.read_sql_query("SELECT * FROM customer", conection_engine)

df_customer.head()

Name

City

Ajeet

Ghaziabad

Rashid

Noida

Ajay

Gurgaon

Ramu

Hyderbad

Rajesh

Bangalore

Once you read the data from SQL and store into a DataFrame you can do anything you want with it

# let's say after doing some transformation and modificaton you have a new DataFrame (df_customer_modified)
df_customer_modified=df_customer

Now if you want you can save this DataFrame in you SQL Database

# write DataFrame into SQL database (if the mentioned table name already exists then this qury will fail)
df_customer_modified.to_sql("customer_modified", conection_engine)

Congratulation !! 🎉

You can now access/read DataFrames using SQL Select statements instead of using the traditional Datafram syntax.
You can also read data from SQL and save the result as DataFrame.
You can save an existing dataframe directly into sql table.

PreviousPython Installation NextData Analytics Project

Last updated 4 months ago

Was this helpful?