Pandas and SQL
Reading Pandas DataFrame using SQL select statements
# import sqldf from pandasql
from pandasql import sqldf
# create a short alternative for sqldf using lambda function to avoid passing globals() or locals() every time while calling sqldf
# this command below will give as alternative to sqldf and automatically pass globals() or locals() we just need to pass the query
pysqldf = lambda q: sqldf(q, globals())# import pandas
import pandas as pd# read from csv file and use 'header' attribute if column headers are not in 1st line of csv file
customer_df=pd.read_csv("D:\\PowerBI\\Datasets\\World Wide Importers\\DimCustomer.csv", header=1)customer_df.head(1)Customer Key
WWI Customer ID
Customer
Bill To Customer
Category
Buying Group
Primary Contact
Postal Code
Credit Limit
Valid From
Valid To
Lineage Key
# create your SQL query that you want to excute on the DataFrame
q='''
SELECT "Customer Data"
FROM customer_df LIMIT 4
'''# use the custom sqldf method created above
pysqldf(q)Customer Key
WWI Customer ID
WWI Customer ID
Read from SQL to DF
install sqlalchemy to us pandas's read_sql and to_sql functions
sqlalchemy to us pandas's read_sql and to_sql functionsinstall pymysql (it is an extension for sqlalchemy to enable it to connect to mysql sources)
create a connection for pandas using slqalchemy function
Id
Name
City
Once you read the data from SQL and store into a DataFrame you can do anything you want with it
Now if you want you can save this DataFrame in you SQL Database
Congratulation !! 🎉
Last updated