Reading Pandas DataFrame using SQL select statements
# import sqldf from pandasqlfrom pandasql import sqldf# create a short alternative for sqldf using lambda function to avoid passing globals() or locals() every time while calling sqldf
# this command below will give as alternative to sqldf and automatically pass globals() or locals() we just need to pass the query
pysqldf =lambdaq: sqldf(q, globals())
# import pandasimport pandas as pd
# read from csv file and use 'header' attribute if column headers are not in 1st line of csv filecustomer_df=pd.read_csv("D:\\PowerBI\\Datasets\\World Wide Importers\\DimCustomer.csv", header=1)
customer_df.head(1)
Customer Key
WWI Customer ID
Customer
Bill To Customer
Category
Buying Group
Primary Contact
Postal Code
Credit Limit
Valid From
Valid To
Lineage Key
# create your SQL query that you want to excute on the DataFrameq='''SELECT "Customer Data" FROM customer_df LIMIT 4'''
# use the custom sqldf method created abovepysqldf(q)
Customer Key
WWI Customer ID
WWI Customer ID
Read from SQL to DF
install sqlalchemy to us pandas's read_sql and to_sql functions
alternatively you can use sqlalchemy with a little extra code
!pip install sqlalchemy
Requirement already satisfied: sqlalchemy in c:\python3\lib\site-packages (1.4.31)
Requirement already satisfied: greenlet!=0.4.17 in c:\python3\lib\site-packages (from sqlalchemy) (1.1.2)
WARNING: You are using pip version 21.2.4; however, version 22.0.3 is available.
You should consider upgrading via the 'C:\Python3\python.exe -m pip install --upgrade pip' command.
install pymysql (it is an extension for sqlalchemy to enable it to connect to mysql sources)
!pip install pymysql
# import sqlalchemyimport sqlalchemy
# import pymysqlimport pymysql
create a connection for pandas using slqalchemy function
df_customer=pd.read_sql_query("SELECT * FROM customer", conection_engine)
df_customer.head()
Once you read the data from SQL and store into a DataFrame you can do anything you want with it
# let's say after doing some transformation and modificaton you have a new DataFrame (df_customer_modified)df_customer_modified=df_customer
Now if you want you can save this DataFrame in you SQL Database
# write DataFrame into SQL database (if the mentioned table name already exists then this qury will fail)df_customer_modified.to_sql("customer_modified", conection_engine)
Congratulation !! 🎉
You can now access/read DataFrames using SQL Select statements instead of using the traditional Datafram syntax.
You can also read data from SQL and save the result as DataFrame.
You can save an existing dataframe directly into sql table.