The SQLAlchemy is a python library for efficient database access. It abstracts away many of the common differences between different SQL databases.
pandas has read_sql and DataFrame.to_sql functions which enable easy read/ write of SQL tables from SQLAlchemy connection.
import sqlalchemy as db
import pandas as pdConnect with SQL database
# PyMySQL is a python connection driver for MySQL
!pip3 install PyMySQLcreate_engine returns an instance of Engine, and it represents the core interface to the database, adapted through a dialect. On first time call of Engine.execute() or Engine.connect() Engine establishes a real connection to the database.
# The typical form of a database URL is: dialect+driver://username:password@host:port/database
engine = db.create_engine('mysql+pymysql://username:password@localhost:3306/mydb')
engine.connect()For now our database has single table.
# Print the table names
engine.table_names()['users']Pandas DataFrame
Create a dataframe with simple data
columns = ['country', 'continent', 'area', 'population', 'gdp']
data = [('Afghanistan', 'Asia', 647500, 31056997, 700),
('India', 'Asia', 3287590, 1095351995, 2900)]df = pd.DataFrame(data=data, columns=columns)
df| country | continent | area | population | gdp | |
|---|---|---|---|---|---|
| 0 | Afghanistan | Asia | 647500 | 31056997 | 700 |
| 1 | India | Asia | 3287590 | 1095351995 | 2900 |
Creata table in SQL database
Write all records stored in dataframe to SQL database table named country
df.to_sql('country', con=engine,
index=False # Do not want DataFrame index as a table column
)We can check if table is created in database
engine.table_names()['country', 'users']Read SQL table
We can read the SQL database table conveniently with read_sql function. It returns the table in DataFrame object.
pd.read_sql("SELECT * FROM country", engine)| country | continent | area | population | gdp | |
|---|---|---|---|---|---|
| 0 | Afghanistan | Asia | 647500 | 31056997 | 700 |
| 1 | India | Asia | 3287590 | 1095351995 | 2900 |
Append to SQL table
# create another dataframe with some more nation's entry
more_data = [('China', 'Asia', 9596960, 1313973713, 5000),
('United States', 'North America', 9631420, 298444215, 37800),
('Brazil', 'South America', 8511965, 188078227, 7600),
('South Africa', 'Africa', 1219912, 44187637, 10700),
('France', 'Europe', 547030, 60876136, 27600)]
df1 = pd.DataFrame(data=more_data, columns=columns)Append the data in database table
df1.to_sql('country', con=engine,
index=False,
if_exists='append' # the SQL table gets appended
)pd.read_sql("SELECT * FROM country", engine)| country | continent | area | population | gdp | |
|---|---|---|---|---|---|
| 0 | Afghanistan | Asia | 647500 | 31056997 | 700 |
| 1 | India | Asia | 3287590 | 1095351995 | 2900 |
| 2 | China | Asia | 9596960 | 1313973713 | 5000 |
| 3 | United States | North America | 9631420 | 298444215 | 37800 |
| 4 | Brazil | South America | 8511965 | 188078227 | 7600 |
| 5 | South Africa | Africa | 1219912 | 44187637 | 10700 |
| 6 | France | Europe | 547030 | 60876136 | 27600 |
Delete records from SQL table
drop_query = """DELETE FROM country
WHERE country = 'China'
"""Execute a SQL statement construct.
engine.execute(drop_query)pd.read_sql("SELECT * FROM country", engine)| country | continent | area | population | gdp | |
|---|---|---|---|---|---|
| 0 | Afghanistan | Asia | 647500 | 31056997 | 700 |
| 1 | India | Asia | 3287590 | 1095351995 | 2900 |
| 2 | United States | North America | 9631420 | 298444215 | 37800 |
| 3 | Brazil | South America | 8511965 | 188078227 | 7600 |
| 4 | South Africa | Africa | 1219912 | 44187637 | 10700 |
| 5 | France | Europe | 547030 | 60876136 | 27600 |
Drop table
You can completely drop the table.
drop_query = "DROP TABLE country"
engine.execute(drop_query);engine.table_names()['users']