Pandas DataFrames - Reading from and writing to PostgreSQL table

Overview:

Reading from a PostgreSQL table to a pandas DataFrame:

  • The data to be analyzed is often from a data store like PostgreSQL table.
  • Data from a PostgreSQL table can be read and loaded into a pandas DataFrame by calling the method DataFrame.read_sql() and passing the database connection obtained from the SQLAlchemy Engine as a parameter.

Example:

# Example python program to read data from a PostgreSQL table

# and load into a pandas DataFrame

import psycopg2

import pandas as pds

from sqlalchemy import create_engine

 

# Create an engine instance

alchemyEngine   = create_engine('postgresql+psycopg2://test:@127.0.0.1', pool_recycle=3600);

 

# Connect to PostgreSQL server

dbConnection    = alchemyEngine.connect();

 

# Read data from PostgreSQL database table and load into a DataFrame instance

dataFrame       = pds.read_sql("select * from \"StudentScores\"", dbConnection);

 

pds.set_option('display.expand_frame_repr', False);

 

# Print the DataFrame

print(dataFrame);

 

# Close the database connection

dbConnection.close();

 

Output:

   index  Physics  Chemistry  Biology  Mathematics  Language

0   1211       57         61       76           56        67

1   1212       77         67       65           78        62

2   1213       65         71       56           63        70

Writing a pandas DataFrame to a PostgreSQL table:

  • The following Python example, loads student scores from a list of tuples into a pandas DataFrame.
  • It creates an SQLAlchemy Engine instance which will connect to the PostgreSQL on a subsequent call to the connect() method.
  • Once a connection is made to the PostgreSQL server, the method to_sql() is called on the DataFrame instance , which persists the contents to a PostgreSQL table.

Example:

# Example Python program to serialize a pandas DataFrame

# into a PostgreSQL table

from sqlalchemy import create_engine

import psycopg2

import pandas as pds

 

 

# Data - Marks scored

studentScores = [(57, 61, 76, 56, 67),

                 (77, 67, 65, 78, 62),

                 (65, 71, 56, 63, 70)

                ];

# Create a DataFrame

dataFrame   = pds.DataFrame(studentScores,

              index=(1211,1212,1213), # Student ids as index

              columns=("Physics", "Chemistry", "Biology", "Mathematics", "Language")

              );

         

alchemyEngine           = create_engine('postgresql+psycopg2://test:test@127.0.0.1/test', pool_recycle=3600);

postgreSQLConnection    = alchemyEngine.connect();

postgreSQLTable         = "StudentScores";

 

try:

    frame           = dataFrame.to_sql(postgreSQLTable, postgreSQLConnection, if_exists='fail');

except ValueError as vx:

    print(vx)

except Exception as ex:  

    print(ex)

else:

    print("PostgreSQL Table %s has been created successfully."%postgreSQLTable);

finally:

    postgreSQLConnection.close();

 

 

Output:

PostgreSQL Table StudentScores has been created successfully.

 

 


Copyright 2024 © pythontic.com