Reading comma separated values (CSV) into pandas DataFrame

Overview:

  • Data with size ranging from few lines to millions of lines are generated daily as CSV streams and files or delimiter separated streams and files. Examples of such data sources include Stock Exchanges, Weather Stations, Sensors in factory Settings, Computer Programs monitoring for various events and so on.
  • In a comma-separated file, commas separate fields and every record starts in a new line.
  • The pandas DataFrame class supports serializing and de-serializing of CSV in an extenstive way through the read_csv() method.
  • The read_csv() method of pandas DataFrame class reads a CSV file and loads each record as a row in the DataFrame.
  • In the similar way the pandas DataFrame class supports operations like reading and writing DataFrame contents from/to MySQL; and reading and writing DataFrame contents from/to PostgreSQL.

Reading CSV from disk file, binary stream and loading into a pandas DataFrame

Example–To load data from a CSV file into a pandas DataFrame:

# Example Python program to read a CSV file

# and load the contents into a pandas DataFrame object

import pandas as pds

 

# Name of the CSV file

fileName    = "./ServerResponseTimes.csv";

 

# Read CSV file into a pandas DataFrame instance

dataFrame   = pds.read_csv(fileName);

 

# Print the contents of the DataFrame onto the console

print("Contents of the DataFrame as read from the .csv file:");

print(dataFrame);

 

# Print the index of the DataFrame

print("Index of the DataFrame:");

print(dataFrame.index);

 

Output:

Contents of the DataFrame as read from the .csv file:

  Server Name  Response Time(ms)

0       Host1                 12

1       Host2                 25

2       Host3                 10

3       Host4                 40

4       Host5                 71

Index of the DataFrame:

RangeIndex(start=0, stop=5, step=1)

Example-To load a binary stream of CSV records into a pandas DataFrame:

The read_csv() is capable of reading from a binary stream as well. The Python example code below constructs a bytes literal and creates a BytesIO stream out of it. The byte stream is passed to the read_csv() method which parses the bytes from the stream and loads into a DataFrame.

# Example Python program to load data from a binary stream of CSV

# into a pandas DataFrame

import pandas as pds

import io

 

# Create a CSV bytes literal

csvData = b"Symbol, Price \r\n Example1, 10 \r\n Example2, 20";

 

# Make a binary stream

binaryStream = io.BytesIO(csvData);

 

# Make pandas to read the CSV from the binary stream and load it into a DataFrame

dataFrame = pds.read_csv(binaryStream);

print("Contents of the DataFrame loaded from a binary stream of CSV:")

print(dataFrame);

 

Output:

Contents of the DataFrame loaded from a binary stream of CSV:

      Symbol   Price

0   Example1       10

1   Example2       20

 


Copyright 2024 © pythontic.com