Drawing A Scatter Plot Using Pandas DataFrame

Overview:

  • A scatter plot is a diagram drawn between two distributions of variables X and Y on a two dimensional plane.
  • Scatter plot is used as an initial screening tool while analyzing two variables for any relationship (linear, non-linear, inverse relationships) that may exist between them.
  • A scatter plot is used only as an initial tool in the process of finding any relationship between two variables. Even if a relationship is found between two variables using scatter plot, it may not be true that one variable influences another variable. To establish relationship between two variables tools like correlation can be used.

 

Plotting a scatter plot using Pandas DataFrame:

  • The pandas DataFrame class in Python has a member plot.
  • Invoking the scatter() method on the plot member draws a scatter plot between two given columns of a pandas DataFrame.
  • A pandas DataFrame can have several columns. Any two columns can be chosen as X and Y parameters for the scatter() method.

Example 1:

# Example Python program to draw a scatter plot

# for two columns of a pandas DataFrame

import pandas as pd

import matplotlib.pyplot as plot

 

# List of tuples

data = [(2, 4),

        (23, 28),

        (7, 2),

        (9, 10)]

 

# Load data into pandas DataFrame       

dataFrame = pd.DataFrame(data=data, columns=['A','B']);

 

# Draw a scatter plot

dataFrame.plot.scatter(x='A', y='B', title= "Scatter plot between two variables X and Y");

plot.show(block=True);

 

Output:

Drawing a Scatter Plot between two columns of a pandas DataFrame in Python

Example 2:

# Example Python program to draw a scatter plot

# for two columns of a multi-column DataFrame

import pandas as pd

import numpy as np

import matplotlib.pyplot as plot

 

# Create an ndarray with three columns and 20 rows

data = np.random.randn(20, 4);

 

# Load data into pandas DataFrame       

dataFrame = pd.DataFrame(data=data, columns=['A', 'B', 'C', 'D']);

 

# Draw a scatter plot

dataFrame.plot.scatter(x='C', y='D', title= "Scatter plot between two columns of a multi-column DataFrame");

plot.show(block=True);

Output:

Scatter plot for select two columns of a pandas DataFrame


Copyright 2023 © pythontic.com