Using loadtxt() function to read data from a text file into an ndarray.

Overview:

  • The loadtxt() function of Python numpy module reads numeric data from a text file, creates an ndarray of floating point numbers from the data and returns it. Target data types can be specified through the parameter dtype.
  • In terms of input source the loadtxt() function is versatile to accept input from generator functions and list of strings. 
  • In case any non-numeric data is present in the text file, an exception ValueError is raised by the loadtxt() function.
  • The fields can be surrounded by quotes or any other quote character as specified by the quote parameter of the loadtxt() function.
  • The column values of the resultant ndarray can be transformed as per the lambda function specified for each column in a dictionary. The format is <column_number>:<lambda_function>. If a single callable is provided instead of a dictionary, the callable is used to transform each value of the ndarray.
  • The numpy module also offers genfromtxt() to load data from text file into an ndarray. The genfromtxt() is sophisticated and it can be told to handle how to process missing fields and nan values in the text file.

Example:

# Example Python program that reads numeric data from
# a file. Fields are separated by whitespace.
import numpy

# File path
txtFile = "NumData.txt"

# Read into an ndarray
matrix = numpy.loadtxt(txtFile)
numpy.set_printoptions(formatter={'float_kind':'{:f}'.format})
print(matrix)
print(type(matrix))

Input File:

# Sample data
10.1 12 15
20 23 45
12 14 25.4

Output:

[[10.100000 12.000000 15.000000]

 [20.000000 23.000000 45.000000]

 [12.000000 14.000000 25.400000]]

<class 'numpy.ndarray'>

Example 2 - Using a converter that multiplies the interest rate column from input file by frequency:

# Example Python program that loads numeric data
# from text file and transforms specific columns 
# by applying a lambda function

import numpy 

# Data file path
dataPath = "ir.txt"

# Multiply the interest rate column by frequency
# to get actual interest rate
frequency = 2
converters = {
    1: lambda ir: float(ir)*2 
}

# Load numeric data from text file
numData = numpy.loadtxt("ir.txt", delimiter = ",", converters = converters)
numpy.set_printoptions(formatter={'float_kind':'{:f}'.format})
print(numData)

Input File:

# Bond value, Interest rate, Duration(in years)
100000,6.4,4
200000,6.4,4
400000,6.1,6
500000,6.5,4
700000,6.6,3

Output:

[[100000.000000 12.800000 4.000000]

 [200000.000000 12.800000 4.000000]

 [400000.000000 12.200000 6.000000]

 [500000.000000 13.000000 4.000000]

 [700000.000000 13.200000 3.000000]]

 


Copyright 2024 © pythontic.com