Create a simple HDF5 dataset using h5Py

Overview:

HDF5 is a hierarchical data model used for describing and defining data from sources that generate large volumes of data. Examples of sources that create huge datasets include sensors in a laboratory or factory, particle experiments, terrestrial and extra-terrestrial experiments and the similar kind.
A HDF5 file can contain as many datasets as needed and is organized like the UNIX file system. From the root-level there could be many groups. The groups can have several datasets. Both the datasets and groups can have attributes the describe them.
h5py is the python module that makes use of the HDF5 library(written in “C” programming language) and enables using HDF5 data models from Python programs.
The class File from the h5py module is used for creating a HDF5 file.
The created file can be specified of a name, mode, driver, version of the hdf5 library to be used, size of the user block which is added to the beginning of the file, single writer-multiple consumer mode, properties related to chunking like chunk cache size, chunk preemption policy, number of chunk slots in chunk cache and any keywords specific to the driver being used.
The file mode specifies the operations for which the file is opened. The modes include: create for read and write, read only on an existing file(r), read and write on an existing file(r+), Create or overwrite (w), create only if there is no file with the same name (w-).
A driver also can be specified while opening a HDF5 File. The type of driver determines the kind of storage facility needed for a HDF5 file. The storage could be fully on the physical memory with an optional write back to a disk file upon closing, a regular disk file similar to the file object opened by the Python open() function, a disk file with or without buffering or with limited buffering or a disk file with chunking enabled.
The sec2 driver is the default driver. The sec2 driver creates a file on the disk with minimal buffering.
Once a HDF5 file is created, datasets can be added to it through the method create_dataset() of the h5py.File class.

Example:

# Example Python program that creates volume datasets in HDF5 format

# using the python module h5py

import h5py

import random

def getVal():

return random.randint(0, 1);

volumeFileName = "VolumeData.hdf5";

volumeFile = h5py.File(volumeFileName, "w");

# Create two volumes

volumeShape = (1, 1, 1);

v1 = volumeFile.create_dataset("s1", shape=volumeShape);

#volumeShape = (512, 512, 512);

#v2 = volumeFile.create_dataset("s2", shape=volumeShape, );

# Fill the volume1

for x in range(0):

for y in range(0):

for z in range(0):

v1[x, y, z] = 10;

print("Volume filling complete");

print("Driver:");

print(volumeFile.driver);

Output:

Name of the HDF5 file:./One.hdf5

Name of the root group:/

Shape of the dataset s1:

(10, 10, 10)

81.0

104.0