Overview:
- NumPy is used in various scientific applications where the size of the data sets is voluminous. Storing of huge ndarray objects from scientific experiments or other industrial applications has to be done by considering several objectives like compact storage size, ability to load a portion of data into the memory - Memory mapping, a simple file format for which readers and writes can be written just not using Python but using any other programming language including the ones that evolve in the future . The save() function of NumPy module addresses all the above considerations while writing an ndarray into a .npy file.
- The save() function of NumPy module writes an ndarray into a file in binary format.
- The structure of file consists of a header followed by the ndarray data. The header consists of the shape and the data type of the array.
- If only a portion of .npy file has to be loaded into the memory, the load() function can be invoked with memory mapping parameter as True.
- Every .npy file has a single ndarray written to it. To write multiple ndarray objects the savez() function is used.
- The savez() and savez_compressed() writes multiple ndarrays into a .npz file. The savez() writes in uncompressed format and savez_compressed() writes in compressed format.
- When the ndarray consists of Python objects the save() function uses the pickling functionality of Python. For NumPy data types other than Python objects, the data written to the .npy is the continuous bytes of the array.
Example:
# Example Python program that writes an ndarray # Create a 2-d array # Write to file - the .npy extension is added by numpy.save() |
Output:
[[10 11 12] [13 14 15] [10 11 12]] |
Contents of the .npy file:
934e 554d 5059 0100 7600 7b27 6465 7363 7227 3a20 273c 6938 272c 2027 666f 7274 7261 6e5f 6f72 6465 7227 3a20 4661 6c73 652c 2027 7368 6170 6527 3a20 2833 2c20 3329 2c20 7d20 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 2020 200a 0a00 0000 0000 0000 0b00 0000 0000 0000 0c00 0000 0000 0000 0d00 0000 0000 0000 0e00 0000 0000 0000 0f00 0000 0000 0000 0a00 0000 0000 0000 0b00 0000 0000 0000 0c00 0000 0000 0000 |
The contents of the .npy file has both a numpy header followed by the data. The header portion present in the the below hex code translates to
?NUMPYv{'descr': '<i8', 'fortran_order': False, 'shape': (3, 3), }
followed by the data
0a00 0000 0000 0000 0b00 0000 0000 0000
0c00 0000 0000 0000 0d00 0000 0000 0000
0e00 0000 0000 0000 0f00 0000 0000 0000
0a00 0000 0000 0000 0b00 0000 0000 0000
0c00 0000 0000 0000
Example 2 - Write an numpy ndarray of Python objects into a file:
For storing Python objects the numpy.save() uses the pickling mechanism available in Python. The default value of the allow_pickle parameter is True for the save() function.
# Example Python program that writes an ndarray of Python objects # Define a Python class # Create an ndarray of hundred python objects # Write the ndarray to file using numpy |
File Contents:
<93>NUMPY^A^@v^@{'descr': '|O', 'fortran_order': False, 'shape': (10, 10, 10), } <80>^D<95><9f>V^@^@^@^@^@^@<8c>^Vnumpy._core.multiarray<94><8c>^L_reconstruct<94><93><94><8c>^Enumpy<94><8c>^Gndarray<94><93><94>K^@<85><94>C^Ab<94><87><94>R<94>(K^AK K K <87><94>h^C<8c>^Edtype<94><93><94><8c>^BO8<94><89><88><87><94>R<94>(K^C<8c>^A|<94>NNNJÿÿÿÿJÿÿÿÿK?t<94>b<89>]<94>(<8c>^H__main__<94><8c>^GPoint3D<94><93><94>)<81><94>}<94>(<8c>^CmyX<94>K^@<8c>^CmyY<94>K <8c>^CmyZ<94>Kdubh^U)<81><94>}<94>(h^XK^@h^YK h^ZKdubh^U)<81><94>}<94>(h^XK^@h^YK h^ZKdubh^U)<81><94>}<94>(h^XK^@h^YK h^ZKdubh^U)<81><94>}<94>(h^XK^@h^YK h^ZKdubh^U)<81><94>}<94>(h^XK^@h^YK . . .
h^ZKdubh^U)<81><94>}<94>(h^XK^@h^YK h^ZKdubh^U)<81><94>}<94>(h^XK^@h^YK h^ZKdubh^U)<81><94>}<94>(h^XK^@h^YK h^ZKdubet<94>b |