RSS

PyTables, HDF5, and bzip2

21 Apr

If you’re using a language other than Python to open your HDF5 file full of tables that you created using PyTables with a compression filter where complib=’bzip2′, forget it (… or at least be prepared to do a LOT of work.)  You’ll see something like this:

HDF5-DIAG: Error detected in HDF5 (1.8.4-patch1) thread 139647180633920:
#000: ../../../src/H5Dio.c line 174 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: ../../../src/H5Dio.c line 404 in H5D_read(): can’t read data
major: Dataset
minor: Read failed
#002: ../../../src/H5Dchunk.c line 1733 in H5D_chunk_read(): unable to read raw data chunk
major: Low-level I/O
minor: Read failed
#003: ../../../src/H5Dchunk.c line 2742 in H5D_chunk_lock(): data pipeline read failed
major: Data filters
minor: Filter operation failed
#004: ../../../src/H5Z.c line 996 in H5Z_pipeline(): required filter is not registered
major: Data filters
minor: Read failed
Failed table read.

The reason for this is actually on the PyTables web site, under Optimization Tips (http://pytables.github.com/usersguide/optimization.html):

Be aware that the LZO and bzip2 support in PyTables is not standard on HDF5, so if you are going to use your PyTables files in other contexts different from PyTables you will not be able to read them. Still, see the ptrepack (where the ptrepack utility is described) to find a way to free your files from LZO or bzip2 dependencies, so that you can use these compressors locally with the warranty that you can replace them with Zlib (or even remove compression completely) if you want to use these files with other HDF5 tools or platforms afterwards.

So if you’re like me, got thrilled that you got a bonus on data compression, rushed to use bzip2 compression and are now wondering why you can’t do research (in a language other than Python) on your precious data, now you know.

(I just reverted to zlib and the problems went away, but it forced me to rebuild my datasets.)

 
Leave a comment

Posted by on April 21, 2012 in Uncategorized

 

Leave a Reply

Your email address will not be published. Required fields are marked *