John Papp
John Papp

Reputation: 1

HDF5: Compound Data Type with Variable Length String to Packet Table

I've successfully exported a fixed length compound data type to a packet table using the following code:

typedef struct moddata_t {
    char cLog[4096];
} moddata_t;
//  
//  Attempt to open the revision history packet table
hid_t hidPTableID = H5PTopen(hidID, "Revision History");
//
//  Create copy of native character type
hid_t hidCharLen4096TypeID = H5Tcopy(H5T_C_S1);
//
//  Set size of character type
H5Tset_size(hidCharLen4096TypeID, 4096);
//
//  Create memory data type for compound data
hid_t hidModDataTypeID = H5Tcreate(H5T_COMPOUND, sizeof(moddata_t));
H5Tinsert(hidModDataTypeID, "log", HOFFSET(moddata_t, cLog), hidCharLen4096TypeID);
//
//  Create fixed length packet table
hidPTableID = H5PTcreate(hidID, "Revision History", hidModDataTypeID, 1, H5P_DEFAULT);
//
//  Free resources
H5Tclose(hidModDataTypeID);
H5Tclose(hidCharLen4096TypeID);
//
//  Fill data type
//    NOTE:  get() function returns a string with miscellaneous info to be exported to packet table
moddata_t modDat;
strcpy(modDat.cLog, get().c_str());
//
//  Append data to packet table
herr_t herrErr = H5PTappend(hidPTableID, 1, &modDat);
//
//  Close packet table
H5PTclose(hidPTableID);

However, if I change to use a variable length string, I get a segfault somewhere inside HDF5 when H5PTappend is called. Unfortunately, there aren't very many examples using a packet table. Here is the modified code that fails:

typedef struct moddata_t {
    hvl_t cLogHandle;
} moddata_t;
//  
//  Attempt to open the revision history packet table
hid_t hidPTableID = H5PTopen(hidID, "Revision History");
//
//  Create copy of native character type
hid_t hidCharLenVarTypeID = H5Tcopy(H5T_C_S1);
//
//  Set size of character type
H5Tset_size(hidCharLenVarTypeID, H5T_VARIABLE);
//
//  Create memory data type for compound data
hid_t hidModDataTypeID = H5Tcreate(H5T_COMPOUND, sizeof(moddata_t));
H5Tinsert(hidModDataTypeID, "log", HOFFSET(moddata_t, cLogHandle), hidCharLenVarTypeID);
//
//  Create fixed length packet table
hidPTableID = H5PTcreate(hidID, "Revision History", hidModDataTypeID, 1, H5P_DEFAULT);
//
//  Free resources
H5Tclose(hidModDataTypeID);
H5Tclose(hidCharLen4096TypeID);
//
//  Fill data type
//    NOTE:  get() function returns a string with miscellaneous info to be exported to packet table
moddata_t modDat;
modDat.cLogHandle.len = get().length() + 1;   // Added one for \0 character;
modDat.cLogHandle.p = new char [get().length()+1];
strcpy((char *) modDat.cLogHandle.p, get().c_str());
//
//  Append data to packet table
herr_t herrErr = H5PTappend(hidPTableID, 1, &modDat);
//
//  Close packet table
H5PTclose(hidPTableID);

Piecing together what others have done to declare compound datatypes with variable length elements, I think I'm creating the datatype correctly. However, there isn't much defining the hvl_t structure so I'm not sure that I'm correctly defining the len and p variables and this is why it is segfaulting.

Any help is appreciated.

Thanks

Upvotes: 0

Views: 718

Answers (2)

John Papp
John Papp

Reputation: 1

Answering my own question.

If I remove:

 //
 //  Create copy of native character type
 hid_t hidCharLenVarTypeID = H5Tcopy(H5T_C_S1);
 //
 //  Set size of character type
 H5Tset_size(hidCharLenVarTypeID, H5T_VARIABLE);

and replace with:

 hid_t hidCharLenVarTypeID = H5Tvlen_create(H5T_C_S1);

Then it seems to work. However, coming from this https://support.hdfgroup.org/HDF5/doc/RM/RM_H5T.html#CreateVLString, it says:

Creating variable-length string datatypes

As the term implies, variable-length strings are strings of varying lengths; they can be arbitrarily long, anywhere from 1 character to thousands of characters.

HDF5 provides the ability to create a variable-length string datatype. Like all string datatypes, this type is based on the atomic string datatype: H5T_C_S1 in C or H5T_FORTRAN_S1 in Fortran. While these datatypes default to one character in size, they can be resized to specific fixed lengths or to variable length.

Variable-length strings will transparently accommodate ASCII strings or UTF-8 strings. This characteristic is set with H5Tset_cset in the process of creating the datatype.

The following HDF5 calls create a C-style variable-length string datatype, vls_type_c_id:

vls_type_c_id = H5Tcopy(H5T_C_S1)  
status = H5Tset_size(vls_type_c_id, H5T_VARIABLE)  

In a C environment, variable-length strings will always be NULL-terminated, so the buffer to hold such a string must be one byte larger than the string itself to accommodate the NULL terminator.

So there seems to be some confusion how to declare a variable length string.

John

Upvotes: 0

SOG
SOG

Reputation: 912

You are probably better off if you use a higher-level API such as HDFql as it deals with HDF5 compound datasets/attributes in a very simple and intuitive way (from a developer point-of-view). Take a look at HDFql reference manual where you can get many examples on how to create, read and write compound datasets/attributes in C (besides other languages such as C++, Java, Python, C#, R and Fortran).

Upvotes: 0

Related Questions