[Top] [Prev] [Next]

3.5 Writing Data to an SDS

An SDS can be written partially or entirely. Partial writing includes writing to a contiguous region of the SDS and writing to selected locations in the SDS according to patterns defined by the user. This section describes the routine SDwritedata and how it can write data to part of an SDS or to an entire SDS. The section also illustrates the concepts of compressing SDSs and using external files to store scientific data.

3.5.1 Writing Data to an SDS Array: SDwritedata

SDwritedata can completely or partially fill an SDS array or append data along the dimension that is defined to be of unlimited length (see Section 3.5.1.3 on page 31 for a discussion of unlimited-length dimensions). It can also skip a specified number of SDS array elements between write operations along each dimension.

To write to an existing SDS, the calling program must contain the following sequence of routine calls:

C:		sds_id = SDselect(sd_id, sds_index);
		status = SDwritedata(sds_id, start, stride, edges, data);
FORTRAN:	sds_id = sfselect(sd_id, sds_index)
		status = sfwdata(sds_id, start, stride, edges, data)	
	OR	status = sfwcdata(sds_id, start, stride, edges, data)
To write to a new SDS, simply replace the call SDselect with the call SDcreate, which is described in Section 3.4.1 on page 25.

SDwritedata takes five arguments: sds_id, start, stride, edges, and data. The argument sds_id is the data set identifier returned by SDcreate or SDselect.

Before proceeding with the description of the remaining arguments, an explanation of the term hyperslab (or slab, as it will be used in this chapter) is in order. A slab is a group of SDS array elements that are stored in consecutive locations. It can be of any size and dimensionality as long as it is a subset of the array, which means that a single array element and the entire array can both be considered slabs. A slab is defined by the multidimensional coordinate of its initial vertex and the lengths of each dimension.

Given this description of the slab concept, the usage of the remaining arguments should become apparent. The argument start is a one-dimensional array specifying the location in the SDS array at which the write operation will begin. The values of each element of the array start are relative to 0 in both the C and FORTRAN-77 interfaces. The size of start must be the same as the number of dimensions in the SDS array. In addition, each value in start must be smaller than its corresponding SDS array dimension unless the dimension is unlimited. Violating any of these conditions causes SDwritedata to return FAIL.

The argument stride is a one-dimensional array specifying, for each dimension, the interval between values to be written. For example, setting the first element of the array stride equal to 1 writes data to every location along the first dimension. Setting the first element of the array stride to 2 writes data to every other location along the first dimension. Figure 3b illustrates this example, where the shading elements are written and the white elements are skipped. If the argument stride is set to NULL in C (or either 0 or 1 in FORTRAN-77), SDwritedata operates as if every element of stride contains a value of 1, and a contiguous write is performed. For better performance, it is recommended that the value of stride be defined as NULL (i.e., 0 or 1 in FORTRAN-77) rather than being set to 1.

The size of the array stride must be the same as the number of dimensions in the SDS array. Also, each value in stride must be smaller than or equal to its corresponding SDS array dimension unless the dimension is unlimited. Violating any of these conditions causes SDwritedata to return FAIL.

FIGURE 3b - An Example of Access Pattern ("Strides")

The argument edges is a one-dimensional array specifying the length of each dimension of the slab to be written. If the slab has fewer dimensions than the SDS data set has, the size of edges must still be equal to the number of dimensions in the SDS array and all the elements corresponding to the additional dimensions must be set to 1.

Each value in the array edges must not be larger than the length of the corresponding dimension in the SDS data set unless the dimension is unlimited. Attempting to write slabs larger than the size of the SDS data set will result in an error condition.

In addition, the sum of each value in the array edges and the corresponding value in the start array must be smaller than or equal to its corresponding SDS array dimension unless the dimension is unlimited. Violating any of these conditions causes SDwritedata to return FAIL.

The parameter data contains the SDS data to be written. If the SDS array is smaller than the buffer data, the amount of data written will be limited to the maximum size of the SDS array.

Be aware that the mapping between the dimensions of a slab and the order in which the slab values are stored in memory is different between C and FORTRAN-77. In C, the values are stored with the assumption that the last dimension of the slab varies fastest (or "row-major order" storage), but in FORTRAN-77 the first dimension varies fastest (or "column-major order" storage). These storage order conventions can cause some confusion when data written by a C program is read by a FORTRAN-77 program or vice versa.

There are two FORTRAN-77 versions of this routine: sfwdata and sfwcdata. The routine sfwdata writes numeric scientific data and sfwcdata writes character scientific data.

SDwritedata returns either a value of SUCCEED (or 0) or FAIL (or -1). The parameters of this routine are described in Table 3D.

TABLE 3D - SDwritedata Parameter List

Routine Name

[Return Type]

(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
SDwritedata

[intn]

(sfwdata/
sfwcdata)
sds_id
int32
integer
Data set identifier

start
int32 []
integer(*)
Array containing the position at which the write will start for each dimension

stride
int32 []
integer(*)
Array specifying the interval between the values that will be read along each dimension

edges
int32 []
integer(*)
Array containing the number of data elements that will be written along each dimension

data
VOIDP
<valid numeric data type>(*)/
character*(*)
Buffer for the data to be written

3.5.1.1 Filling an Entire Array

Filling an array is a simple slab operation where the slab begins at the origin of the SDS array and fills every location in the array. SDwritedata fills an entire SDS array with data when all elements of the array start are set to 0, the argument stride is set equal to NULL in C or each element of the array stride is set to 1 in both C and FORTRAN-77, and each element of the array edges is equal to the length of each dimension.

EXAMPLE 2. Writing to an SDS.

This example illustrates the use of the routines SDselect/sfselect and SDwritedata/sfwrite to select the first SDS in the file SDS.hdf created in Example 1 and to write actual data to it.

C version

FORTRAN-77 version

3.5.1.2 Writing Slabs to an SDS Array

To allow preexisting data to be modified, the HDF library does not prevent SDwritedata from overwriting one slab with another. As a result, the calling program is responsible for managing any overlap when writing slabs. The HDF library will issue an error if a slab extends past the valid boundaries of the SDS array. However, appending data along an unlimited dimension is allowed.

EXAMPLE 3. Writing a Slab of Data to an SDS.

This example shows how to fill a 3-dimensional SDS array with data by writing series of 2-dimensional slabs to it.

C version

FORTRAN-77 version

EXAMPLE 4. Altering Values within an SDS Array.

This example demonstrates how the routine SDwritedata can be used to alter the values of the elements in the 10th and 11th rows, at the 2nd column, in the SDS array created in the Example 1 and written in Example 2. FORTRAN-77 routine sfwdata is used to alter the elements in the 2nd row, 10th and 11th columns, to reflect the difference between C and Fortran internal storage.

C version

FORTRAN-77 version

3.5.1.3 Appending Data to an SDS Array along an Unlimited Dimension

An SDS array can be made appendable, however, only along one dimension. This dimension must be specified as an appendable dimension when it is created.

In C, only the first element of the SDcreate parameter dim_sizes (i.e., the dimension of the lowest rank or the slowest-changing dimension) can be assigned the value SD_UNLIMITED (or 0) to make the first dimension unlimited. In FORTRAN-77, only the last dimension (i.e., the dimension of the highest rank or the slowest-changing dimension) can be unlimited. In other words, in FORTRAN-77 dim_sizes(rank) must be set to the value SD_UNLIMITED to make the last dimension appendable.

To append data to a data set without overwriting previously-written data, the user must specify the appropriate coordinates in the start parameter of the SDwritedata routine. For example, if 15 data elements have been written to an unlimited dimension, appending data to the array requires a start coordinate of 15. Specifying a starting coordinate less than the current number of elements written to the unlimited dimension will result in data being overwritten. In either case, all of the coordinates in the array except the one corresponding to the unlimited dimension must be equal to or less than the lengths of their corresponding dimensions.

Any time an unlimited dimension is appended to, the HDF library will automatically adjust the dimension record to the new length. If the newly-appended data begins beyond the previous length of the dimension, the locations between the old data and the beginning of the newly-appended data are initialized to the assigned fill value if there is one defined by the user, or the default fill value if none is defined. Refer to Section 3.10.5 on page 58 for a discussion of fill value.

3.5.1.4 Determining whether an SDS Array is Appendable: SDisrecord

SDisrecord determines whether the data set identified by the parameter sds_id is appendable, which means that the slowest-changing dimension of the SDS array is declared unlimited when the data set is created. The syntax of SDisrecord is as follows:

C:		status = SDisrecord(sds_id);
FORTRAN:	status = sfisrcrd(sds_id)
SDisrecord returns TRUE (or 1) when the data set specified by sds_id is appendable and FALSE (or 0) otherwise. The parameter of this routine is defined in Table 3E.

TABLE 3E - SDisrecord Parameter List

Routine Name

[Return Type]

(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
SDisrecord

[int32]

(sfisrcrd)
sds_id
int32
integer
Data set identifier

3.5.1.5 Setting the Block Size: SDsetblocksize

SDsetblocksize sets the size of the blocks used for storing the data for unlimited dimension data sets. This is used only when creating new data sets; it does not have any affect on existing data sets. The syntax of this routine is as follows:

C:		status = SDsetblocksize(sds_id, block_size);
FORTRAN:	status = sfsblsz(sds_id, block_size)
SDsetblocksize must be called after SDcreate or SDselect and before SDwritedata. The parameter block_size should be set to a multiple of the desired buffer size.

SDsetblocksize returns a value of SUCCEED (or 0) or FAIL (or -1). Its parameters are further described in Table 3F.

TABLE 3F - SDsetblocksize Parameter List

Routine Name

[Return Type]

(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
SDsetblocksize

[intn]

(sfsblsz)
sds_id
int32
integer
Data set identifier

block_size
int32
integer
Block size

EXAMPLE 5. Appending Data to an SDS Array with an Unlimited Dimension.

This example creates a 10x10 SDS array with one unlimited dimension and writes data to it. The file is reopened and the routine SDisrecord/sfisrcrd is used to determine whether the selected SDS array is appendable. Then new data is appended, starting at the 11th row.

C version

FORTRAN-77 version

3.5.2 Compressing SDS Data: SDsetcompress

The SDsetcompress routine compresses an existing data set or creates a new compressed data set. It is a simplified interface to the HCcreate routine, and should be used instead of HCcreate unless the user is familiar with the lower-level routines.

The compression algorithms currently supported by SDsetcompress are:

In the future, the following algorithms may be included: Lempel/Ziv-78 dictionary coding, an arithmetic coder, and a faster Huffman algorithm.

The syntax of the routine SDsetcompress is as follows:

C:		status = SDsetcompress(sds_id, comp_type, &c_info);
FORTRAN:	status = sfscompress(sds_id, comp_type, comp_prm)
The parameter comp_type specifies the compression type definition and is set to COMP_CODE_RLE (or 1) for run-length encoding (RLE), COMP_CODE_SKPHUFF (or 3) for Skipping Huffman, COMP_CODE_DEFLATE (or 4) for GZIP compression, or COMP_CODE_NONE (or 0) for no compression.

Compression information is specified by the parameter c_info in C, and by the parameter comp_prm in FORTRAN-77. The parameter c_info is a pointer to a union structure of type comp_info. (Refer to the SDsetcompress entry in the HDF Reference Manual for the description of the comp_info structure.) If comp_type is set to COMP_CODE_NONE or COMP_CODE_RLE, the parameters c_info and comp_prm are not used; c_info can be set to NULL and comp_prm can be undefined. If comp_type is set to COMP_CODE_SKPHUFF, then the structure skphuff in the union comp_info in C (comp_prm(1) in FORTRAN-77) must be provided with the size, in bytes, of the data elements. If it is set to COMP_CODE_DEFLATE, the deflate structure in the union comp_info in C (comp_prm(1) in FORTRAN-77) must be provided with the information about the compression effort.

For example, to compress signed 16-bit integer data using the adaptive Huffman algorithm, the following definition and SDsetcompress call are used.

C:		comp_info c_info;
		c_info.skphuff.skp_size = sizeof(int16);
		status = SDsetcompress(sds_id, COMP_CODE_SKPHUFF, &c_info);
FORTRAN:	comp_prm(1) = 2
		COMP_CODE_SKPHUFF = 3
		status = sfscompress(sds_id, COMP_CODE_SKPHUFF, comp_prm)
To compress a data set using the gzip deflation algorithm with the maximum effort specified, the following definition and SDsetcompress call are used.

C:		comp_info c_info;
		c_info.deflate_level = 9;
		status = SDsetcompress(sds_id, COMP_CODE_DEFLATE, &c_info);
FORTRAN:	comp_prm(1) = 9
		COMP_CODE_DEFLATE = 4
		status = sfscompress(sds_id, COMP_CODE_DEFLATE, comp_prm)
SDsetcompress functionality is currently limited to the following:

The existing compression algorithms supported by HDF do not allow partial modification to a compressed datastream. Overwriting the contents of existing data sets may be supported in the future. Note also that SDsetcompress performs the compression of the data, not SDwritedata.

SDsetcompress returns a value of SUCCEED (or 0) or FAIL (or -1). The C version parameters are further described in Table 3G and the FORTRAN-77 version parameters are further described in Table 3H.

TABLE 3G - SDsetcompress Parameter List

Routine Name

[Return Type]

Parameter
Parameter Type
Description
C
SDsetcompress

[intn]

sds_id
int32
Data set identifier

comp_type
int32
Compression method

c_info
comp_info*
Pointer to compression information structure

TABLE 3H - sfscompress Parameter List
Routine Name
Parameter
Parameter Type
Description
FORTRAN-77
sfscompress
sds_id
integer
Data set identifier

comp_type
integer
Compression method

comp_prm
integer(*)
Compression parameters array

EXAMPLE 6. Compressing SDS Data.

This example uses the routine SDsetcompress/sfscompress to compress SDS data with the GZIP compression method. See comments in the program regarding the use of the Skipping Huffman or RLE compression methods.

C version

FORTRAN-77 version

3.5.3 External File Operations

The HDF library provides routines to store SDS arrays in an external file that is separate from the primary file containing the metadata for the array. Such an SDS array is called an external SDS array. With external arrays, it is possible to link data sets in the same HDF file to multiple external files or data sets in different HDF files to the same external file.

External arrays are functionally identical to arrays in the primary data file. The HDF library keeps track of the beginning of the data set and adds data at the appropriate position in the external file. When data is written or appended along a specified dimension, the HDF library writes along that dimension in the external file and updates the appropriate dimension record in the primary file.

There are two methods for creating external SDS arrays. The user can create a new data set in an external file or move data from an existing internal data set to an external file. In either case, only the array values are stored externally, all metadata remains in the primary HDF file.

When an external array is created, a sufficient amount of space is reserved in the external file for the entire data set. The data set will begin at the specified byte offset and extend the length of the data set. The write operation will overwrite the target locations in the external file. The external file may be of any format, provided the data types, byte ordering, and dimension ordering are supported by HDF. However, the primary file must be an HDF file.

Routines for manipulating external SDS arrays can only be used with HDF files. Unidata-formatted netCDF files are not supported by these routines.

3.5.3.1 Specifying the Directory Search Path of an External File: HXsetdir

There are three filesystem locations the HDF external file routines check when determining the location of an external file. They are, in order of search precedence:

  1. The directory path specified by the last call to the HXsetdir routine.
  2. The directory path specified by the $HDFEXTDIR shell environment variable.
  3. The file system locations searched by the standard open(3) routine.

The syntax of HXsetdir is as follows:

C:		status = HXsetdir(dir_list);
FORTRAN:	status = hxisdir(dir_list, dir_length)
HXsetdir has one argument, a string specifying the directory list to be searched. This list can consist of one directory name or a set of directory names separated by colons. The FORTRAN-77 version of this routine takes an additional argument, dir_length, which specifies the length of the directory list string.

If an error condition is encountered, HXsetdir leaves the directory search path unchanged. The directory search path specified by HXsetdir remains in effect throughout the scope of the calling program.

HXsetdir returns a value of SUCCEED (or 0) or FAIL (or -1). The parameters of HXsetdir are described in Table 3I on page 35.

3.5.3.2 Specifying the Location of the Next External File to be Created: HXsetcreatedir

HXsetcreatedir specifies the directory location of the next external file to be created. It overrides the directory location specified by $HDFEXTCREATEDIR and the locations searched by the open(3) call in the same manner as HXsetdir. Specifically, the search precedence is:

  1. The directory specified by the last call to the HXsetcreatedir routine.
  2. The directory specified by the $HDFEXTCREATEDIR shell environment variable.
  3. The locations searched by the standard open(3) routine.

The syntax of HXsetcreatedir is as follows:

C:		status = HXsetcreatedir(dir);
FORTRAN:	status = hxiscdir(dir, dir_length)
HXsetcreatedir has one argument, the directory location of the next external file to be created. The FORTRAN-77 version of this routine takes an additional argument, dir_length, which specifies the length of the directory list string. If an error is encountered, the directory location is left unchanged.

HXsetcreatedir returns a value of SUCCEED (or 0) or FAIL (or -1). The parameters of HXsetcreatedir are described in Table 3I.

TABLE 3I - HXsetdir and HXsetcreatedir Parameter Lists

Routine Name

[Return Type]

(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
HXsetdir

[intn]

(hxisdir)
dir_list
char *
character*(*)
Directory list to be searched

dir_length
Not applicable
integer
Length of the dir_list string

HXsetcreatedir

[intn]

(hxiscdir)
dir
char *
character*(*)
Directory location of the next external file to be created

dir_length
Not applicable
integer
Length of the dir string

3.5.3.3 Creating a Data Set with Data Stored in an External File: SDsetexternalfile

Creating a data set in an external file involves the following steps:

  1. Create the data set.
  2. Specify that an external data file is to be used.
  3. Write data to the data set.
  4. Terminate access to the data set.

To create a data set with data stored in an external file, the calling program must make the following calls.

C:		sds_id = SDcreate(sd_id, name, data_type, rank, dim_sizes);
		status = SDsetexternalfile(sds_id, filename, offset);
		status = SDwritedata(sds_id, start, stride, edges, data);
		status = SDendaccess(sds_id);
FORTRAN:	sds_id = sfcreate(sd_id, name, data_type, rank, dim_sizes)
		status = sfsextf(sds_id, filename, offset)
		status = sfwdata(sds_id, start, stride, edges, data)
	OR	status = sfwcdata(sds_id, start, stride, edges, data)
		status = sfendacc(sds_id)
For a newly-created data set, SDsetexternalfile marks the SDS identified by sds_id as one whose data is to be written to an external file. It does not actually write data to an external file; it marks the data set as an external data set for all subsequent SDwritedata operations.

Note that data can only be moved once for any given data set, i.e., SDsetexternalfile can only be called once after a data set has been created. It is the user's responsibility to make sure that the external data file is kept with the primary HDF file.

The parameter filename is the name of the external data file and offset is the number of bytes from the beginning of the external file to the location where the first byte of data should be written. If a file with the name specified by filename exists in the current directory search path, HDF will access it as the external file. If the file does not exist, HDF will create one in the directory named in the last call to HXsetcreatefile. If an absolute pathname is specified, the external file will be created at the location specified by the pathname, overriding the location specified by the last call to HXsetcreatefile. Use caution when writing to existing external or primary files since the HDF library starts the write operation at the specified offset without determining whether data is being overwritten.

Once the name of an external file is established, it cannot be changed without breaking the association between the data set's metadata and the data it describes.

SDsetexternalfile returns a value of SUCCEED (or 0) or FAIL (or -1). The parameters of SDsetexternalfile are described in Table 3J.

TABLE 3J - SDsetexternalfile Parameter List

Routine Name

[Return Type]

(FORTRAN-77)
Parameter
Parameter Type
Description
C
FORTRAN-77
SDsetexternalfile

[intn]

(sfsextf)
sds_id
int32
integer
Data set identifier

filename
char *
character*(*)
Name of the file to contain the external data set

offset
int32
integer
Offset in bytes from the beginning of the external file to where the SDS data will be written

3.5.3.4 Moving Existing Data to an External File

Data can be moved from a primary file to an external file. The following steps perform this task:

  1. Select the data set.
  2. Specify the external data file.
  3. Terminate access to the data set.

To move data set data to an external file, the calling program must make the following calls:

C:		sds_id = SDselect(sd_id, sds_index);
		status = SDsetexternalfile(sds_id, filename, offset);
		status = SDendaccess(sds_id);
FORTRAN:	sds_id = sfselect(sd_id, sds_index)
		status = sfsextf(sds_id, filename, offset)
		status = sfendacc(sds_id)
For an existing data set, SDsetexternalfile moves the data to the external file. Any data in the external file that occupies the space reserved for the external array will be overwritten as a result of this operation. Data of an existing data set in the primary file can only be moved to the external file once. During the operation, the data is written to the external file as a contiguous stream regardless of how it is stored in the primary file. Because data is moved as is, any unwritten locations in the data set are preserved in the external file. Subsequent read and write operations performed on the data set will access the external file.

EXAMPLE 7. Moving Data to the External File.

This example illustrates the use of the routine SDsetexternalfile/sfsextf to move the SDS data written in Example 2 to the external file.

C version

FORTRAN-77 version



[Top] [Prev] [Next]

hdfhelp@ncsa.uiuc.edu
HDF User's Guide - 07/21/98, NCSA HDF Development Group.