Madagascar Code Patterns

From Madagascar
Revision as of 01:16, 29 October 2011 by Nick (talk | contribs) (→‎Loop over ensembles: vulnerability)
Jump to navigation Jump to search

All patterns are in C, unless otherwise noted.

Loop over ensembles

This is a simple loop over reading 1-D arrays ("traces"), 2-D ("gathers"), etc.

It is necessary in the case of algorithms that act on one ensemble at a time. Parallelizable with OMP.

It has the potential bug that the ensemble array is allocated using dimensions that are of type int (usually 4-byte). This will overflow on some rare occasions, and they should either be defined as 8-byte, or there should be a check function in which dimensions are read into a 8-byte int type and checked whether they will overflow the int. A different function than sf_histint needs to be created for this purpose, or the check to be implemented inside sf_histint.

I/O-optimized loop over samples

Description and usage

It consists of looping over an entire dataset and applying a given procedure to every single sample in a dataset, regardless of what "trace"/"frame"/"volume" it belongs to. Example: computing the sum of all elements of a dataset; computing a histogram; performing a clip operation; etc. It uses the BUFSIZ macro defined in stdio.h to ensure efficient stream I/O. Its occurences can be easily found by grepping for BUFSIZ in the codebase.

Example

<c> int n; /* Total number of elements in dataset */ int nbuf; /* Number of elements in I/O buffer */ float *fbuf; /* I/O array */ sf_file in=NULL; /* Input file. Here is stdin, but this is not compulsory */

in = sf_input("in");

n = sf_filesize(in);

/* This example uses float as data type. Any other data type (int, sf_complex, etc) can be used, as appropriate */ nbuf = BUFSIZ/sizeof(float);

fbuf = sf_floatalloc(nbuf);

for (; n > 0; n -= nbuf) {

   if (nbuf > n) nbuf = n;
   sf_floatread(fbuf, nbuf, in);
   for (i=0; i < nbuf; i++) {
       /* Do computations here */
   }

} </c>

Potential for improvement

This pattern should be parallelized using OpenMP.

The GNU C Library documentation states that when doing I/O on a file (as opposed to a stream), the st_blksize field of the file attributes is a better choice than BUFSIZ.

OMP parallelized loop

Description and usage

Shared-memory parallelization using the OpenMP library.

Example

Potential for improvement

Parallelized, I/O-optimized loop over samples

Description and usage

Example

Potential for improvement