SCons

SCons (from Software Construction) is a superior alternative to the classic make utility.
SCons is implemented as a Python script, its "configuration files" (SConstruct files) are also Python scripts. Madagascar uses SCons to compile software, to manage data processing flowing, and to assemble reproducible documents.
Useful SCons options
- scons -h (help) displays a help message.
- scons -Q (quiet) suppresses progress messages.
- scons -n (no exec) outputs the commands required for building the specified target (or the default targets if no target is specified) without actually executing them. It can be used to generate a shell script out of SConstruct script, as follows:
<bash> scons -nQ [target] > script.sh </bash>
Compilation
SCons was designed primarily for compiling software code. An SConstruct file for compilation may look like <python> env = Environment() env.Append(CPPFLAGS=['-Wall','-g']) env.Program('hello',['hello.c', 'main.c']) </python>
and produce something like
bash$ scons -Q gcc -o hello.o -c -Wall -g hello.c gcc -o main.o -c -Wall -g main.c gcc -o hello hello.o main.o
to compile the hello program from the source files hello.c and main.c.
Madagascar uses SCons to compile its programs from the source. The more frequent usage, however, comes from adopting SCons to manage data processing flows.
Data processing flows with rsf.proj
The rsf.proj module provides SCons rules for Madagascar data processing workflows. An example SConstruct file is shown below and can be found in bei/sg/denmark <python> from rsf.proj import *
Fetch('wz.35.H','wz')
Flow('wind','wz.35.H','dd form=native | window n1=400 j1=2 | smooth rect1=3') Plot('wind','pow pow1=2 | grey')
Flow('mute','wind','mutter v0=0.31 half=n') Plot('mute','pow pow1=2 | grey')
Result('denmark','wind mute','SideBySideAniso')
End() </python> Note that SConstruct by itself does not do any job other than setting rules for building different targets. The targets get built when one executes scons on the command line. Running scons produces
bash$ scons scons: Reading SConscript files ... scons: done reading SConscript files. scons: Building targets ... retrieve(["wz.35.H"], []) < wz.35.H /RSF/bin/sfdd form=native | /RSF/bin/sfwindow n1=400 j1=2 | /RSF/bin/sfsmooth rect1=3 > wind.rsf < wind.rsf /RSF/bin/sfpow pow1=2 | /RSF/bin/sfgrey > wind.vpl < wind.rsf /RSF/bin/sfmutter v0=0.31 half=n > mute.rsf < mute.rsf /RSF/bin/sfpow pow1=2 | /RSF/bin/sfgrey > mute.vpl /RSF/bin/vppen yscale=2 vpstyle=n gridnum=2,1 wind.vpl mute.vpl > Fig/denmark.vpl scons: done building targets.
Obviously, one could also run similar commands with a shell script. What makes SCons convenient is the way it behaves when we make changes in the input files or in the script. Let us change, for example, the mute velocity parameter in the second Flow command. You can do that with an editor or on the command line as
sed -i s/v0=0.31/v0=0.32/ SConstruct
Now let us run scons again
bash$ scons -Q < wind.rsf /RSF/bin/sfmutter v0=0.32 half=n > mute.rsf < mute.rsf /RSF/bin/sfpow pow1=2 | /home/fomels/RSF/bin/sfgrey > mute.vpl /RSF/bin/vppen yscale=2 vpstyle=n gridnum=2,1 wind.vpl mute.vpl > Fig/denmark.vpl
We can see that scons executes only the parts of the data processing flow that were affected by the change. By keeping track of dependencies, SCons makes it easier to modify existing workflows without the need to rerun everything after each change.
SConstruct commands
- Fetch(<file[s]>,<directory>,[options]) defines a rule for downloading data files from the specified directory on an external data server (by default) of from another directory on disk. The optional parameters that control its behavior are summarized below.
Fetch options | ||
---|---|---|
Name | Default | Meaning |
private | None | if the data file is private |
server | $RSF_DATASERVER or http://www.reproducibility.org | remote data server (or local for local files) |
top | data | name of the top data directory on the data server |
In the example above, Fetch specifies the rule for getting the file wz.35.H: connect to the default data sever and download the file from the data/wz directory.
- Flow(<target[s]>,<source[s]>,<command>,[options]) defines a rule for creating targets from sources by running the specified command through Unix shell. For example...
Flow options | ||
---|---|---|
Name | Default | Meaning |
stdout | 1 | if output to standard out (0 for output to /dev/null, -1 for no output) |
stdin | 1 | if take input from standard in (0 for no input) |
rsfflow | 1 | if using Madagascar commands |
suffix | '.rsf' | default suffix for output files |
prefix | 'sf' | default prefix for programs |
src_suffix | '.rsf' | default suffix for input files |
split | [] | split the flow for data parallel processing |
reduce | 'cat' | how to reduce the output from data parallel processing |
local | 0 | if execute on the local node when using data parallel processing on a cluster |
- Plot
- Result
- End
Default targets
Command-line options
Command-line options | |
---|---|
Name | Meaning |
TIMER | Whether to time execution |
CHECKPAR | Whether to check parameters |
ENVIRON | Additional environment settings |
CLUSTER | Nodes available on a cluster |
MPIRUN | mpirun command |
Seismic Unix data processing flows with rsf.suproj
Document creation with rsf.tex
SConstruct commands
- Paper
- End
Default targets
Book and report creation with rsf.book
SConstruct commands
- Book
- Papers
- End