Editing
SCons
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Data processing flows with <tt>rsf.proj</tt> == The [https://github.com/ahay/src/blob/master/framework/rsf/proj.py rsf.proj] module provides SCons rules for Madagascar data processing workflows. An example <tt>SConstruct</tt> file is shown below and can be found in [http://ahay.org/RSF/book/bei/sg/denmark.html bei/sg/denmark] <syntaxhighlight lang="python"> from rsf.proj import * Fetch('wz.35.H','wz') Flow('wind','wz.35.H','dd form=native | window n1=400 j1=2 | smooth rect1=3') Plot('wind','pow pow1=2 | grey') Flow('mute','wind','mutter v0=0.31 half=n') Plot('mute','pow pow1=2 | grey') Result('denmark','wind mute','SideBySideAniso') End() </syntaxhighlight> Note that <tt>SConstruct</tt> by itself does not do any job other than setting rules for building different targets. The targets get built when one executes <tt>scons</tt> on the command line. Running <tt>scons</tt> produces <pre> bash$ scons scons: Reading SConscript files ... scons: done reading SConscript files. scons: Building targets ... retrieve(["wz.35.H"], []) < wz.35.H /RSF/bin/sfdd form=native | /RSF/bin/sfwindow n1=400 j1=2 | /RSF/bin/sfsmooth rect1=3 > wind.rsf < wind.rsf /RSF/bin/sfpow pow1=2 | /RSF/bin/sfgrey > wind.vpl < wind.rsf /RSF/bin/sfmutter v0=0.31 half=n > mute.rsf < mute.rsf /RSF/bin/sfpow pow1=2 | /RSF/bin/sfgrey > mute.vpl /RSF/bin/vppen yscale=2 vpstyle=n gridnum=2,1 wind.vpl mute.vpl > Fig/denmark.vpl scons: done building targets. </pre> Obviously, one could also run similar commands with a shell script. What makes SCons convenient is the way it behaves when we make changes in the input files or in the script. Let us change, for example, the mute velocity parameter in the second <tt>Flow</tt> command. You can do that with an editor or on the command line as <pre> bash$ sed -i s/v0=0.31/v0=0.32/ SConstruct </pre> Now let us run <tt>scons</tt> again <pre> bash$ scons -Q < wind.rsf /RSF/bin/sfmutter v0=0.32 half=n > mute.rsf < mute.rsf /RSF/bin/sfpow pow1=2 | /home/fomels/RSF/bin/sfgrey > mute.vpl /RSF/bin/vppen yscale=2 vpstyle=n gridnum=2,1 wind.vpl mute.vpl > Fig/denmark.vpl </pre> We can see that <tt>scons</tt> executes only the parts of the data processing flow that were affected by the change. By keeping track of dependencies, SCons makes it easier to modify existing workflows without the need to rerun everything after each change. === SConstruct commands === ====Fetch(<file[s]>,<directory>,[options])==== defines a rule for downloading data files from the specified directory on an external data server (by default) or from another directory on disk. The optional parameters that control its behavior are summarized below. {|class="wikitable" align="center" cellspacing="0" border="1" |- ! colspan="3" style="background:#ffdead;"|Fetch options |- | '''Name''' || '''Default''' || '''Meaning''' |- | private || None || if the data file is private |- | server || $RSF_DATASERVER or http://www.reproducibility.org || remote data server (or '''local''' for local files) |- | top || <tt>data</tt> || name of the top data directory on the data server |- | dir || None || name of directory after top |- | usedatapath || 1 || usedatapath=1 - download to $DATAPATH with symbolic link. usedatapath=0 - download to pwd |} In the example above, '''Fetch''' specifies the rule for getting the file <tt>wz.35.H</tt>: connect to the default data sever and download the file from the [http://www.reproducibility.org/data/wz data/wz] directory. An example to Fetch with more parameters is: <syntaxhighlight lang="python"> Fetch('KAHU-3D-PR3177-FM.3D.Final_Migration.sgy', dir='newzealand/Taranaiki_Basin/KAHU-3D', server='http://s3.amazonaws.com', top='open.source.geoscience/open_data', usedatapath=1) </syntaxhighlight> ====Flow(<target[s]>,<source[s]>,<command>,[options])==== defines a rule for creating targets from sources by running the specified command through Unix shell. The optional parameters that control its behavior are summarized below. {|class="wikitable" align="center" cellspacing="0" border="1" |- ! colspan="3" style="background:#ffdead;"|Flow options |- | '''Name''' || '''Default''' || '''Meaning''' |- | stdout || 1 || if output to standard out (0 for output to <tt>/dev/null</tt>, -1 for no output) |- | stdin || 1 || if take input from standard in (0 for no input) |- | rsfflow || 1 || if using <tt>Madagascar</tt> commands |- | suffix || '.rsf' || default suffix for output files |- | prefix || 'sf' || default prefix for programs |- | src_suffix || '.rsf' || default suffix for input files |- | split || [] || split the flow for data parallel processing |- | reduce || 'cat' || how to reduce the output from data parallel processing |- | local || 0 || if execute on the local node when using data parallel processing on a cluster |} In the example above, there are two '''Flow''' commands. The first one involves a Unix pipe in the command definition. On the use of parallel computing options, see [[Parallel Computing]]. ====Plot(<target>,[<source[s]>],<command>,[options])==== is similar to '''Flow''' but generates a graphics file (Vplot file) instead of an RSF file. If the source file is not specified, it is assumed that the name of the output file (without the <tt>.vpl</tt> suffix) is the same as the name of the input file (without the <tt>.rsf</tt> suffix). {|class="wikitable" align="center" cellspacing="0" border="1" |- ! colspan="3" style="background:#ffdead;"|Plot options |- | '''Name''' || '''Default''' || '''Meaning''' |- | suffix || '.vpl' || default suffix for the output file |- | vppen || None || additional options to pass to <tt>vppen</tt> |- | view || None || if set, show the output on the screen instead of saving it in a file |} In the example above, there are two '''Plot''' commands. ====Result(<target>,[<source[s]>],<command>,[options])==== is similar to '''Plot''', only the output graphics file is put not in the current directory but in a separate directory (<tt>./Fig</tt> by default). The output is intended for inclusion in papers and reports. {|class="wikitable" align="center" cellspacing="0" border="1" |- ! colspan="3" style="background:#ffdead;"|Result options |- | '''Name''' || '''Default''' || '''Meaning''' |- | suffix || '.vpl' || default suffix for the output file |} In the example above, <tt>Result</tt> defines a rule that combines the results of two <tt>Plot</tt> rules into one plot by arranging them side by side. The rules for combining different figures together (which apply to both <tt>Plot</tt> and <tt>Result</tt> commands) include: * SideBySideAniso * OverUnderAniso * SideBySideIso * OverUnderIso * TwoRows * TwoColumns * Overlay * Movie ====End()==== takes no arguments and signals the end of data processing rules. It provides the following targets, which operate on all previously specified '''Result''' figures: * '''scons view''' displays the resuts on the screen. * '''scons print''' sends the results to the printer (specified with '''PSPRINTER''' environmental variable). * '''scons lock''' copies the results to a location inside the '''DATAPATH''' tree. * '''scons test''' compares the previously "locked" results with the current results and aborts with an error in case of mismatch. The default target is set to be the collection of all '''Result''' figures. === Command-line options === {|class="wikitable" align="center" cellspacing="0" border="1" |- ! colspan="2" style="background:#ffdead;"|Command-line options |- | '''Name''' || '''Meaning''' |- | TIMER || Whether to time execution |- | CHECKPAR || Whether to check parameters |- | ENVIRON || Additional environment settings |- | CLUSTER || Nodes available on a cluster |- | MPIRUN || mpirun command |} Running the example above with <tt>TIMER=y</tt> produces <syntaxhighlight lang="bash"> bash$ scons -Q TIMER=y /usr/bin/time < wind.rsf /RSF/bin/sfmutter v0=0.32 half=n > mute.rsf 0.09user 0.03system 0:00.13elapsed 94%CPU (0avgtext+0avgdata 383744maxresident)k 0inputs+0outputs (1513major+0minor)pagefaults 0swaps /usr/bin/time < mute.rsf /RSF/bin/sfpow pow1=2 | /RSF/bin/sfgrey > mute.vpl 0.10user 0.00system 0:00.18elapsed 59%CPU (0avgtext+0avgdata 384256maxresident)k 0inputs+0outputs (1515major+0minor)pagefaults 0swaps /usr/bin/time /RSF/bin/vppen yscale=2 vpstyle=n gridnum=2,1 wind.vpl mute.vpl > Fig/denmark.vpl 0.06user 0.03system 0:00.06elapsed 135%CPU (0avgtext+0avgdata 444416maxresident)k 0inputs+0outputs (1739major+0minor)pagefaults 0swaps </syntaxhighlight> In other words, every shell command is preceded by the Unix [http://en.wikipedia.org/wiki/Time_%28Unix%29 time] utility to measure the CPU time of the process. Running the example above with <tt>CHECKPAR=y</tt>, we will not see any difference. Suppose, however, that we made a typo in specifying one of the parameters, for example, by using <tt>v1=</tt> instead of <tt>v0=</tt> in the arguments to <tt>sfmutter</tt>. <syntaxhighlight lang="bash"> bash$ sed -i s/v0=0.31/v1=0.31/ SConstruct bash$ scons -Q CHECKPAR=y No parameter "v1" in sfmutter Failed on "mutter v1=0.31 half=n" </syntaxhighlight> The parameter error gets detected by <tt>scons</tt> before anything is executed.
Summary:
Please note that all contributions to Madagascar are considered to be released under the GNU Free Documentation License 1.3 or later (see
My wiki:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
English
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Getting Madagascar
download
Installation
GitHub repository
SEGTeX
Introduction
Package overview
Tutorial
Hands-on tour
Reproducible documents
Hall of Fame
User Documentation
List of programs
Common programs
Popular programs
The RSF file format
Reproducibility with SCons
Developer documentation
Adding programs
Contributing programs
API demo: clipping data
API demo: explicit finite differences
Community
Conferences
User mailing list
Developer mailing list
GitHub organization
LinkedIn group
Development blog
Twitter
Slack
Tools
What links here
Related changes
Special pages
Page information