Editing
Reproducible computational experiments using SCons
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Example experiments== The main <tt>SConstruct</tt> commands defined in our reproducible research environment are collected in the table. <center> {| class="wikitable" |+Basic methods of an <tt>rsf.proj</tt> object. |- |style="background-color:#ffdead;"| '''<tt>Fetch(data_file,dir[,ftp_server_info])</tt>''' |- | A rule to download <tt><math><</math>data_file<math>></math></tt> from a specific directory <tt><math><</math>dir<math>></math></tt> of an FTP server |- |style="background-color:#ffdead;"| '''<tt>Flow(target[s],source[s],command[s][,stdin][,stdout])</tt> ''' |- | A rule to generate <tt><math><</math>target[s]<math>></math></tt> from <tt><math><</math>source[s]<math>></math></tt> using <tt><math><</math>command[s]<math>></math></tt> |- |style="background-color:#ffdead;"| '''<tt>Plot(intermediate_plot[,source],plot_command)</tt>''' or '''<tt>Plot(intermediate_plot,intermediate_plots,combination)</tt>''' |- | A rule to generate <tt><math><</math>intermediate_plot<math>></math></tt> in the working directory. |- |style="background-color:#ffdead;"| '''<tt>Result(plot[,source],plot_command)</tt>''' or '''<tt>Result(plot,intermediate_plots,combination)</tt>''' |- | A rule to generate a final <tt><math><</math>plot<math>></math></tt> in the special <tt>Fig</tt> folder of the working directory. |- |style="background-color:#ffdead;"| '''<tt>End()</tt>''' |- | A rule to collect default targets. |} </center> These commands are defined in <tt>$PYTHONPATH/rsf/proj.py</tt> where <tt>RSFROOT</tt> is the environmental variable to the Madagascar installation directory. The source of this file is in [http://sourceforge.net/p/rsf/code/HEAD/tree/trunk/framework/rsf/proj.py framework/rsf/proj.py]. ===Example 1=== To follow the first example, select a working project directory and copy the following code to a file named <tt>SConstruct</tt><ref>The source of this file is also accessible at [http://sourceforge.net/p/rsf/code/HEAD/tree/trunk/book/rsf/scons/easystart/SConstruct $RSFSRC/book/rsf/scons/easystart/SConstruct].</ref>. <syntaxhighlight lang="python"> from rsf.proj import * # Download the input data file Fetch('lena.img','imgs') # Create RSF header Flow('lena.hdr','lena.img', 'echo n1=512 n2=513 in=$SOURCE data_format=native_uchar', stdin=0) # Convert to floating point and window out first trace Flow('lena','lena.hdr','dd type=float | window f2=1') # Display Result('lena', ''' sfgrey title="Hello, World!" transp=n color=b bias=128 clip=100 screenratio=1 ''') # Wrap up End() </syntaxhighlight> This is our "hello world" example that illustrates the basic use of some of the commands presented in Table~(tbl:commands). The plan for this experiment is simply to download data from a public data server, to convert it to an appropriate file format and to generate a figure for publication. But let us have a closer look at the <tt>SConstruct</tt> script and try to decorticate it. <syntaxhighlight lang="python"> from rsf.proj import * </syntaxhighlight> is a standard Python command that loads the Madagascar project management module <tt>rsf/proj.py</tt> which provides our extension to SCons. <syntaxhighlight lang="python"> Fetch('lena.img','imgs') </syntaxhighlight> instructs SCons to connect to a public data server (the default server if no FTP server information is provided) and to fetch the data file <tt>lena.img</tt> from the <tt>data/imgs</tt> directory. <!-- Note that Madagascar expects a <tt>data</tt> folder on top of the specified directory (i.e. <tt>imgs</tt>). In the directory where you have your SConstruct, running <tt>scons lena.img</tt> on the command line will download the file <tt>lena.img</tt>. The equivalent command line is <pre> bash$ wget http://www.ahay.org/data/imgs/lena.img </pre> --> Try running "<tt>scons lena.img</tt>" on the command line. The successful output should look like <pre> bash$ scons lena.img scons: Reading SConscript files ... scons: done reading SConscript files. scons: Building targets ... retrieve(["lena.img"], []) scons: done building targets. </pre> with the target file <tt>lena.img</tt> appearing in your directory. In the following examples, we will use <tt>-Q</tt> (quiet) option of <tt>scons</tt> to suppress the verbose output. <syntaxhighlight lang="python"> Flow('lena.hdr','lena.img', 'echo n1=512 n2=513 in=$SOURCE data_format=native_uchar', stdin=0) </syntaxhighlight> prepares the Madagascar header file <tt>lena.hdr</tt> using the standard Unix command <tt>echo</tt>. <pre> bash$ scons -Q lena.hdr echo n1=512 n2=513 in=lena.img data_format=native_uchar > lena.hdr </pre> Since <tt>echo</tt> does not take a standard input, stdin is set to 0 in the Flow command otherwise the first source is the standard input. Likewise, the first target is the standard output unless otherwise specified. Note that <tt>lena.img</tt> is referred as <tt>$SOURCE</tt> in the command. This allows us to change the name of the source file without changing the command. The data format of the <tt>lena.img</tt> image file is <tt>uchar</tt> (unsigned character), the image consists of 513 traces with 512 samples per trace. Our next step is to convert the image representation to floating point numbers and to window out the first trace so that the final image is a 512 by 512 square. The two transformations are conveniently combined into one with the help of a Unix pipe. <syntaxhighlight lang="python"> Flow('lena','lena.hdr','dd type=float | window f2=1') </syntaxhighlight> <pre> bash$ scons -Q lena scons: *** Do not know how to make target `lena'. Stop. </pre> What happened? In the absence of the file suffix, the <tt>Flow</tt> command assumes that the target file suffix is "<tt>.rsf</tt>". Let us try again. <pre> scons -Q lena.rsf < lena.hdr /RSF/bin/sfdd type=float | /RSF/bin/sfwindow f2=1 > lena.rsf </pre> Notice that Madagascar modules <tt>sfdd</tt> and <tt>sfwindow</tt> get substituted for the corresponding short names in the <tt>SConstruct</tt> file. The file <tt>lena.rsf</tt> is in a regularly sampled format<ref>See [[Guide to RSF file format]]</ref> and can be examined, for example, with <tt>sfin lena.rsf</tt><ref>See [[Guide_to_madagascar_programs#sfin]].</ref>. <pre> bash$ sfin lena.rsf lena.rsf: in="/datapath/lena.rsf@" esize=4 type=float form=native n1=512 d1=1 o1=0 n2=512 d2=1 o2=1 262144 elements 1048576 bytes </pre> In the last step, we will create a plot file for displaying the image on the screen and for including it in the publication. <syntaxhighlight lang="python"> Result('lena', ''' sfgrey title="Hello, World!" transp=n color=b bias=128 clip=100 screenratio=1 ''') </syntaxhighlight> Notice that we broke the long command string into multiple lines by using Python's triple quote syntax. All the extra white space will be ignored when the multiple line string gets translated into the command line. The <tt>Result</tt> command has special targets associated with it. Try, for example, "<tt>scons lena.view</tt>" to observe the figure <tt>Fig/lena.vpl</tt> generated in a specially created <tt>Fig</tt> directory and displayed on the screen. The output should look like this figure. [[Image:lena.png|frame|center|The output of the first numerical experiment.]] The reproducible script ends with <syntaxhighlight lang="python"> End() </syntaxhighlight> Ready to experiment? Try some of the following: #Run <tt>scons -c</tt>. The <tt>-c</tt> (clean) option tells SCons to remove all default targets (the <tt>Fig/lena.vpl</tt> image file in our case) and also all intermediate targets that it generated. <pre> bash$ scons -c -Q Removed lena.img Removed lena.hdr Removed lena.rsf Removed /datapath/lena.rsf@ Removed Fig/lena.vpl </pre> Run <tt>scons</tt> again, and the default target will be regenerated. <pre> bash$ scons -Q retrieve(["lena.img"], []) echo n1=512 n2=513 in=lena.img data_format=native_uchar > lena.hdr < lena.hdr /RSF/bin/sfdd type=float | /RSF/bin/sfwindow f2=1 > lena.rsf < lena.rsf /RSF/bin/sfgrey title="Hello, World!" transp=n color=b bias=128 clip=100 screenratio=1 > Fig/lena.vpl </pre> #Edit your <tt>SConstruct</tt> file and change some of the plotting parameters. For example, change the value of <tt>clip</tt> from <tt>clip=100</tt> to <tt>clip=50</tt>. Run <tt>scons</tt> again and observe that only the last part of the processing flow (precisely, the part affected by the parameter change) is being run: <pre> bash$ scons -Q view < lena.rsf /RSF/bin/sfgrey title="Hello, World!" transp=n color=b bias=128 clip=50 screenratio=1 > Fig/lena.vpl sfpen Fig/lena.vpl </pre> SCons is smart enough to recognize that your editing did not affect any of the previous results in the data flow chain! Keeping track of dependencies is the main feature that separates data processing and computational experimenting with SCons from using linear shell scripts. For computationally demanding data processing, this feature can save you a lot of time and can make your experiments more interactive and enjoyable. #A special parameter to SCons (defined in <tt>rsfproj.py</tt>) can time the execution of each step in the processing flow. Try running <tt>scons TIMER=y</tt>. #The <tt>rsfproj</tt> module has direct access to the database that stores parameters of all Madagascar modules. Try running <tt>scons CHECKPAR=y</tt> to see parameter checking enforced before computations\footnote{This feature is new and experimental and may not work properly yet}. The summary of our SCons commands is given in the table. {| class="wikitable" |+SCons commands and options defined in <tt>rsfproj</tt>. |- |style="background-color:#ffdead;"| '''<tt>scons <math><</math>file<math>></math></tt>''' |- | Generate <tt><math><</math>file<math>></math></tt> (usually requires <tt>.rsf</tt> suffix for <tt>Flow</tt> targets and <tt>.vpl</tt> suffix for <tt>Plot</tt> targets.) |- |style="background-color:#ffdead;"| '''<tt>scons</tt>''' |- | Generate default targets (usually figures specified in <tt>Result</tt>.) |- |style="background-color:#ffdead;"| '''<tt>scons view</tt>''' or '''<tt>scons <math><</math>result<math>></math>.view</tt> ''' |- | Generate <tt>Result</tt> figures and display them on the screen. |- |style="background-color:#ffdead;"| '''<tt>scons print</tt>''' or '''<tt>scons <math><</math>result<math>></math>.print</tt>''' |- | Generate <tt>Result</tt> figures and print them. |- |style="background-color:#ffdead;"| '''<tt>scons lock</tt>''' or '''<tt>scons <math><</math>result<math>></math>.lock</tt> ''' |- | Generate <tt>Result</tt> figures and install them in a separate location. |- |style="background-color:#ffdead;"| '''<tt>scons test</tt>''' or '''<tt>scons <math><</math>result<math>></math>.test</tt>''' |- | Generate <tt>Result</tt> figures and compare them with the corresponding "locked" figures stored in a separate location (regression testing). |- |style="background-color:#ffdead;"| '''<tt>scons <math><</math>result<math>></math>.flip</tt>''' |- | Generate the <tt><math><</math>result<math>></math></tt> figure and compare it with the corresponding "locked" figure stored in a separate location by flipping between the two figures on the screen. |- |style="background-color:#ffdead;"| '''<tt>scons TIMER=y ...</tt> ''' |- | Time the execution of each step in the processing flow (using the Unix <tt>time</tt> utility.) |- |style="background-color:#ffdead;"| '''<tt>scons CHECKPAR=y ...</tt> ''' |- | Check the names and values of all parameters supplied to Madagascar modules in the processing flows before executing anything (guards against incorrect input.) This option is new and experimental. |} ===Example 2=== The plan for this experiment is to add random noise to the test "Lena" image and then to attempt removing it by low-pass filtering and by hard thresholding of coefficients in the Fourier domain. The result images are shown in the figures. [[Image:panel1.png|frame|center|Top left: original image. Top right: random noise added. Bottom left: original image spectrum in the Fourier (<math>F</math>-<math>X</math>) domain. Bottom right: noisy image spectrum in the Fourier (<math>F</math>-<math>X</math>) domain.]] [[Image:panel2.png|frame|center|Left: denoising by low-pass filtering. Right: denoising by hard thresholding in the Fourier domain.]] Since the <tt>SConstruct|</tt> file is a Python script, we can also use all the flexibility and power of the Python language in our Madagascar reproducible scripts. A demo script is available in the <tt>rsf/scons/rsfpy</tt> subdirectory of the Madagascar <tt>book</tt> directory. Rather than commenting it line-by-line, we select some parts of interest. In the <tt>SConstruct</tt> script, we can declare Python variables <syntaxhighlight lang="python"> bias = 128 </syntaxhighlight> and use them later, for example, to define our customized plot command as a Python function <syntaxhighlight lang="python"> def grey(title,transp='n',bias=bias): return ''' sfgrey title="%s" transp=%s bias=%g clip=100 screenht=10 screenwd=10 crowd2=0.85 crowd1=0.8 label1= label2= ''' % (title,transp,bias) </syntaxhighlight> This Python function, named <tt>grey()</tt>, can then be called in Plot or Result commands, e.g. <syntaxhighlight lang="python"> Plot('lplena',grey('Noisy Lena LP filtered')) </syntaxhighlight> We can define a Python dictionary, e.g. <syntaxhighlight lang="python"> titles = {'lena':'Lena', 'nlena':'Noisy Lena'} </syntaxhighlight> and loop over its entries, e.g. <syntaxhighlight lang="python"> for name in titles.keys(): Plot(name,grey(titles[name]) ) cftitle = titles[name]+' in FX domain' Flow('fx'+name,name,'sfspectra') Plot('fx'+name,grey(cftitle,'y',100)) </syntaxhighlight> Note that the title of the plots is obtained by concatenating Python strings. Python strings can also be used to define sequences of commands used in several Flows, e.g. <syntaxhighlight lang="python"> # 2-D FFT fft2 = 'sffft1 sym=y | sffft3 sym=y' Flow('fnlena','nlena',fft2) </syntaxhighlight> Finally, in our Madagascar reproducible script, we may want the option to pass command line arguments when running SCons or use default values otherwise, e.g. <syntaxhighlight lang="python"> # denoising using thresholding in the Fourier domain fthr = float(ARGUMENTS.get('fthr', 70)) Flow('fthrlena','fnlena','sfthr thr=%f mode="hard"' % fthr) </syntaxhighlight> Running <tt>scons</tt> only, the default value set for fthr (i.e. 70) is used whereas running <tt>scons fthr=68</tt> set fthr to a command line specified value. This is by no mean an exhaustive list of options but, hopefully, it gives you a flavor of the powerful tool you have in hands. Enjoy! <!-- ===Useful SCons commands for reproducible scripts=== On top of SCons standard options (<tt>scons --help</tt> for more details), Madagascar has its own SCons options. We already saw <tt>scons plot.view</tt> that displays <tt>plot.vpl</tt> (in the <tt>Fig</tt> folder) obtained in a Result command. <tt>scons view</tt> displays the result plots one after the other. It is also possible to check parameters for Madagascar programs in SCons Flow commands using the CHECKPAR option (\texttt{scons CHECKPAR=y target}). Note that CHECKPAR is an experimental option and will be enhanced in the future to include parameter ranges and other safety checks. To time the execution of processing flows in a SConstruct use the TIMER option (<tt>scons TIMER=y target</tt>). <tt>scons lock</tt> is used to secure result plots and copy them from the <tt>Fig</tt> folder of your working directory to your <tt>$RSFFIGS</tt> folder where <tt>RSFFIGS</tt> is the environmental variable to the directory where you want Madagascar to put your key Madagascar result plots. Note that this is a necessary step before creating a reproducible documentation. <tt>scons plot.flip</tt> runs <tt>xtpen Fig/plot.vpl /locked/figures/plot.vpl</tt> to flip between the new and locked figure. This is useful when detecting changes. -->
Summary:
Please note that all contributions to Madagascar are considered to be released under the GNU Free Documentation License 1.3 or later (see
My wiki:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
English
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Getting Madagascar
download
Installation
GitHub repository
SEGTeX
Introduction
Package overview
Tutorial
Hands-on tour
Reproducible documents
Hall of Fame
User Documentation
List of programs
Common programs
Popular programs
The RSF file format
Reproducibility with SCons
Developer documentation
Adding programs
Contributing programs
API demo: clipping data
API demo: explicit finite differences
Community
Conferences
User mailing list
Developer mailing list
GitHub organization
LinkedIn group
Development blog
Twitter
Slack
Tools
What links here
Related changes
Special pages
Page information