Systems

The place of numerical experiments in the research workflow

May 28, 2007 Systems No comments

Since infinite numbers of monkeys hitting keys on typewritters for an infinite amount of time are not usually available for employment, researchers do not proceed at random, but come up instead with a hypothesis, followed by experiments. Ideas may be stimulated by close acquaintance with the subjects and tools of the experiment, but by the very nature of things theory proceeds in time before the experiment, for else one would not know how to design that experiment.
Many theories are sequences of logical statements that can be divided into a number of discrete falsifiable fragments. One option is to postpone experiments until the whole theory is complete, with proofs of convergence, existence, and whatnot. It may be reasonable to argue that until the whole theory is formally complete in all its majesty, one cannot properly perceive the significance of the fragment in context, and experiments need to be designed with the whole theory in mind.
However, a researcher that would proceed in the forementioned way would only discover that he is significantly less productive than his peers! What happened??
Well, his peers probably did something else. Namely they tried to test individual, even incomplete pieces of the theory. If the experiment does not work, they will know quite early, and find workarounds or just give up the dead end and try another hypothesis — until finally they hit on something that works. During all this time that the theory-minded fellow has been pouring equations for a single theory, without knowing whether the inevitable approximations will not doom its application to real data. On top of that, his fellows had more contact with real data and more inspiration for their other theories!
The strategy of his peers resembles quite closely the “release early, release often” saying of the open-source world. It resembles Agile development strategies, in contrast to the the Waterfall model.
The “implement early, implement often” approach also brings to mind a saying from a completely different field: the popular “Cut the losers early, let the winners run” advice to investors. Researchers are much like investors, just that they invest time instead of money. This analogy is not merely interesting: it is useful. A lot of statistics were done about expected returns for investors, which are described better by Pareto distributions/power laws (80/20 rule, etc) than by symmetric Gaussian pdfs, so culling “losers” when they reach a threshold but keeping “winners” biases the expectation towards winners. It is quite straightforward (but tedious) to map investor gains to number of references in peer-reviewed journals, count references and interview researchers about the number of hypotheses tried and discarded, and map the finance-domain portofolio statistical work to the research-domain. Venture capital firms large enough, who need to evaluate and invest in researchers, may have already done that.
What is necessary in order to “implement early, implement often”? Most important, software should be quite usable (being learnable helps as well; yes, these are different things). In the case of a development platform, like Madagascar, where users may freely adapt existing programs, code needs to be clean and easy to understand as well; for this kind of software, code readability and implementation descriptions are more than good practice for future maintenance, but productivity enhancers for current users as well! Madagascar, known for these attributes, is already up to a good start. Bottom line: usability multiplies productivity much more than comes to mind at first thought (i.e. time saved in not looking up things), since hard-to-use software decreases the probability of the researcher putting an idea to the test at all, and instead results in him postponing the moment of coding, without realising the aggregate productivity loss!
In conclusion: let us get to know Madagascar well, implement early, and implement often!

Madagascar now compatible with Numpy for the Python API

October 23, 2006 Systems No comments

Madagascar was updated to use Numpy (http://numpy.scipy.org/) rather than its predecessor Numarray for the Python API (more details at Madagascar API wiki). If both Numpy and Numarray are installed, Madagascar uses Numpy but if only Numarray is installed, Madagascar continues to use Numarray.

./configure; make; make install

September 13, 2006 Systems No comments

If you are used to installing open-source software from source, running

./configure 
make 
make install

is almost a matter of habit. Now you can do that with Madagascar as well. A simple configure script checks the installation of Python and SCons (installing the latter if necessary), sets environmental variables, runs scons config, and creates a Makefile. This setup is experimental and will get tested before appearing in the next stable release.

Beardsley’s principles

July 15, 2006 Systems No comments

Reginald Beardsley formulates seven design principles for a modern seismic processing system:

A modern production seismic processing system must satisfy several conflicting goals:

  • It must make efficient use of an ever changing, heterogeneous computing enviroment.
  • It must be easy to locate the appropriate modules from a large number of choices.
  • It must be easy to modify and test existing modules or write new ones without adversely impacting other users.
  • It must be impossible to run the system with invalid or incomplete input and easy to identify the parameters required by a module, what allowable values are and the significance of the parameter.
  • It must be possible to reproduce within roundoff error all operations on any data, at any time, even years later.
  • Unwanted or unexpected interactions between system components must not take place.
  • It must be possible to construct the system with limited resources.

To read more or to participate in the discussion, please subscribe to the RSF-user mailing list.

madagascar-0.9.3 released

June 23, 2006 Systems No comments

New release madagascar-0.9.3 is out. The changes are mostly in configuration to enable compilation on different platforms. No need to upgrade unless you have had installation problems. Thanks to Naoki Saito (UC Davis) for feedback.
The installation was tested with the help of Sourceforge’s Compile Farm service. This release was verified to compile successfully on the following Compile Farm platforms:

  1. x86 – Linux 2.6 – Debian 3.1
  2. x86 – Linux 2.6 – Fedora FC2
  3. x86 – FreeBSD 5.4
  4. AMD64 – Linux 2.6 – Fedora Core 3
  5. Power 5 – Open Power 720 – SuSE Enterprise 9
  6. Sparc R 220 – Sun Solaris 9
  7. x86 – OpenBSD 3.8
  8. x86 – Solaris 9 (both with gcc and with Sun’s cc)

RSF/Madagascar on Windows using SFU

April 27, 2006 Systems No comments

RSF has been successfully ported to Windows NT using Microsoft’s Services for UNIX:

screen shot.

Many thanks to Dave Hinkley for help and for sharing his expertise.

Nick Vlad on RSF and XML

March 2, 2006 Systems No comments

Thoughts from Nick Vlad on RSF and XML

Provocative idea for the long term that you may post if it seems interesting: Making the history files XML
documents. Their “written by machine, sometimes read by a human” paradigm is exactely that of XML. This
will not impede reading with existing programs, but will instead open intriguing possibilities. The
machine would be able to construct a diagram of the processing history. One does not need to write C++
code to do graphical interfaces. Just transform XML into XHTML 1.0 Strict through a XSLT filter – or
through a basic python script (i.e. <sfwindow> into <div class=”sfwindow”> ). Format with CSS. The history of the file becomes all of a sudden a nice suite of boxes on a webpage. One can do Fahrner Image Replacements in the CSS stylesheet to use specific icons for various (classes of) programs. We’ve just got ourselves a neat little “Processing History Explorer”!
But this is not all. Assume that each RSF program, when appropriately queried, would return a range of valid values for its parameters and a list of the file tags it needs, with intent (in, out, inout). Then it would be possible to automatically transform the XML information about a program into SVG drawings (again, with a XSLT stylesheet), with arrows connecting the processes/files. It is possible to do hyperlinks from SVG elements, so the user can open a window with a XHTML form with menus, radio buttons, etc, with the settings of the program. Now, all of a sudden, we have more than a “Processing History Explorer”. Change the parameters, press submit, modify the flow, run it, the XML description of the processing flow gets transformed into a Scons file, and runs on the remote machine!
The great advantages of doing graphical interfaces this way would be that: (A) they would be entirely platform-independent; (B) They would run as well on remote user machines as on the local host; (C) The number of people who can do web development is much greater than that of the people who know how to do GUI development, so it will be easier to get people to contribute.

Scons tools

August 21, 2005 Systems No comments

Several new Scons features are added to help in the development:

  • In the project SConstruct files, it is possible now to use the full name of a program (i.e. “sfwindow”) as well as the abbreviated name (i.e. “window”). Using the full name is handy when copying and pasting from or to the command line. Using the abbreviated (no-prefix) name saves some typing. Thanks to LIPS for the suggestion!
  • A new rule “scons result.flip” runs “xtpen Fig/result.vpl /locked/figures/result.vpl” to flip between the new and locked figure. This is useful when detecting changes.
  • If you run “scons program.test” from RSF/book directory (i.e. “scons sfwindow.test”), it will visit all project directories where the program is used and run “scons test” in them. This is useful for “unit testing” – making sure that the program passes all tests when modified.

Matlab interface

August 5, 2005 Systems No comments

The Matlab interface to RSF is now completed. See an example in the RSF API guide.

rsfbook is dead, long live rsffigs

July 29, 2005 Systems No comments

The rsfbook repository was getting too big with large figures and movies and that created a lot of difficulties. In a major reorganization effort, we have moved the human-edited part of the books (LaTeX and SConstruct files) under /rsf/book and created a separate repository rsffigs just for figures.
Checkout the figure repository as follows:
svn co http://egl.beg.utexas.edu/svn/rsffigs $RSFROOT/figs.
You can also use a different path and set the RSFFIGS environmental variable. The usual scons test and scons lock commands should work.