next up previous [pdf]

Next: Introduction Up: Reproducible Documents

Published as SIAM Journal on Scientific Computing, 35, C194-C212, (2013)

A parallel sweeping preconditioner for heterogeneous 3D Helmholtz equations

Jack Poulson% latex2html id marker 3241
\setcounter{footnote}{1}\fnsymbol{footnote}, Björn Engquist % latex2html id marker 3242
\setcounter{footnote}{2}\fnsymbol{footnote}, Siwei Li % latex2html id marker 3243
\setcounter{footnote}{3}\fnsymbol{footnote}, and Lexing Ying % latex2html id marker 3244
\setcounter{footnote}{4}\fnsymbol{footnote}
% latex2html id marker 3237
\setcounter{footnote}{1}\fnsymbol{footnote}ICES, University of Texas at Austin, 1 University Station C0200, Austin, TX, 78712 (jack.poulson@gmail.com). This author was also supported by a CAM fellowship.
% latex2html id marker 3238
\setcounter{footnote}{2}\fnsymbol{footnote}Department of Mathematics and ICES, University of Texas at Austin, 1 University Station C1200, Austin, TX, 78712 (engquist@ices.utexas.edu). This author was also supported by NSF grant DMS-1016577.
% latex2html id marker 3239
\setcounter{footnote}{3}\fnsymbol{footnote}Jackson School of Geosciences, University of Texas at Austin, 1 University Station C1160, Austin, TX, 78712 (siwei.li@utexas.edu).
% latex2html id marker 3240
\setcounter{footnote}{4}\fnsymbol{footnote}Department of Mathematics and ICES, University of Texas at Austin, 1 University Station C1200, Austin, TX, 78712 (lexing@math.utexas.edu). This author was supported by NSF CAREER grant DMS-0846501, NSF grant DMS-1016577, and funding from KAUST.


Abstract:

A parallelization of a sweeping preconditioner for 3D Helmholtz equations without large cavities is introduced and benchmarked for several challenging velocity models. The setup and application costs of the sequential preconditioner are shown to be $ O(\gamma ^2 N^{4/3})$ and $ O(\gamma N \log N)$ , where $ \gamma(\omega)$ denotes the modestly frequency-dependent number of grid points per Perfectly Matched Layer. Several computational and memory improvements are introduced relative to using black-box sparse-direct solvers for the auxiliary problems, and competitive runtimes and iteration counts are reported for high-frequency problems distributed over thousands of cores. Two open-source packages are released along with this paper: Parallel Sweeping Preconditioner (PSP) and the underlying distributed multifrontal solver, Clique.




next up previous [pdf]

Next: Introduction Up: Reproducible Documents

2014-08-20