Viscoacoustic modeling on a layered model

In the first synthetic example, we perform viscoacoustic modeling on a multi-scale layered model with a single GeForce GTX760 GPU and a single core of Intel Core i5-4460 CPU for speedup comparison. As is shown in Figure 3, the scale of these layered models varies from $128\times 128$ to $2048\times 2048$ grids. We record the mean runtime per time step of viscoacoustic modeling using a single CPU core and a single GPU at each model scale, and their corresponding speedup ratio, which are presented in Table 2. CPU-based simulation is compiled by GNU C++ compiler (g++ 4.8.4) with FFTW 3.3.2. GPU-based simulation is compiled by CUDA C with the CUFFT library API. Figure 4 shows the mean runtime per time step and the corresponding speedup ratio against model scale. It indicates that the presented cu-RTM package running on a single GPU card can nearly be 50-80 times faster than the conventional CPU implementation with a single CPU core. Furthermore, simulation on a larger model scale tends to achieve a greater speedup ratio.


Fig3_v Figure 3. Velocity models for multi-scale layered model.

**Table 2:** The mean runtime per time step of viscoacoustic modeling using a single GTX760 GPU relative to a four-core Intel Core i5-4460 CPU and the corresponding speedup ratio against model scale.
Model Scale (grids)	128 $\times$ 128	256 $\times$ 256	512 $\times$ 512	1024 $\times$ 1024	2048 $\times$ 2048
CPU Runtime (ms)	9.7170	43.5925	101.3938	359.0682	1855.8382
GPU Runtime (ms)	0.1839	0.8195	1.8262	6.3267	22.3263
Speedup Ratio	52.8385	53.1940	55.5217	56.7544	83.1234


Fig4_v Figure 4. The mean runtime per time step of viscoacoustic modeling using a single GTX760 GPU relative to a four-core Intel Core i5-4460 CPU and the corresponding speedup ratio against model scale.

2020-04-03