In the first synthetic example, we perform viscoacoustic modeling on a multi-scale layered model with a single GeForce GTX760 GPU and a single core of Intel Core i5-4460 CPU for speedup comparison. As is shown in Figure 3, the scale of these layered models varies from to grids. We record the mean runtime per time step of viscoacoustic modeling using a single CPU core and a single GPU at each model scale, and their corresponding speedup ratio, which are presented in Table 2. CPU-based simulation is compiled by GNU C++ compiler (g++ 4.8.4) with FFTW 3.3.2. GPU-based simulation is compiled by CUDA C with the CUFFT library API. Figure 4 shows the mean runtime per time step and the corresponding speedup ratio against model scale. It indicates that the presented cu-RTM package running on a single GPU card can nearly be 50-80 times faster than the conventional CPU implementation with a single CPU core. Furthermore, simulation on a larger model scale tends to achieve a greater speedup ratio.
Fig3_v
Figure 3. Velocity models for multi-scale layered model. |
---|
Model Scale (grids) | 128 128 | 256 256 | 512 512 | 1024 1024 | 2048 2048 |
CPU Runtime (ms) | 9.7170 | 43.5925 | 101.3938 | 359.0682 | 1855.8382 |
GPU Runtime (ms) | 0.1839 | 0.8195 | 1.8262 | 6.3267 | 22.3263 |
Speedup Ratio | 52.8385 | 53.1940 | 55.5217 | 56.7544 | 83.1234 |
Fig4_v
Figure 4. The mean runtime per time step of viscoacoustic modeling using a single GTX760 GPU relative to a four-core Intel Core i5-4460 CPU and the corresponding speedup ratio against model scale. |
---|