The Community for Technology Leaders
Visualization Symposium, IEEE Pacific (2014)
Yokohama, Japan Japan
Mar. 4, 2014 to Mar. 7, 2014
pp: 281-285
Chongke Bi , RIKEN, Wako, Japan
Kenji Ono , RIKEN, Wako, Japan
The development of supercomputers has successfully helped us to carry on complicated simulation with exploded size of dataset. For visualizing such kind of large-scale dataset, reducing the data size by using compression methods is one of the most useful approach. Moreover, parallelization of compression algorithm can greatly improve the efficiency and resolve the limitation of memory size. However, in parallel compression algorithm, interprocessor communication is indispensable, while it is also a bottleneck problem, especially for the general cases that the number of processors is not power-of-two. Parallel POD (proper orthogonal decomposition) compression algorithm is such an example, the number of time steps must be power-of-two for the binary swap scheme. A method that can fully resolve this problem with low computational cost will be very popular. In this paper, we proposed such an approach called 2-3-4 combination approach, which can be simply implemented and also reach high performance of parallel computing algorithms. Furthermore, our method can obtain the best balance among all parallel computing processors. This is achieved by transferring the non-power-of-two problem into power-of-two problem to fully use the best balance feature of binary swap method. We evaluate our approach through applying it to the parallel POD compression algorithm on the K computer.
Program processors, Computers, Vectors, Compression algorithms, Parallel processing, Binary trees, Computational modeling,Parallelism and concurrency, 2-3-4 combination, non-power-of-two, compression, proper orthogonal decomposition, K computer, Approximate methods
Chongke Bi, Kenji Ono, "2-3-4 Combination for Parallel Compression on the K Computer", Visualization Symposium, IEEE Pacific, vol. 00, no. , pp. 281-285, 2014, doi:10.1109/PacificVis.2014.28
637 ms
(Ver 3.3 (11022016))