We propose "P-scheme" for solving recurrence equations for a tridiagonal linear system of equations on distributed-memory parallel computers, but its effectiveness is limited to the case where the problem is enough large. The limitation is mainly due to the communication cost of propagation phase of P-scheme.
In order to overcome the difficulty, we use "message vectorization", which aggregates several communication messages into one, to alleviates the communication cost of P-scheme and evaluate the effectiveness of message vectorization for tridiagonal matrix solver. Our experiments prove that the improved version of P-scheme works well for smaller problems on distributed environment like PC cluster systems and show linear and super-linear speedups can be achieved for 8194 × 8194 and 16386 × 16386 problems, respectively.