A sparse nonsymmetric eigensolver for distributed memory architectures
Authors:
Mario R. Guarracino -
a;
Francesca Perla a;
Paolo Zanetti -
b
| Affiliations: | a Institute for High Performance Computing and Networking, Italian National Research Council, Naples, Italy |
| b University of Naples Parthenope, Naples, Italy |
DOI:
10.1080/17445760701640324
Publication Frequency:
6 issues per year
Published in:
International Journal of Parallel, Emergent and Distributed Systems,
Volume
23,
Issue
3
June
2008
, pages 259
- 270
First Published:
June
2008
Subjects:
Algorithms & Complexity;
Computer Engineering;
Computer Science (General);
Distributed Network Systems;
Distributed Systems;
Internet & Multimedia;
Neural Networks;
Parallel Algorithms;
Parallel Systems;
Programming & Programming Languages;
Quantum Information;
Systems & Computer Architecture;
Formats available:
HTML
(English)
:
PDF
(English)
Previously published as:
Parallel Algorithms and Applications
(1063-7192)
until 2005
View Article:
View Article (PDF)
View Article (HTML)
Abstract
In this work, we propose an efficient parallel implementation of the nonsymmetric block Lanczos algorithm for the computation of few extreme eigenvalues, and corresponding eigenvectors, of real nonhermitian matrices for distributed memory multicomputers. The reorganisation of the block Lanczos algorithm implemented allows to exploit a coarse-grained parallelism and to harness the computational power of the target architectures. The computational kernels of the algorithm are matrix-matrix multiplications, with dense and sparse factors, QR factorisation and singular value decomposition. To reduce the total amount of communication involved in the matrix-matrix multiplication with a sparse factor, we substitute each matrix appearing in the algorithm with its transpose. Then, we develop an efficient parallelisation of the matrix-matrix multiplication when the second factor is sparse. Some other linear algebra operations are performed using ScaLAPACK library. The parallel eigensolver has been tested on a cluster of PCs. All reported results show the proposed algorithm is efficient on the target architectures for problems of adequate dimension.
|
| Keywords: 65F15; 65F50; 68W10 |
| view references (22) |

Download Citation

CiteULike
Del.icio.us
BibSonomy
Connotea