Systolic SVD and QR Decomposition by Householder Reflections
Authors:
D. J. Evans a;
M. Gusev b
| Affiliations: | a Parallel Algorithms Research Centre University of Technology Loughborough, Leics., U.K.. |
| b University "Kiril i Metodij" of Skopje, PMF Institut za Informatika, Arhimedova 5, p.f. 162, YU-91000 Skopje, Macedonia, Yugoslavia. |
DOI:
10.1080/00207160210943
Publication Frequency:
15 issues per year
Published in:
International Journal of Computer Mathematics,
Volume
79,
Issue
4
2002
, pages 417
- 439
Subjects:
Analysis - Mathematics;
Bioinformatics;
Computer Mathematics;
Discrete Mathematics;
Mathematical Finance;
Mathematical Logic;
Mathematical Numerical Analysis;
Systems & Computer Architecture;
Number of References: 20
Formats available:
PDF
(English)
View Article:
View Article (PDF)
Abstract
SVD and QR Decomposition have attracted much attention recently for solving linear algebra and digital signal processing problems. The House-holder QR Decomposition requires a smaller number of operations then the Givens QR Decomposition and therefore is strongly recommended for sequential computer implementations (as found in nearly all known libraries like NAG, LINPACK, EISPACK etc.) However for parallel implementation it has not been widely used, because the Givens Rotations possess inherent parallelism and the Householder Reflections do not.
This is true if one analyzes the original algorithm. Four independent loops and three bottlenecks between the loops are constraint for pipelining the computations. This leads to an inefficient solution, i.e. 4n time moments for each iteration and since there are n iterations then the total time to execute the algorithm is O(4n 2 ). Compared to 3n -2 time steps for the Givens QR Decomposition it is uneconomic. In this paper we have used some recent results for the elimination of the computational and data broadcast, and data synchronization to derive a fully localized form of the Householder QR Decomposition algorithm. We have succeeded in reorganizing and transforming the algorithm from three bottlenecks to one and four loops to one. A linear and a double pipeline array of n+1 processors are presented to solve the problem in O ( n 2 /2) time steps. It is also shown that the bottlenecks for the bidiagonalization and SVD computation cannot be eliminated. |
| Keywords: Singular Value Decomposition; Qr Decomposition; Householder Reflections; Computational Broadcast Elimination; Data Broadcast Elimination; Data Dependence; Algorithm Transformation |
| view references (20) |

Download Citation

CiteULike
Del.icio.us
BibSonomy
Connotea