grid point
eight Z
it surface
ont with
ity value G
els, and the
ted as there
1s of the fol-
(1)
ect surface
inding grid
he image
ed by intro-
dard devia-
linear in Z
ent is found
IDWARE
sthod within
reconstruc-
ively paral-
computing
from paral-
acting with
design the
al notation
| Technique
Nis purpose
'al-purpose
for design
1 has been
by coding
using only
al Machine
| for paral-
ed to mini-
munication
between the nodes of the parallel computer, which will also
strengthen the portability of the application.
The application is currently being developed and tested on
a TN330 parallel computer manufactured by Telmat Multin-
ode, France. The TN330 being used by the project has
68 nodes with 32 Mbytes of memory. Each node consist
of a superscalar T9000 processor capable of peak perfor-
mance of 200 MIPS and 25 MFLOPS. Each T9000 proces-
sor has four communication links which provide a total of 80
Mbytes/second bidirectional bandwith.
In the least squares adjustment the size and structure of the
normal equations matrix depend on the unknown quantities.
Even if the brightness values of each point on the surface are
reduced from the normal equations there still are millions of
unknowns in the system. The magnitude of the system im-
plies a demand for much memory and fast computing. This
problem is tackled in four ways:
1. the unknown intensity values G in (1) are eliminated
in the matching,
2. instead of using a regular DEM grid a variable sized
grid network following breaklines will be used to cut
down the number of unknowns (c.f. section 6),
3. optimizing the solving of the normal equations (c.f.
section 5), and
4. the system is divided into smaller geographical parts
and processed in parallel.
5 THE SOLUTION OF THE NORMAL EQUATIONS
The most computation intensive part of the system is the so-
lution of the normal equations:
Nx=h
Here, the matrix N of the normal equations is obtained from
the design matrix A and the diagonal weight matrix W.
N=ATWA, h=ATW]
The system includes two models and their mutual depen-
dencies. One of the models represent the DEM grid points,
each dependent on all eight grid points around it. The other
one of the two models represent the orientation parameters
of each image. The matrix N can be dissected into blocks
corresponding to the two models:
Na Np
N-
( Na Np
Here, the submatrix N;; corresponds to the DEM, which im-
plies that its dimension is equal to the number of grid points
and that there are nine non-zero elements on each row (if
only regular DEM's are used). If the matrix N is formed by
sweeping through the DEM row by row, a diagonal band with
a semi-bandwidth of one and two bands further away from
the diagonal symmetrically around it are obtained.
The submatrix N,, corresponds to the image orientations,
which gives the submatrix a dimension of 6 - Nimages- Since
the orientation parameters of different pictures are not de-
pendent on each other, N5; consists of blocks of size 6 x 6.
In the calculations for storage and operation count, the sub-
matrix is assumed banded with a semi-bandwidth of 5.
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996
Since the system is overdetermined, the matrix A is a full
rank matrix and the normal equations’ matrix N is positive
definite, which implies that there are a number of methods
to solve the equation Nx = h.
5.1 Solution algorithms
Because the matrix N is large, solution methods which do
not require a great deal of memory are preferred. In gen-
eral, N is stored in a sparse matrix format, and whenever
possible, the symmetry of N is exploited by storing only the
upper triangular of N.
The operation counts of various solution methods can be es-
timated using formulae given in (Pissanetzky 1984). For the
factorization of a banded matrix, it gives:
!B(B+3)n-1B*-B?-2B multiplications and divisions
1B(B-Dn-ip?-19?- 1p additions
Here, n is the dimension and B is the semi-bandwidth of the
matrix. Similarly, the following formulae are given for the the
forward and back substitution of a banded matrix:
(@B+1)n-B?-B multiplications and divisions
2Bn-B?-B additions
Direct solution by Cholesky block factorization. Out of
the direct solution methods, Cholesky block factorization is
a very beneficial method in this case (George & Liu 1981,
Pissanetzky 1984). In that method, the matrix is dissected
into blocks and thus subsystems which are then factorized
independently. The mutual dependencies of the diagonal
blocks are moved to the far right in the submatrix Nj; and
thus joined to the block N;5, which is not actually factorized
but otherwise taken into account in the solution of the lower
part of the equation.
Several grid ordering schemes: the Reverse Cuthill - McKee
scheme, the Minimum Degree algorithm and the Nested Dis-
section algorithm have been suggested for the reduction of
matrix fill-in in the factorization phase. Fill-in reduction has
two positive effects: on one hand, memory is saved, and on
the other hand, both the factorization and the forward and
back substitutions demand fewer operations than if no spe-
cific ordering scheme were used.
The direct solution is parallelized by spawning two tasks on
every node and sending a slice of N; to one of them and
a slice of both Nj; and Nj, to the other one. For symme-
try reasons, the block N,; is not stored at all, and only the
upper triangular of the block Nj; is stored. The whole block
Ny; is stored, which makes it possible to perform an efficient
parallelized Gauss elimination on it. The blocks Nj; and Nj;
are stored in sets of linked lists to save memory in the fac-
torization process. In the block factorization, the block N;,
becomes essentially full, and it is thus stored in an array.
For a Gauss elimination on p processors, the following op-
eration counts can be obtained:
3 Le Hia iud
CAM In? +(3- z)n multiplications and divisions
n "ant n si
o I additions
In the following, the blocks Nj; and N5; are given dimen-
sions n; and nm, respectively. The block N;; is assumed
banded with a semi-bandwidth of B — m which is a worst-
case approximation. Due to the block formation, a growth
of 2(,/p—1),/n is expected in the dimension n,. The oper-
ation count for the direct solution is hence proportional to the
333