AR
En
a
4.1. Fiducial detection
The interior orientation of an aerial image can be automated if
the fiducial marks can be detected without human interaction.
The camera manufacturers apply specific figures as fiducials,
which have given geometry; they can be described even in
graphs. The rough skeleton of the fiducial mark was drawn as a
graph (Figure 2a).
In the first application a color Wild RC20 aerial camera image
was dropped into RGB components, then the red channel was
segmented by histogram threshold. The pixel coordinates of the
binary image were the data points for the test run.
Because the fiducials of this camera type are in the corners of
the images, only the small image corners were cut out and
preprocessed. The result of the algorithm can be seen in Figure
2b.
The ordering algorithm had 100 epochs, the starting and end
learning rates were 0.9 and 0.0. The adjacency distance has
been decreased from 4 to 1 (direct neighborhood). The tuning
had 1000 epochs, 0.1 starting learning rate and zero at the end,
while the neighborhood was set back from 2 to 1.
14 05 0 £
a. initial graph structure
b. final position
Figure 2. Search of fiducials by SONG.
4.2. Detecting building structure
The detection of man-made objects focuses very often on
buildings. The SONG method was therefore tested in such
tasks. There were two experiments executed: (1) the right
position of a given structured building had to be found and (2)
the structure of a given (positioned) building had to be detected.
The first test applied one of the first images taken about the
Pentagon building in Washington DC after the attack on 11.
September 2001. The image was captured by the QuickBird
sensor with a ground resolution of 0.6 m. The initial neuron
graph was given: the structure of the famous building is known
(Figure 3a) The input data points were produced by a
maximum likelihood classification of the color image pixels, for
that training areas of two roof types were marked and used. The
result of the image classification gave a binary image, where the
true pixels were the elements of the roof (Figure 3b). The
coordinates of these pixels were read out and fed into the SONG
algorithm. The ordering phase of the algorithm had 300 epochs,
the starting learning rate and neighborhood were 0.01 and 6,
while the finishing state had values of 0 and | (direct
neighborhood) respectively. In the starting step (Figure 3c) the
20 neurons of the graph were placed somewhere within the
building, after the 10000 step long tuning (with a learning rate
interval of 0.003 — 0 and strict direct neighborhood) the graph
has found the building (Figure 3d). During the tuning phase the
direct neighborhood ensured, that the neurons merely refined
their geometrical positions instead of rough changes.
The iterative evaluation of the 3302 roof pixels took only a
couple of minutes on an Intel PIII machine (Barsi 2003b).
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B3. Istanbul 2004
xx Ium mee 6
Xo 305 3s 5 x2 ans a
aU de
a. initial neuron graph
CONS PN MER ON NO SES a, AN if MN 3
c. the starting neuron graph d. neuron graph after the
position finished iterations
Figure 3. The Pentagon test
The other building analyzing test used an another satellite
image: a 1 m IKONOS image, taken from Singapore in August
2000. The input data set was established by a rule-based RGB
pixel classification, focusing similarly on the roof. In order to
get smaller training data set the identified points were
resampled. The test applied four given graph structures having
11, 13, 19 and 21 neurons in the nodes (Figure 4).
Figure 4. Detecting building structure (Singapore) in IKONOS
image — an intermediate state with 13 neuron graph
The four variations were quite similarly controlled: the ordering
phase had about 200-600 epochs, the tuning had 500-3000.
Learning rate was between 0.01 (ordering) to 0.00001 (tuning).
The starting neighborhood was increased from 4 to 10 with the
complexity grow of the neuron graph in the ordering; the
counterpart tendency was noticed during the tuning with a
decreased neighborhood from 2 to 1 (Barsi 2003c).
b. the classified roof pixels
Interna
4.3. De!
The pre
a runni
structur
process
image
orthoph
off was
internat
The in
intensit
parame
in Figui
Figure
The rc
applica
previou