Full text: Proceedings, XXth congress (Part 3)

resolution 
ase to an 
' and costs 
ypically in 
supervised 
tral scene 
en, pixels 
features is 
>s, and the 
extracted. 
| by a d- 
ndamental 
1ch that all 
ame class; 
y a single 
| shape of 
g rate and 
ssume that 
a common 
jects; any 
ts in terms 
sition and 
structures 
n efficient 
a specific 
data (pixel 
the object- 
based on a 
the pixels 
xels of an 
n, spectral 
rique must 
eans of an 
ect-feature 
The path- 
that pixels 
other by a 
where the 
n the path 
thesis and 
the scene 
International Archives of the Photogrammetry, Remo Sensing and Spatial Information Sciences, Vol XXXV, Part B3. Istanbul 2004 
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
Figure 1. MSS image Object-Based scene representation 
2. MODELLING AND DEFINITIONS 
The scene (in this work this is assumed it to be part of the 
Earth's surface) is the target of the remote sensing system, 
which is under investigation and the interest is to extract 
information about the scene's structure and content (Tso and 
Mather, 2001). The desired information is assumed to be 
contained in the spectral, spatial, and temporal variation of 
electromagnetic energy coming from the scene which is 
gathered by the sensors (Hapke, 1993). Typically a complex 
scene is composed of relatively simple objects of different sizes 
and shapes, each object of which contains only one class of 
surface cover type. The scene is often described by classifying 
the objects and recording their relative positions and orientation 
in the scene in terms of tabulated results and/or a thematic-map. 
[n a remote sensing system, primary features of a scene are 
formed by multispectral observations, which are accomplished 
by spatially and spectrally sampling the scene. A multispectral 
sensor samples several spectral dimensions and one spatial 
dimension from the scene at a given instant of time. The second 
spatial dimension can be provided by the motion of the platform 
which carries the scanner over the region of interest, generating 
a raster scan; alternately, the raster can be provided by area 
array detector. Thus, through the data acquisition system, the 
scene may view in an image from taken at each of a number of 
electromagnetic wavelengths. This image can be thought of as a 
multi-layer matrix whose elements are called pixels (Tso and 
Mather, 2001). One of the important characteristics of such data 
is the special nature of the dependence of the feature at a lattice 
point to that of its neighbours. The unconditional correlation 
between two pixels in spatial proximity to one another is often 
high, and such correlation usually decreases as the distance 
between pixels increases. 
One of the distinctive characteristics of the spatial dependence 
in multispectral data is that the spectral separation between two 
adjacent pixels is less than two non-adjacent pixels, because the 
sampling interval tend to be generally smaller than the size of 
an object; i.e., two pixels in spatial proximity to one another are 
unconditionally correlated with the degree of correlation 
decreasing as the distance between them increases. The results 
of study on measurement of different order statistical spatial 
dependency in image data, specially the measurement of first, 
second and third order amplitude statistics along an image scan 
line show considerable correlation between adjacent pixels. 
Seyler concluded, from the measurement of the distribution of 
the difference between adjacent pixels, that the probability that 
two adjacent pixels have the same grey level is about 10 times 
the probability that they differ by the maximum possible 
amplitude difference. Kettig (Kettig and Landgrebe, 2001) by 
measuring the spatial correlation of multispectral data showed 
that the correlation between adjacent pixels is much less when 
conditional upon being with an object, as compared to 
unconditional correlation. High correlation among adjacent 
pixels in the observation space represents redundancy in scene 
data. When such redundancy occurs, reducing the size of the 
observation space should be possible without loss of 
information. 
As previously stated the scene is assumed to consist of 
relatively simple objects of different sizes and shapes (see 
Figure 1). The resolution of the spatial representation depends 
on both pixel size and the interval between samples, which are 
usually equal. By under-sampling information is lost; however, 
over-sampling will cause increased redundancy. Typically the 
size and shape of objects in the scene vary randomly, Figure |, 
and the sampling rate, and therefore the pixel size, is fixed; it is 
inherent in image data that data-dimensionality (the number of 
spatial-spectral observation for scene representation) increases 
faster than its intrinsic-dimensionality (the size of the smallest 
set which can represent the same scene, numerically, with no 
loss of information). Because the spatial sampling interval is 
usually comparable to the object size, it follows that each object 
is represented by an array of similar pixels. Therefore, scene 
segmentation into pixels is not an efficient approach for scene 
representation; however, a scene can be segmented into objects, 
and since the shape and size of objects match the scene 
variation, scene representation by simple-objects is more 
efficient. 
Object detection refers to finding the natural groups among the 
contiguous pixels. In other words, the data is sorted into objects 
such that the *Unity Relation" holds among members of the 
same object and not between members of different adjacent 
objects. Object extraction and clustering are similar in the sense 
that they both are methods of grouping data; however, spatial 
considerations make clustering and object extraction different. 
Because an object can be textured, the pixels within an object 
might not form a compact cluster in the measurement 
(observation) space. Also, because there can be several 
instances of a particular class of entities in a single image, 
nonadjacent objects might be nearly identical in observation 
space. Another difference is that in object extraction, the 
existence of a partition that completely separates objects is 
guaranteed. However, in clustering, if we allow underlying 
classes with overlapping density functions, the classes can 
never be completely separated in the observation space. Object 
extraction can be thought of as transforming the original image, 
which is a pixel-description of a scene into an arrangement of 
object-description. 
An object-description is often better than a pixel-description, 
for two basic reasons: 
1- More information about the scene entity is available from a 
collection of pixels associated with the object than from an 
individual pixel associated with the scene. This fact has 
been exploited by “object” classification algorithms that 
make a classification decision for each group of image 
points, for example by sequential classification (Tso and 
Mather, 2001). The potential advantages of object 
classification are especially great when class probability 
821 
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.