DECISION TREE CLASSIFIER WITH UNDETERMINED NODES
Masanobu Yoshikawa*, Sadao Fujimura**,
Shojiro Tanakax * x, and Ryuei Nishii* * *x
* Research Associate, Faculty of Engineering, Yamanashi University, Japan
** Professor, Faculty of Engineering, University of Tokyo. Japan
* * * Associate Professor, Faculty of Engineering, Yamanashi University, Japan
* * xx Associate Professor, Faculty of Integrated Arts and Sciences, Hiroshima University, Japan
ISPRS commission III
KEYWORDS: Land Cover, Classification, Design Algorithms, Multispectral Vector, Pattern Recognition
ABSTRACT
A new approach to preserve undetermined data for classification is proposed in this paper. The proposed
classifier includes a mechanism to suspend classification for the indistinct data. The triplet decision tree has
two ‘determined nodes’ based on binary splitting of categories and one additional ‘undetermined node’ for
uncertain part of data. A design procedure for this type of triplet decision trees is proposed as an extension
of the design procedure for binary decision trees. This method maintains advantages of general tree classifiers
about computing efficiency. An effective and flexible classification is enabled by this decision tree by appling
various data segmentation methods in the feature space to uncertain sample groups. Moreover, this classifi-
cation tree has the effect to display hierarchical structure of similar categories and uncertainly-classified data
groups.
1. INTRODUCTION
In general tree classifiers, samples in a category are
processed in one group, i.e. one tree node. While
classification is very effective in these usual methods,
the following three major drawbacks are pointed out.
(P1) Decision trees have only one terminal node for
one classification category. In these tree classi-
fiers, samples mis-classified at one non-terminal
node division in the tree have no chance to cor-
rectly classified by the succeeding steps.
Land cover categories possibly have variety of
vagueness in actual data representabilities, such
as indistinct distribution or existence of adjacent
categories. It is true that usual decision trees
make it possible to adopt the node division even
with this ill conditioned data segmentation. If
a multibranch tree structure is selected. decision
trees may suit the nature of the data better, and
classification accuracy may become better. How-
ever, the processing becomes very complicated
for the design of general multibranch trees. It
is one problem that a complex tree structure is
required for accurate classification but is not de-
sirable for efficient design method.
(P3) Data segmentation is executed by rigid bound-
988
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996
aries at each tree node. In case rigid boundary
is adopted, as for the data which is far from the
boundary in the feature space, the node division
is suitable. However, as for distributions overlap
each other, the node division is less suitable and
may include many mis-classifications.
A design method of decision trees taking these
problems into account is useful for the processing of
remotely sensed data. The following mechanisms are
required for dealing with these problems:
e Samples with indefinite character can be de-
tected and considered separately;
e Classification at each node is partially executed
and decision for the indefinite samples are post-
poned to lower nodes;
e Each category is able to have plural terminal
nodes depending on its nature in the hierarchical
structure.
In this paper. a triplet tree structure is proposed to
overcome the problems considering both classification
accuracy and computing costs.