ul 2004
borhood
ribe the
1Sses. In
ed from
rorithms
od and
In this
f each
sj
- [ficient
ble it is
pixel n
— (4)
:lass (,
unction
pixels,
nolL is
eli
ws the
of the
ing the
se new
f label
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B4. Istanbul 2004
po (Wi | + QU (Wi |
k 2
EP (Wy I MCA )
ow, js (6)
where P;“ (cm, )= the probability of pixel i belongs to class «y
of the t-th iteration
qi" («)-7 neighborhood function of pixel i belongs to
class c of the t-th iteration;
Therefore relaxation is an iterative technique which
probabilities of neighbouring pixels are used iteratively to
update the probabilities for a given pixel based on a relation
between the pixel labels specified by compatibility coefficient.
This approach is computationally intensive and robust to image
noise (zur Erlangung, 1999).
3. LEARNING ATOMATA AND ENVIRONMENT
The goal of many intelligent problem-solving systems is to be
able to make decisions without a complete knowledge of the
consequences of the various choices available. In order for a
system to perform well under conditions of uncertainty, it has to
be able to acquire some knowledge about the consequences of
different choices. This acquisition of the relevant knowledge
can be expressed as a learning problem.
Learning Automata is a model of computer learning which has
been used to model biological learning systems and to find the
optimal action which is offered by a random environment.
Learning automata has found applications in system that
process incomplete knowledge about the environment in which
they operate. These applications includes parameter
optimization, statistical decision making, telephone routing,
pattern recognition, game playing, natural language processing,
modeling ^ biological learning systems, and object
partitioning(Oommen|, 2003). The learning loop involves two
entities: the environment and learning automata; the actual
process of learning is represented as a set of interactions
between the environment and the learning automata the
learning automata is limited to choosing only one of actions at
any given time from a set of actions!a;, .., à,} which are
offered by the environment. Once the learning automata decide
on an action a; this action will serve as input to the
environment. The environment will then respond to the input by
either giving a reward, or a penalty, based on the penalty
probability c; associated with a; . This response serves as the
input to the automata. Based upon the response from the
environment and the current information accumulated so far,
the learning automata decide on its next action and the process
repeats. The intention is that the learning automata gradually
converge toward an ultimate goal.
lO n. 1
|C1,€2,€5,. .. Cn f
a(n) Xn)
Random Environment
Learning Automata — |«
Figure2: Interaction between environment and automata
3.1 Fixed Structure Learning Automata
Fixed structure automata exhibit transition and output matrices
which are time invariant. A-ía,B,F,G.q! is a fixed structure
automata which a= {a,, …, @,} is the set of r actions offered by
the environment that the learning automata must choose from,
B= (0, 1j is the set of inputs from the environment, q is set of
inner state of automata, F is set of updating inner state automata
based on exist state automata and penalty and reward of
environment, G is choosing action function based on new state
of automata
3.2 Variable Learning Automata
Variable structure automata exhibit transition and output
matrices which are change with time, a variable learning
automata can be formally defined as a quadruple (Oommenl,
2003):
Ao, P, b, T! (7)
where, à = {dy, …, 0,} is the set of r actions offered by the
environment that the LA must choose from.
P = [pi(n), ..., p,(n)] is the action probability vector
where pi represents the probability of choosing action
a; at the nth time instant.
B = (0, 1} is the set of inputs from the environment
where ‘0’ represents a reward and ‘1° a penalty.
T: P x B — P is the updating scheme. and defines the
method of updating the action probabilities on
receiving an input from the environment.
If(B=1&& à; is chosen ) then P,(n+1)=P;(n)+o[1- P;(n)]
If(B— L&& a; is chosen ) then Pj(n*1)- 1-a)P;(n) 2 jv (8)
H(B=0&& «; is chosen ) then Pi(n+1)=(1-b)P;(n)
If(B-0&& a; is chosen) then Pi(n* 1)-b/(r-1 )+(1-b)P;(n) = jvi
According to equation 8 if a and b be equal the learning
algorithm will be known as linear reward penalty. If b««a the
learning algorithm will known as linear reward epsilon penalty
and if b-0 the learning algorithm will be a linear reward
inaction.
4. LEARNING CELLULAR ATOMATA
Learning cellular automata A and its environment E are defined
as follows (Fei Qian 2001).
A 7 (U,.X, Y, QN, & FO, T) (9)
E - (Y.C, r! (10)
where, U = {u, j = Le 2, , A} is the cellular space.
X7 (xj, Ox j € o] is the set of inputs
Y 7 (yj, 0 xj € oo] is the set of outputs
N = {nl, - - - , n[N]} is the list of neighborhood
relations.
Q 7 (qj, 0xj € oo] is the set of internal states.
&£:U- QQ CU is the neighborhood state
configuration function