4. Choose a model, describing the important
relationships seen, or hypothesized in the data (data
modeling).
5. Fit the model, using the appropriate modeling
techniques (interpolation / approximation).
6. Examine the fit, using model summaries and
diagnostic plots, testing its estimates (statistical
inference).
7. Repeat steps 4-6, until the model looks satisfactory.
Because there is usually more than one way to model the
data, it’s useful to learn which types of model are best
suited to various types of response and predictor data.
Some methods should, or should not be used, depending
on whether the response and predictors are continuous,
factors, or a combination of both.
2. VARIABLES AND DATA
The program, used to analyze the data, is S-Plus. The
power of the program S-PLUS, as a statistical modeling
language, lies in:
• its convenient and useful way of organizing data;
• its wide variety of classical and modem modeling
techniques;
• in its way of specifying models.
In the implemented GIS, many entities are defined with
their attributes. With the multivariate analysis, the
relations, among the attributes inside each class of entity,
are studied. Furthermore the analytical expressions,
bounding all the variables involved, are found. For each
entity, there is a table, like the following one.
jjBi if» ¡¡to rf* a« Jato frto gm< ¡jap
D|g[y|*l ilfttel
ora
*is*i
• v '• {
i
SIP
2
NIP
Rii
4
DCE
5
DSÎ
6 7
OC® IIP
8 9
W
10 11
.
1 .
43 00
900
600
437 40
552.90
171.504*
1.00
2 .
760q 9.0CJ
3.00
84C00
913.00
810.004*
1.00
3200
9.00
100
1096 00
1165 00
1056.004*
1.00
4
40 00
9.00
2.00
2056 00
2125.00
2026.004*
1.00
5
4100
9.00
2.00
2184 00
225400
2154.004*
1.00
6
4000
900
1.00
147100
1541.00
1441.004*
1.00
7 :
60 00
9.00
1000
97400
147230
8.704*
1.00
10C00
9.00
1200
1006 80
150720
11.804*
1.00
5
76 00
9.30
1300
7080
663.30
5.804*
1.00
10
26 00
1.00
5.00
224800
2315.00
2218.00411
0.00'
11
86 0(1
1.00
’(10
12
«
30 00
‘"5000
1.00
T»
600
"Taf
328 70
"'71200'
207.50
'"'78200'
142Û4n
632004?
0.00
odo
15
4000
9.00
400
355 00
475.00
295004*
1.00
18
4200
2.00
500
1352 00
1422.00
Ì322.0O4H
3.00
21
22
2
l±i
aM
^ ' "V-
• ■
fftoa«yac-C«*»<|;;sWE-|SOB)
■Ï 1
Estimation, hypothesis testing, and statistical inference,
in general, are based on the data. A conjectured model
may be defined implicitly or explicitly. Many types of
models may be specified in S-PLUS, using formulas,
which express the conjectured relationships, among
observed variables, in a natural way.
In the present work, many variables have preliminary
been considered. However recognizing the data available
and evident acquisition difficulties only few ones have
been really analyzed.
One of principal considered entity is the ‘business
activity’ and its attributes. Some of the attributes come
from archives (code, surface, type, etc.), others are
surveyed (number of clients for each activity) and others
come from GIS analyses (distances, surfaces, etc.).
The aim of the work is to find a model, for showing the
relation among all the variables (attributes), starting from
the available data. Before using any type of model, it’s
useful to make an analysis of data, with plotting and
summarizing data. In the follows, there are given a plot
that shows the relations of each variable used and a
correlation table of the examined data:
0 500 100015002000
0 500 100015002000
RU
0
0
K
0 o °o °«?
0
0
3 o 3
0 0 °o °«?
0
0
8*
3 0 °o ° CK?
0
0
i
9°
0
0*
0
o ÿ
0
l
0
0 0
0°
fi
8
4 0
DCE
0 0
0 0
0 0
V
fi
fi
»
o o 0
o S
OcP
o o o
0°
8
<?
3 :
/
î
°0 0
0 0°
° fi
3 3
0
v° 0
0
V
DST
0
* 5>
0
r
00
00
00
®
0
0
0
0
8°
0*
o ÿ
î
0
0
0
0
DCG
0
°0°
fi
fi
fi
00° 3
3 <?
% 3
0 0
0 0
0
0
0
0
0
SUP
fa 0 c '°3
°o 0 fi o oo oo%
0
9 ô 8 o oo 00%
0
$> 36 o oo oo%
0.02.55.07.3 0.C.55.H7.5
0 5001000150(2000
0100000006060(70900
*** Correlation for the examined data:
SUP
FLU
DCE
DST
DCG
SUP
1.000
FLU
0.632
1.000
DCE
-0.348
-0.495
1.000
DST
-0.198
-0.269
0.944
1.000
DCG
-0.466
-0.700
0.883
0.726
1.000