XXII ISPRS Congress 2012: Technical Commission IV

   
me XXXIX-B4, 2012 
ther words, there is no 
ional publish/subscribe 
from data sources. 
>, this paper proposes a 
idaptor module in a 
iy accept pushed data, 
nat pull data from data 
a to the next module in 
e this proposed solution 
UTION 
adaptor module in a 
sor data from pull-based 
er. Other modules of a 
us query engine, are out 
'ee major components, 
ve feeder, and (3) sensor 
nit, the query aggregator 
oid redundant requests. 
et new data with the 
Finally, the sensor data 
sor data according to the 
chitecture is shown in 
ensor data source as it is 
ds to share sensor data 
nly supports pull-based 
iajor issue we mentioned 
ree 
zi 
che 4 
—À 
7T notifications 
tor 
  
uu 
  
: and workflow 
present the details of the 
T. 
the query aggregator, we 
ensor web context. Since 
phenomenon (e.g. wind 
ion and time point, each 
following five elements, 
entifier, à measurement 
'raphical location, and à 
  
International Archives of the Photogrammetry, Remote Sensin 
  
   
  
g and Spatial Information Sciences, Volume XXXIX-B4, 2012 
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia 
time point. Moreover, since sensor readings are pushed to a 
sensor web service for users to retrieve, some additional 
parameters are required to locate the sensor readings, such as 
service location on the Internet (i.e., service URL) and the 
observation offering ID in the OGC context. 
Therefore, when users want to register a query for sensor data in 
OGC SOS, they need to specify the service location, an 
observation offering ID, a observed property URI (which is the 
identifier for the physical phenomenon), a geographical 
coverage (i.e., a bounding box), and a temporal coverage (i.e., a 
time period). In addition, since the objective of this proposed 
system is to retrieve “new” data in a timely manner, the 
temporal coverage could move forward as time goes by, which 
is called the sliding window. Besides the sliding window, there 
are two other types of temporal window, namely, fixed window 
(the temporal coverage will not change) and landmark window 
(the start time point is fixed while the end time point is 
moving). Therefore, in our system, users need to specify the 
type of temporal window they want to use. 
After defining what a query is in the senor web context, we now 
present the functionality of the query aggregator. Since most 
sensor web services are based on pulling interaction model, the 
input adaptor needs to proactively requests data from services. 
However, since queries from users could have different but 
overlapped geographical and temporal coverage, if we pull data 
from sensor web services based on each query, the overlapped 
spatio-temporal coverage will be transmitted redundantly. These 
redundant transmissions could cause huge and unnecessary 
burden on both service-side and client-side as the amount of 
sensor data growing rapidly. Therefore, we propose the query 
aggregator to aggregate and filter out unnecessary requests to 
pull data from sensor web service efficiently. We consider this 
query aggregator as one of the major contributions of this paper. 
In the query aggregator, we utilize the LOading Spatio- 
Temporal Indexing Tree (LOST-Tree) (Huang et al. 2011) as 
data loading management component to aggregate user queries 
and avoid redundant data transmission. LOST-Tree uses two 
key ideas to aggregate requests and specify the loaded portions. 
First, LOST-Tree applies predefined hierarchical spatial and 
temporal frameworks, so that both the spatial and temporal 
extents of requests can be indexed for loading management. 
Since the frameworks are predefined, LOST-Tree can simply 
compare spatial and temporal indices between requests to filter 
out redundant transmission. Also, because the frameworks are 
hierarchical, LOST-Tree can aggregate several indices to attain 
a smaller tree size, which consequently results in a smaller 
memory footprint and query latency. In this paper, we use 
quadtree as the spatial framework and Gregorian calendar as the 
temporal framework. Second, LOST-Tree uses only the spatio- 
temporal extent of requests to specify the loaded portions. Since 
LOST-Tree only manages the spatio-temporal extent of 
requests, LOST-Tree does not grow with the sensor data 
volume, which also allows LOST-Tree to attain a small memory 
footprint and query latency. 
23 Adaptive Feeder 
After the query aggregator aggregates and filters out 
unnecessary requests, the aggregated requests are forwarded to 
the adaptive feeder. The major problem to retrieve sensor data 
from a pull-based data source is that we do not know when a 
New data will be available in the service. À naïve solution is to 
frequently and periodically send requests to the SOS servers. 
  
However, this approach could generate many unnecessary 
requests with empty-hit response (i.e., no data contains in the 
response). 
Therefore, in order to address this issue, the adaptive feeder 
attempts to predict when new data will be available in SOS 
servers. By detecting the sensor sampling frequency (i.e., the 
frequency that a sensor measure a phenomenon), the adaptive 
feeder modifies the requesting frequency accordingly. Although 
the sampling time (the time that the data was measured) and 
valid time (the time that the data is available online) are 
different, a client can only speculate the valid time from the 
sampling time, as the valid time is not available for the client. 
In our current adaptive feeder design, the best scenario is that 
the new sensor reading becomes available right after it is 
measured (i.e., small difference between sampling time and 
valid time). The adaptive feeder will be able to retrieve the data 
in a timely manner as the prediction is close to reality. 
However, sensor readings sometimes need to be buffered or 
calibrated before being inserted into web service. In this case, 
even though the valid time could be very different from the 
prediction, the adaptive feeder can still retrieve data no later 
than the sampling frequency as soon as the data becomes 
available online. 
3. EXPERIMENTAL RESULTS 
In this section, we present the preliminary experimental results 
of the proposed system. We tested the proposed solution on two 
existing sensor web services (here we name them as service A 
and service B). While both services have the same sampling 
frequency (around 15 minutes), these two services have 
different data update behaviour. Service A makes the sensor 
data available as soon as it receives data from sensors, which 
could be our best scenario. Service B first buffers or calibrates 
sensor data before making them available online, in which the 
sampling time is far from the valid time. 
It is worth to note that in addition to the aforementioned 
prediction time, we also add a buffer time (i.e., 30 seconds) to 
accommodate the possible delay when services make data 
available online. In this case, our results would be 30 seconds 
worse than the best scenario. This buffer time will be adjusted 
to a shorter setting after we get more testing results. 
We record the difference between the time point that we get the 
new data and the time point that the latest reading was 
measured. This time difference evaluates how "real time" the 
proposed system can achieve. Table 1 shows the preliminary 
experimental results including the average and standard 
deviation of time difference, the number of unnecessary 
requests (i.e., request that does not retrieve any new data), and 
the total number of feedings performed in this experiment. 
As we can see in the column of service A (i.e, the best 
scenario), we can retrieve new data in the time slightly larger 
than 30 seconds, which is the buffer time. In addition, all 21 
feedings are able to retrieve new data, which means there is no 
unnecessary request in the case of service A. 
On the other hand, as we can see in the column of service B, 
since service B does not make data available online as soon as it 
is measured, the adaptive feeder will send requests every 
detected sampling frequency, which consequently causes many 
unnecessary requests. As we can see from Table 1, there is a 90
1
2
...
406
407
408
409
410
...
544
545
Full text: Technical Commission IV (B4)

Access restriction

Copyright

Note to user