XXII ISPRS Congress 2012: Technical Commission IV

more time. It even can not load metadata more than 100,000 
one time, because it may result in‘“out of memory” error . 
Consequently, the efficiency of query and search service 
becomes lower. GeoNetwork use lucene index engine for 
querying and searching, but optimization and acceleration 
strategy are not enough. When the metadata amount is big 
enough, the query efficiency become slow. Meanwhile, stability 
and robustness are both ruined since huge amount of metadata. 
The final flaw is that the requirements of multi-user concurrent 
accessing based on massive data are not effectively satisfied on 
the internet. 
The original GeoNetwork 2.1 can not satisfy these requirements, 
so we need an optimization solution to improve. 
2. HIERACHICAL OPTIMIZATION MODEL 
Hierachical optimizatioin model(HOM) consists of software 
level and deploy level. The software level means the methods 
can be taken in some software, maybe GeoNetwork itself. The 
deploy level means that these methods can be taken when 
deploy the metadata service system . The software level needs 
modify the GeoNetwork project source code, it is inside. The 
deploy level needs construct a smart web deploy solution and 
uses some specific software to get some excellent function, like 
disaster recovery. It is outside. 
Software 
  
5 Cache 
Processing 
  
  
  
Deploy 
  
Disaster Server 
Recovery 
Cluster 
  
  
   
  
Figure 1. HOM 
2.1 Software Level 
1.Batch Processing 
GeoNetwork's data and index Operations are based on single 
record. It is almost no influence for small amount of metadata. 
When the amount increases to ten thousand, several hundred 
thousand or even millions, the impact is very large, the system 
will be surprisingly slow. When the amount of metadata is large, 
the index is also becoming large, the operation on one single 
metadata, like inserting, updating or deleting, the system will 
modify the index library, and then optimize it for effective 
management and high speed search on index. It will take several 
minutes to complete the operation on single metadata. 
Batch processing is a effective and time saving solution to 
resolve this kind of repeat operations. Each operation first 
writes the modified metadata to database, and records the 
metadata id, when all metadata writing complete, the system 
will rebuild these metadata index once. It will save a lot of time. 
2 .Cache 
Cache technology has been considered one of the effective way 
to reduce server load, network congestion and customer 
accessing delay(HE Chen,2004).In the field of geo information 
service, web cache technology is also widely used. Each big 
electronic map website, use tiles based cache technology for 
map service. A large number of cache using in client side and 
server side to avoid map redraw on map server. It consumes the 
processing time for request to the server, and enhance the 
clients response. OGC also release WMTS 1.0.0 
implementation standard, which can be used to develop scalable 
and high performance services which WMS can not. 
Cache technology can be used in metadata service system. On 
one hand, the number of user querying is times more than 
system metadata and index updating. On the other hand, users 
are usually compare the query results, even repeat query, so the 
results are repeatable. 
We design the result cache technology based on database. When 
the system gets the first query request, it performs coding 
algorithm( such as MDS algorithm ), the query string encoded 
as a unique value, then writes query string, coding value and 
query result into database. When server gets the same request 
again, it encode the query string to a value, find the value in the 
database, and returns result as response. Here we can build 
index for the encoded value, it is unique, to speedup query and 
select efficiency. 
2.2 Deploy Level 
1.Web Cluster 
Web cluster technology is the important method in solving the 
capacity and scalability of web server  system(Li 
Shuangqing,2002).Dispatcher based request dispatching 
mechanism is our metadata service system's load balancing 
mechanism. 
The metadata service system on surveying and mapping results 
run on a “4+1” service cluster, shown in Figure2.The system is 
deployed on a hardware server, we build 5 virtual machine, 
which 4 for normal use, 1 as a backup, when any one of the 4 
normal crashing, the 1 backup will be instead. 
The system uses dispatcher based request dispatching 
mechanism. The front-end node server uses Nginx as request 
dispatcher which is a reverse proxy server. As the service 
system and portal use session for service, Nginx uses ip hash as 
load balancing mechanism. Each request will be dispatched to a 
fixed server by Hash result of access IP, so it can effectively 
solve the session problem. 
The advantages of this technology are: it can ensure the system 
performance and service capabilities, it is extensible, it can 
overcome the Java limits on a single machine. System service 
capability is related to the number of machine in cluster. The 
disadvantage is that the background data synchronization is 
more complex, we need synchronize several times.
1
2
...
257
258
259
260
261
...
544
545
Full text: Technical Commission IV (B4)

Access restriction

Copyright

Note to user