Skip to content
María Arias de Reyna edited this page Nov 19, 2013 · 21 revisions

Right now, there is no clear way to cluster and scale geoNetwork. There is a “readonly instance” implementation that can be used, but there is no efficient way to keep this “readonly instance” synchronized with the rest of geoNetwork instances, which means that there will be inconsistencies between different instances.

We have three proposals on how to achieve good scalability in geoNetwork.

Master server and slaves with daily synchronization

This is the solution currently used in some portals. There is one master server and several read-only instances that are database-restored daily to synchronize with the master instance. Changes will take up to a day to propagate to all instances. This solution means that there will be no developments on GeoNetworks'core, but an external tool will be implemented to automatically do the synchronization.

Master server with slaves with daily harvesting

This option is similar to the previous one, but using harvesting between the master and the slaves instead of removing and rebuilding the database and index. This is not so widely used because harvesting is sometimes slower than rebuilding the database and indexes. See the harvesting proporsal

Balanced Servers with JMS

This is a non-merged old proposal. In this case, we have several GeoNetwork instances, all of them writable, and all of them connected to the same database. This way, all instances have the same features and data, as all of them are using the same storage (database).

On current proposal, every instance have its own Lucene index, but if we use a SOLR index, then all instances of GeoNetwork can use the same index too, which means that all instances will be sync on real time.

comparison table

The JMS solution seems to be the most complete and right way to cluster and scale geonetwork.

Clone this wiki locally