-
Notifications
You must be signed in to change notification settings - Fork 40
Configuration
For the most part the Bulk Import Tool is self configuring, and shouldn't need much, if any, configuration or tuning to function with good performance. In the 9 years the tool has existed, the vast majority of performance bottlenecks have been unrelated to the tool or Alfresco - things like poorly tuned or under-capacity database servers, a saturated network (especially true when Alfresco's contentstore and the source content directory are on remote devices), a failing hard drive in a RAID array, etc. are all far more common.
That said, the tool does provide a small number of tunables, all of which can be added to alfresco-global.properties
to override their default values. They are:
# The maximum "weight" of a batch. Each file in a node (whether content,
# metadata or version) counts towards this total, as does (a fraction of)
# content file size.
alfresco-bulk-import.batch.weight=100
# The size of the thread pool (during the file import phase only)
# <= 0 means autosize based on the number of CPU cores in the server
alfresco-bulk-import.file.threadpool.size=-1
# The maximum size (number of batches) allowed in the queue, before scanning
# receives back-pressure (i.e. gets blocked)
alfresco-bulk-import.batch.queue.size=100
# How long to keep inactive threads alive - see https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/TimeUnit.html for available units
alfresco-bulk-import.threadpool.keepAlive.time=10
alfresco-bulk-import.threadpool.keepAlive.units=MINUTES
Tuning Alfresco itself is also worthwhile, although in general I recommend focusing on the database as the first priority.
Alfresco versions 6.2 and above, has a new class ContentPropertyRestrictionInterceptor
that prevents using the NodeService
from adding/updating the Content Property, which disables the tool's ability to set metadata.
This class has 2 properties that it reads from which can either completely disable the interceptor, or will read off a whitelist of package/class names.
The Properties are:
-
contentPropertyRestrictions.enabled
- Which if set tocontentPropertyRestrictions.enabled=false
will disable the interceptor for the whole application, allowing the import tool to function properly. -
contentPropertyRestrictions.whitelist
- Which requires a package or class name (comma delimited list, if adding more than one class/package). When using the out of the box tool. The property should be set as such:contentPropertyRestrictions.whitelist=org.alfresco.extension.bulkimport.impl.BatchImporterImpl
. This property allows any classes in the whitelist to set the content data of a node without being intercepted.
Back to wiki home.
Copyright © Peter Monks. Licensed under the Apache 2.0 License.