Skip to content
This repository has been archived by the owner on Sep 25, 2022. It is now read-only.

Configuration

Sean edited this page Jan 24, 2020 · 6 revisions

For the most part the Bulk Import Tool is self configuring, and shouldn't need much, if any, configuration or tuning to function with good performance. In the 9 years the tool has existed, the vast majority of performance bottlenecks have been unrelated to the tool or Alfresco - things like poorly tuned or under-capacity database servers, a saturated network (especially true when Alfresco's contentstore and the source content directory are on remote devices), a failing hard drive in a RAID array, etc. are all far more common.

That said, the tool does provide a small number of tunables, all of which can be added to alfresco-global.properties to override their default values. They are:

# The maximum "weight" of a batch.  Each file in a node (whether content,
# metadata or version) counts towards this total, as does (a fraction of)
# content file size.
alfresco-bulk-import.batch.weight=100

# The size of the thread pool (during the file import phase only)
# <= 0 means autosize based on the number of CPU cores in the server
alfresco-bulk-import.file.threadpool.size=-1

# The maximum size (number of batches) allowed in the queue, before scanning
# receives back-pressure (i.e. gets blocked)
alfresco-bulk-import.batch.queue.size=100

# How long to keep inactive threads alive - see https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/TimeUnit.html for available units
alfresco-bulk-import.threadpool.keepAlive.time=10
alfresco-bulk-import.threadpool.keepAlive.units=MINUTES

Tuning Alfresco itself is also worthwhile, although in general I recommend focusing on the database as the first priority.

Note on configuration for Alfresco versions 6.2+

Alfresco versions 6.2 and above, has a new class ContentPropertyRestrictionInterceptor that prevents using the NodeService from adding/updating the Content Property, which disables the tool's ability to set metadata.

This class has 2 properties that it reads from which can either completely disable the interceptor, or will read off a whitelist of package/class names.

The Properties are:

  • contentPropertyRestrictions.enabled - Which if set to contentPropertyRestrictions.enabled=false will disable the interceptor for the whole application, allowing the import tool to function properly.
  • contentPropertyRestrictions.whitelist - Which requires a package or class name (comma delimited list, if adding more than one class/package). When using the out of the box tool. The property should be set as such: contentPropertyRestrictions.whitelist=org.alfresco.extension.bulkimport.impl.BatchImporterImpl. This property allows any classes in the whitelist to set the content data of a node without being intercepted.

Back to wiki home.

Clone this wiki locally