Skip to content

Commit

Permalink
Updating documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
mmolimar committed Jul 5, 2020
1 parent 8a65ec4 commit 44288dc
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 1 deletion.
2 changes: 2 additions & 0 deletions config/kafka-connect-fs.properties
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@ topic=mytopic
policy.class=com.github.mmolimar.kafka.connect.fs.policy.SimplePolicy
policy.recursive=true
policy.regexp=^.*\.txt$
policy.batch_size=0
file_reader.class=com.github.mmolimar.kafka.connect.fs.file.reader.TextFileReader
file_reader.batch_size=0
7 changes: 7 additions & 0 deletions docs/source/config_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,13 @@ General config properties for this connector.
* Type: string
* Importance: high

``file_reader.batch_size``
Number of records to process at a time. Non-positive values disable batching.

* Type: int
* Default: ``0``
* Importance: medium

``file_reader.<file_reader_name>.<file_reader_property>``
This represents custom properties you can include based on the file reader class specified.

Expand Down
6 changes: 5 additions & 1 deletion docs/source/connector.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Among others, these are some file systems it supports:
* S3.
* Google Cloud Storage.
* Azure Blob Storage & Azure Data Lake Store.
* FTP.
* FTP & SFTP.
* WebHDFS.
* Local File System.
* Hadoop Archive File System.
Expand Down Expand Up @@ -52,7 +52,9 @@ The ``kafka-connect-fs.properties`` file defines the following properties as req
policy.class=<Policy class>
policy.recursive=true
policy.regexp=.*
policy.batch_size=0
file_reader.class=<File reader class>
file_reader.batch_size=0

#. The connector name.
#. Class indicating the connector.
Expand All @@ -65,8 +67,10 @@ The ``kafka-connect-fs.properties`` file defines the following properties as req
``com.github.mmolimar.kafka.connect.fs.policy.Policy`` interface).
#. Flag to activate traversed recursion in subdirectories when listing files.
#. Regular expression to filter files from the FS.
#. Number of files that should be handled at a time. Non-positive values disable batching.
#. File reader class to read files from the FS
(must implement ``com.github.mmolimar.kafka.connect.fs.file.reader.FileReader`` interface).
#. Number of records to process at a time. Non-positive values disable batching.

A more detailed information about these properties can be found :ref:`here<config_options-general>`.

Expand Down
3 changes: 3 additions & 0 deletions docs/source/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ Obviously, this depends of the files in the FS(s) but having several URIs in
the connector might be a good idea to adjust the number of tasks
to process those URIs in parallel ( ``tasks.max`` connector property).

Also, using the properties ``policy.batch_size`` and/or ``file_reader.batch_size``
in case you have tons of files or files too large might help.

**I removed a file from the FS but the connector is still sending messages
with the contents of that file.**

Expand Down

0 comments on commit 44288dc

Please sign in to comment.