-
Notifications
You must be signed in to change notification settings - Fork 0
Commands
These are the command line commands of Annif, with REST API equivalents when applicable.
Most of These methods take a projectid
parameter. Projects are
identified by alphanumeric strings (A-Za-z0-9_-
).
annif list-projects
REST equivalent:
GET /projects/
Show a list of currently defined projects. Projects are defined in a
configuration file, normally called projects.cfg
. See Project configuration for details.
annif show-project <projectid>
REST equivalent:
GET /projects/<projectid>
annif train <projectid> <path> [<path2> ...]
Parameters:
-
path
: path(s) to a directory containing text files in the corpus format, or a TSV file (possibly gzipped)
This will train the project using all the documents from the given directory or TSV file in a single batch operation.
REST equivalent: N/A
annif analyze <projectid> [--limit=MAX] [--threshold=THRESHOLD] <document.txt
This will read a text document from standard input and suggest subjects for it.
Parameters:
-
limit
: maximum number of subjects to return -
threshold
: minimum score threshold, below which results will not be returned
REST equivalent:
POST /projects/<projectid>/analyze
annif eval <projectid> [--limit=MAX] [--threshold=THRESHOLD] <path> [<path2> ...]
You need to supply the documents in one of the supported Document corpus formats, i.e. either as a directory or as a TSV file. It is possible to give multiple corpora (even mixing corpus formats), in which case they will all be processed in the same run.
The output is a list of statistical measures.
Parameters:
-
limit
: maximum number of subjects to return -
threshold
: minimum score threshold, below which results will not be returned -
path
: path(s) to a directory containing text files in the corpus format or a TSV file (possibly gzipped)
REST equivalent: N/A
annif optimize <projectid> <path> [<path2> ...]
As with eval
, you need to supply the documents in one of the supported Document corpus formats.
This command will read each document, assign subjects to it using different limit and threshold values, and compare the results with the gold standard subjects.
The output is a list of parameter combinations and their scores. From the output, you can determine the optimum limit and threshold parameters depending on which measure you want to target.
Parameters:
-
path
: path(s) to a directory containing text files in the corpus format or a TSV file (possibly gzipped)
REST equivalent: N/A
annif run
This will start a development web server on http://localhost:5000/ .
REST equivalent: N/A
- Home
- Getting started
- System requirements
- Optional features and dependencies
- Usage with Docker
- Architecture
- Commands
- Web user interface
- Corpus formats
- Project configuration
- Analyzers
- Achieving good results
- Reusing preprocessed training data
- Running as a WSGI service
- Backends
- Development flow, branches and tags
- Release process
- Creating a new backend