Skip to content

Commit

Permalink
Bump nextflow docs 23.12.0-edge
Browse files Browse the repository at this point in the history
Signed-off-by: Paolo Di Tommaso <[email protected]>
  • Loading branch information
pditommaso committed Dec 26, 2023
1 parent c038f2b commit b741e64
Show file tree
Hide file tree
Showing 70 changed files with 476 additions and 259 deletions.
2 changes: 1 addition & 1 deletion assets/docs/edge/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 7c40cf005ba0e6b4ce91c56d3c010b82
config: 5d9195a7465e948284a07f84b8df5dee
tags: 645f666f9bcd5a90fca523b33c5a78b7
12 changes: 6 additions & 6 deletions assets/docs/edge/_sources/amazons3.md.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
(amazons3-page)=

# Amazon S3 storage
# AWS S3 storage

Nextflow includes support for Amazon S3 storage. Files stored in an S3 bucket can be accessed transparently in your pipeline script like any other file in the local file system.
Nextflow includes support for AWS S3 storage. Files stored in an S3 bucket can be accessed transparently in your pipeline script like any other file in the local file system.

## S3 path

Expand All @@ -24,10 +24,10 @@ See the {ref}`script-file-io` section to learn more about available file operati

## Security credentials

Amazon access credentials can be provided in two ways:
AWS access credentials can be provided in two ways:

1. Using AWS access and secret keys in your pipeline configuration.
2. Using IAM roles to grant access to S3 storage on Amazon EC2 instances.
2. Using IAM roles to grant access to S3 storage on AWS EC2 instances.

### AWS access and secret keys

Expand All @@ -52,13 +52,13 @@ If the access credentials are not found in the above file, Nextflow looks for AW

More information regarding [AWS Security Credentials](http://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html) are available in the AWS documentation.

### IAM roles with Amazon EC2 instances
### IAM roles with AWS EC2 instances

When running your pipeline in an EC2 instance, IAM roles can be used to grant access to AWS resources.

In this scenario, you only need to launch the EC2 instance with an IAM role which includes the `AmazonS3FullAccess` policy. Nextflow will detect and automatically acquire the permission to access S3 storage, without any further configuration.

Learn more about [Using IAM Roles to Delegate Permissions to Applications that Run on Amazon EC2](http://docs.aws.amazon.com/IAM/latest/UserGuide/roles-usingrole-ec2instance.html) in the Amazon documentation.
Learn more about [Using IAM Roles to Delegate Permissions to Applications that Run on AWS EC2](http://docs.aws.amazon.com/IAM/latest/UserGuide/roles-usingrole-ec2instance.html) in the AWS documentation.

## China regions

Expand Down
33 changes: 31 additions & 2 deletions assets/docs/edge/_sources/aws.md.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
(aws-page)=

# Amazon Web Services
# AWS Cloud

## AWS security credentials

Expand Down Expand Up @@ -132,7 +132,7 @@ See the [bucket policy documentation](https://docs.aws.amazon.com/config/latest/

## AWS Batch

[AWS Batch](https://aws.amazon.com/batch/) is a managed computing service that allows the execution of containerised workloads in the Amazon cloud infrastructure. It dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized compute resources) based on the volume and specific resource requirements of the jobs submitted.
[AWS Batch](https://aws.amazon.com/batch/) is a managed computing service that allows the execution of containerised workloads in the AWS cloud infrastructure. It dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized compute resources) based on the volume and specific resource requirements of the jobs submitted.

Nextflow provides built-in support for AWS Batch, allowing the seamless deployment of Nextflow pipelines in the cloud, in which tasks are offloaded as Batch jobs.

Expand Down Expand Up @@ -496,6 +496,35 @@ There are multiple reasons why this can happen. They are mainly related to the C

This [AWS page](https://aws.amazon.com/premiumsupport/knowledge-center/batch-job-stuck-runnable-status/) provides several resolutions and tips to investigate and work around the issue.

## AWS Fargate

:::{versionadded} 23.12.0-edge
:::

Nextflow provides experimental support for the execution of [AWS Batch jobs with Fargate resources](https://docs.aws.amazon.com/batch/latest/userguide/fargate.html).

AWS Fargate is a technology that you can use with AWS Batch to run containers without having to manage servers or EC2 instances.
With AWS Fargate, you no longer have to provision, configure, or scale clusters of virtual machines to run containers.

To enable the use of AWS Fargate in your pipeline use the following settings in your `nextflow.config` file:

```groovy
process.executor = 'awsbatch'
process.queue = '<AWS BATCH QUEUE>'
aws.region = '<AWS REGION>'
aws.batch.platformType = 'fargate'
aws.batch.jobRole = 'JOB ROLE ARN'
aws.batch.executionRole = 'EXECUTION ROLE ARN'
wave.enabled = true
```

See the AWS documentation for details how to create the required AWS Batch queue for Fargate, the Batch Job Role
and the Batch Execution Role.

:::{note}
This feature requires the use {ref}`Wave <wave-page>` container provisioning service.
:::

## Advanced configuration

Read the {ref}`AWS configuration<config-aws>` section to learn more about advanced configuration options.
62 changes: 56 additions & 6 deletions assets/docs/edge/_sources/config.md.txt
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,11 @@ The following settings are available:
`apptainer.noHttps`
: Pull the Apptainer image with http protocol (default: `false`).

`apptainer.ociAutoPull`
: :::{versionadded} 23.12.0-edge
:::
: When enabled, OCI (and Docker) container images are pulled and converted to the SIF format by the Apptainer run command, instead of Nextflow (default: `false`).

`apptainer.pullTimeout`
: The amount of time the Apptainer pull can last, exceeding which the process is terminated (default: `20 min`).

Expand Down Expand Up @@ -169,8 +174,13 @@ The following settings are available:
`aws.batch.delayBetweenAttempts`
: Delay between download attempts from S3 (default: `10 sec`).

`aws.batch.executionRole`
: :::{versionadded} 23.12.0-edge
:::
: The AWS Batch Execution Role ARN that needs to be used to execute the Batch Job. This is mandatory when using AWS Fargate platform type. See [AWS documentation](https://docs.aws.amazon.com/batch/latest/userguide/execution-IAM-role.html) for more details.

`aws.batch.jobRole`
: The AWS Job Role ARN that needs to be used to execute the Batch Job.
: The AWS Batch Job Role ARN that needs to be used to execute the Batch Job.

`aws.batch.logsGroup`
: :::{versionadded} 22.09.0-edge
Expand All @@ -188,6 +198,11 @@ The following settings are available:
`aws.batch.maxTransferAttempts`
: Max number of downloads attempts from S3 (default: `1`).

`aws.batch.platformType`
: :::{versionadded} 23.12.0-edge
:::
: Allow specifying the compute platform type used by AWS Batch, that can be either `ec2` or `fargate`. See AWS documentation to learn more about [AWS Fargate platform type](https://docs.aws.amazon.com/batch/latest/userguide/fargate.html) for AWS Batch.

`aws.batch.retryMode`
: The retry mode configuration setting, to accommodate rate-limiting on [AWS services](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-retries.html) (default: `standard`, other options: `legacy`, `adaptive`); this handling is delegated to AWS. To have Nextflow handle retries instead, use `built-in`.

Expand Down Expand Up @@ -865,6 +880,37 @@ The following settings are available for Cloud Life Sciences:
`google.zone`
: The Google Cloud zone where jobs are executed. Multiple zones can be provided as a comma-separated list. Cannot be used with the `google.region` option. See the [Google Cloud documentation](https://cloud.google.com/compute/docs/regions-zones/) for a list of available regions and zones.

`google.batch.allowedLocations`
: :::{versionadded} 22.12.0-edge
:::
: Define the set of allowed locations for VMs to be provisioned. See [Google documentation](https://cloud.google.com/batch/docs/reference/rest/v1/projects.locations.jobs#locationpolicy) for details (default: no restriction).

`google.batch.bootDiskSize`
: Set the size of the virtual machine boot disk, e.g `50.GB` (default: none).

`google.batch.cpuPlatform`
: Set the minimum CPU Platform, e.g. `'Intel Skylake'`. See [Specifying a minimum CPU Platform for VM instances](https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform#specifications) (default: none).

`google.batch.installGpuDrivers`
: :::{versionadded} 23.08.0-edge
:::
: When `true` automatically installs the appropriate GPU drivers to the VM when a GPU is requested (default: `false`). Only needed when using an instance template.

`google.batch.network`
: Set network name to attach the VM's network interface to. The value will be prefixed with `global/networks/` unless it contains a `/`, in which case it is assumed to be a fully specified network resource URL. If unspecified, the global default network is used.

`google.batch.serviceAccountEmail`
: Define the Google service account email to use for the pipeline execution. If not specified, the default Compute Engine service account for the project will be used.

`google.batch.spot`
: When `true` enables the usage of *spot* virtual machines or `false` otherwise (default: `false`).

`google.batch.subnetwork`
: Define the name of the subnetwork to attach the instance to must be specified here, when the specified network is configured for custom subnet creation. The value is prefixed with `regions/subnetworks/` unless it contains a `/`, in which case it is assumed to be a fully specified subnetwork resource URL.

`google.batch.usePrivateAddress`
: When `true` the VM will NOT be provided with a public IP address, and only contain an internal IP. If this option is enabled, the associated job can only load docker images from Google Container Registry, and the job executable cannot use external services other than Google APIs (default: `false`).

`google.lifeSciences.bootDiskSize`
: Set the size of the virtual machine boot disk e.g `50.GB` (default: none).

Expand Down Expand Up @@ -1139,8 +1185,8 @@ Read the {ref}`sharing-page` page to learn how to publish your pipeline to GitHu

The `notification` scope allows you to define the automatic sending of a notification email message when the workflow execution terminates.

`notification.binding`
: A map modelling the variables in the template file.
`notification.attributes`
: A map object modelling the variables that can be used in the template file.

`notification.enabled`
: Enables the sending of a notification message when the workflow execution completes.
Expand Down Expand Up @@ -1397,11 +1443,15 @@ The following settings are available:
`singularity.noHttps`
: Pull the Singularity image with http protocol (default: `false`).

`singularity.oci`
: :::{versionadded} 23.11.0-edge
`singularity.ociAutoPull`
: :::{versionadded} 23.12.0-edge
:::
: Enable OCI-mode the allows the use of native OCI-compatible containers with Singularity. See [Singularity documentation](https://docs.sylabs.io/guides/4.0/user-guide/oci_runtime.html#oci-mode) for more details and requirements (default: `false`).
: When enabled, OCI (and Docker) container images are pull and converted to a SIF image file format implicitly by the Singularity run command, instead of Nextflow. Requires Singulairty 3.11 or later (default: `false`).

`singularity.ociMode`
: :::{versionadded} 23.12.0-edge
:::
: Enable OCI-mode, that allows running native OCI compliant container image with Singularity using `crun` or `runc` as low-level runtime. Note: it requires Singulairty 4 or later. See `--oci` flag in the [Singularity documentation](https://docs.sylabs.io/guides/4.0/user-guide/oci_runtime.html#oci-mode) for more details and requirements (default: `false`).

`singularity.pullTimeout`
: The amount of time the Singularity pull can last, exceeding which the process is terminated (default: `20 min`).
Expand Down
2 changes: 2 additions & 0 deletions assets/docs/edge/_sources/executor.md.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@ The pipeline can be launched either in a local computer, or an EC2 instance. EC2
Resource requests and other job characteristics can be controlled via the following process directives:

- {ref}`process-accelerator`
- {ref}`process-arch` (only when using Fargate platform type for AWS Batch)
- {ref}`process-container`
- {ref}`process-containerOptions`
- {ref}`process-cpus`
- {ref}`process-disk` (only when using Fargate platform type for AWS Batch)
- {ref}`process-memory`
- {ref}`process-queue`
- {ref}`process-resourcelabels`
Expand Down
8 changes: 3 additions & 5 deletions assets/docs/edge/_sources/fusion.md.txt
Original file line number Diff line number Diff line change
Expand Up @@ -212,8 +212,6 @@ The following configuration options are available:
executor which requires the use the [k8s-fuse-plugin](https://github.com/nextflow-io/k8s-fuse-plugin) to be installed
in the target cluster (default: `true`).

`fusion.tagsEnabled`
: Enable/disable the tagging of files created in the underlying object storage via the Fusion client (default: `true`).

`fusion.tagsPattern`
: The pattern that determines how tags are applied to files created via the Fusion client (default: `[.command.*|.exitcode|.fusion.*](nextflow.io/metadata=true),[*](nextflow.io/temporary=true)`)
`fusion.tags`
: The pattern that determines how tags are applied to files created via the Fusion client. To disable tags
set it to `false`. (default: `[.command.*|.exitcode|.fusion.*](nextflow.io/metadata=true),[*](nextflow.io/temporary=true)`)
63 changes: 48 additions & 15 deletions assets/docs/edge/_sources/google.md.txt
Original file line number Diff line number Diff line change
Expand Up @@ -90,45 +90,78 @@ Read the {ref}`Google configuration<config-google>` section to learn more about

### Process definition

Processes can be defined as usual and by default the `cpus` and `memory` directives are used to find the cheapest machine type available at current location that fits the requested resources. If `memory` is not specified, 1GB of memory is allocated per cpu.
By default, the `cpus` and `memory` directives are used to find the cheapest machine type that is available at the current
location and that fits the requested resources. If `memory` is not specified, 1 GB of memory is allocated per CPU.

:::{versionadded} 23.02.0-edge
The `machineType` directive can be a list of patterns separated by comma. The pattern can contain a `*` to match any number of characters and `?` to match any single character. Examples of valid patterns: `c2-*`, `m?-standard*`, `n*`.

Alternatively it can also be used to define a specific predefined Google Compute Platform [machine type](https://cloud.google.com/compute/docs/machine-types) or a custom machine type.
:::

Examples:
The `machineType` directive can be used to request a specific VM instance type. It can be any predefined Google Compute
Platform [machine type](https://cloud.google.com/compute/docs/machine-types) or [custom machine type](https://cloud.google.com/compute/docs/instances/creating-instance-with-custom-machine-type).

```groovy
process automatic_resources_task {
process myTask {
cpus 8
memory '40 GB'

"""
<Your script here>
your_command --here
"""
}

process anotherTask {
machineType 'n1-highmem-8'

"""
your_command --here
"""
}
```

process allowing_some_series {
:::{versionadded} 23.02.0-edge
:::

The `machineType` directive can also be a comma-separated list of patterns. The pattern can contain a `*` to match any
number of characters and `?` to match any single character. Examples of valid patterns: `c2-*`, `m?-standard*`, `n*`.

```groovy
process myTask {
cpus 8
memory '20 GB'
machineType 'n2-*,c2-*,m3-*'

"""
<Your script here>
your_command --here
"""
}
```

process predefined_resources_task {
machineType 'n1-highmem-8'
:::{versionadded} 23.12.0-edge
:::

The `machineType` directive can also be an [Instance Template](https://cloud.google.com/compute/docs/instance-templates),
specified as `template://<instance-template>`. For example:

```groovy
process myTask {
cpus 8
memory '20 GB'
machineType 'template://my-template'

"""
<Your script here>
your_command --here
"""
}
```

:::{note}
Using an instance template will overwrite the `accelerator` and `disk` directives, as well as the following Google Batch
config options: `cpuPlatform`, `preemptible`, and `spot`.
:::

To use an instance template with GPUs, you must also set the `google.batch.installGpuDrivers` config option to `true`.

To use an instance template with Fusion, the instance template must include a `local-ssd` disk named `fusion` with 375 GB.
See the [Google Batch documentation](https://cloud.google.com/compute/docs/disks/local-ssd) for more details about local SSDs.


:::{versionadded} 23.06.0-edge
:::

Expand Down
2 changes: 1 addition & 1 deletion assets/docs/edge/_static/documentation_options.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
var DOCUMENTATION_OPTIONS = {
URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'),
VERSION: '23.11.0-edge',
VERSION: '23.12.0-edge',
LANGUAGE: 'en',
COLLAPSE_INDEX: false,
BUILDER: 'html',
Expand Down
Binary file modified assets/docs/edge/_static/favicon.ico
Binary file not shown.
Loading

0 comments on commit b741e64

Please sign in to comment.