From 59da86a31f682cddace31cf32c551ec46bb3cc4f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Franti=C5=A1ek=20Nesveda?= Date: Mon, 10 Jul 2023 11:55:27 +0200 Subject: [PATCH 1/2] feat: Update base Docker images, improve Dockerfile section, fix 404s --- .../actor_definition/dockerfile.md | 38 +++++++--- .../development/deployment/source_types.md | 8 +- .../actors/development/performance.md | 3 +- sources/platform/api_v2/api_v2_reference.apib | 76 +++++++++---------- 4 files changed, 69 insertions(+), 56 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dockerfile.md b/sources/platform/actors/development/actor_definition/dockerfile.md index 42af4ae36..b5eb74389 100644 --- a/sources/platform/actors/development/actor_definition/dockerfile.md +++ b/sources/platform/actors/development/actor_definition/dockerfile.md @@ -9,34 +9,48 @@ sidebar_position: 4 --- +## Base Docker images + Apify provides several Docker images that can be used as a base for user actors. All images come in two versions: the **latest** tag corresponds to the stable version and **beta** to images where we test new features. Use the beta version at your own risk. Note that all Apify Docker images are pre-cached on Apify servers in order to speed up the actor builds and runs. The source code used to generate the images is available in the [apify-actor-docker](https://github.com/apify/apify-actor-docker) GitHub repository. -## Images with Apify SDK and Crawlee preinstalled {#apify-sdk-actor-images} - -The [Apify SDK for JavaScript](/sdk/js) and [Crawlee](https://crawlee.dev/) are preinstalled on these images. You can read more about them in the [Apify SDK Docker image guide](/sdk/js/docs/guides/docker-images). +### Node.js base images -- **Node.js 16 on Alpine Linux** ([`apify/actor-node`](https://hub.docker.com/r/apify/actor-node/)) - slim and efficient image, contains only the most elementary tools. Note that headless browsers (Puppeteer, Playwright) are not available in this image. +Apify provides several Docker images with Node.js, the [Apify SDK for JavaScript](/sdk/js) and [Crawlee](https://crawlee.dev/) preinstalled. +These images come with either Node.js 16, 18 or 20, you can choose which one you want using one of the `16`, `18` or `20` tags. The `latest` tag corresponds to the latest LTS version of Node.js. -- **Node.js 16 + Puppeteer + Chrome on Debian** ([`apify/actor-node-puppeteer-chrome`](https://hub.docker.com/r/apify/actor-node-puppeteer-chrome/)) - larger image with the Chromium and Google Chrome browsers and the [`puppeteer`](https://github.com/puppeteer/puppeteer) library bundled. With this image, you can use the [`launchPuppeteer()`](https://crawlee.dev/api/puppeteer-crawler/function/launchPuppeteer) function and [`PuppeteerCrawler`](https://crawlee.dev/api/puppeteer-crawler/class/PuppeteerCrawler). Note that Chrome requires quite a lot of resources, therefore the actor should run with at least 2048 MB of memory. +| Image | Description | +| ----- | ----------- | +| Node.js on Alpine Linux ([`actor-node`](https://hub.docker.com/r/apify/actor-node/)) | Slim and efficient image, contains only the most elementary tools. Note that headless browsers (Puppeteer, Playwright) are not available in this image. | +| Node.js + Puppeteer + Chrome on Debian ([`actor-node-puppeteer-chrome`](https://hub.docker.com/r/apify/actor-node-puppeteer-chrome/)) | Larger image with the Chromium and Google Chrome browsers and the [`puppeteer`](https://github.com/puppeteer/puppeteer) library bundled. | +| Node.js + Playwright + Chrome on Debian ([`actor-node-playwright-chrome`](https://hub.docker.com/r/apify/actor-node-playwright-chrome/)) | Larger image with the Chromium and Google Chrome browsers and the [`playwright`](https://github.com/microsoft/playwright) library bundled. | +| Node.js + Playwright + Firefox on Debian ([`actor-node-playwright-firefox`](https://hub.docker.com/r/apify/actor-node-playwright-firefox/)) | Larger image with the Firefox browser and the [`playwright`](https://github.com/microsoft/playwright) library bundled. | +| Node.js + Playwright + WebKit on Debian ([`actor-node-playwright-webkit`](https://hub.docker.com/r/apify/actor-node-playwright-webkit/)) | Larger image with the Webkit browser engine and the [`playwright`](https://github.com/microsoft/playwright) library bundled. | +| Node.js + Playwright + all browsers on Debian ([`actor-node-playwright`](https://hub.docker.com/r/apify/actor-node-playwright/)) | A very large and slow image with the [`playwright`](https://github.com/microsoft/playwright) library and all Playwright browsers (Chromium, Chrome, Firefox, WebKit) bundled. | -- **Node.js 16 + Playwright + Chrome on Debian** ([`apify/actor-node-playwright-chrome`](https://hub.docker.com/r/apify/actor-node-playwright-chrome/)) - similar to the `apify/actor-node-puppeteer-chrome` image, but it comes preinstalled the [`playwright`](https://github.com/microsoft/playwright) automation library instead of Puppeteer. With this image, you can use the [`launchPlaywright()`](https://crawlee.dev/api/playwright-crawler/function/launchPlaywright) function and [`PlaywrightCrawler`](https://crawlee.dev/api/playwright-crawler/class/PlaywrightCrawler). This image also comes with a `firefox` and `webkit` version. +You can read more about each of the images in the [Apify SDK Docker image guide](/sdk/js/docs/guides/docker-images). -For a full list of available images, [see the Apify SDK Docker image guide](/sdk/js/docs/guides/docker-images). +### Python base images -## Images with Apify Client for Python preinstalled {#python-actor-images} +Apify provides several Docker images with Python 3 and the [Apify SDK for Python](/sdk/python) preinstalled. +These images come with either Python 3.8, 3.9, 3.10 or 3.11, you can choose which one you want using one of the `3.8`, `3.9`, `3.10` or `3.11` tags. The `latest` tag corresponds to the latest version of Python 3 supported by the Apify SDK. -The [Apify API client for Python](/api/client/python) is preinstalled on these images. +These images are all based on Debian Bullseye. -- **Python 3 on Alpine Linux** ([`apify/actor-python`](https://hub.docker.com/r/apify/actor-python/)) - a slim image with Python 3 and the [Apify API client for Python](/api/client/python) preinstalled. Comes in multiple versions containing Python 3.7, 3.8, 3.9 or 3.10. +| Image | Description | +| ----- | ----------- | +| Python ([`actor-python`](https://hub.docker.com/r/apify/actor-python)) | Slim and efficient image, containing just the Apify SDK for Python. Headless browsers (Playwright, Selenium) are not available in this image. | +| Python + Playwright ([`actor-python-playwright`](https://hub.docker.com/r/apify/actor-python-playwright)) | Larger image with the [`playwright`](https://github.com/microsoft/playwright) library and all its browsers bundled. | +| Python + Selenium + Chrome ([`actor-python-selenium`](https://hub.docker.com/r/apify/actor-python-selenium)) | Larger image with the [`selenium`](https://github.com/seleniumhq/selenium) library, Google Chrome and [ChromeDriver](https://chromedriver.chromium.org/) bundled. | +## Custom Dockerfile -## [](#custom-dockerfile)Custom Dockerfile +Internally, Apify uses Docker to build and run Actors. If you create an Actor from a template, the Actor already contains an optimized Dockerfile for the given use-case. -Internally, Apify uses Docker to build and run Actors. To control the build of the Actor, you can create a custom **Dockerfile** and either reference from the `dockerfile` field in the Actor's config in **.actor/actor.json**, or store it in **.actor/Dockerfile** or **Dockerfile** in its root directory. These three sites are searched for in this order of preference. If the **Dockerfile** is missing, the system uses the following default: +To control the build of the Actor, you can create a custom **Dockerfile** and either reference from the `dockerfile` field in the Actor's config in **.actor/actor.json**, or store it in **.actor/Dockerfile** or **Dockerfile** in its root directory. These three sites are searched for in this order of preference. If the **Dockerfile** is missing, the system uses the following default: ```dockerfile FROM apify/actor-node:16 diff --git a/sources/platform/actors/development/deployment/source_types.md b/sources/platform/actors/development/deployment/source_types.md index af7e40fe1..ac0072980 100644 --- a/sources/platform/actors/development/deployment/source_types.md +++ b/sources/platform/actors/development/deployment/source_types.md @@ -17,7 +17,7 @@ This option is used by default when your actor's source code is hosted on Apify The only required file is **Dockerfile**, and all other files depend on your Dockerfile settings. By default, Apify's custom NodeJS Dockerfile is used, which requires a **main.js** file containing your source code and a **package.json** file containing package configurations for [NPM](https://www.npmjs.com/). -See [Custom Dockerfile](./source_types.md) and [base Docker images](../actor_definition/dockerfile.md) for more information about creating your own Dockerfile and using Apify's prepared base images. +See [Dockerfile](../actor_definition/dockerfile.md#custom-dockerfile) and [base Docker images](../actor_definition/dockerfile.md#base-docker-images) for more information about creating your own Dockerfile and using Apify's prepared base images. ## [](#git-repository)Git repository @@ -32,7 +32,7 @@ To specify a Git branch or tag to check out, add a URL fragment to the URL. For Optionally, the second part of the fragment in the Git URL (separated by a colon) specifies the directory from which the Actor will be built (and where the `.actor`) folder is located. For example, `https://github.com/jancurn/some-actor.git#develop:some/dir` will check out the **develop** branch and set **some/dir** as the root directory of the Actor. -Note that you can easily set up an integration where the Actor is automatically rebuilt on every commit to the Git repository. For more details, see [GitHub integration](./source_types.md). +Note that you can easily set up an integration where the Actor is automatically rebuilt on every commit to the Git repository. For more details, see [GitHub integration](../../../integrations/github.md). ### [](#private-repositories)Private repositories @@ -53,7 +53,7 @@ An example Actor monorepo is shown in the [`apify/actor-monorepo-example`](https ## [](#zip-file)Zip file -The source code for the Actor can also be located in a Zip archive hosted on an external URL. This option enables integration with arbitrary source code or continuous integration systems. Similarly, as with the [Git repository](#git-repository), the source code can consist of multiple files and directories, can contain a custom **Dockerfile**, and the actor description is taken from README.md. If you don't use a [custom Dockerfile](#custom-dockerfile), the root file of your application must be named `main.js`. +The source code for the Actor can also be located in a Zip archive hosted on an external URL. This option enables integration with arbitrary source code or continuous integration systems. Similarly, as with the [Git repository](#git-repository), the source code can consist of multiple files and directories, can contain a custom **Dockerfile**, and the actor description is taken from README.md. If you don't use a [custom Dockerfile](../actor_definition/dockerfile.md#custom-dockerfile), the root file of your application must be named `main.js`. ## [](#github-gist)GitHub Gist @@ -68,5 +68,5 @@ Then set the **Source Type** to **GitHub Gist** and paste the Gist URL as follow Note that the example Actor is available in the Apify Store as [apify/example-github-gist](https://apify.com/apify/example-github-gist). -Similarly, as with the [Git repository](./source_types.md), the source code can consist of multiple files and directories, it can contain a custom **Dockerfile** and the actor description is taken from README.md. If you don't use a [custom Dockerfile](#custom-dockerfile), the root file of your application must be named `main.js`. +Similarly, as with the [Git repository](#git-repository), the source code can consist of multiple files and directories, it can contain a custom **Dockerfile** and the actor description is taken from README.md. If you don't use a [custom Dockerfile](../actor_definition/dockerfile.md#custom-dockerfile), the root file of your application must be named `main.js`. diff --git a/sources/platform/actors/development/performance.md b/sources/platform/actors/development/performance.md index 7ff331c2d..fb5e5548f 100644 --- a/sources/platform/actors/development/performance.md +++ b/sources/platform/actors/development/performance.md @@ -49,6 +49,5 @@ We first copy the `package.json`, `package-lock.json` , and install the dependen ### Speedup the Actor startup times by using standardised images -If you use one of [Apify's standardized images](https://github.com/apify/apify-actor-docker), the startup time will be faster. This is because the images are cached at each worker machine, and so only the layers you added in your Actor's [Dockefile](./actor_definition/dockerfile.md) need to be pulled. +If you use one of [Apify's standardized images](https://github.com/apify/apify-actor-docker), the startup time will be faster. This is because the images are cached at each worker machine, and so only the layers you added in your Actor's [Dockerfile](./actor_definition/dockerfile.md) need to be pulled. - \ No newline at end of file diff --git a/sources/platform/api_v2/api_v2_reference.apib b/sources/platform/api_v2/api_v2_reference.apib index 173b36dd6..b7a5d7517 100644 --- a/sources/platform/api_v2/api_v2_reference.apib +++ b/sources/platform/api_v2/api_v2_reference.apib @@ -10,9 +10,9 @@ All requests and responses (including errors) are encoded in [JSON](http://www.j with a few exceptions that are explicitly described in the reference. To access the API using [Node.js](https://nodejs.org/en/), we recommend the -[`apify-client`](/api/client/js) [NPM package](https://www.npmjs.com/package/apify-client). +[`apify-client`](https://docs.apify.com/api/client/js) [NPM package](https://www.npmjs.com/package/apify-client). To access the API using [Python](https://www.python.org/), we recommend the -[`apify-client`](/api/api/client/python) [PyPI package](https://pypi.org/project/apify-client/). +[`apify-client`](https://docs.apify.com/api/client/python) [PyPI package](https://pypi.org/project/apify-client/). The clients' functions correspond to the API endpoints and have the same parameters. This simplifies development of apps that depend on the Apify platform. **Note:** All requests with JSON payloads need to specify the `Content-Type: application/json` HTTP header! @@ -40,7 +40,7 @@ your API token. **Do not share your API token or password with untrusted parties.** -For more information, see our [integrations](/integrations) documentation. +For more information, see our [integrations](https://docs.apify.com/platform/integrations) documentation. ## Basic usage @@ -195,7 +195,7 @@ The following table describes the meaning of the response properties: ### Using key -The records in the [key-value store](/storage/key-value-store) +The records in the [key-value store](https://docs.apify.com/platform/storage/key-value-store) are not ordered based on numerical indexes, but rather by their keys in the UTF-8 binary order. Therefore the [Get list of keys](#reference/key-value-stores/key-collection/get-list-of-keys) @@ -370,8 +370,8 @@ and it can be described using the following pseudo-code: If all requests sent by the client implement the above steps, the client will automatically use the maximum available bandwidth for its requests. -Note that the Apify API clients [for JavaScript](/api/client/js) -and [for Python](/api/api/client/python) +Note that the Apify API clients [for JavaScript](https://docs.apify.com/api/client/js) +and [for Python](https://docs.apify.com/api/client/python) use the exponential backoff algorithm transparently, so that you do not need to worry about it. ## Referring to resources @@ -385,7 +385,7 @@ There are three main ways to refer to a resource you're accessing via API. # Group Actors The API endpoints described in this section enable you to manage, build and run Apify actors. -For more information, see the Actor documentation. +For more information, see the Actor documentation. Note that for all the API endpoints that accept the `actorId` parameter to specify an actor, you can pass either the actor ID (e.g. `HG7ML7M8z78YcAPEB`) or a tilde-separated @@ -440,7 +440,7 @@ The HTTP request must have the `Content-Type: application/json` HTTP header! The actor needs to define at least one version of the source code. For more information, see [Version object](#reference/actors/version-object). -If you want to make your actor public using `isRequired: true`, you will need to provide the actor's [`title`](/actors/publishing#title) and the `categories` under which that actor will be classified in Apify Store. For this, it's best to use the [constants from our `apify-shared-js` package](https://github.com/apify/apify-shared-js/blob/2d43ebc41ece9ad31cd6525bd523fb86939bf860/packages/consts/src/consts.ts#L452-L471). +If you want to make your actor [public](https://docs.apify.com/platform/actors/publishing) using `isPublic: true`, you will need to provide the actor's `title` and the `categories` under which that actor will be classified in Apify Store. For this, it's best to use the [constants from our `apify-shared-js` package](https://github.com/apify/apify-shared-js/blob/2d43ebc41ece9ad31cd6525bd523fb86939bf860/packages/consts/src/consts.ts#L452-L471). + Request (application/json) @@ -483,7 +483,7 @@ The request needs to specify the `Content-Type: application/json` HTTP header! When providing your API authentication token, we recommend using the request's `Authorization` header, rather than the URL. ([More info](#introduction/authentication)). -If you want to make your actor public using `isRequired: true`, you will need to provide the actor's [`title`](/actors/publishing#title) and the `categories` under which that actor will be classified in Apify Store. For this, it's best to use the [constants from our `apify-shared-js` package](https://github.com/apify/apify-shared-js/blob/2d43ebc41ece9ad31cd6525bd523fb86939bf860/packages/consts/src/consts.ts#L452-L471). +If you want to make your actor [public](https://docs.apify.com/platform/actors/publishing) using `isPublic: true`, you will need to provide the actor's `title` and the `categories` under which that actor will be classified in Apify Store. For this, it's best to use the [constants from our `apify-shared-js` package](https://github.com/apify/apify-shared-js/blob/2d43ebc41ece9ad31cd6525bd523fb86939bf860/packages/consts/src/consts.ts#L452-L471). + Request (application/json) @@ -528,9 +528,9 @@ Creates a version of an actor using values specified in a [Version object](#refe The request must specify `versionNumber` and `sourceType` parameters (as strings) in the JSON payload and a `Content-Type: application/json` HTTP header. -Each `sourceType` requires its own additional properties to be passed to the JSON payload object. These are outlined in the [Version object](#reference/actors/version-object) table below and in more detail in the [Apify documentation](/actors/development/actor-definition/source-code). +Each `sourceType` requires its own additional properties to be passed to the JSON payload object. These are outlined in the [Version object](#reference/actors/version-object) table below and in more detail in the [Apify documentation](https://docs.apify.com/platform/actors/development/deployment/source-types). -For example, if an Actor's source code is stored in a [GitHub repository](/actors/development/actor-definition/source-code#git-repository), you will set the `sourceType` to `GIT_REPO` and pass the repository's URL in the `gitRepoUrl` property. +For example, if an Actor's source code is stored in a [GitHub repository](https://docs.apify.com/platform/actors/development/deployment/source-types#git-repository), you will set the `sourceType` to `GIT_REPO` and pass the repository's URL in the `gitRepoUrl` property. ``` { @@ -601,7 +601,7 @@ on its value the Version object has the following additional property: -For more information about source code and actor versions, see [Source code](/actor/source-code) in Actors documentation. +For more information about source code and actor versions, see [Source code](https://docs.apify.com/platform/actors/development/actor-definition/source-code) in Actors documentation. + Parameters @@ -913,7 +913,7 @@ array elements. By default, the records are sorted by the `startedAt` field in ascending order, therefore you can use pagination to incrementally fetch all records while new ones are still being created. To sort the records in descending order, use `desc=1` -parameter. You can also filter runs by status ([available statuses](/actors/running#lifecycle)). +parameter. You can also filter runs by status ([available statuses](https://docs.apify.com/platform/actors/running/runs-and-builds#lifecycle)). + Parameters @@ -921,7 +921,7 @@ parameter. You can also filter runs by status ([available statuses](/actors/runn + offset: 10 (number, optional) - Number of array elements that should be skipped at the start. The default value is `0`. + limit: 99 (number, optional) - Maximum number of array elements to return. The default value as well as the maximum is `1000`. + desc: true (boolean, optional) - If `true` or `1` then the objects are sorted by the `startedAt` field in descending order. By default, they are sorted in ascending order. - + status: SUCCEEDED (string, optional) - Return only runs with the provided status ([available statuses](/actors/running#lifecycle)) + + status: SUCCEEDED (string, optional) - Return only runs with the provided status ([available statuses](https://docs.apify.com/platform/actors/running/runs-and-builds#lifecycle)) + Response 200 (application/json) @@ -962,7 +962,7 @@ received in the response JSON to the [Get items](#reference/datasets/item-collec otherwise it will have a transitional status (e.g. `RUNNING`). + webhooks: `dGhpcyBpcyBqdXN0IGV4YW1wbGUK...` (string, optional) - Specifies optional webhooks associated with the actor run, which can be used to receive a notification e.g. when the actor finished or failed. The value is a Base64-encoded JSON array of objects defining the webhooks. For more information, see - [Webhooks documenation](/webhooks). + [Webhooks documenation](https://docs.apify.com/platform/integrations/webhooks). + Request @@ -996,7 +996,7 @@ received in the response JSON to the [Get items](#reference/datasets/item-collec + build: `0.1.234` (string, optional) - Specifies the actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the default run configuration for the actor (typically `latest`). + webhooks: `dGhpcyBpcyBqdXN0IGV4YW1wbGUK...` (string, optional) - Specifies optional webhooks associated with the actor run, which can be used to receive a notification e.g. when the actor finished or failed. The value is a Base64-encoded JSON array of objects defining the webhooks. For more information, see - [Webhooks documenation](/webhooks). + [Webhooks documenation](https://docs.apify.com/platform/integrations/webhooks). ### With input [POST] @@ -1114,7 +1114,7 @@ To run the actor asynchronously, use the [Run actor](#reference/actors/run-colle + build: `0.1.234` (string, optional) - Specifies the actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the default run configuration for the actor (typically `latest`). + webhooks: `dGhpcyBpcyBqdXN0IGV4YW1wbGUK...` (string, optional) - Specifies optional webhooks associated with the actor run, which can be used to receive a notification e.g. when the actor finished or failed. The value is a Base64-encoded JSON array of objects defining the webhooks. For more information, see - [Webhooks documenation](/webhooks). + [Webhooks documenation](https://docs.apify.com/platform/integrations/webhooks). + format: `json` (string, optional) - Format of the results, possible values are: `json`, `jsonl`, `csv`, `html`, `xlsx`, `xml` and `rss`. The default value is `json`. + clean: `false` (boolean, optional) - If `true` or `1` then the API endpoint returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). @@ -1338,13 +1338,13 @@ This is useful if you want to use another actor to finish the work of your current actor run, without the need to create a completely new run and waiting for its finish. For the users of your actors, the metamorph operation is transparent, they will just see your actor got the work done. -There is a limit on how many times you can metamorph a single run. You can check the limit in [the Actor runtime limits](/platform/limits#actor-limits). +There is a limit on how many times you can metamorph a single run. You can check the limit in [the Actor runtime limits](https://docs.apify.com/platform/limits#actor-limits). Internally, the system stops the Docker container corresponding to the actor run and starts a new container using a different Docker image. All the default storages are preserved and the new input is stored under the `INPUT-METAMORPH-1` key in the same default key-value store. -For more information, see the [Actor docs](/actors/development/actor-definition/source-code#metamorph). +For more information, see the [Actor docs](https://docs.apify.com/platform/actors/development/programming-interface/metamorph). + Parameters @@ -1369,7 +1369,7 @@ Only finished runs, i.e. runs with status `FINISHED`, `FAILED`, `ABORTED` and `T Run status will be updated to RUNNING and its container will be restarted with the same storages (the same behaviour as when the run gets migrated to the new server). -For more information, see the [Actor docs](/actor/run#resurrection-of-finished-run). +For more information, see the [Actor docs](https://docs.apify.com/platform/actors/running/runs-and-builds#resurrection-of-finished-run). + Parameters @@ -1457,7 +1457,7 @@ In order to save new items to the dataset, send HTTP POST request with JSON payl # Group Actor tasks The API endpoints described in this section enable you to manage and run Apify actor tasks. -For more information, see the Actor tasks documentation. +For more information, see the Actor tasks documentation. Note that for all the API endpoints that accept the `actorTaskId` parameter to specify a task, you can pass either the task ID (e.g. `HG7ML7M8z78YcAPEB`) or a tilde-separated @@ -1661,7 +1661,7 @@ array elements. By default, the records are sorted by the `startedAt` field in ascending order; therefore you can use pagination to incrementally fetch all records while new ones are still being created. To sort the records in descending order, use the `desc=1` -parameter. You can also filter runs by status ([available statuses](/actors/running#lifecycle)). +parameter. You can also filter runs by status ([available statuses](https://docs.apify.com/platform/actors/running/runs-and-builds#lifecycle)). + Parameters @@ -1670,7 +1670,7 @@ parameter. You can also filter runs by status ([available statuses](/actors/runn + offset: 10 (number, optional) - Number of array elements that should be skipped at the start. The default value is `0`. + limit: 99 (number, optional) - Maximum number of array elements to return. The default value as well as the maximum is `1000`. + desc: true (boolean, optional) - If `true` or `1` then the objects are sorted by the `startedAt` field in descending order. By default, they are sorted in ascending order. - + status: SUCCEEDED (string, optional) - Return only runs with the provided status ([available statuses](/actors/running#lifecycle)) + + status: SUCCEEDED (string, optional) - Return only runs with the provided status ([available statuses](https://docs.apify.com/platform/actors/running/runs-and-builds#lifecycle)) + Response 200 (application/json) @@ -1715,7 +1715,7 @@ received in the response JSON to the [Get items](#reference/datasets/item-collec e.g. when the actor finished or failed. The value is a Base64-encoded JSON array of objects defining the webhooks. **Note**: if you already have a webhook set up for the actor or task, you do not have to add it again here. For more information, see - [Webhooks documenation](/webhooks). + [Webhooks documenation](https://docs.apify.com/platform/integrations/webhooks). + Request @@ -1749,7 +1749,7 @@ received in the response JSON to the [Get items](#reference/datasets/item-collec in the response. By default, it is `OUTPUT`. + webhooks: `dGhpcyBpcyBqdXN0IGV4YW1wbGUK...` (string, optional) - Specifies optional webhooks associated with the actor run, which can be used to receive a notification e.g. when the actor finished or failed. The value is a Base64-encoded JSON array of objects defining the webhooks. For more information, see - [Webhooks documenation](/webhooks). + [Webhooks documenation](https://docs.apify.com/platform/integrations/webhooks). ### Run task synchronously (POST) [POST] @@ -1855,7 +1855,7 @@ To run the Task asynchronously, use the [Run task asynchronously](#reference/act + build: `0.1.234` (string, optional) - Specifies the actor build to run. It can be either a build tag or build number. By default, the run uses the build specified in the task settings (typically `latest`). + webhooks: `dGhpcyBpcyBqdXN0IGV4YW1wbGUK...` (string, optional) - Specifies optional webhooks associated with the actor run, which can be used to receive a notification e.g. when the actor finished or failed. The value is a Base64-encoded JSON array of objects defining the webhooks. For more information, see - [Webhooks documenation](/webhooks). + [Webhooks documenation](https://docs.apify.com/platform/integrations/webhooks). + format: `json` (string, optional) - Format of the results, possible values are: `json`, `jsonl`, `csv`, `html`, `xlsx`, `xml` and `rss`. The default value is `json`. + clean: `false` (boolean, optional) - If `true` or `1` then the API endpoint returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). @@ -2098,7 +2098,7 @@ array elements. By default, the records are sorted by the `startedAt` field in ascending order. Therefore, you can use pagination to incrementally fetch all records while new ones are still being created. To sort the records in descending order, use `desc=1` -parameter. You can also filter runs by status ([available statuses](/actors/running#lifecycle)). +parameter. You can also filter runs by status ([available statuses](https://docs.apify.com/platform/actors/running/runs-and-builds#lifecycle)). + Parameters @@ -2106,7 +2106,7 @@ parameter. You can also filter runs by status ([available statuses](/actors/runn + offset: 10 (number, optional) - Number of array elements that should be skipped at the start. The default value is `0`. + limit: 99 (number, optional) - Maximum number of array elements to return. The default value (as well as the maximum) is `1000`. + desc: true (boolean, optional) - If `true` or `1` then the objects are sorted by the `startedAt` field in descending order. By default, they are sorted in ascending order. - + status: SUCCEEDED (string, optional) - Return only runs with the provided status ([available statuses](/actors/running#lifecycle)) + + status: SUCCEEDED (string, optional) - Return only runs with the provided status ([available statuses](https://docs.apify.com/platform/actors/running/runs-and-builds#lifecycle)) + Response 200 (application/json) @@ -2232,7 +2232,7 @@ Internally, the system stops the Docker container corresponding to the actor run and starts a new container using a different Docker image. All the default storages are preserved and the new input is stored under the `INPUT-METAMORPH-1` key in the same default key-value store. -For more information, see the [Actor docs](/actors/development/actor-definition/source-code#metamorph). +For more information, see the [Actor docs](https://docs.apify.com/platform/actors/development/programming-interface/metamorph). + Parameters @@ -2256,7 +2256,7 @@ Only finished runs, i.e. runs with status `FINISHED`, `FAILED`, `ABORTED` and `T Run status will be updated to RUNNING and its container will be restarted with the same storages (the same behaviour as when the run gets migrated to the new server). -For more information, see the [Actor docs](/actor/run#resurrection-of-finished-run). +For more information, see the [Actor docs](https://docs.apify.com/platform/actors/running/runs-and-builds#resurrection-of-finished-run). + Parameters @@ -2378,7 +2378,7 @@ Key-value store is a simple storage for saving and reading data records or files Each data record is represented by a unique key and associated with a MIME content type. Key-value stores are ideal for saving screenshots, actor inputs and outputs, web pages, PDFs or to persist the state of crawlers. -For more information, see the Key-value store documentation. +For more information, see the Key-value store documentation. Note that some of the endpoints do not require the authentication token, the calls are authenticated using a hard-to-guess ID of the key-value store. @@ -2424,7 +2424,7 @@ parameter. Creates a key-value store and returns its object. The response is the same object as returned by the [Get store](#reference/key-value-stores/store-object/get-store) endpoint. -Keep in mind that data stored under unnamed store follows [data retention period](/storage#data-retention). +Keep in mind that data stored under unnamed store follows [data retention period](https://docs.apify.com/platform/storage#data-retention). It creates a store with the given name if the parameter name is used. If there is another store with the same name, the endpoint does not create a new one and returns the existing object instead. @@ -2621,7 +2621,7 @@ such as online store products or real estate offers. You can imagine it as a tab where each object is a row and its attributes are columns. Dataset is an append-only storage - you can only add new records to it but you cannot modify or remove existing records. Typically it is used to store crawling results. -For more information, see the Datasets documentation. +For more information, see the Datasets documentation. Note that some of the endpoints do not require the authentication token, the calls are authenticated using the hard-to-guess ID of the dataset. @@ -2661,7 +2661,7 @@ array elements. ### Create dataset [POST] Creates a dataset and returns its object. -Keep in mind that data stored under unnamed dataset follows [data retention period](/storage#data-retention). +Keep in mind that data stored under unnamed dataset follows [data retention period](https://docs.apify.com/platform/storage#data-retention). It creates a dataset with the given name if the parameter name is used. If a dataset with the given name already exists then returns its object. @@ -2993,7 +2993,7 @@ This section describes API endpoints to manage request queues. Request queue is a storage for a queue of HTTP URLs to crawl, which is typically used for deep crawling of websites where you start with several URLs and then recursively follow links to other pages. The storage supports both breadth-first and depth-first crawling orders. -For more information, see the Request queue documentation. +For more information, see the Request queue documentation. Note that some of the endpoints do not require the authentication token, the calls are authenticated using the hard-to-guess ID of the queue. @@ -3033,7 +3033,7 @@ array elements. ### Create request queue [POST] Creates a request queue and returns its object. -Keep in mind that requests stored under unnamed queue follows [data retention period](/storage#data-retention). +Keep in mind that requests stored under unnamed queue follows [data retention period](https://docs.apify.com/platform/storage#data-retention). It creates a queue of given name if the parameter name is used. If a queue with the given name already exists then the endpoint returns its object. @@ -3403,7 +3403,7 @@ This section describes API endpoints to manage webhooks. Webhooks provide an easy and reliable way to configure the Apify platform to carry out an action (e.g. a HTTP request to another service) when a certain system event occurs. For example, you can use webhooks to start another actor when an actor run finishes or fails. -For more information see Webhooks documentation. +For more information see Webhooks documentation. ## Webhook collection [/v2/webhooks{?token,limit,offset,desc}] @@ -3622,7 +3622,7 @@ This section describes API endpoints for managing schedules. Schedules are used to automatically start your actors at certain times. Each schedule can be associated with a number of actors and actor tasks. It is also possible to override the settings of each actor (task) similarly to when invoking the actor (task) using the API. -For more information, see Schedules documentation. +For more information, see Schedules documentation. Each schedule is assigned actions for it to perform. Actions can be of two types - `RUN_ACTOR` and `RUN_ACTOR_TASK`. For details, see the documentation of the [Get schedule](#reference/schedules/schedule-object/get-schedule) endpoint. From e3f04516546fa47456aa0745324fe98d90585770 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Franti=C5=A1ek=20Nesveda?= Date: Tue, 11 Jul 2023 14:50:57 +0200 Subject: [PATCH 2/2] Fix OS for Docker images with Webkit --- .../actors/development/actor_definition/dockerfile.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dockerfile.md b/sources/platform/actors/development/actor_definition/dockerfile.md index b5eb74389..2ede039fc 100644 --- a/sources/platform/actors/development/actor_definition/dockerfile.md +++ b/sources/platform/actors/development/actor_definition/dockerfile.md @@ -28,8 +28,8 @@ These images come with either Node.js 16, 18 or 20, you can choose which one you | Node.js + Puppeteer + Chrome on Debian ([`actor-node-puppeteer-chrome`](https://hub.docker.com/r/apify/actor-node-puppeteer-chrome/)) | Larger image with the Chromium and Google Chrome browsers and the [`puppeteer`](https://github.com/puppeteer/puppeteer) library bundled. | | Node.js + Playwright + Chrome on Debian ([`actor-node-playwright-chrome`](https://hub.docker.com/r/apify/actor-node-playwright-chrome/)) | Larger image with the Chromium and Google Chrome browsers and the [`playwright`](https://github.com/microsoft/playwright) library bundled. | | Node.js + Playwright + Firefox on Debian ([`actor-node-playwright-firefox`](https://hub.docker.com/r/apify/actor-node-playwright-firefox/)) | Larger image with the Firefox browser and the [`playwright`](https://github.com/microsoft/playwright) library bundled. | -| Node.js + Playwright + WebKit on Debian ([`actor-node-playwright-webkit`](https://hub.docker.com/r/apify/actor-node-playwright-webkit/)) | Larger image with the Webkit browser engine and the [`playwright`](https://github.com/microsoft/playwright) library bundled. | -| Node.js + Playwright + all browsers on Debian ([`actor-node-playwright`](https://hub.docker.com/r/apify/actor-node-playwright/)) | A very large and slow image with the [`playwright`](https://github.com/microsoft/playwright) library and all Playwright browsers (Chromium, Chrome, Firefox, WebKit) bundled. | +| Node.js + Playwright + WebKit on Ubuntu ([`actor-node-playwright-webkit`](https://hub.docker.com/r/apify/actor-node-playwright-webkit/)) | Larger image with the Webkit browser engine and the [`playwright`](https://github.com/microsoft/playwright) library bundled. | +| Node.js + Playwright + all browsers on Ubuntu ([`actor-node-playwright`](https://hub.docker.com/r/apify/actor-node-playwright/)) | A very large and slow image with the [`playwright`](https://github.com/microsoft/playwright) library and all Playwright browsers (Chromium, Chrome, Firefox, WebKit) bundled. | You can read more about each of the images in the [Apify SDK Docker image guide](/sdk/js/docs/guides/docker-images).