From 45c9a91c9e4e3eba6d4c26def43687f5f5a0712d Mon Sep 17 00:00:00 2001 From: Henry Wilkinson Date: Wed, 7 Feb 2024 14:33:57 -0500 Subject: [PATCH] Docs: Improve relative links (#1476) ### Changes - Fixes one broken link (["Ansible Playbooks" here](https://docs.browsertrix.cloud/deploy/remote/)) - Formats relative links better to conform with [mkdocs 1.5 link validation improvements](https://www.mkdocs.org/about/release-notes/#expanded-validation-of-links) --- docs/deploy/ansible/digitalocean.md | 2 +- docs/deploy/index.md | 2 +- docs/deploy/remote.md | 2 +- docs/develop/frontend-dev.md | 2 +- docs/develop/index.md | 2 +- docs/index.md | 7 +++---- docs/user-guide/archived-items.md | 2 +- docs/user-guide/collections.md | 2 +- docs/user-guide/crawl-workflows.md | 10 +++++----- docs/user-guide/index.md | 16 ++++++++-------- docs/user-guide/signup.md | 2 +- docs/user-guide/workflow-setup.md | 6 +++--- 12 files changed, 27 insertions(+), 28 deletions(-) diff --git a/docs/deploy/ansible/digitalocean.md b/docs/deploy/ansible/digitalocean.md index 8fab27901c..1ffd987b80 100644 --- a/docs/deploy/ansible/digitalocean.md +++ b/docs/deploy/ansible/digitalocean.md @@ -13,7 +13,7 @@ To run this ansible playbook, you need to: - `doctl` command line client configured (run `doctl auth init`) - Create a [DigitalOcean Spaces](https://docs.digitalocean.com/reference/api/spaces-api/) API Key which will also need to be set in your terminal sessions environment variables, which should be set as `DO_AWS_ACCESS_KEY` and `DO_AWS_SECRET_KEY` - Configure a DNS A Record and CNAME record. -- Have a working python and pip configuration through your OS Package Manager +- Have a working Python and pip configuration through your OS Package Manager #### Install diff --git a/docs/deploy/index.md b/docs/deploy/index.md index dff84c4cc8..dc396b2915 100644 --- a/docs/deploy/index.md +++ b/docs/deploy/index.md @@ -10,4 +10,4 @@ The main requirements for Browsertrix Cloud are: - [Helm 3](https://helm.sh/) (package manager for Kubernetes) -We have prepared a [Local Deployment Guide](./local) which covers several options for testing Browsertrix Cloud locally on a single machine, as well as a [Production (Self-Hosted and Cloud) Deployment](./production) guides to help with setting up Browsertrix Cloud for different production scenarios. +We have prepared a [Local Deployment Guide](local.md) which covers several options for testing Browsertrix Cloud locally on a single machine, as well as a [Production (Self-Hosted and Cloud) Deployment](remote.md) guide to help with setting up Browsertrix Cloud in different production scenarios. diff --git a/docs/deploy/remote.md b/docs/deploy/remote.md index 651072c295..2ec0df5dd8 100644 --- a/docs/deploy/remote.md +++ b/docs/deploy/remote.md @@ -2,7 +2,7 @@ For remote and hosted deployments (both on a single machine or in the cloud), the only requirement is to have a designed domain and (strongly recommended, but not required) second domain for signing web archives. -We are also experimenting with [Ansible playbooks](../deploy/ansible) for cloud deployment setups. +We are also experimenting with [Ansible playbooks](ansible/digitalocean.md) for cloud deployment setups. The production deployments also allow using an external mongodb server, and/or external S3-compatible storage instead of the bundled minio. diff --git a/docs/develop/frontend-dev.md b/docs/develop/frontend-dev.md index 5bacf20933..b362affa2a 100644 --- a/docs/develop/frontend-dev.md +++ b/docs/develop/frontend-dev.md @@ -8,7 +8,7 @@ Instead of rebuilding the entire frontend image to view your UI changes, you can ### 1. Browsertrix Cloud API backend already in a Kubernetes cluster -The frontend development server requires an existing backend that has been deployed locally or is in production. See [Deploying Browsertrix Cloud](../../deploy/). +The frontend development server requires an existing backend that has been deployed locally or is in production. See [Deploying Browsertrix Cloud](../deploy/index.md). ### 2. Node.js ≥16 and Yarn 1 diff --git a/docs/develop/index.md b/docs/develop/index.md index 47a54c26cf..01107c77c8 100644 --- a/docs/develop/index.md +++ b/docs/develop/index.md @@ -25,6 +25,6 @@ The frontend UI is implemented in TypeScript, using the Lit framework and Shoela The static build of the frontend is bundled with nginx, but the frontend can be deployed locally in dev mode against an existing backend. -See [Running Frontend](./frontend-dev) for more details. +See [Developing the Frontend UI](frontend-dev.md) for more details. diff --git a/docs/index.md b/docs/index.md index 8306f33aee..d6c89633e9 100644 --- a/docs/index.md +++ b/docs/index.md @@ -8,11 +8,10 @@ hide: Welcome to the Browsertrix Cloud official user guide and developer docs. These docs will contain the following sections. -- [Deployment Guide](./deploy) — How to install and deploy Browsertrix Cloud on your local machine, or in the cloud. -- [Developer Docs](./develop) — Information on developing Browsertrix Cloud itself. -- [User Guide](./user-guide) — Instructions and reference for using Browsertrix Cloud. +- [Deployment Guide](deploy/index.md) — How to install and deploy Browsertrix Cloud on your local machine, or in the cloud. +- [Developer Docs](develop/index.md) — Information on developing Browsertrix Cloud itself. +- [User Guide](user-guide/index.md) — Instructions and reference for using Browsertrix Cloud. If you are unfamiliar with Browsertrix Cloud, please check out [our website](https://browsertrix.cloud), or the main repository at [https://github.com/webrecorder/browsertrix-cloud](https://github.com/webrecorder/browsertrix-cloud) Our docs are still under construction. If you find something missing, chances are we haven't gotten around to writing that part yet. If you find typos or something isn't clear or seems incorrect, please open an [issue](https://github.com/webrecorder/browsertrix-cloud/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) and we'll try to make sure that your questions get answered here in the future! - diff --git a/docs/user-guide/archived-items.md b/docs/user-guide/archived-items.md index 483503b5d4..97d93d83e1 100644 --- a/docs/user-guide/archived-items.md +++ b/docs/user-guide/archived-items.md @@ -12,7 +12,7 @@ The status of an archived item depends on its type. Uploads will always have the | Status | Description | | ---- | ---- | -| :bootstrap-check-circle: Complete | The crawl completed according to the workflow's settings. Workflows with [limits](../workflow-setup/#limits) set may stop running before they capture every queued page, but the resulting archived item will still be marked as "Complete". | +| :bootstrap-check-circle: Complete | The crawl completed according to the workflow's settings. Workflows with [limits](workflow-setup.md#limits) set may stop running before they capture every queued page, but the resulting archived item will still be marked as "Complete". | | :bootstrap-dash-circle: Stopped | The crawl workflow was _stopped_ gracefully by a user and data is saved. | | :bootstrap-x-octagon: Canceled | The crawl workflow was _canceled_ by a user, no data is saved. | | :bootstrap-exclamation-triangle: Failed | A serious error occurred while crawling, no data is saved.| diff --git a/docs/user-guide/collections.md b/docs/user-guide/collections.md index 4178a5b3aa..2b9e163939 100644 --- a/docs/user-guide/collections.md +++ b/docs/user-guide/collections.md @@ -11,7 +11,7 @@ Collections are the primary way of organizing and combining archived items into Crawls and uploads can be added to a collection after creation by selecting _Select Archived Items_ from the collection's actions menu. -A crawl workflow can also be set to [automatically add any completed archived items to a collection](../workflow-setup/#collection-auto-add) in the workflow's settings. +A crawl workflow can also be set to [automatically add any completed archived items to a collection](workflow-setup.md#collection-auto-add) in the workflow's settings. ## Sharing Collections diff --git a/docs/user-guide/crawl-workflows.md b/docs/user-guide/crawl-workflows.md index 4a148d3906..2b319f9f23 100644 --- a/docs/user-guide/crawl-workflows.md +++ b/docs/user-guide/crawl-workflows.md @@ -4,11 +4,11 @@ Crawl Workflows consist of a list of configuration options that instruct the cra ## Creating and Editing Crawl Workflows -New Crawl Workflows can be created from the Crawling page. A detailed breakdown of available settings can be found [here](../workflow-setup). +New Crawl Workflows can be created from the Crawling page. A detailed breakdown of available settings can be found [here](workflow-setup.md). ## Status -Crawl Workflows inherit the [status of the last item they created](../archived-items/#status). When a workflow has been instructed to run it can have have five possible states: +Crawl Workflows inherit the [status of the last item they created](archived-items.md#status). When a workflow has been instructed to run it can have have five possible states: | Status | Description | | ---- | ---- | @@ -25,11 +25,11 @@ Crawl workflows can be run from the actions menu of the workflow in the crawl wo While crawling, the Watch Crawl page displays a list of queued URLs that will be visited, and streams the current state of the browser windows as they visit pages from the queue. -Running a crawl workflow that has successfully run previously can be useful to capture content as it changes over time, or to run with an updated [Crawl Scope](../workflow-setup/#scope). +Running a crawl workflow that has successfully run previously can be useful to capture content as it changes over time, or to run with an updated [Crawl Scope](workflow-setup.md#scope). ### Live Exclusion Editing -While [exclusions](../workflow-setup/#exclusions) can be set before running a crawl workflow, sometimes while crawling the crawler may find new parts of the site that weren't previously known about and shouldn't be crawled, or get stuck browsing parts of a website that automatically generate URLs known as ["crawler traps"](https://en.wikipedia.org/wiki/Spider_trap). +While [exclusions](workflow-setup.md#exclusions) can be set before running a crawl workflow, sometimes while crawling the crawler may find new parts of the site that weren't previously known about and shouldn't be crawled, or get stuck browsing parts of a website that automatically generate URLs known as ["crawler traps"](https://en.wikipedia.org/wiki/Spider_trap). If the crawl queue is filled with URLs that should not be crawled, use the _Edit Exclusions_ button on the Watch Crawl page to instruct the crawler what pages should be excluded from the queue. @@ -37,7 +37,7 @@ Exclusions added while crawling are applied to the same exclusion table saved in ### Changing the Amount of Crawler Instances -Like exclusions, the [crawler instance](../workflow-setup/#crawler-instances) scale can also be adjusted while crawling. On the Watch Crawl page, press the _Edit Crawler Instances_ button, and set the desired value. +Like exclusions, the [crawler instance](workflow-setup.md#crawler-instances) scale can also be adjusted while crawling. On the Watch Crawl page, press the _Edit Crawler Instances_ button, and set the desired value. Unlike exclusions, this change will not be applied to future workflow runs. diff --git a/docs/user-guide/index.md b/docs/user-guide/index.md index 2ece94a8b9..3ca87273bf 100644 --- a/docs/user-guide/index.md +++ b/docs/user-guide/index.md @@ -6,8 +6,8 @@ Welcome to the Browsertrix User Guide. This page covers the basics of using Brow To get started crawling with Browsertrix: -1. Create an account and join an organization [as described here](signup). -2. After being redirected to the organization's [overview page](overview), click the _Create New_ button in the top right and select _[Crawl Workflow](crawl-workflows)_ to begin configuring your first crawl! +1. Create an account and join an organization [as described here](signup.md). +2. After being redirected to the organization's [overview page](overview.md), click the _Create New_ button in the top right and select _[Crawl Workflow](crawl-workflows.md)_ to begin configuring your first crawl! 3. For a simple crawl, choose the _Seeded Crawl_ option, and enter a page url in the _Crawl Start URL_ field. By default, the crawler will archive all pages under the starting path. 4. Next, click _Review & Save_, and ensure the _Run on Save_ option is selected. Then click _Save Workflow_. 5. Wait a moment for the crawler to start and watch as it archives the website! @@ -16,12 +16,12 @@ To get started crawling with Browsertrix: After running your first crawl, check out the following to learn more about Browsertrix's features: -- A detailed list of [crawl workflow setup](workflow-setup) options. -- Adding [exclusions](workflow-setup/#exclusions) to limit your crawl's scope and evading crawler traps by [editing exclusion rules while crawling](crawl-workflows/#live-exclusion-editing). -- Best practices for crawling with [browser profiles](browser-profiles) to capture content only available when logged in to a website. -- Managing archived items, including [uploading previously archived content](archived-items/#uploading-web-archives). -- Organizing and combining archived items with [collections](collections) for sharing and export. -- If you're an admin: [Inviting collaborators to your org](org-settings/#members). +- A detailed list of [crawl workflow setup](workflow-setup.md) options. +- Adding [exclusions](workflow-setup.md#exclusions) to limit your crawl's scope and evading crawler traps by [editing exclusion rules while crawling](crawl-workflows.md#live-exclusion-editing). +- Best practices for crawling with [browser profiles](browser-profiles.md) to capture content only available when logged in to a website. +- Managing archived items, including [uploading previously archived content](archived-items.md#uploading-web-archives). +- Organizing and combining archived items with [collections](collections.md) for sharing and export. +- If you're an admin: [Inviting collaborators to your org](org-settings.md#members). ### Have more questions? diff --git a/docs/user-guide/signup.md b/docs/user-guide/signup.md index be4dc28636..22c98f3b2b 100644 --- a/docs/user-guide/signup.md +++ b/docs/user-guide/signup.md @@ -2,7 +2,7 @@ ## Invite Link -If you have been sent an [invite](../org-settings/#members), enter a name and password to create a new account. Your account will be added to the organization you were invited to by an organization admin. +If you have been sent an [invite](org-settings.md#members), enter a name and password to create a new account. Your account will be added to the organization you were invited to by an organization admin. ## Open Registration diff --git a/docs/user-guide/workflow-setup.md b/docs/user-guide/workflow-setup.md index be0fc6ef6b..0b51fa1c29 100644 --- a/docs/user-guide/workflow-setup.md +++ b/docs/user-guide/workflow-setup.md @@ -2,7 +2,7 @@ ## Crawl Type -The first step in creating a new [crawl workflow](../crawl-workflows) is to choose what type of crawl you want to run. Crawl types are fixed and cannot be converted or changed later. +The first step in creating a new [crawl workflow](crawl-workflows.md) is to choose what type of crawl you want to run. Crawl types are fixed and cannot be converted or changed later. `URL List`{ .badge-blue } : The crawler visits every URL specified in a list, and optionally every URL linked on those pages. @@ -150,7 +150,7 @@ Waits on the page for a set period of elapsed time after any behaviors have fini ### Browser Profile -Sets the [_Browser Profile_](../browser-profiles) to be used for this crawl. +Sets the [_Browser Profile_](browser-profiles.md) to be used for this crawl. ### Crawler Release Channel @@ -219,4 +219,4 @@ Apply tags to the workflow. Tags applied to the workflow will propagate to every ### Collection Auto-Add -Search for and specify [collections](../collections) that this crawl workflow should automatically add content to as soon as crawling finishes. Canceled and Failed crawls will not be automatically added to collections. +Search for and specify [collections](collections.md) that this crawl workflow should automatically add content to as soon as crawling finishes. Canceled and Failed crawls will not be automatically added to collections.