Skip to content

Commit

Permalink
Docs: Improve relative links (#1476)
Browse files Browse the repository at this point in the history
### Changes

- Fixes one broken link (["Ansible Playbooks"
here](https://docs.browsertrix.cloud/deploy/remote/))
- Formats relative links better to conform with [mkdocs 1.5 link
validation
improvements](https://www.mkdocs.org/about/release-notes/#expanded-validation-of-links)
  • Loading branch information
Shrinks99 authored Feb 7, 2024
1 parent f853fcd commit 45c9a91
Show file tree
Hide file tree
Showing 12 changed files with 27 additions and 28 deletions.
2 changes: 1 addition & 1 deletion docs/deploy/ansible/digitalocean.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ To run this ansible playbook, you need to:
- `doctl` command line client configured (run `doctl auth init`)
- Create a [DigitalOcean Spaces](https://docs.digitalocean.com/reference/api/spaces-api/) API Key which will also need to be set in your terminal sessions environment variables, which should be set as `DO_AWS_ACCESS_KEY` and `DO_AWS_SECRET_KEY`
- Configure a DNS A Record and CNAME record.
- Have a working python and pip configuration through your OS Package Manager
- Have a working Python and pip configuration through your OS Package Manager

#### Install

Expand Down
2 changes: 1 addition & 1 deletion docs/deploy/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ The main requirements for Browsertrix Cloud are:
- [Helm 3](https://helm.sh/) (package manager for Kubernetes)


We have prepared a [Local Deployment Guide](./local) which covers several options for testing Browsertrix Cloud locally on a single machine, as well as a [Production (Self-Hosted and Cloud) Deployment](./production) guides to help with setting up Browsertrix Cloud for different production scenarios.
We have prepared a [Local Deployment Guide](local.md) which covers several options for testing Browsertrix Cloud locally on a single machine, as well as a [Production (Self-Hosted and Cloud) Deployment](remote.md) guide to help with setting up Browsertrix Cloud in different production scenarios.
2 changes: 1 addition & 1 deletion docs/deploy/remote.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

For remote and hosted deployments (both on a single machine or in the cloud), the only requirement is to have a designed domain and (strongly recommended, but not required) second domain for signing web archives.

We are also experimenting with [Ansible playbooks](../deploy/ansible) for cloud deployment setups.
We are also experimenting with [Ansible playbooks](ansible/digitalocean.md) for cloud deployment setups.

The production deployments also allow using an external mongodb server, and/or external S3-compatible storage instead of the bundled minio.

Expand Down
2 changes: 1 addition & 1 deletion docs/develop/frontend-dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Instead of rebuilding the entire frontend image to view your UI changes, you can

### 1. Browsertrix Cloud API backend already in a Kubernetes cluster

The frontend development server requires an existing backend that has been deployed locally or is in production. See [Deploying Browsertrix Cloud](../../deploy/).
The frontend development server requires an existing backend that has been deployed locally or is in production. See [Deploying Browsertrix Cloud](../deploy/index.md).

### 2. Node.js ≥16 and Yarn 1

Expand Down
2 changes: 1 addition & 1 deletion docs/develop/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,6 @@ The frontend UI is implemented in TypeScript, using the Lit framework and Shoela

The static build of the frontend is bundled with nginx, but the frontend can be deployed locally in dev mode against an existing backend.

See [Running Frontend](./frontend-dev) for more details.
See [Developing the Frontend UI](frontend-dev.md) for more details.

<!-- *TODO Add additional info here* -->
7 changes: 3 additions & 4 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,10 @@ hide:

Welcome to the Browsertrix Cloud official user guide and developer docs. These docs will contain the following sections.

- [Deployment Guide](./deploy) — How to install and deploy Browsertrix Cloud on your local machine, or in the cloud.
- [Developer Docs](./develop) — Information on developing Browsertrix Cloud itself.
- [User Guide](./user-guide) — Instructions and reference for using Browsertrix Cloud.
- [Deployment Guide](deploy/index.md) — How to install and deploy Browsertrix Cloud on your local machine, or in the cloud.
- [Developer Docs](develop/index.md) — Information on developing Browsertrix Cloud itself.
- [User Guide](user-guide/index.md) — Instructions and reference for using Browsertrix Cloud.

If you are unfamiliar with Browsertrix Cloud, please check out [our website](https://browsertrix.cloud), or the main repository at [https://github.com/webrecorder/browsertrix-cloud](https://github.com/webrecorder/browsertrix-cloud)

Our docs are still under construction. If you find something missing, chances are we haven't gotten around to writing that part yet. If you find typos or something isn't clear or seems incorrect, please open an [issue](https://github.com/webrecorder/browsertrix-cloud/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) and we'll try to make sure that your questions get answered here in the future!

2 changes: 1 addition & 1 deletion docs/user-guide/archived-items.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ The status of an archived item depends on its type. Uploads will always have the

| Status | Description |
| ---- | ---- |
| <span class="status-success">:bootstrap-check-circle: Complete</span> | The crawl completed according to the workflow's settings. Workflows with [limits](../workflow-setup/#limits) set may stop running before they capture every queued page, but the resulting archived item will still be marked as "Complete". |
| <span class="status-success">:bootstrap-check-circle: Complete</span> | The crawl completed according to the workflow's settings. Workflows with [limits](workflow-setup.md#limits) set may stop running before they capture every queued page, but the resulting archived item will still be marked as "Complete". |
| <span class="status-warning">:bootstrap-dash-circle: Stopped</span> | The crawl workflow was _stopped_ gracefully by a user and data is saved. |
| <span class="status-danger">:bootstrap-x-octagon: Canceled</span> | The crawl workflow was _canceled_ by a user, no data is saved. |
| <span class="status-danger">:bootstrap-exclamation-triangle: Failed</span> | A serious error occurred while crawling, no data is saved.|
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/collections.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Collections are the primary way of organizing and combining archived items into

Crawls and uploads can be added to a collection after creation by selecting _Select Archived Items_ from the collection's actions menu.

A crawl workflow can also be set to [automatically add any completed archived items to a collection](../workflow-setup/#collection-auto-add) in the workflow's settings.
A crawl workflow can also be set to [automatically add any completed archived items to a collection](workflow-setup.md#collection-auto-add) in the workflow's settings.

## Sharing Collections

Expand Down
10 changes: 5 additions & 5 deletions docs/user-guide/crawl-workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ Crawl Workflows consist of a list of configuration options that instruct the cra

## Creating and Editing Crawl Workflows

New Crawl Workflows can be created from the Crawling page. A detailed breakdown of available settings can be found [here](../workflow-setup).
New Crawl Workflows can be created from the Crawling page. A detailed breakdown of available settings can be found [here](workflow-setup.md).

## Status

Crawl Workflows inherit the [status of the last item they created](../archived-items/#status). When a workflow has been instructed to run it can have have five possible states:
Crawl Workflows inherit the [status of the last item they created](archived-items.md#status). When a workflow has been instructed to run it can have have five possible states:

| Status | Description |
| ---- | ---- |
Expand All @@ -25,19 +25,19 @@ Crawl workflows can be run from the actions menu of the workflow in the crawl wo

While crawling, the Watch Crawl page displays a list of queued URLs that will be visited, and streams the current state of the browser windows as they visit pages from the queue.

Running a crawl workflow that has successfully run previously can be useful to capture content as it changes over time, or to run with an updated [Crawl Scope](../workflow-setup/#scope).
Running a crawl workflow that has successfully run previously can be useful to capture content as it changes over time, or to run with an updated [Crawl Scope](workflow-setup.md#scope).

### Live Exclusion Editing

While [exclusions](../workflow-setup/#exclusions) can be set before running a crawl workflow, sometimes while crawling the crawler may find new parts of the site that weren't previously known about and shouldn't be crawled, or get stuck browsing parts of a website that automatically generate URLs known as ["crawler traps"](https://en.wikipedia.org/wiki/Spider_trap).
While [exclusions](workflow-setup.md#exclusions) can be set before running a crawl workflow, sometimes while crawling the crawler may find new parts of the site that weren't previously known about and shouldn't be crawled, or get stuck browsing parts of a website that automatically generate URLs known as ["crawler traps"](https://en.wikipedia.org/wiki/Spider_trap).

If the crawl queue is filled with URLs that should not be crawled, use the _Edit Exclusions_ button on the Watch Crawl page to instruct the crawler what pages should be excluded from the queue.

Exclusions added while crawling are applied to the same exclusion table saved in the workflow's settings and will be used the next time the crawl workflow is run unless they are manually removed.

### Changing the Amount of Crawler Instances

Like exclusions, the [crawler instance](../workflow-setup/#crawler-instances) scale can also be adjusted while crawling. On the Watch Crawl page, press the _Edit Crawler Instances_ button, and set the desired value.
Like exclusions, the [crawler instance](workflow-setup.md#crawler-instances) scale can also be adjusted while crawling. On the Watch Crawl page, press the _Edit Crawler Instances_ button, and set the desired value.

Unlike exclusions, this change will not be applied to future workflow runs.

Expand Down
16 changes: 8 additions & 8 deletions docs/user-guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ Welcome to the Browsertrix User Guide. This page covers the basics of using Brow

To get started crawling with Browsertrix:

1. Create an account and join an organization [as described here](signup).
2. After being redirected to the organization's [overview page](overview), click the _Create New_ button in the top right and select _[Crawl Workflow](crawl-workflows)_ to begin configuring your first crawl!
1. Create an account and join an organization [as described here](signup.md).
2. After being redirected to the organization's [overview page](overview.md), click the _Create New_ button in the top right and select _[Crawl Workflow](crawl-workflows.md)_ to begin configuring your first crawl!
3. For a simple crawl, choose the _Seeded Crawl_ option, and enter a page url in the _Crawl Start URL_ field. By default, the crawler will archive all pages under the starting path.
4. Next, click _Review & Save_, and ensure the _Run on Save_ option is selected. Then click _Save Workflow_.
5. Wait a moment for the crawler to start and watch as it archives the website!
Expand All @@ -16,12 +16,12 @@ To get started crawling with Browsertrix:

After running your first crawl, check out the following to learn more about Browsertrix's features:

- A detailed list of [crawl workflow setup](workflow-setup) options.
- Adding [exclusions](workflow-setup/#exclusions) to limit your crawl's scope and evading crawler traps by [editing exclusion rules while crawling](crawl-workflows/#live-exclusion-editing).
- Best practices for crawling with [browser profiles](browser-profiles) to capture content only available when logged in to a website.
- Managing archived items, including [uploading previously archived content](archived-items/#uploading-web-archives).
- Organizing and combining archived items with [collections](collections) for sharing and export.
- If you're an admin: [Inviting collaborators to your org](org-settings/#members).
- A detailed list of [crawl workflow setup](workflow-setup.md) options.
- Adding [exclusions](workflow-setup.md#exclusions) to limit your crawl's scope and evading crawler traps by [editing exclusion rules while crawling](crawl-workflows.md#live-exclusion-editing).
- Best practices for crawling with [browser profiles](browser-profiles.md) to capture content only available when logged in to a website.
- Managing archived items, including [uploading previously archived content](archived-items.md#uploading-web-archives).
- Organizing and combining archived items with [collections](collections.md) for sharing and export.
- If you're an admin: [Inviting collaborators to your org](org-settings.md#members).


### Have more questions?
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/signup.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Invite Link

If you have been sent an [invite](../org-settings/#members), enter a name and password to create a new account. Your account will be added to the organization you were invited to by an organization admin.
If you have been sent an [invite](org-settings.md#members), enter a name and password to create a new account. Your account will be added to the organization you were invited to by an organization admin.

## Open Registration

Expand Down
6 changes: 3 additions & 3 deletions docs/user-guide/workflow-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Crawl Type

The first step in creating a new [crawl workflow](../crawl-workflows) is to choose what type of crawl you want to run. Crawl types are fixed and cannot be converted or changed later.
The first step in creating a new [crawl workflow](crawl-workflows.md) is to choose what type of crawl you want to run. Crawl types are fixed and cannot be converted or changed later.

`URL List`{ .badge-blue }
: The crawler visits every URL specified in a list, and optionally every URL linked on those pages.
Expand Down Expand Up @@ -150,7 +150,7 @@ Waits on the page for a set period of elapsed time after any behaviors have fini

### Browser Profile

Sets the [_Browser Profile_](../browser-profiles) to be used for this crawl.
Sets the [_Browser Profile_](browser-profiles.md) to be used for this crawl.

### Crawler Release Channel

Expand Down Expand Up @@ -219,4 +219,4 @@ Apply tags to the workflow. Tags applied to the workflow will propagate to every

### Collection Auto-Add

Search for and specify [collections](../collections) that this crawl workflow should automatically add content to as soon as crawling finishes. Canceled and Failed crawls will not be automatically added to collections.
Search for and specify [collections](collections.md) that this crawl workflow should automatically add content to as soon as crawling finishes. Canceled and Failed crawls will not be automatically added to collections.

0 comments on commit 45c9a91

Please sign in to comment.