Skip to content

Commit 201f42f

Browse files
committed
update docs and info text
1 parent cbee597 commit 201f42f

File tree

2 files changed

+26
-5
lines changed

2 files changed

+26
-5
lines changed

frontend/docs/docs/user-guide/workflow-setup.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,9 @@ Crawl scopes are categorized as a **Page Crawl** or **Site Crawl**:
7878

7979
### Page URL(s)
8080

81-
One or more URLs of the page to crawl. URLs must follow [valid URL syntax](https://www.w3.org/Addressing/URL/url-spec.html). For example, if you're crawling a page that can be accessed on the public internet, your URL should start with `http://` or `https://`.
81+
One or more URLs of the pages to crawl, visible when using a crawl scope of _Single Page_ or _List of Pages_. URLs will be crawled in the order that they are specified.
82+
83+
URLs must follow [valid URL syntax](https://www.w3.org/Addressing/URL/url-spec.html). For example, if you're crawling a page that can be accessed on the public internet, your URL should start with `http://` or `https://`.
8284

8385
See [List Of Pages](#list-of-pages) for additional info when providing a list of URLs.
8486

@@ -90,7 +92,7 @@ See [List Of Pages](#list-of-pages) for additional info when providing a list of
9092

9193
### Crawl Start URL
9294

93-
This is the first page that the crawler will visit. _Site Crawl_ scopes are based on this URL.
95+
This is the first page that the crawler will visit. When using a crawl scope of _In-Page Links_, _Pages in Same Directory_, _Pages on Same Domain_, or _Pages on Same Domain + Subdomains_, this URL is the basis for determining whether a linked URL is within scope and should be crawled.
9496

9597
### Include Any Linked Page
9698

@@ -349,7 +351,9 @@ Describe and organize your crawl workflow and the resulting archived items.
349351

350352
### Name
351353

352-
Allows a custom name to be set for the workflow. If no name is set, the workflow's name will be set to the _Crawl Start URL_. For Page List crawls, the workflow's name will be set to the first URL present in the _Crawl URL(s)_ field, with an added `(+x)` where `x` represents the total number of URLs in the list.
354+
Allows a custom name to be set for the workflow.
355+
356+
If no name is set, the workflow's name will be set to the first page URL specified in _Scope_ (also referred to as the crawl start URL.) For _Single Page_ and _List of Pages_ crawls, the workflow's name will be suffixed by `+ N` where `N` represents the number of page URLs in addition to the crawl start URL.
353357

354358
### Description
355359

frontend/src/features/crawl-workflows/workflow-editor.ts

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2165,6 +2165,20 @@ https://archiveweb.page/images/${"logo.svg"}`}
21652165
};
21662166

21672167
private renderJobMetadata() {
2168+
const link_to_scope = html`<button
2169+
type="button"
2170+
class="text-blue-600 hover:text-blue-500"
2171+
@click=${async () => {
2172+
this.updateProgressState({ activeTab: "scope" });
2173+
2174+
await this.updateComplete;
2175+
2176+
void this.scrollToActivePanel();
2177+
}}
2178+
>
2179+
${msg("Scope")}
2180+
</button>`;
2181+
21682182
return html`
21692183
${inputCol(html`
21702184
<sl-input
@@ -2179,8 +2193,11 @@ https://archiveweb.page/images/${"logo.svg"}`}
21792193
></sl-input>
21802194
`)}
21812195
${this.renderHelpTextCol(
2182-
msg(`Customize this Workflow's name. Workflows are named after
2183-
the first Crawl URL by default.`),
2196+
html`${msg(`Customize the name of this workflow.`)}
2197+
${msg(
2198+
html`If omitted, the workflow will be named after the first page URL
2199+
specified in ${link_to_scope}.`,
2200+
)} `,
21842201
)}
21852202
${inputCol(html`
21862203
<sl-textarea

0 commit comments

Comments
 (0)