Skip to content

Parser fails to parse links which are not valid regex #753

@haynesgt

Description

@haynesgt

Expected Behavior

Should be able to Parser.parse("https://ahrefs.com/jobs/clickhouse-c++-developer")

Current Behavior

SyntaxError: Invalid regular expression: /^https://ahrefs.com/jobs/clickhouse-c++-developer/i: Nothing to repeat
    at new RegExp (<anonymous>)
    at makeBaseRegex (node_modules/@postlight/parser/dist/mercury.js:7403:10)
    at scoreLinks (node_modules/@postlight/parser/dist/mercury.js:7419:19)

Steps to Reproduce

await (Parser = require("@postlight/parser")).parse("https://ahrefs.com/jobs/clickhouse-c++-developer")

Detailed Description

Issue in 2.2.3

return new RegExp(`^${baseUrl}`, 'i');

Possible Solution

function makeBaseRegex(baseUrl) {
  var escapedUrl = baseUrl.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
  return new RegExp(`^${escapedUrl}`), 'i');
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions