Skip to content

Importing browser user_data_dir generated with Selenium for crawl #846

@msramalho

Description

@msramalho

I've been experimenting with Selenium and SeleniumBase to access auth-walled content with good results, one of the strategies is providing the user-data-dir/user_data_dir to indicate a directory for the default chrome profile to be stored in, this allows reusing the sessions for quite some time. Some of the websites need more than just the cookies to accept the login (I'm thinking local storage and maybe more).

Mainly: As it stands, is there any way of importing this same profile into browsertrix-crawler? Is the --profile option in browsertrix compatible with this directory or interchangeable somehow?

extra:

  • I would think Chrome (from Selenium) -> Brave profiles may be compatible but have not tested it.
  • I'd also be curious to see if this could be done on the create-login-profile stage, and even if it is possible to connect Selenium/Selenium base to the VNC server.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions