openwpm.browser_manager module

class openwpm.browser_manager.BrowserManager(command_queue: Queue, status_queue: Queue, browser_params: BrowserParamsInternal, manager_params: ManagerParamsInternal, crash_recovery: bool)[source]

Bases: Process

The BrowserManager function runs in each new browser process. It is responsible for listening to command instructions from the Task Manager and passing them to the command module to execute and interface with Selenium. Command execution status is sent back to the TaskManager.

run_impl() None[source]

Override this method in subclasses instead of run().

The base implementation calls the target function (if provided), matching the default multiprocess.Process behavior.

class openwpm.browser_manager.BrowserManagerHandle(manager_params: ManagerParamsInternal, browser_params: BrowserParamsInternal)[source]

Bases: object

The BrowserManagerHandle class is responsible for holding all the configuration and status information on BrowserManager process it corresponds to. It also includes a set of methods for managing the BrowserManager process and its child processes/threads.

Parameters:
  • manager_params – are the TaskManager configuration settings.

  • browser_params – are per-browser parameter settings (e.g. whether this browser is headless, etc.)

browser_manager: Process | None

process that controls browser

close_browser_manager(force: bool = False) None[source]

Attempt to close the webdriver and browser manager processes from this thread. If the browser manager process is unresponsive, the process is killed.

command_queue: Queue | None

queue for passing command objects to BrowserManager

command_thread: Thread | None

thread to run commands issued from TaskManager

current_timeout: int | None

timeout of the current command

display_pid: int | None

the pid of the display for the Xvfb display (if it exists)

display_port: int | None

the port of the display for the Xvfb display (if it exists)

execute_command_sequence(task_manager: TaskManager, command_sequence: CommandSequence) None[source]

Sends CommandSequence to the BrowserManager one command at a time

geckodriver_pid: int | None

pid for browser instance controlled by BrowserManager

is_fresh: bool

indicates if the BrowserManager is new (to optimize restarts)

kill_browser_manager()[source]

Kill the BrowserManager process and all of its children

launch_browser_manager() bool[source]

sets up the BrowserManager and gets the process id, browser pid and, if applicable, screen pid. loads associated user profile if necessary

ready()[source]

return if the browser is ready to accept a command

restart_browser_manager(clear_profile=False)[source]

kill and restart the two worker processes <clear_profile> marks whether we want to wipe the old profile

restart_required: bool

indicates if the browser should be restarted

set_visit_id(visit_id)[source]
shutdown_browser(during_init: bool, force: bool = False) None[source]

Runs the closing tasks for this Browser/BrowserManager

status_queue: Queue | None

queue for receiving command execution status from BrowserManager

openwpm.browser_manager.is_dns_error(command_status: str, error_text: str | None) bool[source]

Return True when the failure is a DNS resolution error (NXDOMAIN).

DNS resolution errors are expected when crawling large domain lists (e.g. Tranco top-100k) and don’t indicate a browser or instrumentation failure. Only NXDOMAIN is excluded; DNS timeouts and SERVFAIL intentionally still count.