HeadlessBrowserGet
The HeadlessBrowserGet
API retrieves the rendered HTML content of a web page using a headless browser. This tool is particularly useful for pages that require JavaScript execution in order to display dynamic content.
Parameters
url
: The URL of the web page to fetch.
Type: string
Description: The URL to fetch and render using a headless browser.
Configuration
Upon initialization, the tool accepts a provider
parameter which defines the headless browser backend to use. Currently, only "selenium-chrome"
is supported. The API automatically sets up a Chrome webdriver in headless mode and returns the full page source with JavaScript executed.
Example Usage
from gofannon.headless_browser.headless_browser_get import HeadlessBrowserGet
# Initialize the tool with the default provider (selenium-chrome)
browser_get = HeadlessBrowserGet()
# Get the rendered HTML content of the page
page_content = browser_get.fn("https://example.com")
print(page_content)
Background
Dynamic web pages often require the execution of JavaScript to fully render content. Traditional HTTP requests (e.g., via requests.get) do not process JavaScript; thus, headless browsers are commonly used in such scenarios. HeadlessBrowserGet leverages tools from Selenium WebDriver (with chrome-options) to fetch pages with JavaScript rendered, providing up-to-date and complete content.
You can now add these markdown documentation files to your repo under the docs/headless_browser folder.