Coding environment - IDE Interaction code

These are all of the codes that you can do with the IDE

  • input - Global object available to the interaction code. Provided by trigger input or next_stage() calls
    navigate(input.url);
  • navigate - Navigate the browser session to a URL
    url: A URL to navigate to
    navigate([url]);
    navigate(input.url);
    navigate('https://example.com')
  • navigate options
    navigate([url], {wait_until: 'domcontentloaded'}); // waits until DOM content loaded event is fired in the browser
    navigate([url], {referer: [url]}); // adds a referer to the navigation
    navigate([url], {timeout: 45000}); // the number of milliseconds to wait for. Default is 30000 ms
    navigate([url], {header : 'accept: text/html'}); // add headers to the navigation
    navigate([url], {fingerprint: {screen: {width: 400, height: 400}}}); // specify browser width/height
  • parse - Parse the page data
    let page_data = parse();
    collect({ title: page_data.title price: page_data.price });
  • collect - Adds a line of data to the dataset created by the crawler
    data_line: A object with the fields you want to collect
    validate_fn: Optional function to validate that the line data is valid
    collect(<data_line>[, <validate_fn>]);
    collect({price: data.price});
    collect(line, l=>!l && throw new Error('Empty line'));
  • next_stage - Run the next stage of the crawler with the specified input
    input: Input object to pass to the next browser session
    next_stage({url: 'http://example.com', page: 1});
  • rerun_stage - Run this stage of the crawler again with new input
    input: Input object to pass to the next browser session
    rerun_stage({url: 'http://example.com/other-page'});
  • run_stage - Run a specific stage of the crawler with a new browser session
    input: Input object to pass to the next browser session
    stage: Which stage to run (1 is first stage)
    run_stage(2, {url: 'http://example.com', page: 1});
  • country - Configure your crawl to run from a specific country
    code: 2-character ISO country code
    country(<code>);
    country('us');
  • wait - Wait for an element to appear on the page
    selector: Element selector
    opt: wait options (see examples)
    wait(<selector>);
    wait('#welcome-splash');
    wait('.search-results .product');
    wait('[href^='/product']');
    wait(<selector>, {timeout: 5000});
    wait(<selector>, {hidden: true});
  • wait_for_text - Wait for an element on the page to include some text
    selector: Element selector
    text: The text to wait for
    wait_for_text(<selector>, <text>);
    wait_for_text('.location', 'New York');
  • click - Click on an element (will wait for the element to appear before clicking on it)
    selector: Element selector
    click(<selector>);
    click('#show-more');
  • type - Enter text into an input (will wait for the input to appear before typing)
    selector: Element selector
    text: The text to wait for
    type(<selector>, <text>);
    type('#location', 'New York');
    type(<selector>, ['Enter']);
    type(<selector>, ['Backspace']);
  • select - Pick a value from a select element
    selector: Element selector
    select(<select>, <value>);
    select('#country', 'Canada');
  • URL - URL class from NodeJS standard "url" module
    url: URL string
    let u = new URL('https://example.com');
  • location - Object with info about current location. Available fields: href
    url: URL string
    navigate('https://example.com');
    location.href;
  • tag_response - Save the response data from a browser request
    name: The name of the tagged field
    pattern: The URL pattern to match
    tag_response(<field>, <pattern>);
    tag_response('teams', /\/api\/teams/);
    navigate('https://example.com/sports');
    let teams = parse().teams;
    for (let team of teams) collect(team);
  • response_header - Returns the response headers of the last page load
    let headers = response_headers(); 
    console.log('content-type', headers['content-type']);
  • console - Log messages from the interaction code
    console.log(1, 'luminati', [1, 2], {key: value});
  • load_more - Scroll to the bottom of a list to trigger loading more items. Useful for lazy-loaded infinite-scroll sites
    selector: Element selector
    load_more(<selector>);
    load_more('.search-results');
  • scroll_to - Scroll the page so that an element is visible
    scroll_to(<selector>);
    scroll_to('.author-profile');
  • $ - Helper for jQuery-like expressions
    selector: Element selector
    $(<selector>);
    wait($('.store-card'))

Was this article helpful?