Use Puppeteer Core for simple scraping on MacOsX

Posted on October 6, 2021

For quick scraping activities, puppeteer-core is simply great and very fast to set up and use.

You need to configure the Chrome path since it requires an already installed version of it:

const puppeteer = require('puppeteer-core')

let launchOptions = { 
  executablePath: '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
  headless: true
}

const browser = await puppeteer.launch(launchOptions)

Then, the usual Puppeteer functions:

const page = await browser.newPage()
await page.goto(url)

And finally, the scraping part, this example gets all the p tag in the page:

const res = await page.evaluate(() => {
  const ob = {}

  ob.images = [...document.querySelectorAll('p')]
  ob.images = ob.images.map(p => p.innerText)
    
  return ob
})

And of course, the closing session:

await browser.close()

console.log(res)