Puppeteer - NodeJS Scraping: Unterschied zwischen den Versionen
Aus Wikizone
| Zeile 36: | Zeile 36: | ||
== Beispiel Skripte == | == Beispiel Skripte == | ||
| + | DOM Element auslesen | ||
<syntaxhighlight lang="javascript"> | <syntaxhighlight lang="javascript"> | ||
| + | const puppeteer = require("puppeteer"); | ||
| + | (async () => { | ||
| + | const browser = await puppeteer.launch({headless: false}) // launch can launch headless or with displaying | ||
| + | const page = await browser.newPage() // open new tab in browser | ||
| + | await page.goto("https://schlegel.media") | ||
| + | |||
| + | const grabSlogan = await page.evaluate( () => { | ||
| + | const slogan = document.querySelector(".uk-text-lead") | ||
| + | return slogan.innerHTML | ||
| + | }) | ||
| + | |||
| + | console.log(grabSlogan) | ||
| + | await browser.close() | ||
| + | }) (); | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Version vom 17. August 2022, 14:47 Uhr
Quickstart
https://www.youtube.com/watch?v=Sag-Hz9jJNg
Voraussetzung: VisualStudioCode, NodeJS installiert
Ordner erstellen und NodeJS Projekt starten
Terminal
npm init -y npm install puppeteer
Installiert auch Chromium. Schau mal in die
index.js erstellen. Puppeteer laden mit asynchroner Funktion. Diese Funktion
const puppeteer = require("puppeteer");
(async () => {
}) ();
Beispiel Screenshot von Seite anfertigen:
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({headless: false}) // launch can launch headless or with displaying
const page = await browser.newPage() // open new tab in browser
await page.goto("https://schlegel.media")
await page.screenshot({path: "screenshot.png"})
await browser.close()
}) ();
Starten mit
node index.js
Beispiel Skripte
DOM Element auslesen
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({headless: false}) // launch can launch headless or with displaying
const page = await browser.newPage() // open new tab in browser
await page.goto("https://schlegel.media")
const grabSlogan = await page.evaluate( () => {
const slogan = document.querySelector(".uk-text-lead")
return slogan.innerHTML
})
console.log(grabSlogan)
await browser.close()
}) ();