Scrapy - Web crawler

Aus Wikizone
Version vom 17. August 2022, 11:59 Uhr von 134.3.84.225 (Diskussion)
(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Wechseln zu: Navigation, Suche

Links[Bearbeiten]

Headless Browser Scraping

Scraping Onepage Apps, JavaScript Frameworks (React etc.)[Bearbeiten]

Scrapy does not interpret JavaScript statements and therefore will probably not show what your browser renders if the website you are scraping relies on JavaScript a lot (for example single-page apps). There's no immediate plan to have Scrapy interpret JavaScript or render pages like a browser does. Splash is one solution to render JavaScript. There are other solutions, for example Selenium.

Possible Solutions[Bearbeiten]

https://github.com/scrapy/scrapy/issues/4484
https://github.com/joelgriffith/navalia
Splash
Google Puppeteer - The best option, fully customizable
Selenium - I would rate it as 2nd number, not bad but not the best!