Scrapy - Web crawler: Unterschied zwischen den Versionen

Aus Wikizone
Wechseln zu: Navigation, Suche
(Die Seite wurde neu angelegt: „ == Scraping Onepage Apps, JavaScript Frameworks (React etc.) == Scrapy does not interpret JavaScript statements and therefore will probably not show what y…“)
 
 
Zeile 1: Zeile 1:
  
 
+
== Links ==
 +
[[Headless Browser Scraping]]
  
 
== Scraping Onepage Apps, JavaScript Frameworks (React etc.) ==
 
== Scraping Onepage Apps, JavaScript Frameworks (React etc.) ==

Aktuelle Version vom 17. August 2022, 11:59 Uhr

Links[Bearbeiten]

Headless Browser Scraping

Scraping Onepage Apps, JavaScript Frameworks (React etc.)[Bearbeiten]

Scrapy does not interpret JavaScript statements and therefore will probably not show what your browser renders if the website you are scraping relies on JavaScript a lot (for example single-page apps). There's no immediate plan to have Scrapy interpret JavaScript or render pages like a browser does. Splash is one solution to render JavaScript. There are other solutions, for example Selenium.

Possible Solutions[Bearbeiten]

https://github.com/scrapy/scrapy/issues/4484
https://github.com/joelgriffith/navalia
Splash
Google Puppeteer - The best option, fully customizable
Selenium - I would rate it as 2nd number, not bad but not the best!