PINGDOM_CHECK

The Scrapy tutorial part III: How To Scrape Data From Multiple Web Pages

Developed by Pablo Hoffman and Shane Evans, Scrapy is an open-source python framework built specifically for web data extraction. With Scrapy spiders, you are able to download HTML, parse and process the data and save it in either CSV, JSON, or XML file formats.

So far you've learned how to build a Scrapy spider to extract an item from a page. Now you will learn how to scrape multiple items from the same page.

After watching this, you will know:

  • How to iterate over page elements
  • How to extract data from repeating elements

If you haven't yet, we recommend you to first watch part I and part II of our tutorial series.

FAQs

What is Scrapy?

Scrapy is an open-source Python framework designed for web data extraction.

What file formats can Scrapy save data in?

Scrapy can save data in CSV, JSON, or XML file formats.

What are key skills covered in this part of the tutorial?

Iterating over page elements and extracting data from repeating elements.

What should you do before starting this tutorial?

Watch parts I and II of the tutorial series for foundational knowledge.