PINGDOM_CHECK
2 Mins

Introducing Scrapy Cloud with Python 3 support

It’s the end of an era. Python 2 is on its way out with only a few security and bug fixes forthcoming from now until its official retirement in 2020. Given this withdrawal of support and the fact that Python 3 has snazzier features, we are thrilled to announce that Scrapy Cloud now officially supports Python 3.

scrapy_cloud_2x

If you are new to Zyte , Scrapy Cloud is our production platform that allows you to deploy, monitor, and scale your web scraping projects. It pairs with Scrapy, the open-source web scraping framework, and Portia, our open-source visual web scraper.

Scrapy + Scrapy Cloud with Python 3

I’m sure you Scrapy users are breathing a huge sigh of relief! While Scrapy with official Python 3 support has been around since May, you can now deploy your Scrapy spiders using the fancy new features introduced with Python 3 to Scrapy Cloud. You’ll have the beloved extended tuple unpacking, function annotations, keyword-only arguments, and much more at your fingertips.

Fear not if you are a Python 2 developer and can't port your spiders' codebase to Python 3 because Scrapy Cloud will continue supporting Python 2. In fact, Python 2 remains the default unless you explicitly set your environment to Python 3.

Deploying your Python 3 spiders

Docker support was one of the new features that came along with the Scrapy Cloud 2.0 release in May. It brings more flexibility to your spiders, allowing you to define in which kind of runtime environment (AKA stack) they will be executed.

This configuration is done in your local project's scrapinghub.yml. There you have to include a section called stacks having scrapy:1.1-py3 as the stack for your Scrapy Cloud project:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
projects:
default: 99999
stacks:
default: scrapy:1.1-py3
projects: default: 99999 stacks: default: scrapy:1.1-py3
projects:
    default: 99999
stacks:
    default: scrapy:1.1-py3

After doing that, you just have to deploy your project using shub:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ shub deploy
$ shub deploy
$ shub deploy

Note: make sure you are using shub 2.3+ by upgrading it:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ pip install shub --upgrade
$ pip install shub --upgrade
$ pip install shub --upgrade

And you're all done! The next time you run your spiders on Scrapy Cloud, they will run on Scrapy 1.1 + Python 3.

Multi-target deployment file

If you have a multi-target deployment file, you can define a separate stack for each project ID:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
projects:
default:
id: 55555
stack: scrapy:1.1
py3:
id: 99999
stack: scrapy:1.1-py3
projects: default: id: 55555 stack: scrapy:1.1 py3: id: 99999 stack: scrapy:1.1-py3
projects:
    default:
        id: 55555
        stack: scrapy:1.1
    py3:
        id: 99999
        stack: scrapy:1.1-py3

This allows you to deploy your local project to whichever Scrapy Cloud project you want, using a different stack for each one:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ shub deploy py3
$ shub deploy py3
$ shub deploy py3

This deploys your crawler to project 99999 and uses Scrapy 1.1 + Python 3 as the execution environment.

You can find different versions of the Scrapy stack here.

Wrap up

We hope that you’re as excited as we are for this newest upgrade to Python 3. If you have further questions or are interested in learning more about the souped-up Scrapy Cloud, take a look at our Knowledge Base article.

For those new to our platform, Scrapy Cloud has a forever free subscription, so sign up and give us a try.