Skip to content

best scrapy practices #390

@haowens

Description

@haowens

Overview

Scrapy is a Python web scraping framework, but it also offers a lot of encapsulated async data processing functionality independent of actual web scraping. Since I have now done the same data processing with a Scrapy pipeline and without, and since we want to standardize the role of Scrapy in our data work, I want to reflect on each implementation option and the strengths and tradeoffs of each.

Proposal

Planning to spend an afternoon typing up a document of notes.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions