Scrapy 1.4 allows remote attackers to cause a denial of service (memory consumption) via large files because arbitrarily many files are read into memory, which is especially problematic if the files are then individually written in a separate thread to a slow storage resource, as demonstrated by interaction between dataReceived (in core/downloader/handlers/http11.py) and S3FilesStore.
References
- https://nvd.nist.gov/vuln/detail/CVE-2017-14158
- https://github.com/scrapy/scrapy/issues/482
- http://blog.csdn.net/wangtua/article/details/75228728
- https://github.com/pypa/advisory-database/blob/8b7a4d62a95e8f605e5dfb4e0b4f299e6403dc12/vulns/scrapy/PYSEC-2017-83.yaml
- https://github.com/advisories/GHSA-h7wm-ph43-c39p
- https://github.com/pypa/advisory-database/tree/main/vulns/scrapy/PYSEC-2017-83.yaml