Aiohttp unification#397
Conversation
|
This one is buggy.
The runtime change is that you increase potential concurrency in
|
Headers like user-agent and settings like rate limiting and raise_for_status are currently applied inconsistently across scripts. This removes that duplication and simplifies http client usage.
aiohttp by default keep track of cookies for a session. There is no need for this as we're only making API calls.
|
Thanks for the review, I've fixed the missing calls in compress_channel.py and crawl.py and removed the unused MAX_CONCURRENCY.
Only across different domains though, otherwise the same limit is applied in
|
c972458 to
b362c23
Compare
thanks for looking that up.
Generally, the sketch of this file is copy/paste from some other project. The limit is here because of GIL churn. At least very, very likely. It is/was faster with it. But because how the data is spread in this case, basically everything comes from "raw.githubusercontent.com", there is no difference. In the original source I obviously had different data, and likely more heavy computational cost. This is all pretty light here. You do apply the limit to Otherwise, looks good. |
|
A follow up can be a simple env variable e.g. |
MAX_CONNECTIONS_PER_HOST environment variable can now be set to override the per-host connection limit, and scripts can set their own optimized default settings.
|
I've added a |
Unify aiohttp client creation
Headers like user-agent and settings like rate limiting and
raise_for_status are currently applied inconsistently across scripts.
This removes that duplication and simplifies http client usage.