Commits · master · Platform / Development / swh-web-client

Feb 17, 2025

Apply swh-py-template v0.3.3 with copier · 4a1b0da2

Antoine Lambert authored 1 month ago

Bump development tools: mypy, codespell, isort, ...

Move all tools configuration in pyproject.toml.

Remove no longer needed mypy overrides.

4a1b0da2

Sep 09, 2024

rate-limit: avoid a crash in some rare occasion · abb995db

Pierre-Yves David authored 6 months ago

If the _RateLimitEvent pushed on the heap are similar enough, they info
will be compared, and we need them to be comparable otherwise we get a
traceback.

abb995db

Aug 30, 2024
- client: Fix typo spotted after codespell upgrade · 13d7a001
  Antoine Lambert authored 6 months ago
  
  13d7a001
Aug 27, 2024
- Apply swh-py-template v0.2.3 · 2e68b7e9
  David Douard authored 6 months ago
  
  2e68b7e9
May 15, 2024

concurrent-queries: enable the feature by default · 93914d20
Pierre-Yves David authored 10 months ago
```
This seems a useful feature.
```
93914d20

concurrent-queries: add the option to issue some request concurrently · 60abb446

Pierre-Yves David authored 10 months ago

Some large call might be sliced into multiple http requests. Since we
known in advance we will send all these request, issuing then in
parallel will run faster.

The new rate limiting machinery will ensure that we do not issue
requests faster than the server intend us to do.

The "known" method is using such chunking and use this new approach when
the parameter is set.

The default concurrency of 20 have been picked arbitrarily, it seem
large enough to provide a significant speedup and small enough to not
hammer the server too much.

60abb446

Apr 18, 2024

rate-limit: grant an initial portion of "free token" · 07c872e0

Pierre-Yves David authored 1 year ago

To avoid slowing down client that only needs to do a few requests, we
grant them a small percentage of free initial request and rate limit on
the remaining.

See inline documentation for details.

07c872e0

client: more accurate and thread safe rate limiting · 6d78a444

Pierre-Yves David authored 1 year ago

This commit rework the rate limiting logic to use a more accurate and
thread compatible method.

In short, we know have a daemon thread, that process rate limit
information from completed request and slowly issue "available-request"
token to a `threading.Semaphore` instance at the appropriate rate.
These "available-request" token are consumed by request, effectively
reducing the rate of requests.

See inline documentation for more details.

Note that new rate limiting implementation is no longer "progressive".
The previous implementation adds no delay between request initially and
gradually increasing the delay between requests as the rate limit budget
gets low. However this implementation had multiple issues:

- The rate limit was enforced through explicit delay before each
  requests, slowing down operation regardless of the actual time between
  request.

- The approach was oblivious of threading, so having multiple thread
  issue requests in parallel would increase the pace as much. To counter
  balance this, the "progressive" curse of the delay was "heavy handed"
  adding exponentially more delay that necessary to counter balance the
  threading effect.

The new approach behave much better on both regards:

- Requests are not delayed unless there is not "available-request"
  token. If some token have been accumulated, a request can proceed
  without any delay. If no token are available, the
  `Semaphore.acquire()` call will simply block until a "token" is
  generated. Since token are generated in the background at stable rate.
  Any time spend in other code between each requests will not combine
  with the rate limit delay.

- The logic is fully compatible with thread. Tokens are produced at a
  stable pace regardless the number of Thread using a WebApiClient. In
  the same ways, each token can only be consumed by a single thread, so
  request will be issued at the intended thread regardless the number of
  thread trying to issue them.

Regardless of the advantage of the new method, the lost ability to
initially issue request at a faster rate is still something useful. We
will re-introduce a solution for that in the next commit.

6d78a444

Mar 29, 2024
- Apply swh-py-template v0.2.0 · acfa29e3
  David Douard authored 11 months ago
  
  acfa29e3
Feb 14, 2024
- pytest: Remove use of --import-mode=importlib option · 6d1eb989
  Antoine Lambert authored 1 year ago
  
  It makes the tests fail when executed in a local venv outside tox.
  6d1eb989
Feb 05, 2024
- tox: Bump mypy to 1.8.0 · e09ad26d
  Antoine Lambert authored 1 year ago
  
  Related to swh/meta#5075.
  e09ad26d
Jan 15, 2024
- Update code for coding conventions · cac419e2
  charly reux authored 1 year ago
  
  v0.8.0
  
  cac419e2
Jan 12, 2024
- Fixed conflicts with latest merge · 699de30a
  charly reux authored 1 year ago
  
  699de30a
Jan 09, 2024

WebAPIClient: add some debug logging · 0ea158fe

Pierre-Yves David authored 1 year ago

We add some basic debug output to help monitor the HTTP exchange, rate
limit information and associated request delay.

This comes from the SWH scanner Client too.

0ea158fe

WebAPIClient: slow down request according to rate limit · 182a2bb9

Pierre-Yves David authored 1 year ago

This is mostly a port of the basic logic that exists in the SWH scanner
Client.

We can improve and adjust that logic in the future, but this is out of scope for
this series.

182a2bb9

WebAPIClient: gather rate limit information · 5770837f

Pierre-Yves David authored 1 year ago

We gather rate limiting information from response header and we keep the
most useful one. We do not do anything with it yet, but this will come
soon.

5770837f

WebAPIClient: create a session to optimize request · 5dca0b14

Pierre-Yves David authored 1 year ago

The `requests` packages offer a simple way to reuse connection over
multiple http request, so lets us using it for free.

5dca0b14

WebAPIClient: retry on failed request · ba8a9190

Pierre-Yves David authored 1 year ago

Detect bad 429 reply and retry on them. This is useful to avoid aborting
in the middle of a large series of request.

This is also a behavior imported from the SWH scanner version of the
Client.

ba8a9190

known: comply with maximum request size · 957a00b1

Pierre-Yves David authored 1 year ago

The maximum number of swhids included in a single `known` API call is
limited. So we introduce a way to automatically slices larger call in
small request.

We also make sure the constant is publicly available to help client to
adjust their strategy.

Such automatic slicing was first introduced in the SWH Scanner version
of the Web API Client. It is both useful and required for feature
parity.

We take this as an opportunity to automatize some part of the test for
the `known` query to do larger queries based on a common set of
generated ids.

957a00b1

WebAPIClient: add a `get_origin` method · a9b96fc3

Pierre-Yves David authored 1 year ago

This method is weird and will probably not survive long as is. However
that method is copied from the SWH scanner version of the web client.
Since having two version for the web api client seems silly, I am adding
the missing piece (whatever value these pieces have) to the more generic
version.

With this addition the scanner is now ready to switch to the
`swh.web.client` version of the web client.

Further work is needed to add parallel requests and rate limiting
complience to this code. However, this work will not affect the public
API of the object.

a9b96fc3

typing: set known input at Iterable · 8279a8df

Pierre-Yves David authored 1 year ago

The previous value was Iterator, which is much more restrictive and
prevent passing a list as argument. Since list are useful, we update the
function signature.

8279a8df

Dec 06, 2023
- fix trailing whitespace · 005bfa5c
  charly reux authored 1 year ago
  
  005bfa5c
- Fix fetch_url formatting in API_DATA_STATIC · 6be87da3
  charly reux authored 1 year ago
  
  6be87da3
- Added optional email in cooking_request · 034840a5
  charly reux authored 1 year ago
  
  034840a5
- Refactor for flake · 5d459baf
  charly reux authored 1 year ago
  
  5d459baf
- Refactor for flake · 9faede9a
  charly reux authored 1 year ago
  
  9faede9a
- Refactor for flake · a47de590
  charly reux authored 1 year ago
  
  a47de590
- Refactor for flake · 32175149
  charly reux authored 1 year ago
  
  32175149
- Refactor for flake · 39d950a8
  charly reux authored 1 year ago
  
  39d950a8
- Add cooking request, cooking check, and cooking · 9fbc842f
  charly reux authored 1 year ago
  
  fetch tests
  9fbc842f
- Add static mock API data for post and get requests · c923317f
  charly reux authored 1 year ago
  
  c923317f
- Add cooking_request, cooking_check, and · 80332517
  charly reux authored 1 year ago
  
  cooking_fetch methods to WebAPIClient
  80332517
- Add .venv/ to .gitignore · a3644eaa
  charly reux authored 1 year ago
  
  a3644eaa
- Add mock for GET requests · ace439f2
  charly reux authored 1 year ago
  
  ace439f2
Dec 04, 2023
- python: Fix isort formatting · 0d673a21
  David Douard authored 1 year ago
  
  0d673a21
Dec 03, 2023
- Apply swh-py-template 0.1.6 · 3ab3654d
  David Douard authored 1 year ago
  
  3ab3654d
Nov 29, 2023
- Migrate to copier-bases swh-py-template · 7391edc0
  David Douard authored 1 year ago
  
  7391edc0
Nov 24, 2023
- Fix WebAPICLient.iter() method · 19bef61f
  David Douard authored 2 years ago
  
  And add a simple test for it.
  19bef61f
Jun 07, 2023
- cli: Actually fix the second command · 40ae50f9
  vlorentz authored 1 year ago
  
  v0.6.0
  
  40ae50f9
- cli: Fix submit-request's docstring · 8776529a
  vlorentz authored 1 year ago
  
  \n needs to be escaped, and click rewrapped the line
  8776529a