- Sep 16, 2019
-
-
Antoine Lambert authored
Turns out all newly listed repositories were filtered out because of that. Consequently, no entries in the listers database and no scheduler loading tasks were created when listing a Phabricator instance. Closes T1999
-
- Sep 12, 2019
-
-
Antoine Lambert authored
-
- Sep 09, 2019
-
-
Antoine R. Dumont authored
-
- Sep 06, 2019
-
-
Antoine R. Dumont authored
Related T1984
-
- Sep 05, 2019
-
-
David Douard authored
it is not used anymore.
-
- Sep 04, 2019
-
-
David Douard authored
Since all the listing tasks accepts an url as first argument (whatever the argument name is), it makes sense to use a simple common argument name for this. I've chosen 'url' instead of api_baseurl/forge_url/url. Also kill now useless `new_lister()` functions.
-
David Douard authored
This is needed to fix the db-init implementation so the debian loader (which does use the SQLBase from swh.storage) have its models declared in the MetaData used by the initialize() function.
-
David Douard authored
Since the CGit lister now perform an HTTP query for each git repos listed in the main index, it is significantly slower, so reducing the time between database commits make sense, and won't overload the database. With a bit of logging, it makes it easier to follow/debug the progress of a listing.
-
David Douard authored
Add a new register-task-types cli that will create missing task-type entries in the scheduler according to: - only create missing task-types (do not update them), but check that the backend_name field is consistent, - each SWHTask-based task declared in a module listed in the 'task_modules' plugin registry field will be checked and added if needed; tasks which name start wit an underscore will not be added, - added task-type will have: - the 'type' field is derived from the task's function name (with underscores replaced with dashes), - the description field is the first line of that function's docstring, - default values as provided by the swh.lister.cli.DEFAULT_TASK_TYPE (with a simple pattern matching to have decent default values for full/incremental tasks), - these default values can be overloaded via the 'task_type' plugin registry entry. For this, we had to rename all tasks names (eg. `cran_lister` -> `list_cran`). Comes with some tests.
-
- Sep 03, 2019
-
-
David Douard authored
Listers are declared as plugins via the `swh.workers` entry_point. As such, the registry function is expected to return a dict with the `task_modules` field (as for generic worker plugins), plus: - `lister`: the lister class, - `models`: list of SQLAlchemy models used by this lister, - `init` (optionnal): hook (callable) used to initialize the lister's state (typically, create/initialize the database for this lister). If not set, the default implementation creates database tables (after optionally having deleted exisintg ones) according to models declared in the `models` register field. There is no need for explicitely add lister task modules in the main `conftest` module, but any new/extra lister to be tested must be registered (the tested lister module must be properly installed in the test environment). Also refactor a bit the cli tools: - add support for the standard --config-file option at the 'lister' group level, - move the --db-url to the 'lister' group, - drop the --lister option for the `swh lister db-init` cli tool: initializing (especially with --drop-tables) the database for a single lister is unreliable, since all tables are created using a sibgle MetaData (in the same namespace).
-
David Douard authored
This is needed by the (refactored) db init mechanism, since this later uses the main declarative base class (thus the main MetaData instance) to gather tables to be created/dropped.
-
David Douard authored
forgot the forge_url -> api_baseurl renaming in there.
-
David Douard authored
forgot some `url_prefix` there.
-
- Sep 02, 2019
-
-
David Douard authored
- use the 'standard' api_baseurl as init argument, - make it optional, with default to forge.softwareheritage.org, - use origin_url as id.
-
David Douard authored
and simplify a bit the code of the constructor.
-
David Douard authored
-
David Douard authored
-
David Douard authored
This is required to be able to make lister classes instanciation easier and more reliable, especially in the context of cli tools like 'swh lister run', for which we want to be able to specify any lister init argument as extra parameter of the command.
-
David Douard authored
Simplify the code: - do only inherit from ListerBase - implement HTTP queries directly using requests - get rid of convoluted code Make the origin_url gathered from the git repo's "project" page instead of building it from the 'url_prefix' hack. Now, the lister WILL make substancially more requests, since it will make one request per listed git repo, but the provided origin_url should be pretty reliable now. When several url are provided as clonable URLs, choose the http/https one first, otherwise, choose the first one of the list. Add proper tests for the cgit lister. Also, get rid of the 'time_updated' column in the model.
-
- Aug 30, 2019
-
-
David Douard authored
get rid of the "smart" flush_packet_db computation.
-
David Douard authored
instead of picking the first one, so this behavior is consistent with ListerHttpTransport's one.
-
David Douard authored
and get rid of the unneeded _build_query_params method.
-
David Douard authored
stick to the existing credentials mechanism provided by ListerHttpTransport.
-
David Douard authored
and fix empty values returned by this later (empty list instead of ampty dict).
-
- Aug 29, 2019
-
-
Antoine R. Dumont authored
-
- Aug 28, 2019
-
-
Antoine R. Dumont authored
-
Antoine R. Dumont authored
-
Antoine R. Dumont authored
Example use case: swh lister run --lister gitlab \ --priority high \ --policy oneshot \ --db-url postgresql://postgres@localhost:5432/swh-listers \ api_baseurl=https://gitlab.ow2.org/api/v4/ Related T1919
-
Antoine R. Dumont authored
Prior to this commit, the policy and priority were hard-coded. The default values are now the old hard-coded values. This will allow to develop a cli to trigger forges listing with oneshot policy and some priority tasks. Thus ingesting those faster and without manual interventation as we currently do.
-
- Jul 19, 2019
-
-
Archit Agrawal authored
Implement a packagist lister to list the names and metadata url of all the packages. Closes 1776
-
- Jul 18, 2019
-
-
Archit Agrawal authored
Add tests for pypi lister Closes T1890
-
Archit Agrawal authored
There were previously no tests for the listers which are using the class SimpleLister(like pypi) Refractored test_lister.py of lister core to accomodate tests for SimpleLister keeping the tests undisturbed for other lister.
-
- Jul 11, 2019
-
-
Stefano Zacchiroli authored
-
- Jul 04, 2019
-
-
Stefano Zacchiroli authored
-
Stefano Zacchiroli authored
-
Stefano Zacchiroli authored
-
- Jun 28, 2019
-
-
Antoine R. Dumont authored
-
Antoine R. Dumont authored
-
Antoine R. Dumont authored
-
Antoine R. Dumont authored
-