Skip to content
Snippets Groups Projects

launchpad: Allow bzr origins listing

4 unresolved threads

Related to swh-loader-bzr#3945 (closed)

Test Plan

tox

and docker is happy too:

Lister log (with a pprint on the collection it lists)

swh-lister_1                        | [2022-02-16 16:33:50,268: INFO/MainProcess] Task swh.lister.launchpad.tasks.FullLaunchpadLister[f3e3f3aa-8f4a-4e2c-8821-facd4952e53e] received
swh-lister_1                        | ('git', <lazr.restfulclient.resource.Collection object at 0x7ff52ebef290>)
swh-lister_1                        | ('bzr', <lazr.restfulclient.resource.Collection object at 0x7ff52d1a9450>)
...
  • [1] scheduler db in docker, it's listing new bzr origins (no bzr prior to the run):
17:58:57 swh-scheduler@localhost:5433=# select now(), count(*) from listed_origins where visit_type='bzr';
+-------------------------------+-------+
|              now              | count |
+-------------------------------+-------+
| 2022-02-16 16:59:07.267496+00 | 21000 |
+-------------------------------+-------+
(1 row)

Time: 5.236 ms
17:59:07 swh-scheduler@localhost:5433=# select now(), count(*) from listed_origins where visit_type='bzr';
+-------------------------------+-------+
|              now              | count |
+-------------------------------+-------+
| 2022-02-16 16:59:46.291344+00 | 22000 |
+-------------------------------+-------+
(1 row)

Time: 5.584 ms
18:00:23 swh-scheduler@localhost:5433=# select now(), * from listed_origins where visit_type='bzr' order by last_update desc limit 1;
+-[ RECORD 1 ]-----------+-------------------------------------------------------------------------------+
| now                    | 2022-02-16 17:00:34.779024+00                                                 |
| lister_id              | 9290a3f8-6896-47ea-81b3-e3adc9df21be                                          |
| url                    | https://code.launchpad.net/~ubuntu-branches/ubuntu/karmic/libvncserver/karmic |
| visit_type             | bzr                                                                           |
| extra_loader_arguments | {}                                                                            |
| enabled                | t                                                                             |
| first_seen             | 2022-02-16 16:59:53.309055+00                                                 |
| last_seen              | 2022-02-16 16:59:53.309055+00                                                 |
| last_update            | 2009-06-27 00:56:06.928908+00                                                 |
+------------------------+-------------------------------------------------------------------------------+

Time: 11.362 ms

After an incremental run:

19:59:26 swh-scheduler@localhost:5433=# select now(), count(*) from listed_origins where visit_type='bzr';
+-------------------------------+--------+
|              now              | count  |
+-------------------------------+--------+
| 2022-02-17 08:18:45.201575+00 | 168000 |
+-------------------------------+--------+
(1 row)

Time: 20.536 ms
09:18:45 swh-scheduler@localhost:5433=# select * from listers where name='launchpad';
+-[ RECORD 1 ]--+-----------------------------------------------------------------------------------------------------------------------+
| id            | 9290a3f8-6896-47ea-81b3-e3adc9df21be                                                                                  |
| name          | launchpad                                                                                                             |
| instance_name | launchpad                                                                                                             |
| created       | 2022-02-16 16:24:45.466527+00                                                                                         |
| current_state | {"bzr_date_last_modified": "2009-09-10T10:21:25+00:00", "git_date_last_modified": "2022-02-16T19:07:16.970183+00:00"} |
| updated       | 2022-02-16 21:25:33.628123+00                                                                                         |
+---------------+-----------------------------------------------------------------------------------------------------------------------+

Time: 0.414 ms

Migrated from D7193 (view on Phabricator)

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
19 19
20 20 logger = logging.getLogger(__name__)
21 21
22 LaunchpadPageType = Iterator[Collection]
22 VcsType = str
23 LaunchpadPageType = Tuple[VcsType, Collection]
23 24
24 25
25 26 @dataclass
26 27 class LaunchpadListerState:
27 28 """State of Launchpad lister"""
28 29
29 date_last_modified: Optional[datetime] = None
30 """modification date of last updated repository since last listing"""
30 git_date_last_modified: Optional[datetime] = None
31 """modification date of last updated git repository since last listing"""
  • That means either i'll reset the state in the scheduling db or i'll alter the data when deploying this.

  • I think altering the JSON data in the scheduler db should be a good move as we already listed plenty of git repos.

    21:54 $ psql service=swh-scheduler
    psql (12.10 (Debian 12.10-1.pgdg110+1), server 12.9 (Debian 12.9-1.pgdg110+1))
    SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
    Type "help" for help.
    
    softwareheritage-scheduler=> select current_state from listers where name = 'launchpad';
                           current_state                        
    ------------------------------------------------------------
     {"date_last_modified": "2022-02-16T19:32:09.400561+00:00"}
    (1 row)
  • Please register or sign in to reply
  • Raphaël Gomès mentioned in merge request !266 (closed)

    mentioned in merge request !266 (closed)

  • Raphaël Gomès mentioned in merge request !412 (closed)

    mentioned in merge request !412 (closed)

  • This looks good, considering the... interesting launchpad API.

  • Merge request was accepted

  • Raphaël Gomès approved this merge request

    approved this merge request

  • 56 64 credentials=credentials,
    57 65 )
    58 66 self.incremental = incremental
    59 self.date_last_modified = None
    67 self.date_last_modified: Dict[str, Optional[datetime]] = {
    68 "git": None,
    69 "bzr": None,
    70 }
    60 71
    61 72 def state_from_dict(self, d: Dict[str, Any]) -> LaunchpadListerState:
    62 date_last_modified = d.get("date_last_modified")
    63 if date_last_modified is not None:
    64 d["date_last_modified"] = iso8601.parse_date(date_last_modified)
    73 for vcs_type in ["git", "bzr"]:
  • 64 d["date_last_modified"] = iso8601.parse_date(date_last_modified)
    73 for vcs_type in ["git", "bzr"]:
    74 key = f"{vcs_type}_date_last_modified"
    75 date_last_modified = d.get(key)
    76 if date_last_modified is not None:
    77 d[key] = iso8601.parse_date(date_last_modified)
    78
    65 79 return LaunchpadListerState(**d)
    66 80
    67 81 def state_to_dict(self, state: LaunchpadListerState) -> Dict[str, Any]:
    68 d: Dict[str, Optional[str]] = {"date_last_modified": None}
    69 date_last_modified = state.date_last_modified
    70 if date_last_modified is not None:
    71 d["date_last_modified"] = date_last_modified.isoformat()
    82 d: Dict[str, Optional[str]] = {}
    83 for vcs_type in ["git", "bzr"]:
  • 93 118 """
    94 119 assert self.lister_obj.id is not None
    95 120
    96 prev_origin_url = None
    121 prev_origin_url: Dict[str, Optional[str]] = {"git": None, "bzr": None}
  • Looks good to me, I added some nitpick comments.

  • Thanks for the reviews

    Looks good to me, I added some nitpick comments.

    Good points, i'll adapt in another commit to avoid rebasing gazillion of diffs.

  • Merge request was merged

  • Please register or sign in to reply
    Loading