- Apr 26, 2022
-
-
vlorentz authored
-
- Feb 07, 2022
-
-
Antoine R. Dumont authored
Related to T3916
-
- Jan 11, 2022
- Dec 16, 2021
-
-
Antoine R. Dumont authored
This also drops spurious copyright headers to those files if present. Related to T3812
-
- Oct 07, 2021
-
-
vlorentz authored
-
- Jun 09, 2021
-
-
Antoine Lambert authored
-
Antoine R. Dumont authored
-
- Apr 26, 2021
-
-
Antoine Lambert authored
Enable to check package documentation can be built without producing sphinx warnings. The sphinx environment is designed to be used in continuous integration in order to prevent breaking documentation build when committing changes. The sphinx-dev environment is designed to be used inside a full swh development environment. Related to T3258
-
- Mar 08, 2021
-
-
Tushar Goel authored
Signed-off-by:
Tushar Goel <tushar.goel.dav@gmail.com>
-
- Mar 05, 2021
- Mar 02, 2021
-
-
Tushar Goel authored
Currently boolean is used for checking mapping status of a row, instead of boolean type use named class which have 3 types mapped, unmapped and ignore Signed-off-by:
Tushar Goel <tushar.goel.dav@gmail.com>
-
- Feb 23, 2021
-
-
Tushar Goel authored
This is to build a mechanism to write to write the data from clearcode database which has been mapped with swh storage into swh RawExtrensicMetadata, and the data that has not been mapped to a table unmapped_data. This process of orchestration will run periodically and will only try to map new data that has been entered after the last orchestration process and the data that was not mapped in last orchestration. Initialize tables if they don't exist in database. Initialize swh storage and add MetadataAuthority, MetadataFetcher, then map previously unmapped data and get last run date of orchestration then read data from clearcode and orchestor rows from clearcode DB (if whole row is mapped then in metadataStorage, if partial or no data is matched then store that row in unmapped data table (for future mapping purpose), if tool of row is fossoloy then skip that row. Add tests and docstrings Signed-off-by:
Tushar Goel <tushar.goel.dav@gmail.com>
-
- Feb 19, 2021
-
-
vlorentz authored
-
- Feb 16, 2021
-
-
vlorentz authored
Instead, make these functions only process what is specific to each tool type (ie. the layout of the metadata listing the files), and let map_harvest deal with what is in common (mapping_status + calling map_sha1_and_add_in_data)
-
Tushar Goel authored
Currently map_row returns a complex tuple, that damages code readability, so instead of returning complex tuple it should return a named class RawExtrensicMetadata. Add different fields like file, format in map_row, map_scancode, map_clearlydefined, map_licensee that are required to map previous tuple data with RawExtrensicMetadata. Add docstrings and tests. Signed-off-by:
Tushar Goel <tushar.goel.dav@gmail.com>
-
- Feb 02, 2021
-
-
Tushar Goel authored
This is to build a mechanism to map a row [path(Primary Key), content(binary data), last_modified_date(timestamp with timezone), map_error(error message while mapping), uuid] from clearcode toolkit database, with software heritage archive using content table for sha1 and revision table for sha1_git and extract required information from that row. Then return list of data that has been mapped and mapping status(if able to map every hash of that row, will return True, else return False) so the row that is not being able to map for now can be stored in a state, and can be mapped in future. Add various exception classes in error.py that can be raised while mapping a row. Check if that row is a definition or harvest and also check if that row does not has invalid path, raise exception if path is invalid. If row is a definiton then map the data using map_definition and if it is a harvest then map it using map_harvest. Use storage instead of sql queries while mapping with the data inside archive. Add tests to cover all the cases and add docstrings to explain how every function works. Signed-off-by:
Tushar Goel <tushar.goel.dav@gmail.com>
-
- Jan 20, 2021
-
-
Tushar Goel authored
Signed-off-by:
Tushar Goel <tushar.goel.dav@gmail.com>
-
- Dec 18, 2020
-
-
vlorentz authored
-
- Dec 17, 2020
-
-
Tushar Goel authored
Signed-off-by:
Tushar Goel <tushar.goel.dav@gmail.com>
-
- Dec 15, 2020
-
-
vlorentz authored
-