Skip to content
Snippets Groups Projects
Forked from Platform / Development / swh-graph
Source project has a limited visibility.
swh-loader-svn
==============

Documents are in the ./docs folder:
- Specification: ./docs/swh-loader-svn.txt
- Comparison performance with git-svn: ./docs/comparison-git-svn-swh-svn.org

# Configuration file

## Location

Either:
- /etc/softwareheritage/loader/svn.ini
- ~/.config/swh/loader/svn.ini
- ~/.swh/loader/svn.ini

## Configuration sample

```
storage:
  cls: remote
  args:
    url: http://localhost:5002/

send_contents: true
send_directories: true
send_revisions: true
send_releases: true
send_occurrences: true
# nb of max contents to send for storage
content_packet_size: 10000
# 100 Mib of content data
content_packet_block_size_bytes: 104857600
# limit for swh content storage for one blob (beyond that limit, the
# content's data is not sent for storage)
content_packet_size_bytes: 1073741824
directory_packet_size: 2500
revision_packet_size: 10
release_packet_size: 1000
occurrence_packet_size: 1000

check_revision: 10
```

## configuration content

With at least the following module (swh.loader.svn.tasks) and queue
(swh_loader_svn):


```
[main]
task_broker = amqp://guest@localhost//
task_modules = swh.loader.svn.tasks
task_queues = swh_loader_svn
task_soft_time_limit = 0
```

swh.loader.svn.tasks and swh_loader_svn are the important entries here.

## start worker instance

To start a current worker instance:

```sh
python3 -m celery worker --app=swh.scheduler.celery_backend.config.app \
                --pool=prefork \
                --concurrency=10 \
                -Ofair \
                --loglevel=debug 2>&1
```

## Produce a repository to load

You can see:

`python3 -m swh.loader.svn.producer svn --help`

### one repository
```sh
python3 -u -m swh.loader.svn.producer svn --svn-url file:///home/storage/svn/repos/pkg-fox --visit-date 'Tue, 3 May 2017 17:16:32 +0200'
```

Note:
- `--visit-date` to override the default visit-date to now.

### multiple repositories

```sh
cat ~/svn-repository-list | python3 -m swh.loader.svn.producer svn
```

The file svn-repository-list contains a list of svn repository urls
(one per line), something like:

```txt
svn://svn.debian.org/svn/pkg-fox/ optional-url
svn://svn.debian.org/svn/glibc-bsd/ optional-url
svn://svn.debian.org/svn/pkg-voip/ optional-url
svn://svn.debian.org/svn/python-modules/ optional-url
svn://svn.debian.org/svn/pkg-gnome/ optional-url
```

## Produce archive of svndumps list to load

see. `python3 -m swh.loader.svn.producer svn-archive --help`