Account: https://api.grid5000.fr/stable/users/ user ldachary
Laboratory: Software Heritage Special Task Force Unit Detached
Project: Software Heritage
In the past months a novel object storage architecture was designed[0] and experimented on using the grid5000 grenoble cluster[1]. It allows for the efficient storage of 100 billions immutable small objects (median size of 4KB). It will be used by the Software Heritage project to keep accumulating the publicly available source code that is constantly growing. Software Heritage already published articles[2][3] and more are expected in the future. Their work would not be possible without this novel object storage architecture because the current solutions are either not efficient enough or too costly.
Date of the reservation: August 6th or 14th or 21st, 7pm
Duration: 9 days
The goal is to run a benchmark demonstrating the object storage architecture delivers the expected results in an experimental environment at scale. Running them over the week-end (60 hours) shows they behave as expected but they do not exhaust the resources of the cluster (using only 20% of the disk capacity). Running the benchmark during 9 days would allow to use approximately 100TB of storage instead of 20TB. It is still only a fraction of the target volume (10PB) but it may reveal issues that could not be observed on a smaller scale.
$ oarsub -t exotic -l "{cluster='dahu'}/host=30+{cluster='yeti'}/host=3,walltime=216" --reservation '2021-08-06 19:00:00' -t deploy [ADMISSION RULE] Include exotic resources in the set of reservable resources (this does NOT exclude non-exotic resources). [ADMISSION RULE] Error: Walltime too big for this job, it is limited to 168 hours
The usual grid5000 contact is on vacation, falling back to his replacement to resolve this.
I was about to make the reservation and ran into the following problem:
$ oarsub -t exotic -l "{cluster='dahu'}/host=30+{cluster='yeti'}/host=3,walltime=216" --reservation '2021-08-06 19:00:00' -t deploy
[ADMISSION RULE] Include exotic resources in the set of reservable resources (this does NOT exclude non-exotic resources).
[ADMISSION RULE] Error: Walltime too big for this job, it is limited to 168 hours
Would you be so kind as to let me know how I can work around it? In the meantime I reserved for 163 hours (job 2019935) just to make sure the time slot is not inadvertently occupied by another request.
Thanks again for your help and have a wonderful day!
On a une procédure pour ce genre de cas, je t'ai ajouté au groupe
"oar-unrestricted-adv-reservations" qui devrait lever toutes les
restrictions sur les réservations à l'avance de ressources. Tu devrais du
coup pouvoir refaire ta réservation avec le bon walltime.
J'ai mis une date d'expiration au 12 septembre sur ce groupe pour être sûr
que ça suffise, mais pense bien à refaire une demande d'utilisation
spéciale si tu as un nouveau besoin hors charte après celle d'août.
$ oarsub -t exotic -l "{cluster='dahu'}/host=30+{cluster='yeti'}/host=3,walltime=216" --reservation '2021-08-06 19:00:00' -t deploy[ADMISSION RULE] Include exotic resources in the set of reservable resources (this does NOT exclude non-exotic resources).[ADMISSION RULE] ldachary is granted the privilege to do unlimited reservations[ADMISSION RULE] Computed global resource filter: -p "(deploy = 'YES') AND maintenance = 'NO'"[ADMISSION_RULE] Computed resource request: -l {"(cluster='dahu') AND type = 'default'"}/host=30+{"(cluster='yeti') AND type = 'default'"}/host=3Generate a job key...OAR_JOB_ID=2019986Reservation mode: waiting validation...Reservation valid --> OK
The run terminated August 11th @ 15:21 because of what appears to be a rare race condition. It was however mostly finished. The results show an unexpected degradation in the read performances. It deserves further investigation because it keeps degrading over time. The write performance are however stable and suggest the benchmark code itself may be responsible for this degradation. If the Ceph cluster was globally slowing down, both reads and writes would show a degradation in performance because previous benchmark results showed that there is a correlation between the two.
Bytes write 106.4 MB/sObjects write 5.2 Kobject/sBytes read 94.6 MB/sObjects read 23.1 Kobject/s1014323 random reads take longer than 100ms (2.1987787007491675%)
The benchmarks were modified to (i) use a fixed number of random / sequential readers instead of a random choice for better predictability, (ii) introduce throttling to cap the sequential reads speed to approximately 200MB/s. A run of read only was run:
and at the same time rbd bench was run to continuously write on a single image, at ~200MB/s. The start of the rbd bench is a few minutes after the start of the read. It will run for the next 24h to verify that: