cvsclient: Optimize use of temporary files
I encountered that issue while testing the loading of a CVS repository submitted to Save Code Now:
$ swh -l DEBUG loader run cvs pserver://anonymous@cvs.openacs.org/cvsroot/openacs-4
DEBUG:swh.loader.cvs.loader.CvsLoader:Fetching CVS rlog from cvs.openacs.org:/cvsroot/openacs-4
ERROR:swh.loader.cvs.loader.CvsLoader:Loading failure, updating to `failed` status
Traceback (most recent call last):
File "/home/anlambert/.virtualenvs/swh/lib/python3.11/site-packages/swh/loader/core/loader.py", line 441, in load
File "/home/anlambert/swh/swh-environment/swh-loader-cvs/build/__editable__.swh.loader.cvs-0.8.1-cp311-cp311-linux_x86_64/swh/loader/cvs/loader.py", line 585, in prepare
File "/home/anlambert/swh/swh-environment/swh-loader-cvs/build/__editable__.swh.loader.cvs-0.8.1-cp311-cp311-linux_x86_64/swh/loader/cvs/cvsclient.py", line 356, in fetch_rlog
File "/home/anlambert/swh/swh-environment/swh-loader-cvs/build/__editable__.swh.loader.cvs-0.8.1-cp311-cp311-linux_x86_64/swh/loader/cvs/cvsclient.py", line 294, in _parse_rlog_response
File "/usr/lib/python3.11/tempfile.py", line 796, in TemporaryFile
File "/usr/lib/python3.11/tempfile.py", line 789, in opener
File "/usr/lib/python3.11/tempfile.py", line 395, in _mkstemp_inner
OSError: [Errno 24] Too many open files: '/tmp/tmpy5h2de_1'
When using the pserver protocol, the CVS loader can create a large amount of temporary files resulting in a loading error for a large repository. To mitigate that issue, prefer to use SpooledTemporaryFile instead of TemporaryFile to avoid creating files on disk when their size is lower than a cutoff value.
The checkout
implementation of the loader when using pserver protocol was also simplified.
To be noted, those are quick fixes as we need to urgently archive the CVS repositories hosted on OSDN.net (using pserver protocol) before the incoming takedown, loader implementation should be improved later to better embrace Python best practices.
Edited by Antoine Lambert