swh-journal: Add tests to the journal publisher using kafka instance

Build is green See https://jenkins.softwareheritage.org/job/DJNL/job/tox/28/ for more details.

Ensure this now rather than later ;)

Build is green

\m/

Thanks for looking into pytest-kafka!

Before you go too much further with this approach, I'll just say that it makes me very uncomfortable...

We don't install external daemon dependencies (e.g. PostgreSQL, RabbitMQ, statsd, ...) in setup.py in any other module, and I don't see a reason for this module to be any different.

If the CI needs Kafka to be installed to be able to run, then we should make sure the image that the tests run on have Kafka available, the same way we currently install PostgreSQL so pytest-postgresql can work.

For local development, i feel we should just do the same thing and document how to install Kafka on the developer's system, rather than have setup.py unconditionally download stuff from the Internet (although the way it's done here looks fine at a glance).

! In !242 (closed), @olasd wrote: Thanks for looking into pytest-kafka!

Before you go too much further with this approach, I'll just say that it makes me very uncomfortable...

yes, that sounds off to my ears as well but i did not see any other way so far.

We don't install external daemon dependencies (e.g. PostgreSQL, RabbitMQ, statsd, ...) in setup.py in any other module, and I don't see a reason for this module to be any different.

indeed.

If the CI needs Kafka to be installed to be able to run, then we should make sure the image that the tests run on have Kafka available, the same way we currently install PostgreSQL so pytest-postgresql can work.

thanks for reminding me of this, i'll go check the details ;)

For local development, i feel we should just do the same thing and document how to install Kafka on the developer's system, rather than have setup.py unconditionally download stuff from the Internet (although the way it's done here looks fine at a glance).

As a trade-off, wouldn't it be ok that we keep the code for local development?

That is, dev can do, to ease setup:

./setup.py develop  # <- the name should probably change, i just kept what upstream did ;)

And simply not use that part for other automatic stuff?

Thanks for the feedback

mentioned in commit swh-environment@c5a48242

Will remove the 'develop' step later

In the mean time:

tox.ini: Remove unused dependency
publisher: separate pytest test from mocked ones
test_publisher: Reorganize code
publisher: Initiate the journal cli to start running the publisher
test_publisher2: Start trying to test with kafka
test_publisher: Use direct instantiation of the in-memory storage
test_publisher2: Progress towards actual publisher test with kafka
test_publisher2: Make the test pass
test_publisher2: Explain why we need the transformation

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tox/35/ See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tox/35/console

Rebase on latest master (and removes the cli adaptation part, !9 (closed))

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tox/38/ See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tox/38/console

Make test work for major object types

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tox/41/ See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tox/41/console

Build has FAILED

Because flake8 is missing on the job node somehow.

I think i'm done for the main part of the tests.

Now i need to concentrate on removing the part about the target ./setup.py develop. Well not necessarily removing that part (because that could prove useful for others) but at least not make the ci job rely on it (through tox in the current diff's state).

Rebase on latest master

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tox/42/ See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tox/42/console

Remove unused import

Build is green See https://jenkins.softwareheritage.org/job/DJNL/job/tox/43/ for more details.

mentioned in commit swh/infra/ci-cd/swh-jenkins-dockerfiles@cc483aa7

So, if we keep the setup.py snippet to install kafka, I think it should be a completely separate subcommand of setup.py, rather than overriding the default behavior of the develop subcommand, which has a somewhat standard meaning of "install the current module for a development setup".

But I really think those 50 lines of python should just be turned into 5-10 lines of shell script that would be more readable and do the exact same thing.

(disclaimer: I still haven't looked too closely at the actual tests)

So, if we keep the setup.py snippet to install kafka, I think it should be a completely separate subcommand of setup.py, rather than overriding the default behavior of the develop subcommand, which has a somewhat standard meaning of "install the current module for a development setup".

well, it was for that indeed, for the tests to work so for development.

(I don't know yet how to make a subcommand of setup.py, i just started from the pytest-kafka's intial setup.py which does that)

But I really think those 50 lines of python should just be turned into 5-10 lines of shell script that would be more readable and do the exact same thing.

fine by me, i had it in swh/infra/ci-cd/swh-jenkins-dockerfiles!42 (closed) (prior to removing it) so i'll just take it back for here.

(disclaimer: I still haven't looked too closely at the actual tests)

ack, that's fine ;)

For information, it's just:

write to the publisher (as our swh.storage.listener does)
publisher reacts, reifies objects and publish them
a consumer of the publisher asserts the objects are the expected reified ones

swh-journal: Add tests to the journal publisher using kafka instance

Test Plan

Closed by Phabricator Migration user 6 years ago (Mar 25, 2019 2:53pm UTC) 6 years ago

Activity

swh-journal: Add tests to the journal publisher using kafka instance

Test Plan

Merge request reports

Closed by Phabricator Migration user 6 years ago (Mar 25, 2019 2:53pm UTC) 6 years ago

Activity