Bug #14898
closedtest/lib/test/unit/parallel.rb: TestSocket#test_timestamp stuck sometimes
Description
With parallel tests (make test-all TESTS=-j4
with 4 parallelism) stuck sometimes.
http://ci.rvm.jp/results/trunk-test@ruby-sky3/1087178
We can see this stuck very old revisions but not sure how to solve...
Anyone help us?
Updated by normalperson (Eric Wong) over 6 years ago
ko1@atdot.net wrote:
With parallel tests (
make test-all TESTS=-j4
with 4 parallelism) stuck sometimes.http://ci.rvm.jp/results/trunk-test@ruby-sky3/1087178
We can see this stuck very old revisions but not sure how to solve...
I've never seen it stuck myself.
Is UDP over loopback supposed to be reliable?
I would not expect it to be (but am not sure), I think it's
possible the kernel could drop packets if under memory pressure.
Updated by ko1 (Koichi Sasada) over 6 years ago
On 2018/07/06 18:47, Eric Wong wrote:
I've never seen it stuck myself.
Only a few times per thousands trial. I also never seen in manual trial.
Is UDP over loopback supposed to be reliable?
Maybe yes because other tests passed.
I would not expect it to be (but am not sure), I think it's
possible the kernel could drop packets if under memory pressure.
mmm. can we rewrite tests with this concern?
--
// SASADA Koichi at atdot dot net
Updated by normalperson (Eric Wong) over 6 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r63872.
test/socket/test_socket.rb (test_timestamp): retry send
I theorize there can be UDP packet loss even over loopback if
the kernel is under memory pressure. Retry sending periodically
until recvmsg succeeds.
i[ruby-core:87842] [Bug #14898]
Updated by normalperson (Eric Wong) over 6 years ago
Koichi Sasada ko1@atdot.net wrote:
On 2018/07/06 18:47, Eric Wong wrote:
I would not expect it to be (but am not sure), I think it's
possible the kernel could drop packets if under memory pressure.mmm. can we rewrite tests with this concern?
Maybe r63872 can help by retrying send.
Updated by ko1 (Koichi Sasada) over 6 years ago
On 2018/07/07 14:36, Eric Wong wrote:
Maybe r63872 can help by retrying send.
Great! Thank you.
--
// SASADA Koichi at atdot dot net
Updated by ko1 (Koichi Sasada) over 6 years ago
http://ci.rvm.jp/results/trunk-test@frontier/1153126
doesn't fixed yet :(
Updated by ko1 (Koichi Sasada) over 6 years ago
- Status changed from Closed to Open
Updated by normalperson (Eric Wong) over 6 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r64157.
test/socket/test_socket.rb (test_timestampns): retry send
It looks like we need to retry test_timestampns in addition
to test_timestamp; so share some code while we're at it.
cf. http://ci.rvm.jp/results/trunk-test@frontier/1153126
[ruby-core:88104] [Bug #14898]
Updated by normalperson (Eric Wong) over 6 years ago
ko1@atdot.net wrote:
Oh, different test, that is test_timestampns getting stuck.
Trying r64157:
test/socket/test_socket.rb (test_timestampns): retry send
It looks like we need to retry test_timestampns in addition
to test_timestamp; so share some code while we're at it.
Updated by normalperson (Eric Wong) over 6 years ago
http://ci.rvm.jp/results/trunk_clang_38@silicon-docker/1185552
:<
ko1@atdot.net wrote:
ko1: is frontier also on Docker? I seem to remember hearing of
some UDP problems in containers several years ago, but maybe it
was only UDP multicast... This was years ago, and I never tried
containers myself.
Updated by ko1 (Koichi Sasada) over 6 years ago
Oh, different test, that is test_timestampns getting stuck.
sorry.
ko1: is frontier also on Docker?
No. It raw Linux machine.
Updated by normalperson (Eric Wong) about 6 years ago
ko1@atdot.net wrote:
Bug #14898: test/lib/test/unit/parallel.rb: TestSocket#test_timestamp stuck sometimes
https://bugs.ruby-lang.org/issues/14898#change-73373
Still not solved. This might be a similar issue to r64478 with
too many pipes...