https://redmine.ruby-lang.org/https://redmine.ruby-lang.org/favicon.ico?17113305112011-06-24T04:42:23ZRuby Issue Tracking SystemRuby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=181992011-06-24T04:42:23ZMartinBosslet (Martin Bosslet)Martin.Bosslet@gmail.com
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>3</i></li></ul><p>Strange. I tried with 0.9.8r, couldn't reproduce it. Then I tried with 1.0.0d, again the tests succeeded. It is only the customized version of 1.0.0d that ships with Fedora 15 that causes the error. I know that they disabled ECC support in their version of OpenSSL 1.0.0d, so they possibly changed other things, too, that could cause the failure.</p>
<p>I'll investigate this further.</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=182092011-06-24T22:04:06ZMartinBosslet (Martin Bosslet)Martin.Bosslet@gmail.com
<ul><li><strong>Assignee</strong> set to <i>nahi (Hiroshi Nakamura)</i></li></ul><p>I played around a bit and it seems like this could be some kind of race condition.<br>
At first I thought that the custom Fedora OpenSSL would probably reject some<br>
algorithm and so I wanted to see what cipher is negotiated in the tests and added<br>
this:</p>
<a name="Index-testopensslutilsrb"></a>
<h1 >Index: test/openssl/utils.rb<a href="#Index-testopensslutilsrb" class="wiki-anchor">¶</a></h1>
<p>--- test/openssl/utils.rb (revision 32218)<br>
+++ test/openssl/utils.rb (working copy)<br>
@@ -226,6 +226,7 @@<br>
ssl = nil<br>
begin<br>
ssl = ssls.accept</p>
<ul>
<li>
<pre><code> puts ssl.cipher
rescue OpenSSL::SSL::SSLError
retry
end
</code></pre>
</li>
</ul>
<p>This actually made the test failure disappear.</p>
<p>It also disappears if I leave some of the openssl tests out, e.g.<br>
if I leave out the first test test_asn.rb or test_cipher.rb - this will<br>
also cause test_ssl.rb to succeed. What bugs me is that these tests<br>
aren't related to test_ssl.rb except maybe for the fact that they all<br>
"require openssl" and utils.rb. Could it be that something persists in<br>
the buffer used for the SSL tests that is introduced by any of the other tests?</p>
<p>It's really weird... Hiroshi, do you have an idea what the cause might<br>
be?</p>
<p>I would tend to ignore this since it only appears with a customized<br>
OpenSSL version in a very specific scenario if it weren't for the<br>
fact that I'm afraid if really a race condition is the problem then<br>
this could also come up again in other situations...</p>
<p>Thanks in advance,<br>
Martin</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=184872011-06-26T19:21:55Znahi (Hiroshi Nakamura)nakahiro@gmail.com
<ul><li><strong>Target version</strong> set to <i>1.9.3</i></li></ul> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=185732011-06-28T03:40:11ZMartinBosslet (Martin Bosslet)Martin.Bosslet@gmail.com
<ul></ul><p>Not sure if it is related:</p>
<p>I tested again using Ubuntu 10.04 with OpenSSL 0.9.8k and make test-all TESTS="openssl/test_pair.rb" (at r32258) and I get this (reproducible all the time):</p>
<a name="Running-tests"></a>
<h1 >Running tests:<a href="#Running-tests" class="wiki-anchor">¶</a></h1>
<p>..................E.</p>
<p>Finished tests in 1.201981s, 16.6392 tests/s, 84.0280 assertions/s.</p>
<ol>
<li>Error:<br>
test_write_nonblock(OpenSSL::TestPair):<br>
OpenSSL::SSL::SSLError: SSL_read:: bad write retry<br>
/home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:53:in <code>sysread' /home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:53:in </code>fill_rbuff'<br>
/home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:94:in <code>read' /home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:176:in </code>block in test_write_nonblock'<br>
/home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:44:in <code>ssl_pair' /home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:164:in </code>test_write_nonblock'</li>
</ol>
<p>20 tests, 101 assertions, 0 failures, 1 errors, 0 skips</p>
<p>However, when running the test with OpenSSL 1.0.0d on the same machine, it succeeds.</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=185742011-06-28T04:40:29ZMartinBosslet (Martin Bosslet)Martin.Bosslet@gmail.com
<ul></ul><p>Martin Bosslet wrote:</p>
<blockquote>
<ol>
<li>Error:<br>
test_write_nonblock(OpenSSL::TestPair):<br>
OpenSSL::SSL::SSLError: SSL_read:: bad write retry<br>
/home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:53:in <code>sysread' /home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:53:in </code>fill_rbuff'<br>
/home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:94:in <code>read' /home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:176:in </code>block in test_write_nonblock'<br>
/home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:44:in <code>ssl_pair' /home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:164:in </code>test_write_nonblock'</li>
</ol>
<p>20 tests, 101 assertions, 0 failures, 1 errors, 0 skips</p>
</blockquote>
<p>I tested further and the test fails deterministically at the same spot each time. If I rewrite test_write_nonblock as follows:</p>
<pre><code>def test_write_nonblock
ssl_pair {|s1, s2|
n = 0
begin
n += s1.write_nonblock("a" * 100000)
n += s1.write_nonblock("b" * 100000)
n += s1.write_nonblock("c" * 100000)
100000.times do |i|
n += 1
begin
s1.write_nonblock("d")
rescue OpenSSL::SSL::SSLError => e
p e
puts "Iteration: #{i}"
raise e
end
end
puts "Da5"
n += s1.write_nonblock("e" * 100000)
puts "Da6"
n += s1.write_nonblock("f" * 100000)
puts "Da7"
rescue IO::WaitWritable
end
s1.close
assert_equal(n, s2.read.length)
}
</code></pre>
<p>end</p>
<p>it will always yield the same result (only with Ubuntu's 0.9.8k though) when running make test-all TESTS="openssl/test_pair.rb":</p>
<p>..................#<OpenSSL::SSL::SSLError: write would block><br>
Iteration: 441<br>
E.</p>
<p>Finished tests in 1.213241s, 16.4848 tests/s, 83.2481 assertions/s.</p>
<ol>
<li>Error:<br>
test_write_nonblock(OpenSSL::TestPair):<br>
OpenSSL::SSL::SSLError: SSL_read:: bad write retry<br>
/home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:53:in <code>sysread' /home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:53:in </code>fill_rbuff'<br>
/home/martin/Projekte/Ruby/ruby-svn/build/.ext/common/openssl/buffering.rb:94:in <code>read' /home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:188:in </code>block in test_write_nonblock'<br>
/home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:44:in <code>ssl_pair' /home/martin/Projekte/Ruby/ruby-svn/ruby/test/openssl/test_pair.rb:164:in </code>test_write_nonblock'</li>
</ol>
<p>20 tests, 101 assertions, 0 failures, 1 errors, 0 skips</p>
<p>It always flunks at iteration 441 with an "OpenSSL::SSL:SSLError: write would block".</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=186662011-06-29T16:30:19Znahi (Hiroshi Nakamura)nakahiro@gmail.com
<ul><li><strong>Priority</strong> changed from <i>3</i> to <i>Normal</i></li></ul><p>Great. Thanks. I changed the priority since Martin found the way to replicate. What makes the story more interesting is that the test 'test_write_nonblock' does not give me a SSLError but an assertion failure at the last assert.</p>
<p>I'll investigate it.</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=186672011-06-29T17:52:16Znahi (Hiroshi Nakamura)nakahiro@gmail.com
<ul></ul><p>The failure I was getting is from the wrong place of 'n += 1'. So it just means 1.0.0d is safe. And I can replicate the error with 0.9.8o on Ubuntu 11.04</p>
<p>Martin, can you add a SSLSocket#close call for s1 and s2 in 'test_connect_accept_nonblock' ensure block, then try the test again? This looks hide the error for me (run all tests normally.)</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=186892011-06-30T11:21:15ZMartinBosslet (Martin Bosslet)Martin.Bosslet@gmail.com
<ul></ul><p>Oops, the n += 1 was misplaced indeed :)</p>
<p>I tried on Fedora 15 with the official 0.9.8k and was able to reproduce the error in 'test_write_non_block'. Closing sockets s1 and s2 in the ensure block in 'test_connect_accept_nonblock' unfortunately didn't help, still the same error (running the tests normally).</p>
<p>Next, I tested with 0.9.8o (official download) and <em>wasn't</em> able to reproduce it (again Fedora 15 32 bit) - in contrast to your experience.</p>
<p>0.9.8k: Fails<br>
0.9.8l: Fails<br>
0.9.8m: Succeeds<br>
0.9.8n: Succeeds<br>
0.9.8o: Succeeds<br>
0.9.8r: Succeeds<br>
1.0.0d: Succeeds</p>
<p>So it seems that on my side I'm able to reproduce it deterministically with 0.9.8k and 0.9.8l, but the failure vanishes in versions >= 0.9.8m.</p>
<p>I scanned through OpenSSL's change log looking for changes made in 0.9.8m and found this:</p>
<p>Handle non-blocking I/O properly in SSL_shutdown() call.<br>
[Darryl Miles <a href="mailto:darryl-mailinglists@netbauds.net" class="email">darryl-mailinglists@netbauds.net</a>]</p>
<p>The behavior on my machine would underline the assumption that this is probably the cause. What do you think? Should I try to find out exactly what Darryl Miles changed to see whether we could handle this in ext/openssl, too?</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=190662011-07-11T15:08:18Zkosaki (Motohiro KOSAKI)kosaki.motohiro@gmail.com
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Assigned</i></li></ul> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=191072011-07-12T18:18:36Znahi (Hiroshi Nakamura)nakahiro@gmail.com
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>3</i></li></ul><p>I'm on 64bit Ubuntu 11.04 box, and I can replicate the error with 0.9.8o...</p>
<p>Though we still don't know the culprit of this failure, I mark this Low-priority since it's not a report from user.</p>
<p>Martin, you can change Assignee and take this as you like :)</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=191412011-07-13T19:21:16ZMartinBosslet (Martin Bosslet)Martin.Bosslet@gmail.com
<ul><li><strong>Assignee</strong> changed from <i>nahi (Hiroshi Nakamura)</i> to <i>MartinBosslet (Martin Bosslet)</i></li></ul><blockquote>
<p>Martin, you can change Assignee and take this as you like :)</p>
</blockquote>
<p>Thanks, Hiroshi!</p>
<p>If I get <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: [ext/openssl] SSLSession#initialize fails with OpenSSL 0.9.7 (Closed)" href="https://redmine.ruby-lang.org/issues/4961">#4961</a> done in time I will also try to have this fixed for the 1.9.3 release.</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=196362011-07-25T12:43:26Znahi (Hiroshi Nakamura)nakahiro@gmail.com
<ul></ul><p>While investigating <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Re: socket related errors on chkbuild (Closed)" href="https://redmine.ruby-lang.org/issues/5039">#5039</a>, I found that SSL_shutdown() could add something to global error stack in OpenSSL < 0.9.8m, possibly related the OpenSSL 0.9.8m fix Martin posted above. Based on my investigation, that's the reason why 'bad write retry' (error for SSL_write) is raised from SSL_read. I'll handle 'bad write retry' for SSL_read issue at <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Re: socket related errors on chkbuild (Closed)" href="https://redmine.ruby-lang.org/issues/5039">#5039</a>.</p>
<p>It explains the error 'bad write retry' we're getting well but it doesn't for the error 'SSL_read:: sslv3 alert bad record mac' so I keep this ticket for the issue 'SSL_read:: sslv3 alert bad record mac'.</p>
<p>Martin, do you have any thought about the original error? Are you still getting the error? (not 'bad write retry')</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=197422011-07-27T07:39:13ZMartinBosslet (Martin Bosslet)Martin.Bosslet@gmail.com
<ul></ul><p>Hiroshi Nakamura wrote:</p>
<blockquote>
<p>While investigating <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Re: socket related errors on chkbuild (Closed)" href="https://redmine.ruby-lang.org/issues/5039">#5039</a>, I found that SSL_shutdown() could add something to global error stack in OpenSSL < 0.9.8m, possibly related the OpenSSL 0.9.8m fix Martin posted above. Based on my investigation, that's the reason why 'bad write retry' (error for SSL_write) is raised from SSL_read. I'll handle 'bad write retry' for SSL_read issue at <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Re: socket related errors on chkbuild (Closed)" href="https://redmine.ruby-lang.org/issues/5039">#5039</a>.</p>
</blockquote>
<p>Runs smooth now with all OpenSSL versions that previously produced the 'bad write retry'. Great work!</p>
<blockquote>
<p>It explains the error 'bad write retry' we're getting well but it doesn't for the error 'SSL_read:: sslv3 alert bad record mac' so I keep this ticket for the issue 'SSL_read:: sslv3 alert bad record mac'.</p>
<p>Martin, do you have any thought about the original error? Are you still getting the error? (not 'bad write retry')</p>
</blockquote>
<p>I don't know what I did to produce that thing. That night I was able to reproduce it just fine, but I never got it again since then. I just ran make test-all for openssl with the custom Fedora version 50 times, not a single error.</p>
<p>What should we do, close it? We could still reopen the issue if I somehow manage to reproduce it...</p>
<p>Since I didn't (and don't) know where to look for it, debugging hasn't led me any further for now - I'm still clueless concerning the origin of the 'bad record mac'.</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=198012011-07-28T16:41:53Znahi (Hiroshi Nakamura)nakahiro@gmail.com
<ul><li><strong>Status</strong> changed from <i>Assigned</i> to <i>Feedback</i></li><li><strong>Assignee</strong> deleted (<del><i>MartinBosslet (Martin Bosslet)</i></del>)</li><li><strong>Target version</strong> deleted (<del><i>1.9.3</i></del>)</li></ul><p>Martin Bosslet wrote:</p>
<blockquote>
<p>I don't know what I did to produce that thing. That night I was able to reproduce it just fine, but I never got it again since then. I just ran make test-all for openssl with the custom Fedora version 50 times, not a single error.</p>
<p>What should we do, close it? We could still reopen the issue if I somehow manage to reproduce it...</p>
</blockquote>
<p>Thanks for confirmation. Let's mark this ticket state as 'Feedback'.</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=320162012-10-31T00:57:20Zmame (Yusuke Endoh)mame@ruby-lang.org
<ul></ul><p>Martin and NaHi,</p>
<p>Is there any concrete plan to address this issue?<br>
If not, I'd like to close this ticket now.<br>
You can reopen it whenever you find any clue to make a progres.</p>
<p>--<br>
Yusuke Endoh <a href="mailto:mame@tsg.ne.jp" class="email">mame@tsg.ne.jp</a></p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=324252012-11-05T22:57:35Zmame (Yusuke Endoh)mame@ruby-lang.org
<ul><li><strong>Assignee</strong> set to <i>MartinBosslet (Martin Bosslet)</i></li></ul><p>Emboss and NaHi, ping?</p>
<p>--<br>
Yusuke Endoh <a href="mailto:mame@tsg.ne.jp" class="email">mame@tsg.ne.jp</a></p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=333692012-11-21T11:02:26ZMartinBosslet (Martin Bosslet)Martin.Bosslet@gmail.com
<ul></ul><p>mame (Yusuke Endoh) wrote:</p>
<blockquote>
<p>Emboss and NaHi, ping?</p>
<p>--</p>
</blockquote>
<p>If nahi agrees, I would say "close" since no feedback arrived in the meantime?</p> Ruby master - Bug #4923: [ext/openssl] test_ssl.rb: test_client_auth failshttps://redmine.ruby-lang.org/issues/4923?journal_id=356522013-01-26T06:51:05Zdrbrain (Eric Hodel)drbrain@segment7.net
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Closed</i></li></ul><p>I am closing this due to lack of feedback and the note of a possible fix above.</p>