=begin
My ruby application is making many connections to imap servers and after about a day or so of running, all 200 worker threads deadlock on @response_arrival.wait within imap.rb:
def get_tagged_response(tag)
until @tagged_responses.key?(tag)
@response_arrival.wait
end
return pick_up_tagged_response(tag)
end
I looked at the code and the following lines look suspicious to me:
def receive_responses
while true
begin
resp = get_response
rescue Exception
@sock.close
@client_thread.raise($!)
break
end
break unless resp
begin
synchronize do
case resp
when TaggedResponse
@tagged_responses[resp.tag] = resp
@response_arrival.broadcast
if resp.tag == @logout_command_tag
return
end
when UntaggedResponse
record_response(resp.name, resp.data)
if resp.data.instance_of?(ResponseText) &&
(code = resp.data.code)
record_response(code.name, code.data)
end
if resp.name == "BYE" && @logout_command_tag.nil?
@sock.close
raise ByeResponseError, resp.raw_data
end
when ContinuationRequest
@continuation_request = resp
@response_arrival.broadcast
end
@response_handlers.each do |handler|
handler.call(resp)
end
end
rescue Exception
@client_thread.raise($!)
end
end
end
It looks like in the conditions of "when UntaggedResponse", the @response_arrival.broadcast is not invoked. Would this cause the waiting thread to wait forever?
=begin
Update to this. I did some thread dumps and it looks like the code above is stuck waiting on this thread:
"Thread-9802" daemon prio=10 tid=0x0000000003e71000 nid=0x5e78 runnable [0x00007f3f6e9c6000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:83)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00007f3ff95fbe48> (a sun.nio.ch.Util$1)
- locked <0x00007f3ff95fbe30> (a java.util.Collections$UnmodifiableSet)
- locked <0x00007f3ff95fba70> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:102)
at org.jruby.ext.openssl.SSLSocket.waitSelect(SSLSocket.java:240)
at org.jruby.ext.openssl.SSLSocket.sysread(SSLSocket.java:448)
at org.jruby.ext.openssl.SSLSocket$i_method_0_1$RUBYINVOKER$sysread.call(org/jruby/ext/openssl/SSLSocket$i_method_0_1$RUBYINVOKER$sysread.gen:65535)
at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:630)
at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:186)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:146)
at ruby.jit.fill_rbuff_0CEA65EC1C909A5CF6CB7E8100C08A3F334C6007.rescue_1$RUBY$__rescue___0(buffering.rb:35)
at ruby.jit.fill_rbuff_0CEA65EC1C909A5CF6CB7E8100C08A3F334C6007.file(buffering.rb:34)
at ruby.jit.fill_rbuff_0CEA65EC1C909A5CF6CB7E8100C08A3F334C6007.file(buffering.rb)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:119)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:106)
at ruby.jit.gets_78EF17A1CC25A7841D76040F4F007ECC23A22E41.file(buffering.rb:106)
at ruby.jit.gets_78EF17A1CC25A7841D76040F4F007ECC23A22E41.file(buffering.rb)
at org.jruby.ast.executable.AbstractScript.file(AbstractScript.java:39)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:153)
at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:309)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:148)
at ruby.jit.get_response_9C1FCDC0EA0182083B570911F4E3FECA09F08611.file(imap.rb:994)
at ruby.jit.get_response_9C1FCDC0EA0182083B570911F4E3FECA09F08611.file(imap.rb)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:119)
at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:289)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:108)
at org.jruby.ast.VCallNode.interpret(VCallNode.java:85)
at org.jruby.ast.LocalAsgnNode.interpret(LocalAsgnNode.java:123)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.RescueNode.executeBody(RescueNode.java:199)
at org.jruby.ast.RescueNode.interpretWithJavaExceptions(RescueNode.java:118)
at org.jruby.ast.RescueNode.interpret(RescueNode.java:110)
at org.jruby.ast.BeginNode.interpret(BeginNode.java:83)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
at org.jruby.ast.WhileNode.interpret(WhileNode.java:131)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
at org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:139)
at org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:159)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:106)
at org.jruby.ast.VCallNode.interpret(VCallNode.java:85)
at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
at org.jruby.runtime.InterpretedBlock.evalBlockBody(InterpretedBlock.java:373)
at org.jruby.runtime.InterpretedBlock.yield(InterpretedBlock.java:327)
at org.jruby.runtime.BlockBody.call(BlockBody.java:78)
at org.jruby.runtime.Block.call(Block.java:89)
at org.jruby.RubyProc.call(RubyProc.java:224)
at org.jruby.RubyProc.call(RubyProc.java:207)
at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:94)
at java.lang.Thread.run(Thread.java:636)
Half of my threads are hanging on "@response_arrival.wait" and the rest is waiting on imap.rb:994 which is s = @sock.gets(CRLF) within the get_response method.
=begin
Hi Yui. Just so i know what to tell the jruby team, why do you think it's a jruby issue? Reason i came here is because it looks like jruby uses the exact same code as MRI.