Project

General

Profile

Actions

Bug #13856

closed

MinGW / mswin intermittent failure in test/socket/test_socket.rb

Added by MSP-Greg (Greg L) over 6 years ago. Updated about 6 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.0dev (2017-09-11 trunk 59829) [x64-mingw32]
[ruby-core:82602]

Description

Recently, there have been two Appveyor fails, first and second.

Both listed the following as the error:

running file: C:/projects/ruby/test/socket/test_socket.rb

Some worker was crashed. It seems ruby interpreter's bug
or, a bug of test/unit/parallel.rb. try again without -j
option.

I have a similar intermittent error as a silent segfault in MinGW builds. I've attached the patch file I use, it patches the TestSocket#test_closed_read method, and simply adds sock.autoclose = false after sock is created.

I haven't done much Ruby socket coding, so I've never looked into the issue, but the test passes and is stable with the patch. For all I know, the patch may 'end-around' the whole point of the test.

I don't know if this helps identify the real issue or not. If not, feel free to close.


Files

Updated by h.shirosaki (Hiroshi Shirosaki) over 6 years ago

  • ruby -v set to ruby 2.5.0dev (2017-09-11 trunk 59829) [x64-mingw32]

Segmentation fault is caused by RBASIC_CLASS(err) access in hook_before_rewind() in vm.c:1667.
err is not a valid pointer.

Here is gdb output.

Run options: "--ruby=./miniruby.exe -I../snapshot/lib -I. -I.ext/common  ../snapshot/tool/runruby.rb --extout=.ext --debugger -- --disable-gems" --excludes-dir=../snapshot/test/excludes --name=!/memory_leak/ -v -ntest_closed_read

# Running tests:

[1/1] TestSocket#test_closed_read[New Thread 16292.0x331c]
[New Thread 16292.0x57f4]
[Thread 16292.0x331c exited with code 0]

Thread 9 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16292.0x57f4]
0x00000000680aad1a in hook_before_rewind (th=th@entry=0x33671d0, will_finish_vm_exec=will_finish_vm_exec@entry=0, state=6, err=err@entry=0xc0000241,
    cfp=<optimized out>) at ../snapshot/vm.c:1667
1667        if (state == TAG_RAISE && RBASIC_CLASS(err) == rb_eSysStackError) {
(gdb) bt
#0  0x00000000680aad1a in hook_before_rewind (th=th@entry=0x33671d0, will_finish_vm_exec=will_finish_vm_exec@entry=0, state=6, err=err@entry=0xc0000241,
    cfp=<optimized out>) at ../snapshot/vm.c:1667
#1  0x00000000680b6ffd in vm_exec (th=0x31e000, th@entry=0x33671d0) at ../snapshot/vm.c:2003
...

err seems to come from rb_threadptr_execute_interrupts() in thread.c.
Adding volatile as the following suppresses segfault on my test.
But this patch sometimes causes another error (TypeError: exception class/object expected).
I don't know the reason.

diff --git a/thread.c b/thread.c
index d706ee469b..fd1ee78933 100644
--- a/thread.c
+++ b/thread.c
@@ -2058,7 +2058,7 @@ rb_threadptr_execute_interrupts(rb_thread_t *th, int blocking_timing)
 
 	/* exception from another thread */
 	if (pending_interrupt && rb_threadptr_pending_interrupt_active_p(th)) {
-	    VALUE err = rb_threadptr_pending_interrupt_deque(th, blocking_timing ? INTERRUPT_ON_BLOCKING : INTERRUPT_NONE);
+	    volatile VALUE err = rb_threadptr_pending_interrupt_deque(th, blocking_timing ? INTERRUPT_ON_BLOCKING : INTERRUPT_NONE);
 	    thread_debug("rb_thread_execute_interrupts: %"PRIdVALUE"\n", err);
 
 	    if (err == Qundef) {
  1) Failure:
TestSocket#test_closed_read [C:/Users/h.shirosaki/work/rubyinstaller2-packages/mingw-w64-ruby25/src/snapshot/test/socket/test_socket.rb:543]:
[ruby-core:35203].
[IOError] exception expected, not.
Class: <TypeError>
Message: <"exception class/object expected">
---Backtrace---
C:/Users/h.shirosaki/work/rubyinstaller2-packages/mingw-w64-ruby25/src/snapshot/test/socket/test_socket.rb:537:in `readline'
C:/Users/h.shirosaki/work/rubyinstaller2-packages/mingw-w64-ruby25/src/snapshot/test/socket/test_socket.rb:537:in `block in test_closed_read'
---------------

Updated by h.shirosaki (Hiroshi Shirosaki) over 6 years ago

I found that another exception raises while executing rb_exc_raise() in rb_threadptr_execute_interrupts().
This reentering exception would cause segfault for some reason.
Socket is closed releasing GVL before rb_exc_raise() in rb_threadptr_execute_interrupts().
If keeping GVL with close, reentering exception and segfault is not raised on my test. The patch is attached.
This may be related to #4558?

Updated by usa (Usaku NAKAMURA) over 6 years ago

Shirosaki-san, your patch seems ok.
Could you check in?

Updated by MSP-Greg (Greg L) over 6 years ago

Shirosaki-san,

'Breakfast build' is finished, passed test-all, and the following also passed:

ruby runner.rb -I../lib -I. --repeat-count=50 --show-skip socket/test_socket.rb -ntest_closed_read

This has been an intermittent failure, but I've never had it complete 50 before.

Hence, along with usa, I believe you've fixed the issue. Thanks again for your work.

I'm not sure how everyone would prefer to be addressed. If you'd like to address me, please call me Greg. The first code I wrote was on a teletype...

EDIT: A little rushed this morning.

I applied the patch to:

ruby 2.5.0dev (2017-09-21 trunk 59985) [x64-mingw32]

I checked previous builds (without the patch), the following builds failed:

ruby 2.5.0dev (2017-09-20 trunk 59974) [x64-mingw32]  (previous build without patch)

ruby 2.4.2p198 (2017-09-14 revision 59899) [x64-mingw32]

ruby 2.3.5p376 (2017-09-14 revision 59905) [x64-mingw32]

I don't know if this could be backported or not...

Actions #5

Updated by Anonymous over 6 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r60055.


io.c: fix segfault with closing socket on Windows

  • io.c (fptr_finalize_flush): add an argument to keep GVL.
  • io.c (fptr_finalize): adjust for above change.
  • io.c (io_close_fptr): closing without GVL causes another
    exception while raising exception in another thread. This causes
    segfault on Windows. Keep GVL while closing when another thread
    raises.
    [Bug #13856] [ruby-core:82602]
Actions #6

Updated by nagachika (Tomoyuki Chikanaga) over 6 years ago

  • Backport changed from 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN to 2.2: UNKNOWN, 2.3: REQUIRED, 2.4: REQUIRED

Updated by nagachika (Tomoyuki Chikanaga) about 6 years ago

  • Backport changed from 2.2: UNKNOWN, 2.3: REQUIRED, 2.4: REQUIRED to 2.2: UNKNOWN, 2.3: REQUIRED, 2.4: DONE

ruby_2_4 r62690 merged revision(s) 60055.

Updated by usa (Usaku NAKAMURA) about 6 years ago

  • Backport changed from 2.2: UNKNOWN, 2.3: REQUIRED, 2.4: DONE to 2.2: UNKNOWN, 2.3: DONE, 2.4: DONE

ruby_2_3 r62821 merged revision(s) 60055.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0