Bug #13288
closedmingw issues with 57789
Description
I have been building trunk with mingw/MSYS2 for a few months. I also created packages using OpenSSL 1.1.0e and gdbm 1.10.
Yesterday, I build 57782 with no major issues. Today, when I built 57789 (this commit) I had a segfault in the test-all section. the build.log was fine, and it responds to ruby -v, but I haven't tested further.
I built 57782 and 57788 with no major issues. I built 57789 again, and again, the segfault occurred.
The full log is at x86_64-check.log, also attached.
I'm not a c type, so if there's anything else you need or would like me to check, please let me know.
Files
        
           Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          
          
        
        
      
      Thank you for the report. According to the log your compiler is GCC 6.3 so this is the target environment I wanted to optimize. Weird thing is the stack trace shows the SEGV occurs inside of Init_fiddle(), which was not in the changeset.
I'm looking around. Thank you for the patience.
        
           Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          
          
        
        
      
      - Status changed from Open to Assigned
- Assignee set to shyouhei (Shyouhei Urabe)
        
           Updated by MSP-Greg (Greg L) over 8 years ago
          Updated by MSP-Greg (Greg L) over 8 years ago
          
          
        
        
      
      Thank you for looking for the problem. As mentioned, I'm very c challenged. Would it help if I did a i686 build?
Looking around MSYS2, the only thing I saw that might affect things was fix_return_size.patch. You're probably already aware of it. The last rev of the libffi package was 2016 July.
        
           Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          
          
        
        
      
      While I was investigating this issue, naruse reverted some part of the commit nonetheless. That might have changed your situation. Can you pull the latest trunk and see if the problem still exists?
        
           Updated by MSP-Greg (Greg L) over 8 years ago
          Updated by MSP-Greg (Greg L) over 8 years ago
          
          
        
        
      
      Shyouhei Urabe wrote:
While I was investigating this issue, naruse reverted some part of the commit nonetheless.
I hate it when that happens...
Can you pull the latest trunk and see if the problem still exists?
Started. About that previous question - does having info on both x86_64 and i686 help, or, dependent on the issue?
        
           Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          
          
        
        
      
      Greg L wrote:
About that previous question - does having info on both x86_64 and i686 help, or, dependent on the issue?
Yes, that may narrow the root cause of your problem. If the SEGV disappears on 32bit platforms we can say it is something ILP64-related. So that info would greatly help us.
        
           Updated by MSP-Greg (Greg L) over 8 years ago
          Updated by MSP-Greg (Greg L) over 8 years ago
          
          
        
        
      
      Shyouhei,
x64 failed (SEGV) on 57794 with the same log --
C:\Windows\system32\ntdll.dll(KiUserExceptionDispatcher+0x2e) [0x0000000077726818]
 [0xffffffffff7aad10]
D:\msys64\mingw64\bin\libffi-6.dll(ffi_call_win64+0x97) [0x00000000005b4797]
D:\msys64\mingw64\bin\libffi-6.dll(ffi_call+0x47) [0x00000000005b43a7]
D:\GitHub\ruby-makepkg-mingw\mingw-w64-ruby\src\build-x86_64-w64-mingw32\.ext\x64-mingw32\fiddle.so(Init_fiddle+0xa58) [0x0000000063d82e38]
I'll start i686 tomorrow morning. Thanks again.
        
           Updated by MSP-Greg (Greg L) over 8 years ago
          Updated by MSP-Greg (Greg L) over 8 years ago
          
          
        
        
      
      Shyouhei,
Not a good day so far. A couple rev's of i686 built, responded to -v, but test-all is a mess. Spent some time revising my build code and looking around libffi.
Testing 57788 did not yield results similar to what I recall seeing with i686 in the past. MSYS2 had some updates recently, so will soon try to check those. Mostly on i686 today.
Reminder, yesterday, after MSYS2 updates, got the following (which is a good result)
16615 tests, 2227473 assertions, 10 failures, 4 errors, 150 skips
ruby -v: ruby 2.5.0dev (2017-03-06 trunk 57788) [x64-mingw32]
Hence, with current MSYS2/mingw system:
x64 57788 and prior builds fine
i686 builds have issues compared to the last good data I have:
17074 tests, 4982778 assertions, 12 failures, 9 errors, 147 skips
ruby -v: ruby 2.5.0dev (2017-03-01 trunk 57712) [i386-mingw32]
Question - the results I posted in #7. Do they indicate the SEGV is in libffi-6.dll? I assume it could also be caused by fiddle.so.
Back when I was working on seeing if gdbm 1.12 would work with ruby (don't think so), I was comparing patches between MSYS2, cygwin, and ruby. With libffi, I found the following:
mingw fix_return_size.patch
cygwin 3.2.1-win64-rewrite.patch
ruby libffi-3.2.1-mswin.patch
No idea whether it's applicable...
        
           Updated by MSP-Greg (Greg L) over 8 years ago
          Updated by MSP-Greg (Greg L) over 8 years ago
          
          
        
        
      
      Shyouhei,
Not sure which commit did it, but x64 builds fine now. test-all result:
16616 tests, 2233236 assertions, 10 failures, 5 errors, 150 skips
ruby -v: ruby 2.5.0dev (2017-03-08 trunk 57806) [x64-mingw32]
I'll try i686 later.
Thanks to everyone for their work...
        
           Updated by shevegen (Robert A. Heiler) over 8 years ago
          Updated by shevegen (Robert A. Heiler) over 8 years ago
          
          
        
        
      
      Have we found the code-culprit though or was it a Heisenbug fix? :)
I mean it may be obvious now that one commit by naruse fixed it but
if anyone could confirm it that would be great, just out of
(my) curiosity.
        
           Updated by MSP-Greg (Greg L) over 8 years ago
          Updated by MSP-Greg (Greg L) over 8 years ago
          
          
        
        
      
      Robert A. Heiler wrote:
Have we found the code-culprit though or was it a Heisenbug fix? :)
Given that a perfect test system needs to test for every conditional and every iteration, I doubt one exists. At some point, I prefer a mathematical stability viewpoint. Wide ranging changes may expose instability (that previously appeared stable) elsewhere.
Building
ruby 2.5.0dev (2017-03-08 trunk 57807) [x64-mingw32]
I received a different SEGV error than the one that started this. The original error may have involved a mingw dll, the new one is in x64-msvcrt-ruby250.dll. The original occurred during test-all, but before testing even started, the new one occurs during a test.
[ 3187/16621] TestArray#test_sum
D:/GitHub/ruby/test/ruby/test_array.rb:2797: [BUG] Segmentation fault
After a bit more work, I'll post it and another testing SEGV error that has existed for quite a while, which I've patched around in my build script...
        
           Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          Updated by shyouhei (Shyouhei Urabe) over 8 years ago
          
          
        
        
      
      - Status changed from Assigned to Closed