Bug #16136
closedString corruption in 2.6.4
Description
When trying to upgrade activeadmin's tests to use ruby 2.6.4, I got some very weird failures where some strings would end up containing random content. See https://circleci.com/gh/activeadmin/activeadmin/20329 for an example of those failures. Failures are somewhat random but the test suite seems to consistently fail on 2.6.4 and pass on 2.6.3. I also managed to reproduce the failures locally, sometimes very consistently, but now it's hardly ever happening... :/
I looked at the changes in 2.6.4 and it looks like https://bugs.ruby-lang.org/issues/16105 and/or https://bugs.ruby-lang.org/issues/15946 could be related, because the region of the Rails code where the corrupted string is created seems to be doing something similar to the reported issues: https://github.com/rails/rails/blob/b7e591a55f4de5f1511c3b9255b3b25159b8ba41/actionpack/lib/action_dispatch/routing/mapper.rb#L404-L412.
Sorry I'm not really sure how to create a reproducible example, or how to debug this issue. Just opening this in case it could help noticing something obviously wrong with the mentioned patches.
Updated by vcsjones (Kevin Jones) about 5 years ago
Hello,
We started running in to this issue as well with Ruby 2.6.4. We are seeing consistent but non-deterministic string corruption in one of our applications. As an example, we see something like:
undefined method `"defense_profile_extension_id\x00eq"'
Note the \x00
in the string does not belong there, it should be a _
. However these results are not consistent. Another run of the application with zero code changes produced this error:
undefined method `"\x00\x00\x00\x00\x00\x00\x00\x00f_profile_extension_id_eq"
This string is being passed in to public_send
elsewhere in another library, hence the "undefined method" error.
Unfortunately, we have yet to be able to create a small reproduction of this issue. We can make it happen consistently in a very large code base though. I did manage to run git bisect
on a locally compiled ruby and determined that the change that introduced is https://github.com/ruby/ruby/commit/da36d5700d9e0e66411d93595b6f654c85853fa1.
When we revert that change, the issue can no longer be reproduced.
Updated by alanwu (Alan Wu) about 5 years ago
I wonder if these bugs exist on master or are specific to 2.6.4.
Updated by naruse (Yui NARUSE) about 5 years ago
- Related to Bug #15916: Memory leak in Regexp literal interpolation added
Updated by naruse (Yui NARUSE) about 5 years ago
- Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN to 2.5: DONTNEED, 2.6: REQUIRED
Updated by deivid (David Rodríguez) about 5 years ago
@vcsjones (Kevin Jones) You're right, my raw guess at the culprit was wrong since I just got the issue again with the ruby I built excluding the commits I mentioned. I'm going to build the same ruby you built now to confirm that the culprit is https://github.com/ruby/ruby/commit/da36d5700d9e0e66411d93595b6f654c85853fa1 from my side too.
If this is the culprit, I'm guessing this is happening on master too, but I'll run my tests against master as well to confirm.
Updated by hsbt (Hiroshi SHIBATA) about 5 years ago
I faced the same error maybe.
Our company blog generated by middleman was failed with Ruby 2.6.4 and 2.7.0 built by master branch. This issue was completely reproduced with them. We could build with Ruby 2.5.6.
I will investigate the root causes of this issue.
Updated by mame (Yusuke Endoh) about 5 years ago
- Status changed from Open to Closed
Applied in changeset git|ade1283ca276f7d589ffd3539fbc7b9817f682d5.
Fix a use-after-free bug by avoiding rb_str_new_frozen
str2 = rb_str_new_frozen(str1)
seems to make str1 a shared string that
refers to str2, but str2 is not marked as STR_IS_SHARED_M nor
STR_NOFREE.
rb_fstring(str2)
frees str2's ptr because it is not marked, and the
free'ed pointer is the same as str1's ptr.
After that, accessing str1 may cause use-after-free memory corruption.
I guess this is a bug of rb_str_new_frozen, but I'm completely unsure
what it should be; the string states and flags are not documented.
So, this is a workaround for [Bug #16136]. I confirmed that rspec of
activeadmin runs gracefully.
Updated by mame (Yusuke Endoh) about 5 years ago
I've committed ade1283ca276f7d589ffd3539fbc7b9817f682d5, which is a workaround for the issue. I confirm that activeadmin test runs successfully with this patch.
I think that this issue is related to the fstring mechanism, but I have no idea what function is wrong. The internal state of a String object is too complicated and undocumented, so I cannot see the design, representation, and invariants. The fstring mechanism is too difficult for human (except @normalperson (Eric Wong)).
Updated by vcsjones (Kevin Jones) about 5 years ago
I applied the patch to ruby 2.6.4 and I cannot reproduce the issue any more. This patch appears to resolve the issue for 2.6.4. Thank you very much for taking the time to investigate the issue.
diff --git a/symbol.c b/symbol.c
index f3506a02dd..a408ee052e 100644
--- a/symbol.c
+++ b/symbol.c
@@ -743,7 +743,8 @@ rb_str_intern(VALUE str)
enc = ascii;
}
else {
- str = rb_str_new_frozen(str);
+ str = rb_str_dup(str);
+ OBJ_FREEZE(str);
}
str = rb_fstring(str);
type = rb_str_symname_type(str, IDSET_ATTRSET_FOR_INTERN);
Updated by deivid (David Rodríguez) about 5 years ago
Wow! Thanks so much for providing this solution in such a timely manner! <3
Updated by mame (Yusuke Endoh) about 5 years ago
@vcsjones (Kevin Jones) Good to hear! Thank you for confirming. And your investigation by bisect was very helpful.
@deivid (David Rodríguez) The most difficult part for me was to read .circleci/config.yml to reproduce the issue on my machine :-)
Updated by deivid (David Rodríguez) about 5 years ago
@mame (Yusuke Endoh) I'm sorry you had to do extra work to figure that out! I'll post proper reproduction steps next time :)
Updated by graywolf (Gray Wolf) about 5 years ago
Out of curiosity, why is this issue closed if it was not yet backported to 2.6 branch?
Updated by mame (Yusuke Endoh) about 5 years ago
graywolf (Gray Wolf) wrote:
Out of curiosity, why is this issue closed if it was not yet backported to 2.6 branch?
Don't worry. The branch maintainers, @nagachika (Tomoyuki Chikanaga) and @usa (Usaku NAKAMURA), are going to check all tickets whose "Backport" field is "REQUIRED", regardless of their status.
Updated by hsbt (Hiroshi SHIBATA) about 5 years ago
Our blog system was fixed at ade1283ca276f7d589ffd3539fbc7b9817f682d5
Updated by nagachika (Tomoyuki Chikanaga) about 5 years ago
- Backport changed from 2.5: DONTNEED, 2.6: REQUIRED to 2.5: DONTNEED, 2.6: DONE
ruby_2_6 r67803 merged revision(s) ade1283ca276f7d589ffd3539fbc7b9817f682d5.
Updated by mame (Yusuke Endoh) about 5 years ago
FYI: Ruby 2.6.5 has been released, which includes the fix for this issue.
https://www.ruby-lang.org/en/news/2019/10/01/ruby-2-6-5-released/
Updated by deivid (David Rodríguez) about 5 years ago
Thanks so much! <3