I was working on https://bugs.ruby-lang.org/issues/20876 and was investigating some problems with zlib.c and GVL, and noticed that zstream_run_func is executed without the GVL, but can invoke various rb_ string functions. Those functions in turn can invoke rb_raise and generally look problematic. However, maybe by luck, such code path does not appear to be invoked in typical usage.
I would like to modify zlib.c to only release the GVL around the CPU intensive compression/decompression operation: https://github.com/ruby/zlib/pull/88
In addition, I identified several more improvements to prevent segfaults and other related failures:
Use rb_str_locktemp to prevent the z->buf changing size while in use by the rb_nogvl code.
Expand the mutex to protect #deflate and #inflate completely, not just the internal operation.
@ko1 (Koichi Sasada) Do we have a proper description of what is safe and what it unsafe to do with the GVL released?
Because obviously it's OK to use ruby_xmalloc / ruby_xfree with the GVL released, so methods which allocate aren't necessarily problematic?\
In this case I'm unclear on why rb_str_set_len / rb_str_modify_expand shouldn't be called with the GVL released, assuming the objects on which they operate aren't visible to any other thread.
I think it would be helpful to have more clear guidelines on these things (unless of course I missed some existing documentation).
* NOTE: You can not execute most of Ruby C API and touch Ruby
* objects in `func()' and `ubf()', including raising an
* exception, because current thread doesn't acquire GVL
* (it causes synchronization problems). If you need to
* call ruby functions either use rb_thread_call_with_gvl()
* or read source code of C APIs and confirm safety by
* yourself.
*
* NOTE: In short, this API is difficult to use safely. I recommend you
* use other ways if you have. We lack experiences to use this API.
* Please report your problem related on it.
*
* NOTE: Releasing GVL and re-acquiring GVL may be expensive operations
* for a short running `func()'. Be sure to benchmark and use this
* mechanism when `func()' consumes enough time.
*
* Safe C API:
* * rb_thread_interrupted() - check interrupt flag
* * ruby_xmalloc(), ruby_xrealloc(), ruby_xfree() -
* they will work without GVL, and may acquire GVL when GC is needed.
Again:
* NOTE: In short, this API is difficult to use safely. I recommend you
* use other ways if you have. We lack experiences to use this API.
* Please report your problem related on it.
@ko1 (Koichi Sasada) Not sure how I didn't think to check that, thank you. So indeed allocations are fine. From what I understand, the issue is mostly exceptions and of course using an object concurrently.
Even if today the implementation follows a "safe" code path, in the future, it may not.
This is a good point.
I think we should consider all C API functions unsafe to be called without the GVL, except the functions listed in Safe C API.
So I think we should update the docs to remove or read source code of C APIs and confirm safety by yourself. as it's not a good idea as it may change and it's very hard to assess if safe.
There would be quite a lot of value in having some nogvl save APIs though. e.g. if database clients could allocate Hash/Array/String to build the response while the GVL is still released, it could really help with throughput of threaded servers like Puma.
There would be quite a lot of value in having some nogvl save APIs though. e.g. if database clients could allocate Hash/Array/String to build the response while the GVL is still released, it could really help with throughput of threaded servers like Puma.
I think it's a great idea (seriously great), but out of scope for this issue. Do you want to create a new issue to start that discussion?