Bug #18911
closedProcess._fork hook point is not called when Process.daemon is used
Description
Hello there! I'm working at Datadog on the ddtrace
gem, and we need to hook into fork operations to make sure that our products work correctly/automatically even in environments that fork.
As part as #17795 a new Process._fork
method was added to allow libraries and frameworks to easily hook into fork operations. I was investigating its use in ddtrace
and noticed the following gap: the Process.daemon
API internally makes use of fork
, but the new hook point is not called for that API.
Testcase:
puts RUBY_DESCRIPTION
module ForkHook
def _fork(*args)
puts " #{Process.pid} Before fork!"
res = super
puts " #{Process.pid} After fork!"
res
end
end
Process.singleton_class.prepend(ForkHook)
puts "#{Process.pid} Regular fork:"
fork { exit }
Process.wait
puts "#{Process.pid} Process.daemon:"
Process.daemon(nil, true)
puts "#{Process.pid} Finishing!"
Testcase output:
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-darwin20]
48136 Regular fork: # <-- original process
48136 Before fork!
48136 After fork! # <-- original process
48137 After fork! # <-- child process
48136 Process.daemon: # <-- original process
48139 Finishing! # <-- forks and pid changes, but the hook isn't called
This was surprising to me since the advantage of this hook point would not not needing to hook into the many other places where fork
can get called from.
Thanks a lot :)
Updated by mame (Yusuke Endoh) over 2 years ago
- Assignee set to akr (Akira Tanaka)
@akr (Akira Tanaka) suggested special treatment for Process.daemon
, and I followed him when implementing it. So this is by design. However, I forgot the reason why Process._fork
should ignore daemon
. I'll ask him at the next dev meeting.
Updated by mame (Yusuke Endoh) over 2 years ago
@ivoanjo (Ivo Anjo) Do you have difficulties due to this behavior in terms of ddtrace? Or you were just "surprised"? The motivation is very important to discuss the issue. Process.daemon
stops threads, so I guess you have any difficulties, but I'd like to confirm it before the dev meeting.
Updated by mame (Yusuke Endoh) over 2 years ago
@nobu (Nobuyoshi Nakada) said that this is because Process.daemon
does not call fork(2) but daemon(3). We happen to know that daemon(3) calls fork(2) (on some environments), but other unknown C functions calling fork internally cannot be handled. So Process._fork
hooks only events that Ruby calls fork(2) directly.
Updated by ivoanjo (Ivo Anjo) over 2 years ago
Thanks @mame (Yusuke Endoh) for the awesomely quick reply :)
@ivoanjo (Ivo Anjo) (Ivo Anjo) Do you have difficulties due to this behavior in terms of ddtrace? Or you were just "surprised"? The motivation is very important to discuss the issue. Process.daemon stops threads, so I guess you have any difficulties, but I'd like to confirm it before the dev meeting.
Yeah, it surprised me because it was a situation where there's a fork (albeit indirectly) and threads die, so I needed to do cleanups/restart stuff, but was not covered by the _fork
.
@nobu (Nobuyoshi Nakada) (Nobuyoshi Nakada) said that this is because Process.daemon does not call fork(2) but daemon(3). We happen to know that daemon(3) calls fork(2) (on some environments), but other unknown C functions calling fork internally cannot be handled. So Process._fork hooks only events that Ruby calls fork(2) directly.
I guess this is perhaps more of a discoverability/documentation issue. My understanding was "hooking _fork
was all you needed", but actually daemon needs to be as well. But in practical terms, if I'm hooking Process._fork
, it's not hard to also hook Process.daemon
.
Next to the code implementing _fork
we have the following comment:
/*
...
* This method is not for casual code but for application monitoring
* libraries. You can add custom code before and after fork events
* by overriding this method.
*/
VALUE
rb_proc__fork(VALUE _obj)
...would it be reasonable to add a "Note: Process#daemon is similar to fork, but does not go through this method." or something similar?
Updated by mame (Yusuke Endoh) over 2 years ago
...would it be reasonable to add a "Note: Process#daemon is similar to fork, but does not go through this method." or something similar?
Let's go with this! Could you please send a PR?
At the dev meeting, I briefly talked with @akr (Akira Tanaka) about this issue.
A main motivation of Process._fork
discussed in #17795 was to disconnect the database connections before fork. This was to prevent corrupted communication if the parent and child processes accessed the database connection in parallel after fork. Process.daemon
uses fork(2) but does not cause this problem because the parent process exits immediately after fork. Thus, for those who override Process._fork
for the original purpose, it would be more natural for Process._fork
not to be invoked in Process.daemon
. Unfortunately, this design does not fit with your use case to restart a monitoring thread after fork.
ivoanjo (Ivo Anjo) wrote in #note-4:
Yeah, it surprised me because it was a situation where there's a fork (albeit indirectly) and threads die, so I needed to do cleanups/restart stuff, but was not covered by the
_fork
.
Just FYI, according to @akr (Akira Tanaka), some OSes supports daemon(3) more directly; daemon(3) does not use "double fork" hack internally on a such OS.
Updated by ivoanjo (Ivo Anjo) over 2 years ago
I've just opened https://github.com/ruby/ruby/pull/6170 to add the comment.
Thanks @mame (Yusuke Endoh) for going above and beyond in helping me with this :)
Updated by ivoanjo (Ivo Anjo) over 2 years ago
- Status changed from Open to Closed
Applied in changeset git|74817f3d37bb7153385f682f75e37713c4c8009d.
[DOC] Process._fork does not get called by Process.daemon
As discussed in Bug #18911, I'm adding some documentation to
Process._fork
to clarify that it is not expected to cover
calls to Process.daemon
.
Co-authored-by: Yusuke Endoh mame@ruby-lang.org
Updated by Dan0042 (Daniel DeLorme) about 1 year ago
Note for future readers coming here from the Process._fork
documentation.
If you need to restart threads, you should override both _fork
and daemon
module RestartWatcherThread
def _fork
pid = super
restart_thread if pid == 0
pid
end
def daemon(...)
super.tap{ restart_thread }
end
prepend_features(Process.singleton_class)
end