Project

General

Profile

Actions

Feature #6648

open

Provide a standard API for retrieving all command-line flags passed to Ruby

Added by headius (Charles Nutter) over 12 years ago. Updated about 1 month ago.

Status:
Assigned
Target version:
-
[ruby-core:45867]

Description

Currently there are no standard mechanisms to get the flags passed to the currently running Ruby implementation. The available mechanisms are not ideal:

  • Scanning globals and hoping they have not been tweaked to new settings
  • Using external wrappers to launch Ruby
  • ???

Inability to get the full set of command-line flags, including flags passed to the VM itself (and probably VM-specific) makes it impossible to launch subprocess Ruby instances with the same settings.

A real world example of this is "((%bundle exec%))" when called with a command line that sets various flags, a la ((%jruby -Xsome.vm.setting --1.9 -S bundle exec%)). None of these flags can propagate to the subprocess, so odd behaviors result. The only option is to put the flags into an env var (((|JRUBY_OPTS|)) or ((|RUBYOPT|))) but this breaks the flow of calling a simple command line.

JRuby provides mechanisms to get all its command line options, but they require calling Java APIs from Ruby's API set. Rubinius provides its own API for accessing comand-line options, but I do not know if it includes VM-level flags as well as standard Ruby flags.

I know there is a (({RubyVM})) namespace in the 2.0 line. If that namespace is intended to be general-purpose for VM-level features, it would be a good host for this API. Something like...

  class << RubyVM
    def vm_args; end # returns array of command line args *not* passed to the target script

    def script; end # returns the script being executed...though this overlaps with $0

    def script_args; end # returns args passed to the script...though this overlaps with ARGV, but that is perhaps warranted since ARGV can be modified (i.e. you probably want the original args)
  end

Related issues 2 (1 open1 closed)

Related to Ruby master - Misc #20739: Test suite does not carry over CLI optionsOpenActions
Is duplicate of Ruby master - Feature #4046: Saving C's **argv and cwd allows Ruby programs to reliably restart themselvesFeedbackActions

Updated by headius (Charles Nutter) over 12 years ago

Oops, this should be a feature request.

Actions #2

Updated by nobu (Nobuyoshi Nakada) over 12 years ago

  • Tracker changed from Bug to Feature

I'm positive to the feature, but RubyVM wouldn't be right place.
It is CRuby specific and unexpected to be in other implementations.

Updated by nobu (Nobuyoshi Nakada) over 12 years ago

  • Description updated (diff)
Actions #4

Updated by headius (Charles Nutter) over 12 years ago

ARGV is a special class; perhaps ARGV could have the methods?

Updated by headius (Charles Nutter) over 12 years ago

I was mistaken...it is ARGF, not ARGV that is a special class. ARGV is a normal array.

Another option: ENV, which is a special Hash-like class. ENV.vm_args, ENV.script, and ENV.script_args aren't bad.

Updated by headius (Charles Nutter) over 12 years ago

Ping...we'd still like to have this to be able to build a unifying benchmark tool, which needs to be able to report the actual command-line arguments passed to the runtime. Current tricks are too ugly (parsing ps output, for example), and this would not be difficult to add.

I'm leaning toward ENV having a few special methods for this, but I'm open to other ideas.

Updated by mame (Yusuke Endoh) about 12 years ago

  • Status changed from Open to Assigned
  • Assignee set to matz (Yukihiro Matsumoto)
  • Target version set to 2.6

I'm sorry that matz didn't noticed this.

--
Yusuke Endoh

Actions #8

Updated by naruse (Yui NARUSE) about 7 years ago

  • Target version deleted (2.6)
Actions #9

Updated by hsbt (Hiroshi SHIBATA) 10 months ago

  • Description updated (diff)

Updated by Dan0042 (Daniel DeLorme) 9 months ago · Edited

I'd like to revive this proposal.

The OP mentions calling a subcommand with the same options/flags as the current interpreter, and that's a fine use case. As for me I'm also interested in re-executing the current script while keeping ruby options/flags.

Some time ago I tried writing a rbenv alternative based on the idea of adding "-r versionchecker" to RUBYOPT and then re-executing the current script with a different interpreter if we find a .ruby-version file that specifies a different version. No bash, no shims! But it was not to be; the lack of this proposed API made it infeasible. In particular if ruby is executed with the -e argument it appears impossible to get back the value.

I imagine this feature would also be very useful for web servers that need to re-execute upon receiving USR2. Currently they need to have all their options in RUBYOPT.

Since the path to the current interpreter is already in RbConfig.ruby I would suggest RbConfig.ruby_args for this API.

Then we could have a copy of the original $0 in RbConfig.script and a copy of the original ARGV in RbConfig.script_args, and to re-execute we can do

exec(RbConfig.ruby, *RbConfig.ruby_args, *RbConfig.script, *RbConfig.script_args)

Extra features I'd like, if possible:

  1. if ruby is invoked with -e argument(s), $0 is "-e" but RbConfig.script should be an array of the arguments:
ruby -e 'p 42' -e 'p RbConfig.script'
42
["-e", "42", "-e", "p RbConfig.script"]
  1. if ruby is invoked with script on stdin, $0 is "-" but RbConfig.script should be an array with "-e":
echo 'p RbConfig.script' | ruby
["-e", "p RbConfig.script"]

If either of those extra features are impossible/undesirable, RbConfig.script should be false so that exec/system fails with TypeError rather than executing random things.

Updated by Eregon (Benoit Daloze) 9 months ago · Edited

I fully agree with the proposal of @Dan0042.
This is also needed for MSpec, which currently works around the lack of it by requiring to pass any ruby option through -T-option (which is awkward and error-prone), it would be much nicer if we could have RbConfig.ruby_args.

In fact MSpec is also forced to create an extra process due to the lack of this API (which is a noticeable overhead, even more so on Ruby implementations with a slower startup than CRuby), because that's currently the only way to ensure the main process specs and subprocesses created by specs have the same VM options.
We cannot know if the initial process from the mspec executable has the same ruby options as the options passed through -T (typically not), hence the extra process.

Updated by kddnewton (Kevin Newton) 9 months ago

As another note, this would be useful within CRuby itself. Right now there are lots of tests that run assert_in_out_err, which in turn calls EnvUtil.invoke_ruby. EnvUtil.invoke_ruby does not pass along some command-line options like RJIT, YJIT, Prism, etc. So there appear to be some tests that are being run in the CRuby CI that aren't testing what they should be testing.

Updated by Eregon (Benoit Daloze) 8 months ago

@matz (Yukihiro Matsumoto) Do you agree with RbConfig.ruby_args, is it OK to add it?

Updated by matz (Yukihiro Matsumoto) 8 months ago

If RbConfig is a convenient place for you, it is OK to add ruby_args.

Matz.

Updated by nobu (Nobuyoshi Nakada) 8 months ago

RbConfig is for build time informations, and does not look a right place for runtime informations.

Updated by headius (Charles Nutter) 8 months ago

Note that for this to be most effective it would be the arguments unprocessed as they appear on the command line, but that may not be possible to do if the shell removes quoting.

I don't think that should be a reason not to implement this issue, but if there are quoted arguments on the command line they might have to be re-quoted by the user if they are passed through another shell to launch. They should work fine if passed as direct arguments to spawn.

Updated by Dan0042 (Daniel DeLorme) 8 months ago

nobu (Nobuyoshi Nakada) wrote in #note-15:

RbConfig is for build time informations, and does not look a right place for runtime informations.

Isn't it ok to relax the semantics a little bit? RbConfig seems to me the most logical place for "ruby configuration", both run time and build time.

But this does bring the excellent point that RbConfig.ruby is not necessarily the location of the ruby interpreter as I previously thought:

$ ruby -e 'p RbConfig.ruby'
"/opt/ruby/3.2/bin/ruby"
$ cp /opt/ruby/3.2/bin/ruby rubyyyy
$ ./rubyyyy -e 'p RbConfig.ruby'
"/opt/ruby/3.2/bin/ruby"

So it's not quite suitable for re-executing. So we could either

  • change RbConfig.ruby to be the current ruby interpreter (because TBH I'm not sure what's the use of this current RbConfig.ruby)
  • add a new method like RbConfig.ruby_executable
  • use a different namespace like Process.ruby and Process.ruby_args

headius (Charles Nutter) wrote in #note-16:

if there are quoted arguments on the command line they might have to be re-quoted by the user if they are passed through another shell to launch.

Wouldn't you normally use Shellwords for this? The original quoting is not available to ruby anyway.

Updated by Dan0042 (Daniel DeLorme) 8 months ago · Edited

  • change RbConfig.ruby to be the current ruby interpreter (because TBH I'm not sure what's the use of this current RbConfig.ruby)

@nobu (Nobuyoshi Nakada) what are your thoughts on the above?
For example in the test suite, in test/set/test_sorted_set.rb we can see r = system(RbConfig.ruby, *options, '-e', ruby)
and it seems to me like that's wrong; the system method is executing the installed ruby rather than the compiled ruby that is supposedly under test.

If you think this is correct and it's fine that RbConfig.ruby returns a static path, we need a different place for ruby_args
Or if you're not ok with relaxing the semantics of RbConfig then we also need a different place, maybe Process.ruby_args

Updated by Eregon (Benoit Daloze) 8 months ago

@Dan0042
I think it's everyone's understanding that RbConfig.ruby should always be the path of the currently-running ruby.
In fact it is already the case e.g. on TruffleRuby.
And I suspect it's also already the case on CRuby with --enable-load-relative (but it would be nice if someone can check, if it's not we should fix that).
cp /opt/ruby/3.2/bin/ruby rubyyyy is simply unsupported on non---enable-load-relative CRuby.
Finding the path of the current executable is something that is not available on every platform yet it is supported on all major platforms.

Given the existence of RbConfig.ruby, I think RbConfig.ruby_args is the best fit.

(BTW there is Process.argv0 which is about (Ruby) ARGV[0] and not (C) argv[0], so it seems better to me to put the method somewhere else than Process, to avoid mixing levels there)

Updated by Dan0042 (Daniel DeLorme) 8 months ago

I think it's everyone's understanding that RbConfig.ruby should always be the path of the currently-running ruby.

Yes I believe that is everyone's understanding. At least it was mine. And it turns out to be incorrect. Sure in the vast majority of cases the static install path and the currently-running ruby are going to be the same thing, so one might say it's too small a detail to care about. But I happen to care about small details.

And I suspect it's also already the case on CRuby with --enable-load-relative (but it would be nice if someone can check, if it's not we should fix that).

I tried, and --enable-load-relative doesn't appear to be a supported option in any version of ruby.,

Given the existence of RbConfig.ruby, I think RbConfig.ruby_args is the best fit.

I agree.

(BTW there is Process.argv0 which is about (Ruby) ARGV[0] and not (C) argv[0]

I'm afraid not; Process.argv0 is about ruby $0 which is very different from ARGV[0]

Updated by Eregon (Benoit Daloze) 8 months ago

Dan0042 (Daniel DeLorme) wrote in #note-20:

I tried, and --enable-load-relative doesn't appear to be a supported option in any version of ruby.,

It's a ./configure option: ./configure --enable-load-relative.
Sorry I should have made that clear.

(BTW there is Process.argv0 which is about (Ruby) ARGV[0] and not (C) argv[0]

I'm afraid not; Process.argv0 is about ruby $0 which is very different from ARGV[0]

Ah right, the name of that method is so confusing (IMO it shouldn't exist, redundant with $0).
It's mostly like argv[0] in C but it returns the Ruby script path being run (vs path of the current executable) and yet it's not ARGV[0].
So sort of related to this issue, but so awfully confusing I don't think we want to follow that unfortunate naming.

I like your proposed naming in https://bugs.ruby-lang.org/issues/6648#note-10 but I think we should add RbConfig.ruby_args before the rest and file a new ticket for the rest.
(re-executing the same script with the same arguments is a special case, there are more use cases for RbConfig.ruby_args)

Updated by Dan0042 (Daniel DeLorme) 8 months ago

IMO it shouldn't exist, redundant with $0

Keep in mind that $0 can be set as process name, so Process.argv0 is not redundant (despite the unfortunate naming).

I like your proposed naming in https://bugs.ruby-lang.org/issues/6648#note-10 but I think we should add RbConfig.ruby_args before the rest and file a new ticket for the rest.

Agreed. This will also allow me to make a clearer point for the security risk of re-executing $0 when it is equal to "-e"

Updated by mame (Yusuke Endoh) 8 months ago

I am afraid if it is more difficult than expected to do "launch subprocess Ruby instances with the same settings".

I am not very familiar with Windows, but I have heard that there is no concept of "an array of command-line arguments" in Windows. A command line is represented as a single string. On Windows, system("exe", "ary1", "ary2") is converted to a single string and executed via the shell (sometimes, I am not sure the condition). This exotic command line argument handling in Windows can lead to vulnerabilities.

What I'm trying to say is, it could be difficult to guarantee exec(RbConfig.ruby, *RbConfig.ruby_args, RbConfig.script, *RbConfig.script_args) will always achieve "launch subprocess Ruby instances with the same settings".

If you really want to "launch subprocess Ruby instances with the same settings", we might want to consider a more dedicated API for it, instead of parsing the command line to a string array and passing it to Kernel#exec.

Updated by Eregon (Benoit Daloze) 8 months ago

@mame (Yusuke Endoh) CRuby already needs to get arguments as an array to parse command-line flags, so RbConfig.ruby_args just exposes that.
If CRuby can parse these Ruby command-line flags, for sure we can save them in some kind of array.

IIRC these extra complications are only relevant in .bat files, the C main still receives an array of arguments on Windows.

Updated by shyouhei (Shyouhei Urabe) 8 months ago

Eregon (Benoit Daloze) wrote in #note-24:

@mame (Yusuke Endoh) CRuby already needs to get arguments as an array to parse command-line flags, so RbConfig.ruby_args just exposes that.
If CRuby can parse these Ruby command-line flags, for sure we can save them in some kind of array.

This is true. Technically we can provide such array. But for what reason? The question is its usage.

IIRC these extra complications are only relevant in .bat files, the C main still receives an array of arguments on Windows.

Background: This is how we execute external process in Windows: https://github.com/ruby/ruby/blob/029d92b8988d26955d0622f0cbb8ef3213200749/win32/win32.c#L1541-L1544
Also background: Windows API for creating a process: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw

So this is not about receiving arguments but calling a process. As you see there is no Windows API that takes char**. We cannot safely pass through what we have. You have to concatenate them into one argument string (LPWSTR lpCommandLine), with proper escaping of whitespace etc. This is where the security concern arises. Because process arguments come from out of the process itself by nature, there is no guarantee that they are written by good will. I have to say it is at least dangerous to "escape" them to be "safe" to pass to a process invoking API. Our current implementation is not ready for that... Is it even possible?

Updated by shyouhei (Shyouhei Urabe) 8 months ago

In short the problem we see is feeding strings from untrusted sources to generic Kernel#exec. Sounds ultra risky, no?

Let's not do so. If what is needed is just launching a ruby process, we could perhaps design a workaround.

Updated by Dan0042 (Daniel DeLorme) 8 months ago

shyouhei (Shyouhei Urabe) wrote in #note-26:

In short the problem we see is feeding strings from untrusted sources to generic Kernel#exec. Sounds ultra risky, no?

It also sounds nothing like what this proposal is about. If the current script was executed with ruby --enable=jit foo.rb then it is, by definition, safe to run exec("ruby", "--enable=jit", "foo.rb")
It's already possible to run exec("ruby", "foo.rb"); changing it to exec("ruby", *RbConfig.ruby_args, "foo.rb") does not reduce security, in fast it increases security.

Updated by Dan0042 (Daniel DeLorme) 8 months ago

As you see there is no Windows API that takes char**. We cannot safely pass through what we have. You have to concatenate them into one argument string (LPWSTR lpCommandLine), with proper escaping of whitespace etc.

But this issue is not specific to RbConfig.ruby_args is it? You have to do the concatenation in exec/system anyway; RbConfig.ruby_args will not change this situation either for better or worse.

Because process arguments come from out of the process itself by nature, there is no guarantee that they are written by good will.

Can you explain that one? I don't understand how valid ruby options like --enable=jit could be "not written by good will".

Updated by shyouhei (Shyouhei Urabe) 8 months ago

Please note that I'm not necessarily against a way to call the current ruby executable. I just say doing so using exec is a bad idea, because exec is not designed for that purpose.

The current situation is that ruby is not the only valid executable that the method takes. Allowing untrusted inputs for it means it has to be secure for everything. This is too much a hustle. Better find a fine-grained alternative.

Updated by Eregon (Benoit Daloze) 8 months ago

shyouhei (Shyouhei Urabe) wrote in #note-29:

The current situation is that ruby is not the only valid executable that the method takes. Allowing untrusted inputs for it means it has to be secure for everything. This is too much a hustle. Better find a fine-grained alternative.

There is no untrusted input involved here, because the user chooses what flags to pass to ruby.
If ruby flags can be injected by an attacker, then all is lost regardless of this change (e.g. they can just inject -r/backdoor.rb).

Regarding the Windows concern, it is the exact same problem for e.g. spawn("dir", "*.mp3", "/s").
From what I can see, it is completely separate from this ticket.
The code for this on Windows must already escape as much as feasible, and if it fails it's a bug of that code which should be fixed to fix spawn etc in general, nothing to change in RbConfig.ruby_args.
Or if the escaping fails maybe it's just considered a Windows limitation, independent of this ticket.

For example, the user is running ruby --yjit -rmytracing script.rb, the only addition is the script can now find out the ruby flags it was called with (["--yjit", "-rmytracing"]).
If the script spawn subprocesses, it already did before, so nothing changes there.
It can choose to use RbConfig.ruby_args, and that's fine, the user running the script is responsible for whether it's safe to run the script, as always.

Let's take the MSpec use-case (mentioned before in https://bugs.ruby-lang.org/issues/6648#note-11), what we want is to run Ruby subprocess with the same Ruby flags.
So if e.g. ruby --yjit -rmytracing path/to/mspec is called, then if specs create subprocesses (via ruby_exe()), then those subprocesses (running some fixture) also have --yjit -rmytracing, as desired.
You might argue RUBYOPT could be used instead, but that is problematic for various reasons: some flags are not allowed in RUBYOPT, RUBYOPT gets propagated arbitrarily far which is not necessarily desired (including to other Ruby implementations and executables written in Ruby, etc).

A concrete example I often run into is running ruby/spec with TruffleRuby,
I pass --core-load-path=.../src/main/ruby/truffleruby to use core library files from disk in development.
It is critical that subprocesses in specs also use that (otherwise we'd get an inconsistent core library).
The current workaround is to pass that flag both to the ruby process and as -T, which is quite ugly but it also slow, because it means mspec must create an extra subprocess just to apply these -T flags (it actually uses exec but that's just as slow):

$ mxbuild/truffleruby-jvm/bin/ruby \
  --experimental-options --core-load-path=src/main/ruby/truffleruby \
  spec/mspec/bin/mspec run \
  --config spec/truffleruby.mspec \
  -t .../mxbuild/truffleruby-jvm/bin/ruby \
  --excl-tag fails --excl-tag slow \
  -T--vm.ea -T--vm.esa \
  -T--experimental-options -T--core-load-path=src/main/ruby/truffleruby

(as you can see there are already some bugs there because the -T and regular flags don't match exactly).
(if you think this could be wrapped in some helper script, it already is, but it changes nothing because it must accept arbitrary ruby flags to be passed)

With RbConfig.ruby_args, MSpec can know which ruby flags it was passed, which would avoid needing the extra subprocess, and it would be:

$ mxbuild/truffleruby-jvm/bin/ruby \
  --vm.ea --vm.esa \
  --experimental-options --core-load-path=src/main/ruby/truffleruby \
  spec/mspec/bin/mspec run \
  --config spec/truffleruby.mspec \
  -t .../mxbuild/truffleruby-jvm/bin/ruby \
  --excl-tag fails --excl-tag slow

This would be a killer feature when attaching a debugger, because then one could just myruby -rmydebug spec/mspec/bin/mspec and it would start running specs with the debugger (e.g. for TruffleRuby with the Java debugger).
Instead of the current situation where the debugger is started on this mspec "wrapper" which just exec's to handle -T flags and is very annoying.

This happens in CRuby just as much, for example https://github.com/ruby/ruby/blob/69c0b1438a45938e79e63407035f116de4634dcb/spec/default.mspec#L27-L31 is a workaround causing some duplication.
And it makes much more messy to e.g. running a single spec under gdb/lldb.

A very similar situation happens for make test-all I would imagine.
I guess currently any Ruby subprocesses incorrectly omits Ruby flags, which means the test coverage is lower than intended (e.g. for --yjit, --rjit and all other flags).
This would also be convenient for the way to run built-but-not-installed-ruby, which every CRuby developer uses.

Updated by Eregon (Benoit Daloze) 8 months ago

mame (Yusuke Endoh) wrote in #note-23:

[...] instead of parsing the command line to a string array and passing it to Kernel#exec.

Don't we use execve() (which takes char**) as well on Windows for Kernel#exec?
It does seem to exist: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/execve-wexecve?view=msvc-170

Did you mean Kernel#spawn/Kernel#system maybe?

Of course, RbConfig.ruby_args would return an Array, not a String.

Updated by Eregon (Benoit Daloze) 8 months ago

And from a quick look there is also _spawnv which does take a char**, maybe we could use that on Windows?

Updated by nobu (Nobuyoshi Nakada) 6 months ago

Eregon (Benoit Daloze) wrote in #note-31:

mame (Yusuke Endoh) wrote in #note-23:

[...] instead of parsing the command line to a string array and passing it to Kernel#exec.

Don't we use execve() (which takes char**) as well on Windows for Kernel#exec?
It does seem to exist: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/execve-wexecve?view=msvc-170

Kernel#exec is not implemented on Windows.

Did you mean Kernel#spawn/Kernel#system maybe?

Argument passing is same for them.

Eregon (Benoit Daloze) wrote in #note-32:

And from a quick look there is also _spawnv which does take a char**, maybe we could use that on Windows?

_spawnv is not an API but a just wrapper of CreateProcess as well as rb_w32_uspawn, but lacks some arguments.
We use rb_w32_uspawn and rb_w32_uspawn_flags to support redirection and a few flags.
Our functions are designed based on _spawnv in msvcrt, so the security concern is similar.
The argument handling would work fine for many commands using msvcrt or ruby, but not all.

mame (Yusuke Endoh) wrote in #note-23:

If you really want to "launch subprocess Ruby instances with the same settings", we might want to consider a more dedicated API for it, instead of parsing the command line to a string array and passing it to Kernel#exec.

This means a restricted method to invoke only ruby not to run other arbitrary commands, which can interpret the arguments wrongly, by "a more dedicated API", I think.

Updated by Dan0042 (Daniel DeLorme) 6 months ago · Edited

mame (Yusuke Endoh) wrote in #note-23:

If you really want to "launch subprocess Ruby instances with the same settings", we might want to consider a more dedicated API for it, instead of parsing the command line to a string array and passing it to Kernel#exec.

A "more dedicated API" sounds not so good to me. It's more complex and also less versatile. One of the use cases was certainly to (re-)execute the current ruby, but in #note-10 I also mentioned executing a different version of ruby, like exec("ruby-3.0", *filter_for_ruby30(RbConfig.ruby_args), script). This would not be possible with a dedicated API. Not to mention other potential uses like RbConfig.ruby_args.include?('-v') to know if the version number was printed. Imho it's better to keep it simple and composable.

shyouhei (Shyouhei Urabe) wrote in #note-29:

Please note that I'm not necessarily against a way to call the current ruby executable. I just say doing so using exec is a bad idea, because exec is not designed for that purpose.

Maybe that's true on Windows, but on Unix exec is very much the normal and blessed way to re-execute the current program. I understand there are challenges on Windows, but they are inherent to the existing system/spawn methods, and will not change regardless of adding RbConfig.ruby_args. I don't think it's a good idea to hobble Unix just because Windows has some design flaws.

nobu (Nobuyoshi Nakada) wrote in #note-33:

Kernel#exec is not implemented on Windows.

Then maybe RbConfig.ruby_args should not be implemented on Windows if it's such a problem? (Though I don't see why.)

Updated by byroot (Jean Boussier) 5 months ago

This need came up again in https://github.com/rubygems/rubygems/pull/7933.

AFAICT there is currently no good way for a program to re-exec itself with all the same arguments.

Updated by byroot (Jean Boussier) 5 months ago

As for the API, my two cents is that it would make the most sense in Process. e.g. Process.argv, such as:

$ ruby --yjit -e "p Process.argv"
["--yjit", "-e", "p Process.argv"]

So a self re-exec would be:

Process.exec(RbConfig.ruby, *Process.argv)

Updated by kddnewton (Kevin Newton) 5 months ago

It might make sense to split this ticket into two requests: one for Process.argv or similar and one for re-executing the current process with the same or additional flags (Process.reexec??). These seems like slightly different requests, and Process.argv seems like it might be slightly more difficult to implement.

Updated by byroot (Jean Boussier) 5 months ago

Needs a bit of polish (e.g. doc and spec), but that's essentially what I think we should do: https://github.com/ruby/ruby/pull/11370

Updated by Dan0042 (Daniel DeLorme) 5 months ago

byroot (Jean Boussier) wrote in #note-36:

As for the API, my two cents is that it would make the most sense in Process. e.g. Process.argv, such as:

I don't mind using "Process", but the "argv" name is really ambiguous. For the command ruby --yjit script.rb --arg we'd have:

main(char **argv) => ["ruby", "--yjit", "script.rb", "--arg"]
Process.argv      => ["--yjit", "script.rb", "--arg"]
ARGV              => ["--arg"]
Process.argv0     => "script.rb" (not element 0 of any of the 3 arrays above)

The name "argv" already has 3 different conflicting meanings, so adding a 4th one is a bit too much imho.

byroot (Jean Boussier) wrote in #note-38:

Needs a bit of polish (e.g. doc and spec), but that's essentially what I think we should do: https://github.com/ruby/ruby/pull/11370

This works great for re-executing the current process, but not for the original problem described in OP to "launch subprocess Ruby instances with the same settings". It's definitely a bit more complicated to extract just ["--yjit"] from the example above, and requires to change proc_options in ruby.c
...I'll try to see if I can whip up something.

Updated by nobu (Nobuyoshi Nakada) 5 months ago

-C option arguments are cumulative and can be relative paths.
I don't think ruby -C subdir -e 'exec(*Process.argv)' would work as expected.

Actions #41

Updated by Eregon (Benoit Daloze) 4 months ago

  • Related to Misc #20739: Test suite does not carry over CLI options added

Updated by Eregon (Benoit Daloze) 4 months ago

@nobu (Nobuyoshi Nakada) True, it would also be helpful to have a way to capture the original CWD.
Then that example would work just fine with ruby -C subdir -e 'exec(RbConfig.ruby, *Process.ruby_args, "-e", "p :OK", chdir: Process.original_working_directory)'
(using the naming from https://bugs.ruby-lang.org/issues/6648#note-10 but with Process).

@byroot (Jean Boussier) 's PR shows it's quite easy to get this information (although that's for all args, not just ruby command line/vm args, but that should be easy to fix): https://github.com/ruby/ruby/pull/11370

Updated by nobu (Nobuyoshi Nakada) 2 months ago

Eregon (Benoit Daloze) wrote in #note-42:

@nobu (Nobuyoshi Nakada) True, it would also be helpful to have a way to capture the original CWD.
Then that example would work just fine with ruby -C subdir -e 'exec(RbConfig.ruby, *Process.ruby_args, "-e", "p :OK", chdir: Process.original_working_directory)'
(using the naming from https://bugs.ruby-lang.org/issues/6648#note-10 but with Process).

How to "capture the original CWD"?
Keeping a fd open is uselessly expensive in many case, I think.
Saving a path string is not reliable.
I think the method to run without -C options in the CWD is better.

Updated by Eregon (Benoit Daloze) 2 months ago

nobu (Nobuyoshi Nakada) wrote in #note-43:

Saving a path string is not reliable.

Why not?
Are you thinking if the directory is removed? In that case there is no way to rerun the command faithfully, so an error like Errno::ENOENT is fine.

Updated by nobu (Nobuyoshi Nakada) about 1 month ago

Eregon (Benoit Daloze) wrote in #note-44:

Are you thinking if the directory is removed? In that case there is no way to rerun the command faithfully, so an error like Errno::ENOENT is fine.

Removed or renamed.
It can rerun fine by removing -C options and staying in the current directory.

Updated by Eregon (Benoit Daloze) about 1 month ago

nobu (Nobuyoshi Nakada) wrote in #note-45:

Removed or renamed.
It can rerun fine by removing -C options and staying in the current directory.

But current directory might have changed e.g. by Dir.chdir, so using Dir.pwd (explicitly or implicitly) is no good, it's the same as ARGV vs "original ARGV".
From that I think it's clear we should save the original arguments (Ruby options and Ruby user arguments) as well as the original CWD (as a String).

Of course in the worst case CWD might no longer exist/renamed or even .../bin/ruby has been removed, but those are not the cases we are interested in with this feature (there are intrinsic limitations, they are fine).

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

I'm in favor of not storing -C in ruby_args. In general, -C is mutually exclusive with the program doing its own Dir.chdir. Either the program changes its working directory to "x", or if it doesn't then you'd use -C x as a workaround.

In the odd case that both -C dir1 and Dir.chdir("dir2") are used, I think that "dir2" should be the one used in subsequent #exec calls.

But ideally we would have some real-world examples of how -C is used in order to guide this decision.

Updated by deivid (David Rodríguez) about 1 month ago

I agree with @Eregon (Benoit Daloze) and I believe ideally we would store -C in ruby_args but also provide a way to access the original working directory. That way people can choose where to re-run the original command.

In the case of Bundler, we would use this feature when we detect that a different version of Bundler is running than the one in the Gemfile.lock file. In that case, we would do some environment manipulation to make sure the locked version is actually used in a subsequent process, and then re-run the original process. In our case, I think it would only make sense to re-run the original command in the original directory.

Alternatively, we could do some manipulation of original -C arguments to expand them to a single value with the absolute path where Ruby ends up running. But I like the Process.original_working_directory idea better.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

deivid (David Rodríguez) wrote in #note-48:

In the case of Bundler, we would use this feature when we detect that a different version of Bundler is running than the one in the Gemfile.lock file. In that case, we would do some environment manipulation to make sure the locked version is actually used in a subsequent process, and then re-run the original process. In our case, I think it would only make sense to re-run the original command in the original directory.

I don't follow the logic here. You would re-run the original command in the original directory, just so that it can change back to the directory specified by -C. What for? Why not just stay in the CWD if you're going to wind up back there anyway? What's the point of using Process.original_working_directory and -C to cancel each other?

Oh, and here's an interesting tidbit I just learned that might impact how to handle this: Ruby loads the script after changing to the dir specified by -C. That means ruby -C foo x.rb loads the script foo/x.rb. And when you re-execute you have to be in directory foo (or expand the script path).

Updated by deivid (David Rodríguez) about 1 month ago

I don't follow the logic here. You would re-run the original command in the original directory, just so that it can change back to the directory specified by -C. What for? Why not just stay in the CWD if you're going to wind up back there anyway? What's the point of using Process.original_working_directory and -C to cancel each other?

Well, that's the only safe way, right? What if "user code" changed to a different directory before we restart, then -C will have a different effect if it's relative.

Updated by deivid (David Rodríguez) about 1 month ago

But to be honest, I get your point about hearing from realworld use cases of -C. I've never seen it used.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

deivid (David Rodríguez) wrote in #note-50:

What if "user code" changed to a different directory before we restart, then -C will have a different effect if it's relative.

That's why I don't want -C in ruby_args at all. What if user code changed to a different directory and intended for that to be the CWD upon re-exec? Then -C would break that intent. The way I see it, the CWD is part of the environment; in the same way that a re-exec inherits any changes to the ENV variables, a re-exec should inherit any changes to the CWD.

But anyway, I was talking specifically about your Bundler example; I didn't see the point of -C in that specific example. Bundler executes right from the start; there's no opportunity for "user code" to change to a different directory, is there?

Updated by headius (Charles Nutter) about 1 month ago

I am in favor of including -C in the argv API along with every other flag that was passed at the command line. The purpose of getting the list of arguments is not solely for relaunching, it is also to be able to reprocess that list and act on it within the same process.

The relaunching api, which I agree would make sense as a separate API call, would have more smarts for launching a new process with equivalent flags to this process. That may mean that global state change flags like -C must be handled differently.

Someone here also mentioned having access to the original state in which the runtime was launched, such as the original working directory before -C or chdir calls have happened. Original environment probably should be included as well.

My goal with these APIs is to make it transparent to Ruby code exactly how the current instance of Ruby was launched, along with providing a standard way to relaunch the same runtime in the same way. The argv API should come first.

Updated by deivid (David Rodríguez) about 1 month ago · Edited

But anyway, I was talking specifically about your Bundler example; I didn't see the point of -C in that specific example. Bundler executes right from the start; there's no opportunity for "user code" to change to a different directory, is there?

The entrypoint to Bundler is not only the bundle CLI, but also a require of bundler/setup, for example. And sometimes stuff happens before that.

But you're right that my example was not directly related to -C.

Overall I don't think whether -C is included or not will affect Bundler a lot, because of how little usage -C seems to have, but it still feels better to me to include it since after all, it's part of the original argv. If Process.argv user does not need it or wants to ignore it, that can be done manually.

Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like2Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0