Project

General

Profile

Actions

Feature #19972

open

Install default/bundled gems into dedicated directories

Added by vo.x (Vit Ondruch) 6 months ago. Updated 5 months ago.

Status:
Assigned
Target version:
-
[ruby-core:115165]

Description

I think that the current situation, where the same directory (lets call it Gem.default_dir) is used for default/bundled gems as well as for user installed gems, is suboptimal. During the times, this has caused us quite some issue on Fedora. Historically, we redefined the Gem.default_dir to user home directory, to avoid the mixing of system gems and user installed gems. Unfortunately, with advent of default/bundled gems, we were facing issues that these gems were suddenly not listed, etc. I am realizing this issue in full once again since the "user install" RubyGems feature has landed 1. I also think that we have arrived to this situation by evolution, not by design.

Therefore my proposal is:

Keep the Gem.default_dir for user gem installed gems and lets install default and bundled gems into separate dedicated directories. Have separate Gem.bundled_gems_dir and Gem.default_gems_dir structures.

Of course, if Gem.default_dir == Gem.bundled_gems_dir == Gem.default_gems_dir, we still can have the current layout.

I have a simple POC here:

https://github.com/ruby/ruby/pull/8761

BTW I have reported it here, because I think that RubyGems provides all it is needed. So it is not RubyGems ticket after all. However, I believe that RubyGems could benefit from this long term and some simplifications/cleanups would be possible.


Related issues 2 (2 open0 closed)

Related to Ruby master - Feature #14737: Split default gems into separate directory structureAssignedhsbt (Hiroshi SHIBATA)Actions
Related to Ruby master - Feature #5617: Allow install RubyGems into dediceted directoryAssignedhsbt (Hiroshi SHIBATA)Actions

Updated by Eregon (Benoit Daloze) 6 months ago

I think this makes sense and would be better/clearer and make it easier if a user wants to remove all user-installed gems.
We should make sure it does not hurt application startup since it might have to look in more places, but probably insignificant since this search is likely only done on gem activation time or so.

Some remarks:

  • Default gems have almost no files, e.g. lib/ruby/gems/3.2.0/gems/fcntl-1.0.2 is empty and lib/ruby/gems/3.2.0/gems/racc-1.6.2 only has bin/racc. But still it definitely feels better to separate those. BTW gem install'ing the same version tend to cause troubles, probably RubyGems should reject that and just use the default gem if same version. Also this might remove the need for specifications/default which is a bit odd.
  • Bundled gems are supposed to be "just like another gem, they just happen to be shipped with Ruby without needing gem install". Maybe those make sense to keep together with user gems? Although from the POV that they are sort of "stdlib"/"standard gems" it makes sense to separate them.

Updated by vo.x (Vit Ondruch) 6 months ago

Eregon (Benoit Daloze) wrote in #note-1:

Also this might remove the need for specifications/default which is a bit odd.

I have PR opened for this for a long time:

https://github.com/rubygems/rubygems/pull/2909

However, now I believe that start with separating default gems into independent directory could be better starting point.

  • Bundled gems are supposed to be "just like another gem, they just happen to be shipped with Ruby without needing gem install". Maybe those make sense to keep together with user gems? Although from the POV that they are sort of "stdlib"/"standard gems" it makes sense to separate them.

I am looking at this problematic mainly from Fedora POV, where we manage these directories via RPM and we want to manage these directories via RPM only as much as possible. StdLib certainly. And it should be different place from the other RPM managed gems IMHO. I think it might be useful to easily see the distinction. And it certainly needs to be different place then gem installed gems.

Actions #3

Updated by jeremyevans0 (Jeremy Evans) 6 months ago

  • Tracker changed from Bug to Feature
  • ruby -v deleted (ruby 3.3.0dev (2023-10-24 master c44d65427e) [x86_64-linux])
  • Backport deleted (3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN)

Updated by rubyFeedback (robert heiler) 6 months ago

I also like the overall idea - aka to have more fine-tuned control. While I don't run
into the same issues described by vo.x, I do have to work in restricted areas every
now and then; university campus sites are a good example, where I tend to have a
Linux system with mostly just my home directory available. These tend to have outdated
ruby installations.

I get around that issue by simply changing $PATH and using a self-compiled ruby in
my $HOME directory then. But sometimes it is a bit confusing since you end up with
multiple different ruby versions and gem versions as well. I understand the use case
of RPM wanting to present a unified query/install/remove setup for the user.

I also think that we have arrived to this situation by evolution, not by design.

I think so too. Lateron changes sometimes invalidate or simplify prior assumptions
made. See bundler's integration into rubygems; before that I remember drbrain/dblack
(I forgot the nick right now) pointing that out how they use partially overlapping
functionality (before the merge; and I am still not sure if the merge is totally
complete now. I am mostly using gems, so I have no idea about bundler).

I have not thought about whether the proposal he made comes with disadvantages or
trade-off, but the general idea to allow for more fine-tuned control may be useful.

BTW I have reported it here, because I think that RubyGems provides all it is
needed. So it is not RubyGems ticket after all.

I think it is ok to suggest it on MRI here because it also asks the
ruby core devs about real world usage of gems (and problems). Again,
I have no idea whether the proposed suggestion solves this issue or
not, but I personally can relate to some of the issues pointed out.

The current way can be confusing, also because as in the example
provided, you don't easily know where ruby finds stuff in such
dual-ruby setups in a restricted setting. A user may install gems
via --user-install but perhaps due to some odd situation may pick
up other gems from another directory. For instance, when I install
jruby, it also uses the same directory that I used for my self-compiled
MRI on my home system; then it reports that ruby-gtk3 related gems
need to be recompiled, but these were installed by MRI ruby (and
don't work on jruby anyway). So this is also confusing to me, and
perhaps to other ruby users too (if they have a comparable use case).

I am looking at this problematic mainly from Fedora POV, where we
manage these directories via RPM and we want to manage these
directories via RPM only as much as possible. StdLib certainly.
And it should be different place from the other RPM managed gems
IMHO

I think debian has had similar issues in the past too. So that's
relatable. GoboLinux (and NixOS) even managed versioned directories
for all applications installed. So one would have e. g. /Programs/Ruby/2.2.0/,
/Programs/Ruby/3.2.1/ and so forth. Managing subdirectories makes sense
as well. I guess the biggest question I have is how users can query
all of this easily.

The single most used command I tend to have for gem, aside from "gem install
foobar", is "gem env". Should this show additional information? For instance,
it could report which gems are installed where, via some new flag or so,
e. g. "gem env extended" or "gem env full" or whatever else fits.

This probably needs a few iteration steps until it works well. Perhaps vit
can suggest some proposed steps, including which methods should work.
We have Gem.default_dir right now but not Gem.bundled_gems_dir (and I am
not sure about the API there, perhaps the name should be different; I can't
think of a better name right now though).

Updated by Dan0042 (Daniel DeLorme) 6 months ago

I'm not against the idea, but there's already quite few a directories for ruby libraries and TBH it's getting a bit confusing.

lib/ruby/3.2.0
lib/ruby/gems/3.2.0
lib/ruby/site_ruby/3.2.0/x86_64-linux
lib/ruby/vendor_ruby/3.2.0/x86_64-linux
lib/ruby/default_gems/3.2.0 new!!!

from https://stdgems.org/

  • Default gems: These gems are part of Ruby and you can always require them directly. You cannot remove them. They are maintained by Ruby core.
  • Bundled gems: The behavior of bundled gems is similar to normal gems, but they get automatically installed when you install Ruby. They can be uninstalled and they are maintained outside of Ruby core.

According to the above, bundled gems are closer in behavior to user gems than to default gems; does it really make sense to mix bundled+default?

From my reading of the situation, it seems like there should be

  1. a directory for default gems
  2. the regular "gems" directory for bundled gems and root-installed gems
  3. Gem.user_dir for user-installed gems

Updated by vo.x (Vit Ondruch) 6 months ago

Dan0042 (Daniel DeLorme) wrote in #note-5:

I'm not against the idea, but there's already quite few a directories for ruby libraries and TBH it's getting a bit confusing.

lib/ruby/3.2.0
lib/ruby/gems/3.2.0
lib/ruby/site_ruby/3.2.0/x86_64-linux
lib/ruby/vendor_ruby/3.2.0/x86_64-linux

These are for Ruby libraries as you say. Not for gems. On Fedora, we are occasionally using the vendor dir for our packages, but we prefer the gemified versions of libraries, no matter if RPM managed or gem installed.

For gems, the paths are

$ gem env

... snip ...

  - GEM PATHS:
     - /usr/local/lib/ruby/gems/3.3.0+0
     - /builddir/.local/share/gem/ruby/3.3.0+0

... snip ...

Although RubyGems know vendor dir, which is not even listed by gem env.

lib/ruby/default_gems/3.2.0 new!!!

Yep, something like this would be listed among the GEM PATHS by the command above. Plus also bundled_gems dir.

from https://stdgems.org/

  • Default gems: These gems are part of Ruby and you can always require them directly. You cannot remove them. They are maintained by Ruby core.
  • Bundled gems: The behavior of bundled gems is similar to normal gems, but they get automatically installed when you install Ruby. They can be uninstalled and they are maintained outside of Ruby core.

I don't think that https://stdgems.org/ is relevant here, because it is 3rd party site. The proper documentation is here:

https://github.com/ruby/ruby/blob/14bf7164a69944b4e54aa2502cb5749d700505e5/doc/standard_library.rdoc?plain=1#L24
https://github.com/ruby/ruby/blob/14bf7164a69944b4e54aa2502cb5749d700505e5/doc/standard_library.rdoc?plain=1#L107

And there are differencies such as: "They can be uninstalled" vs "They can be uninstallable from Ruby installation."

According to the above, bundled gems are closer in behavior to user gems than to default gems; does it really make sense to mix bundled+default?

My proposal is to install them into separate directories. But maybe the remark was not directed to me :)

From my reading of the situation, it seems like there should be

  1. a directory for default gems
  2. the regular "gems" directory for bundled gems and root-installed gems
  3. Gem.user_dir for user-installed gems

My proposal is to make this flexible enough to allow the situation as you describe, while I'd prefer:

  1. a directory for default gems
  2. a directory for bundled gems
  3. directory for system wide installed gems
  4. dir for user installed gems

Please note that I deliberately not using terms such as "root-installed gems" or "Gem.user_dir for user-installed gems" to prevent too much implications. And if you look at my POC, it actually does not change anything in RubyGems, it just merely changes configuration.

Updated by hsbt (Hiroshi SHIBATA) 6 months ago

  • Status changed from Open to Assigned
  • Assignee set to hsbt (Hiroshi SHIBATA)
Actions #8

Updated by hsbt (Hiroshi SHIBATA) 6 months ago

  • Related to Feature #14737: Split default gems into separate directory structure added

Updated by vo.x (Vit Ondruch) 5 months ago

I have made sure the test suite is green with this configuration. Although I am still not really convinced about the workaround I have put in place for the last remaining test failures:

https://github.com/rubygems/rubygems/pull/7187

Updated by martinemde (Martin Emde) 5 months ago

I like the proposal, and wanted to comment about naming.

I know we’ve called them bundled gems for a while and it’s documented that way, but if we solidify this in a directory name it may be confusing for users of bundler. Finding path names in backtraces that show “bundled_gems” for certain gems could be very confusing when they are not related to the bundler bundled gems.

If we end up with a separate directory, can we use a name that is unambiguous? packaged_gems was the first name that occurred to me but I’ll leave it up to the implementor to decide what’s best.

Updated by vo.x (Vit Ondruch) 5 months ago

Naming is hard obviously. When you mentioned that, vendored_gems immediately popped up on my mind, but there is already the --vendor option. Next I could think of was to shift the meaning a bit and rename default_gems => std_lib gems and bundled_gems => default_gems, which would have its own share of issues.

Nevertheless, I am not that afraid of confusion with Bundler.

Updated by Dan0042 (Daniel DeLorme) 5 months ago

Although RubyGems know vendor dir, which is not even listed by gem env.

I think it's only listed if present

  - GEM PATHS:
     - /opt/ruby/3.2/lib/ruby/gems/3.2.0
     - /home/dan42/.gem/ruby/3.2.0
     - /opt/ruby/3.2/lib/ruby/vendor_ruby/gems/3.2.0

My proposal is to make this flexible enough to allow the situation as you describe

I definitely appreciate this flexibility and I think it's a great design.

Please note that I deliberately not using terms such as "root-installed gems"

And yet we need a word to describe those gems. If we have "default" gems and "bundled" gems, then what are "gems that are installed via rubygems to the lib/ruby/gems/ dir that may actually be configured to be something else" ? I used "root-installed" gems as the closest thing I could think of, but maybe "site" gems is the best word here?

Next I could think of was to shift the meaning a bit and rename default_gems => std_lib gems and bundled_gems => default_gems, which would have its own share of issues.

I had a similar thought; what about using a "gems" subdir for everything?

lib/ruby/                   for stdlib
lib/ruby/gems/              for stdgems (default (and bundled?) gems)
         vendor_ruby/  
         vendor_ruby/gems/  already the case for `gem install --vendor`
         site_ruby/
         site_ruby/gems/    for regular `gem install`
Actions #13

Updated by hsbt (Hiroshi SHIBATA) 5 months ago

  • Related to Feature #5617: Allow install RubyGems into dediceted directory added
Actions

Also available in: Atom PDF

Like3
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0