Project

General

Profile

Actions

Feature #16428

open

Add Array#uniq?, Enumerable#uniq?

Added by kyanagi (Kouhei Yanagita) over 1 year ago. Updated 5 months ago.

Status:
Feedback
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:96288]

Description

I propose Array#uniq?.

I often need to check if an array have duplicate elements.

This method returns true if no duplicates are found in self, otherwise returns false.
If a block is given, it will use the return value of the block for comparison.

This is equivalent to array.uniq.size == array.size, but faster.

% ~/tmp/r/bin/ruby -rbenchmark/ips -e 'a = Array.new(100) { rand(1000) }; Benchmark.ips { |x| x.report("uniq") { a.uniq.size == a.size }; x.report("uniq?") { a.uniq? } }'
Warming up --------------------------------------
                uniq    25.765k i/100ms
               uniq?    76.544k i/100ms
Calculating -------------------------------------
                uniq    278.144k (± 4.1%) i/s -      1.391M in   5.010858s
               uniq?    981.868k (± 5.1%) i/s -      4.975M in   5.081611s

I think the name uniq? is natural because Array already has uniq.

patch: https://github.com/ruby/ruby/pull/2762

Updated by shevegen (Robert A. Heiler) over 1 year ago

I often need to check if an array have duplicate elements.

Makes sense to me; I have had situations where I needed this
too in the past (including situations for non-unique entries
in an Array), so I agree on the general use case opportunities
in this regard.

Updated by duerst (Martin Dürst) over 1 year ago

I seem to member that many years ago, I made the same proposal, and Nobu created a patch, but unfortunately, I didn't find any traces anymore on this tracker or in my mail.

Anyway, I support this proposal. It's definitely an useful functionality, and it's clearly faster than doing it indirectly via #uniq.

Updated by kyanagi (Kouhei Yanagita) over 1 year ago

  • Subject changed from Add Array#uniq? to Add Array#uniq?, Enumerable#uniq?

Following a suggestion of Enumerable#uniq?, I also added Enumerable#uniq? to my patch.
Array#uniq? is left because it is faster than Enumerable#uniq?.

Updated by matz (Yukihiro Matsumoto) over 1 year ago

  • Status changed from Open to Feedback

You said, "I often need to check if an array have duplicate elements". But we cannot think of the real-world use-case.
Could you elaborate on how to use the proposed #uniq? and its benefit?

Matz.

Updated by kyanagi (Kouhei Yanagita) over 1 year ago

I was developing mobile games, and I met these situations:

A card deck can't have duplicate characters.
i.e. deck.cards.map(&:character_id).uniq.size == deck.cards.size
-> deck.cards.map(&:character_id).uniq? or deck.cards.uniq?(&:character_id)

When players compose items, each of them should be different.
i.e. materials.map(&:item_id).uniq.size == materials.size
-> materials.map(&:item_id).uniq? or materials.uniq?(&:item_id)

Another situation:

I developed a registration form for relay runners.
A request body is like this:

# Missing sections are allowed. You can send them later.
[
  { section: 1, name: 'aaa' },
  { section: 3, name: 'bbb' },
  { section: 5, name: 'ccc' },
]

In this case, duplication of section is not allowed.
runners.map(&:section).uniq.size == runners.size
-> runners.map(&:section).uniq? or runners.uniq?(&:section)

I think uniq? is easier to write and read than x.uniq.size == x.size
for expression of no duplication. It's even faster.

This check is also found in Ruby's repository (bundler):
https://github.com/ruby/ruby/blob/master/spec/bundler/support/matchers.rb#L84

Updated by shyouhei (Shyouhei Urabe) over 1 year ago

kyanagi (Kouhei Yanagita) wrote in #note-5:

I was developing mobile games, and I met these situations:

A card deck can't have duplicate characters.
i.e. deck.cards.map(&:character_id).uniq.size == deck.cards.size
-> deck.cards.map(&:character_id).uniq? or deck.cards.uniq?(&:character_id)

So you just want to test? Why doesn't deck.cards.map(...).uniq!'s return value work?

When players compose items, each of them should be different.
i.e. materials.map(&:item_id).uniq.size == materials.size
-> materials.map(&:item_id).uniq? or materials.uniq?(&:item_id)

So you just want to test? Don't you want to show the duplicated materials to the players? Does uniq? help then?

Another situation:

I developed a registration form for relay runners.
A request body is like this:

# Missing sections are allowed. You can send them later.
[
  { section: 1, name: 'aaa' },
  { section: 3, name: 'bbb' },
  { section: 5, name: 'ccc' },
]

In this case, duplication of section is not allowed.
runners.map(&:section).uniq.size == runners.size
-> runners.map(&:section).uniq? or runners.uniq?(&:section)

So you just want to test? Don't you want to render error message about what is the duplicated section? Does uniq? help then?

I think uniq? is easier to write and read than x.uniq.size == x.size
for expression of no duplication. It's even faster.

My main question is: it isn't faster when you render error messages. How do you use it?

This check is also found in Ruby's repository (bundler):
https://github.com/ruby/ruby/blob/master/spec/bundler/support/matchers.rb#L84

Honestlt I don't understand what this matcher is trying to achieve.

Updated by kyanagi (Kouhei Yanagita) over 1 year ago

In my cases, I (server side) only had to check duplication because a client also have validations.
Legal users can't send a request with duplicates, so detailed error message was not required.
(If needed, I could investigate logged request.)

uniq!'s return value is also usable, but I think uniq? is more fitting.
(I'd like to check duplication, not to get uniq array.)

Actions #8

Updated by keithrbennett (Keith Bennett) 5 months ago

I was just going to post this suggestion, but saw that it was already here.

uniq? could be helpful, for example, where you are loading objects from an external source (e.g. from JSON or YAML), and you need to verify that the objects' id's are unique. objects.map(&:id).uniq? is much more expressive, clear, and concise, than the lower level, longer form that might be something like this:

ids = objects.map(&:id)
ids.size == ids.uniq.size

Also, it's consistent with the style of existing methods like empty?, one?, etc.

Actions

Also available in: Atom PDF