Bug #20614
closedInteger#size returns incorrect values on 64-bit Windows
Description
According to the ruby/spec, 0.size
should return size of the machine word in bytes, but on x64-mswin64_140 (both release 3.3.3 and git revision 02c4f0c89d) it doesn't. Following example:
a, b = 0.size, [0].pack('J').length
puts a, b
should print two 8
s, but on x64-mswin64_140, a is 4
.
Issue is most likely caused by use of long
instead of SIGNED_VALUE
in internal/fixnum.h and fix_size
function in numeric.c, because on Windows, long
is always a 32-bit number.
Updated by akr (Akira Tanaka) 7 months ago
You can use RbConfig::SIZEOF to query the size of a C type.
% ruby -v -rrbconfig/sizeof -e 'pp RbConfig::SIZEOF'
ruby 3.4.0dev (2024-05-16T19:35:22Z master 854cbbd5a9) [x86_64-linux]
{"int"=>4,
"short"=>2,
"long"=>8,
"long long"=>8,
"__int128"=>16,
"off_t"=>8,
"void*"=>8,
"float"=>4,
"double"=>8,
"time_t"=>8,
"clock_t"=>8,
"size_t"=>8,
"ptrdiff_t"=>8,
"dev_t"=>8,
"int8_t"=>1,
"uint8_t"=>1,
"int16_t"=>2,
"uint16_t"=>2,
"int32_t"=>4,
"uint32_t"=>4,
"int64_t"=>8,
"uint64_t"=>8,
"int128_t"=>16,
"uint128_t"=>16,
"intptr_t"=>8,
"uintptr_t"=>8,
"ssize_t"=>8,
"int_least8_t"=>1,
"int_least16_t"=>2,
"int_least32_t"=>4,
"int_least64_t"=>8,
"int_fast8_t"=>1,
"int_fast16_t"=>8,
"int_fast32_t"=>8,
"int_fast64_t"=>8,
"intmax_t"=>8,
"sig_atomic_t"=>4,
"wchar_t"=>4,
"wint_t"=>4,
"wctrans_t"=>8,
"wctype_t"=>8,
"_Bool"=>1,
"long double"=>16,
"float _Complex"=>8,
"double _Complex"=>16,
"long double _Complex"=>32,
"__float128"=>16,
"_Decimal32"=>4,
"_Decimal64"=>8,
"_Decimal128"=>16,
"__float80"=>16}
Updated by alanwu (Alan Wu) 7 months ago · Edited
IMO based on the current wording of the documentation it should always return sizeof(VALUE)
for fixnums, because VALUE holds the machine representation for fixnums.
By the way, ruby/spec is not a specification for how things ought to behave; it's descriptive not prescriptive. Check out the README of the project: https://github.com/ruby/spec?tab=readme-ov-file#description-and-motivation
Updated by Eregon (Benoit Daloze) 7 months ago
Agreed with @alanwu (Alan Wu), the docs seems clear, and one would expect this method returns how many bytes are used to represent the Integer (not counting object header overhead for bignums, fair enough):
int.size -> int
Document-method: Integer#size
Returns the number of bytes in the machine representation of int
(machine dependent).
1.size #=> 8
-1.size #=> 8
2147483647.size #=> 8
(256**10 - 1).size #=> 10
(256**20 - 1).size #=> 20
(256**40 - 1).size #=> 40
So it should be pointer-size bytes for any Fixnum on any platform (since Fixnum are tagged pointers).
Updated by nobu (Nobuyoshi Nakada) 6 months ago
- Status changed from Open to Rejected
This is very implementation dependent thing, and Fixnum is based on long
, at least in the current implementation.
So it should be sizeof(long)
.
Updated by Eregon (Benoit Daloze) 6 months ago
@nobu (Nobuyoshi Nakada) Could you give some pointers?
At least VALUE
is pointer-sized: https://github.com/ruby/ruby/blob/82aee1a9467c0f1bd33eb0247c5a0a8b8b9a5049/include/ruby/internal/value.h#L102-L121 (and needs to be otherwise any cast between void* and VALUE could break).
And Fixnums are stored in VALUE
variables.
So what do you mean by Fixnum
is based on long
?
Isn't 1 << 33
a tagged pointer/a Fixnum on 64-bit Windows?
Updated by Eregon (Benoit Daloze) 6 months ago · Edited
I found it: https://github.com/ruby/ruby/blob/82aee1a9467c0f1bd33eb0247c5a0a8b8b9a5049/include/ruby/internal/arithmetic/fixnum.h#L43-L58
This feels very strange, why does CRuby only use half of the bits for fixnums on Windows 64-bit? (and so has the overhead for any Integer 2**30 to use bignums)
Updated by Hanmac (Hans Mackowiak) 6 months ago
@eragon might be because of FLONUM using the other half?
But the restriction of only half of 64-bit is way older than FLONUM implementation
Updated by alanwu (Alan Wu) 6 months ago · Edited
I tried making it return 8 for fixnums on Windows and that revealed a bunch of false assumptions ruby/spec
makes. It has code like def max_long = 2**(0.size * 8 - 1) - 1
and assumptions about sizeof(long) == sizeof(void*)
sprinkled around (e.g. tests for Array#pack
and friends).
If we change it to return 8 on Windows with no other changes, we expose people making (false) assumptions to a few issues: getting fixnum bounds from 0.size
by assuming that most of the bytes are used; getting long
related bounds from 0.size
by assuming the implementation will never change; assuming 0.size
has some relationship with the size of data pointers. The most prominent gem that makes false assumptions is concurrent-ruby
. (Thanks for the code search ko1!)
These are definitely user errors, but we should try and minimize breakage when making changes regardless. People can avoid all of these issues by using RbConfig::{LIMITS,SIZEOF}
from rbconfig/sizeof
, but that's undocumented in all releases; I added docs recently. We can fix the weirdness of having unused bytes in fixnums on LLP64 platforms like Windows by defining fixnums based on VALUE. That's good for everyone and probably a better time to change what 0.size
returns.
Updated by Eregon (Benoit Daloze) 6 months ago · Edited
alanwu (Alan Wu) wrote in #note-8:
We can fix the weirdness of having unused bytes in fixnums on LP32 platforms like Windows by defining fixnums based on VALUE. That's good for everyone and probably a better time to change what
0.size
returns.
Right, that makes perfect sense to me.
I'm happy to already merge your changes (even if partial) to ruby/spec to future-proof it, i.e. https://github.com/ruby/ruby/pull/11130 without the Integer#size change.
Could you open an issue or PR for concurrent-ruby?