Feature #8206
openShould Ruby core implement String#blank?
Description
There has been some discussion about porting the #blank?
protocol over to Ruby in the past that has been rejected by Matz.
This proposal is only about String
however.
At the moment to figure out if you have a blank string you would
" ".strip.length == 0
The disadvantage is that this forces unneeded allocations and does too much work:
An optimal implementation would be:
static VALUE
rb_str_blank(VALUE str)
{
rb_encoding *enc;
char *s, *e;
enc = STR_ENC_GET(str);
s = RSTRING_PTR(str);
if (!s || RSTRING_LEN(str) == 0) return Qtrue;
e = RSTRING_END(str);
while (s < e) {
int n;
unsigned int cc = rb_enc_codepoint_len(s, e, &n, enc);
if (!rb_isspace(cc) && cc != 0) return Qfalse;
s += n;
}
return Qtrue;
}
This in turn is about 5-8x than the regex solution to the problem and way faster than allocating one massive string with strip when length is large.
Should Ruby take on this method, to accompany #strip
following its practice.
A slight caveat though is that active support has a somewhat different definition of blank?
const unsigned int as_blank[26] = {9, 0xa, 0xb, 0xc, 0xd,
0x20, 0x85, 0xa0, 0x1680, 0x180e, 0x2000, 0x2001,
0x2002, 0x2003, 0x2004, 0x2005, 0x2006, 0x2007, 0x2008,
0x2009, 0x200a, 0x2028, 0x2029, 0x202f, 0x205f, 0x3000
};
static VALUE
rb_str_blank_as(VALUE str)
{
rb_encoding *enc;
char *s, *e;
int i;
int found;
enc = STR_ENC_GET(str);
s = RSTRING_PTR(str);
if (!s || RSTRING_LEN(str) == 0) return Qtrue;
e = RSTRING_END(str);
while (s < e) {
int n;
unsigned int cc = rb_enc_codepoint_len(s, e, &n, enc);
found = 0;
for(i=0;i<26;i++){
unsigned int current = as_blank[i];
if(current == cc) {
found = 1;
break;
}
if(cc < current){
break;
}
}
if (!found) return Qfalse;
s += n;
}
return Qtrue;
}
Clearly it makes no sense to have such a method.
If Ruby took over implementing String#blank?
it would clash with Active Support. But imho would enforce better API consistency.
Thoughts?