Project

General

Profile

Feature #8206

Updated by nobu (Nobuyoshi Nakada) over 8 years ago

There has been some discussion about porting the `#blank?` #blank? protocol over to Ruby in the past that has been rejected by Matz.  

 This proposal is only about `String` String however.  

 At the moment to figure out if you have a blank string you would  

 ```ruby 
 "    ".strip.length == 0 
 ``` 

 The disadvantage is that this forces unneeded allocations and does too much work:  

 An optimal implementation would be: 

 ```c 
 static VALUE 
 rb_str_blank(VALUE str) 
 { 
   rb_encoding *enc; 
   char *s, *e; 

   enc = STR_ENC_GET(str); 
   s = RSTRING_PTR(str); 
   if (!s || RSTRING_LEN(str) == 0) return Qtrue; 

   e = RSTRING_END(str); 
   while (s < e) { 
	   int n; 
	   unsigned int cc = rb_enc_codepoint_len(s, e, &n, enc); 

	   if (!rb_isspace(cc) && cc != 0) return Qfalse; 
     s += n; 
   } 
   return Qtrue; 
 } 
 ``` 

 This in turn is about 5-8x than the regex solution to the problem and way faster than allocating one massive string with strip when length is large.  

 Should Ruby take on this method, to accompany `#strip` following its practice.  

 ---  

 A slight caveat though is that active support has a somewhat different definition of blank?  

 ```c 
 const unsigned int as_blank[26] = {9, 0xa, 0xb, 0xc, 0xd, 
   0x20, 0x85, 0xa0, 0x1680, 0x180e, 0x2000, 0x2001, 
   0x2002, 0x2003, 0x2004, 0x2005, 0x2006, 0x2007, 0x2008, 
   0x2009, 0x200a, 0x2028, 0x2029, 0x202f, 0x205f, 0x3000 
 }; 

 static VALUE 
 rb_str_blank_as(VALUE str) 
 { 
   rb_encoding *enc; 
   char *s, *e; 
   int i; 
   int found; 

   enc = STR_ENC_GET(str); 
   s = RSTRING_PTR(str); 
   if (!s || RSTRING_LEN(str) == 0) return Qtrue; 

   e = RSTRING_END(str); 
   while (s < e) { 
	   int n; 
	   unsigned int cc = rb_enc_codepoint_len(s, e, &n, enc); 

     found = 0; 
     for(i=0;i<26;i++){ 
       unsigned int current = as_blank[i]; 
       if(current == cc) { 
         found = 1; 
         break; 
       } 
       if(cc < current){ 
         break; 
       } 
     } 

	   if (!found) return Qfalse; 
     s += n; 
   } 
   return Qtrue; 
 } 
 ``` 
 Clearly it makes no sense to have such a method.  

 If Ruby took over implementing `String#blank?` it would clash with Active Support. But imho would enforce better API consistency.  

 Thoughts? 


 

Back