Project

General

Profile

Feature #13434

Updated by naruse (Yui NARUSE) over 7 years ago

Current ways to define and parse arguments in the Ruby C API are clumsy, 
 slow, and impede potential optimizations. 

 The current C API for defining (rb_define_{singleton_}, method), 
 and parsing (rb_scan_args, rb_get_kwargs) is orthogonal but inefficient. 

 rb_get_kwargs creates garbage which pure Ruby kwarg methods do not. 
 [Feature #11339] was an ugly workaround to use Ruby wrapper methods 
 for IO#*nonblock methods to avoid garbage from rb_get_kwargs. 

 Furthermore, it should be possible to annotate args for C functions as 
 "read-only, use-once" or similar.    In other words, it should be possible to 
 implement my idea from [ruby-core:80626] where method lookup can be done 
 out-of-order in some cases, and allow optimizations such as replacing 
 "putstring" insns with garbage-free "putobject" insns for constants strings 
 without introducing backwards incompatibility for Rubyists. 

 We can also get rid of the limited basic op redefinition checks and 
 implement more generic versions of opt_aref_with / opt_aset_with 
 for more functions that can take frozen string args. 

 The "read-only, use-once" annotation can even make it safe for 
 a dynamic strings to be immediately recycled to reduce garbage. 

 So we could annotate "puts" and IO#write in a way that causes the VM to 
 immediately recycle its argument if it's a dynamically-generated string: 

	 puts "#{dynamic} #{string(:here)}" 

 I am not good at API design; so I'm not sure what it should look like. 

 Perhaps sendmsg_nonblock may be implemented like: 
 ``` 

 ``` 
 struct rb_method_info { 
     /* to be filled in by rb_def_method ... */ 
 }; 

 static VALUE 
 sendmsg_nonblock(struct rb_method_info *info, int argc, VALUE *argv, VALUE self) 
 { 
     VALUE mesg, flags, dest_sockaddr, control, exception; 

     rb_get_args(info, argc, argv, 
		 &mesg, &flags, &dest_sockaddr, &control, &exception); 

     ... 
 } 

 /* 
  * ALLCAPS variable names mean read-only (like "constants" in Ruby) 
  * "1" prefix means use only once, eligible for immediately recycle 
  * if dynamic string 
  */ 

 rb_def_method(rb_cBasickSocket, sendmsg_nonblock, 
               "sendmsg_nonblock(1MESG " 
				 "1FLAGS = 0), " 
				 "1DEST_SOCKADDR = nil), " 
				 "*1CONTROL, exception: true)", -1); 

 /* rb_hash_aset can be done as: 
  * where 0KEY (not "1" prefix) means it is constant and persistent, 
  * and "val" (all lower case, no prefix) means it is a normal 
  * variable which can persistent after the function returns 
  */ 
 rb_def_method(rb_Hash, rb_hash_aset, "[0KEY]=val", 2); 
 ``` 

 Thoughts? 

 The existing C API must continue to work, so 3rd-party extensions can 
 migrate to the new API slowly. 

Back