SummerOfCode2010 Ideas for Ruby¶
This is a list of ideas for Ruby Summer of Code 2010. Consider it asa starting point -- any other ideas that interests you will bemuch welcomed.
Ruby's GC can be improved.
mmap allocation of memory¶
Currently MRI uses malloc() to allocate memory regions. Replace itwith mmap. Windows support is mandatory though.
- Is this really useful? Large blocks are already allocated with mmap() by the glibc malloc implementation. - Hongli Lai
- Hence the ``Windows support'' notice. AFAIK Windows malloc does not use such technology. -- shyouhei
Currently GC's marking phase massively touches every Ruby objectsalive, so it does not fit with CoW memory pages (e.g. after forkingthe process). This can be improved by moving mark bit from inside ofa Ruby object to a dedicated memory page for that purpose. Codecan be ported from Ruby Enterprise Edition which already does this.
Sweeping phase can be made asynchronous to other part of Rubyexecution. That would increase your process's response performance;which is essential for some kind of area -- such as multimedia.
generational (second challenge)¶
There once was a challenge to replace GC a generational one (back in2001). That was not merged into MRI, because it turned out to beslower than the non-generational one; write barrier costed too much.If you are interested in this area, some tricks around write barriersare the key parts for success.
Symbols are not collected for now. But what if that can be?Long-lasting processes (daemons, web applications,...) can reducememory usage this way.
Ruby is slow™
theoretically optimal algorithms for some methods¶
For instance Array#shift and Array#unsihft are O(n) today, but can bemade O(1). That should quicken programs which uses Arrays as FIFOqueues.
Ruby's regexp engine acts as a pure NFA. But there can be situationswhere DFA-like behaviours are possible. Automatic compilation of aregexp into DFA will reduce execution costs.
Is it too heavy for a summer?
Bright new ideas are the source of improvements
rewrite build system (autoconf etc.) cleanly¶
It is a chaos now. Build system is an essential part of ruby but rarelygets interested by researchers. Rewriting it to a saner one should easeother maintainer's daily works.
E.g. jar for java. This sounds easy, but not always. Kernel#requireshould also be changed to fit with it. Maybe also rubygems.Archiving format, compression algorithm, digital certification of anarchive … much work to be done.
The key part is validation here. To dump a bytecode is easy. To loadone, yet avoiding a SEGV, needs a lot to work.
There is already a experimental dtrace branch(→http://github.com/yugui/ruby/tree/feature/dtrace) we needmore work to merge it to the trunk.
More concurrency primitives¶
- Reader/writer locks. With timeout support.
- Semaphores. With timeout support.
More data structures¶
- Self-balancing binary tree. (But it looks like it's already being worked on?)
- Priority queues.
- Linked lists, so that people can write performant FIFO queues without abusing Array for this.
'Did you mean:" for NameError message¶
put possible error candidates for NameError message.
… and more
GUtopIa is a concept for an ultra-high level GUI API implemented via adapters to other back-end systems (Qt, Gtk, Cocoa, etc). The basic concepts for the project have been around awhile, but very little implementation has taken shape. An SOC student might find it entertaining and educational to bring such a system to life (see http://rubyworks.github.com/gutopia).
Improving existing things¶
net/http is slow¶
Because it uses Timeout, so it spawns a new thread every time Net::HTTP makes a call. Should just use non-blocking sockets and select() for timed blocking.
net/http doesn't always timeout correctly¶
Timeout wasn't that reliable on 1.8, and is even less reliable on 1.9.
Some things are excellently documented, other things very badly.
The docs are weird; too verbose in some areas, not verbose enough in other areas. Other than the simplest use case (download an URL into a string) I always end up searching Google. Hard to figure out how to set request headers, how to get response headers, how to set timeout.
net/https is badly documented. Needs documentation on usage, how to set client certificate, how (not) to verify server certificate, etc.
Completely undocumented right now. Especially the encryption and SSL stuff can use some decent documentation.
The Ruby 1.9 encoding stuff¶
Internal/external encoding is not well-documented. How to enforce encodings and how to convert between encodings is not clear.
Various softwares are still written targeted to 1.8. That's OK, butour future is in 1.9 so they need to move to it someday. Why not makethat someday today?
Significant challenges increase in the realm of online security.OWASP ES.api is an effort to consolidate security defenses.A number of implementations are in progress.http://www.owasp.org/index.php/Category:OWASP_Enterprise_Security_APIA Ruby implementation would be welcome.