Feature #5392
closedSymbol GC
Description
I looked more into Symbol GC. The biggest problem is IDs are not VALUEs. My outburst at RubyConf based on my stupid assumption that they were -- I was trying to attack the problem using WeakRefs.
If IDs were VALUEs and Symbols were allocated like any other Object, the existing GC mark and root machinery (including C stack root scans), would take care of it, with an additional sweep of the global_symbol lookup tables.
However, the remaining issue is IDs stored in globals. No matter what, IDs stored in C globals will need to be rb_gc_register_address(VALUE*) roots -- this means CRuby API/contract changes.
Adding a standalone ID mark table and a rb_gc_mark_id() function will not fix problem of lone IDs on the C stack.
What was the original reason to distinguish Symbol IDs from Object VALUEs, besides making lexer tokens simple to map.
Would changing IDs to be allocated VALUE objects simplify internals anyway? This change could also allow Anonymous Symbols and Anonymous Methods.
-- Kurt Stephens
Updated by rkh (Konstantin Haase) about 13 years ago
How would you ensure identity? Do a search on every Symbol creation? Keep a hash map?
On Oct 3, 2011, at 09:41 , Kurt Stephens wrote:
Issue #5392 has been reported by Kurt Stephens.
Feature #5392: Symbol GC
http://redmine.ruby-lang.org/issues/5392Author: Kurt Stephens
Status: Open
Priority: Normal
Assignee:
Category:
Target version:I looked more into Symbol GC. The biggest problem is IDs are not VALUEs. My outburst at RubyConf based on my stupid assumption that they were -- I was trying to attack the problem using WeakRefs.
If IDs were VALUEs and Symbols were allocated like any other Object, the existing GC mark and root machinery (including C stack root scans), would take care of it, with an additional sweep of the global_symbol lookup tables.
However, the remaining issue is IDs stored in globals. No matter what, IDs stored in C globals will need to be rb_gc_register_address(VALUE*) roots -- this means CRuby API/contract changes.
Adding a standalone ID mark table and a rb_gc_mark_id() function will not fix problem of lone IDs on the C stack.
What was the original reason to distinguish Symbol IDs from Object VALUEs, besides making lexer tokens simple to map.
Would changing IDs to be allocated VALUE objects simplify internals anyway? This change could also allow Anonymous Symbols and Anonymous Methods.-- Kurt Stephens
Updated by kstephens (Kurt Stephens) about 13 years ago
Konstantin Haase wrote:
How would you ensure identity? Do a search on every Symbol creation? Keep a hash map?
Unless I misunderstand your question, we would insure identity with the same mechanism that exists now: a String->Symbol hash map. The difference is the hash map is pruned of dead Symbols during GC sweep. If available, WeakRefs and RefQueues would reduce the cost.
Updated by mame (Yusuke Endoh) over 12 years ago
- Status changed from Open to Assigned
- Assignee set to authorNari (Narihiro Nakamura)
Updated by authorNari (Narihiro Nakamura) about 11 years ago
- Status changed from Assigned to Closed
duplicated #7791