Project

General

Profile

Actions

Feature #5392

closed

Symbol GC

Added by kstephens (Kurt Stephens) about 13 years ago. Updated almost 11 years ago.

Status:
Closed
Target version:
[ruby-core:39881]

Description

I looked more into Symbol GC. The biggest problem is IDs are not VALUEs. My outburst at RubyConf based on my stupid assumption that they were -- I was trying to attack the problem using WeakRefs.

If IDs were VALUEs and Symbols were allocated like any other Object, the existing GC mark and root machinery (including C stack root scans), would take care of it, with an additional sweep of the global_symbol lookup tables.

However, the remaining issue is IDs stored in globals. No matter what, IDs stored in C globals will need to be rb_gc_register_address(VALUE*) roots -- this means CRuby API/contract changes.

Adding a standalone ID mark table and a rb_gc_mark_id() function will not fix problem of lone IDs on the C stack.

What was the original reason to distinguish Symbol IDs from Object VALUEs, besides making lexer tokens simple to map.
Would changing IDs to be allocated VALUE objects simplify internals anyway? This change could also allow Anonymous Symbols and Anonymous Methods.

-- Kurt Stephens

Updated by rkh (Konstantin Haase) about 13 years ago

How would you ensure identity? Do a search on every Symbol creation? Keep a hash map?

On Oct 3, 2011, at 09:41 , Kurt Stephens wrote:

Issue #5392 has been reported by Kurt Stephens.


Feature #5392: Symbol GC
http://redmine.ruby-lang.org/issues/5392

Author: Kurt Stephens
Status: Open
Priority: Normal
Assignee:
Category:
Target version:

I looked more into Symbol GC. The biggest problem is IDs are not VALUEs. My outburst at RubyConf based on my stupid assumption that they were -- I was trying to attack the problem using WeakRefs.

If IDs were VALUEs and Symbols were allocated like any other Object, the existing GC mark and root machinery (including C stack root scans), would take care of it, with an additional sweep of the global_symbol lookup tables.

However, the remaining issue is IDs stored in globals. No matter what, IDs stored in C globals will need to be rb_gc_register_address(VALUE*) roots -- this means CRuby API/contract changes.

Adding a standalone ID mark table and a rb_gc_mark_id() function will not fix problem of lone IDs on the C stack.

What was the original reason to distinguish Symbol IDs from Object VALUEs, besides making lexer tokens simple to map.
Would changing IDs to be allocated VALUE objects simplify internals anyway? This change could also allow Anonymous Symbols and Anonymous Methods.

-- Kurt Stephens

--
http://redmine.ruby-lang.org

Updated by kstephens (Kurt Stephens) about 13 years ago

Konstantin Haase wrote:

How would you ensure identity? Do a search on every Symbol creation? Keep a hash map?

Unless I misunderstand your question, we would insure identity with the same mechanism that exists now: a String->Symbol hash map. The difference is the hash map is pruned of dead Symbols during GC sweep. If available, WeakRefs and RefQueues would reduce the cost.

Updated by mame (Yusuke Endoh) over 12 years ago

  • Status changed from Open to Assigned
  • Assignee set to authorNari (Narihiro Nakamura)
Actions #4

Updated by mame (Yusuke Endoh) almost 12 years ago

  • Target version set to 2.6

Updated by authorNari (Narihiro Nakamura) almost 11 years ago

  • Status changed from Assigned to Closed

duplicated #7791

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0