ObjectSpace.dump_all is a very useful method to debug memory leaks and such, hence is frequently needed in production. But since all the 7bit strings content is included in the dump, it incur the risk of leaking personal data, or secrets.
Also, in many case the strings content isn't that helpful and is just making the dump much bigger for no good reason. And only pure-ASCII strings are dumped this way, which means all the tools that process these dumps should already be compatible with a dump without any string content.
I propose to add another optional parameter to dump_all: string_value: false. When passed, no String content is ever dumped regardless of its coderange.
Why not just stop dumping string values? I'm proposing this because I see no reason to keep them. It is practically proven unnecessary; all non-ASCII bits are already silently dropped and no one complains. I prefer simple API for ObjectSpace.dump_all. We could add options later, if we find any use cases.
I see no reason to keep them. It is practically proven unnecessary
I disagree. Just to give one example among many, it's very useful when tracking memory leaks. For instance you notice some pattern of a Hash growing, being able from the dump to see the content of the key often allow to map that object to actual code.
I'm not sure if I'm in favor of this request then. ObjectSpace.dump_all is very much analogous to a coredump. Both are very handy on occasions. I don't doubt your experience of finding memory leak is real. But... People normally don't try to cruft a coredump. One do often include sensitive info, but being able to access to a coredump is a big threat already. We normally strictly restrict access to them. The same thing can go for ObjectSpace.dump_all output.
I wrote "I prefer simple API for ObjectSpace.dump_all" because, I'm pretty sure this is not the last thing you wanted for the output. People need to filter out some objects fields, order by something, group by something, having a histogram, ... and pretty sure we would end up need an entire SQL engine. My preference is this method should remain as simple as possible, and let jq(1) etc., having that business.
I'm not sure reasoning by analogy with core dumps is sound here. If there was a way to be sure a core dump is stripped of all personally identifiable informations I'd definitely use it to share core dumps when it's useful.
because, I'm pretty sure this is not the last thing you wanted for the output. ... and pretty sure we would end up need an entire SQL engine.
I think this is a bit of an unfair argument. Yes I requested multiple additions to this API over the last few years, but in my opinion there is a very long way to go before it can considered a complex API, especially for an API that is intended for very advanced debugging. And it's not like I have a long list of feature requests I'm drip feeding.
Also I don't even need that capability myself, I suggested it because I was trying to help @zzak (zzak _) fix a memory leak at his company, and the dumps containing string values made it hard for him to get approval to generate heap dumps from production because of security concerns, and thought this new option it could be useful for the community.