Bug #11706
closedClean up files etc/unicode/name2ctype.{h.blt,kwd,src}
Description
The files name2ctype.{h.blt,kwd,src} in etc/unicode are intermediate products that are not needed in the repository, and haven't been committed consistently. I propose to remove them.
[I'm not sure this is a bug or a feature, but it doesn't provide any new functionality, so feature doesn't seem right.]
[I've assigned this to Nobu for feedback; I can execute it once we agree on a way forward.]
On 2015/11/17 15:39, Nobuyoshi Nakada wrote:
Please update name2ctype.{h.blt,kwd,src} files too.
Thanks for the reminder. I had a look at these files. Maybe before further commits, we can try to simplify things a bit, and/or to ignore irrelevant stuff.
Sorry this message is long. Looking at the three files you mentioned, I noticed the following:
enc/unicode/name2ctype.h.kwd was produced on the Onigmo side, when I worked on the update (see also https://github.com/k-takata/Onigmo/pull/58), too. However, it is not part of the Onigmo distribution.
It was last committed by Yui Naruse at r36070, on 2012/06/14. This is way before the update to Unicode 7.0.0 with r46831.
On 2011/11/20, K. Takata introduced https://github.com/k-takata/Onigmo/blob/master/tool/convert-name2ctype.sh, which is used as:
convert-name2ctype.sh name2ctype.kwd > name2ctype.h
to directly convert from name2ctype.kwd to name2ctype.h (although it produces a few numbered intermediary files which are removed in the last step).
enc/unicode/name2ctype.h.blt was last committed by yourself in r49292 on 2015/01/17. Your log message mentions r46831, but it is unclear why you updated .h.blt and not .kwd and .src. The last commit before this was r36070, same as for name2ctype.h.kwd.
enc/unicode/name2ctype.src also was last committed in r36070.
Looking at Makefile.in, it contains instructions to create enc/unicode/name2ctype.h from enc/unicode/name2ctype.kwd at http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/Makefile.in?view=markup#l340. There, .h.blt and .src are mentioned, but my knowledge of shell syntax isn't good enough to understand what's exactly supposed to go on.
My conclusions so far would be:
- name2ctype.{h.blt,kwd,src} are all intermediary files that are not
 actually used directly for building Ruby.
- In the last few years, these three files have been committed only
 rarely and accidentally, not in any visible sync with actual bug fixes
 or feature additions.
- Onigmo no longer uses name2ctype.h.blt and .src, and does not commit
 .kwd.
- The build process on the Onigmo side, although I did it manually, was
 well documented and painless; on the Ruby side, it may be possible to
 build enc/unicode/name2ctype.h (the file that's finally used for
 compilation), but I haven't found how to do so.
- For a process that needs to be done about once a year, this amount of
 manual work seems perfectly fine (at least for me, and I volunteer to
 do it again next year).
- Therefore, I suggest that we don't care about committing
 name2ctype.{h.blt,kwd,src}. If you want me to commit
 enc/unicode/name2ctype.h.kwd, I can do it (because I have the new
 version). Indeed, it might be better to remove these three files;
 they only make checkouts heavier.
- If we want to simplify the production process, my preference would be
 to update Makefile.in based on convert-name2ctype.sh, or to directly
 integrate convert-name2ctype.sh into tool/enc-unicode.rb
 (why would one want to use sed and friends if we already use ruby?)