Project

General

Profile

Actions

Bug #20526

open

File.open(encoding: "bom|utf-8") converts "\r\n" to "\n" on Windows

Added by kou (Kouhei Sutou) 6 months ago. Updated 5 months ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x64-mingw-ucrt]
[ruby-core:118182]

Description

I'm not sure whether this is an intentional behavior or not but it seems that encoding: "utf-8" doesn't change newline conversion but encoding: "bom|utf-8" changes newline conversion:

File.write("a.txt", "a\r\n")
File.read("a.txt").bytes # => [97, 13, 10]
File.open("a.txt", encoding: "utf-8") {|f| f.read.bytes} # => [97, 10, 10]
File.open("a.txt", encoding: "bom|utf-8") {|f| f.read.bytes} # => [97, 10] XXX: \r\n -> \n
File.open("a.txt", encoding: "bom|utf-8", universal_newline: false) {|f| f.read.bytes} # => [97, 13, 10]

Note that the XXX: line in the above codes. Is this an intentional behavior?

Updated by nobu (Nobuyoshi Nakada) 6 months ago

Probably a bug at push back after BOM look ahead.

BTW, on Windows, File.write and File.read are in text mode by default.
That file would be 4 bytes, "a\r\r\n" in binary.

Actions #2

Updated by nobu (Nobuyoshi Nakada) 6 months ago

  • Backport changed from 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN to 3.1: REQUIRED, 3.2: REQUIRED, 3.3: REQUIRED
Actions #3

Updated by kou (Kouhei Sutou) 6 months ago

  • Description updated (diff)
Actions #4

Updated by hsbt (Hiroshi SHIBATA) 5 months ago

  • Target version deleted (3.2)
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0