Project

General

Profile

Actions

Feature #20210

closed

Invalid source encoding raises ArgumentError, not SyntaxError

Added by kddnewton (Kevin Newton) 9 months ago. Updated 9 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:116435]

Description

I was hoping we could change the error that is raised when an invalid source encoding is found from an ArgumentError to a SyntaxError.

First let me say, if this isn't possible for backward compatibility, I understand. Please do not take this as me not caring about backward compatibility.

Right now, if you have the script # encoding: foo\n"bar", it will raise an ArgumentError, not a SyntaxError. If there are other syntax errors in the file, there's no way to concat them together to give feedback to the user. If a user wants to consistently handle the errors coming back from a parse, they currently have to rescue ArgumentError and SyntaxError.

Ideally it would all be SyntaxError, so we could handle them consistently and append all errors together.

Updated by nobu (Nobuyoshi Nakada) 9 months ago

  • Tracker changed from Misc to Feature

I don't remember the reason to select ArgumentError, SyntaxError feels more reasonable.
https://github.com/ruby/ruby/pull/9701

Updated by yui-knk (Kaneko Yuichiro) 9 months ago

I'm wondering which encoding should be used if the parser hits invalid source encoding like # coding: foo. I think it's needed to clarify which encoding is assumed on this ticket.

Updated by naruse (Yui NARUSE) 9 months ago

Parsing entire source code with wrong encoding is not reasonable because in some encoding including SJIS (Windows-31J) parsing result won't be valid because some multibyte character may include ASCII character in the trailing byte in the encoding. A developer need to fix the encoding first.

Updated by Anonymous 9 months ago

Hi. One question:

When parsing begins, what encoding do Prism and parse.y use by default?

Updated by kddnewton (Kevin Newton) 9 months ago

@naruse (Yui NARUSE) I'm fine exiting immediately, I was just hoping to make it a syntax error.

@Edwing123 By default Ruby source assumes UTF-8 unless told otherwise by a magic comment or a command line option.

Updated by mame (Yusuke Endoh) 9 months ago

Discussed at the dev meeting.

We need a good reason to introduce incompatibility. You say you are fine with the current behavior (exiting immediately), Then, we can't see no reason to change it.

This is just my idea: it would be great for prism, as a library, to continue parsing even with an invalid encoding magic comment (maybe as ASCII-8BIT?), but it would be good to keep the behavior of Ruby interpreter as possible.

Updated by kddnewton (Kevin Newton) 9 months ago

  • Status changed from Open to Closed

I think that makes sense! Let's keep it as an argument error. Prism will keep parsing for now, but raise the right error.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0