Feature #16005
openA variation of Time.iso8601 that can parse yyyy-MM-dd HH:mm:ss
Description
Let me propose a String to Time conversion method that can parse "yyyy-MM-dd HH:mm:ss" format, which is very much similar to Time.iso8601
, but delimits the date part and the time part with a space character.
This format is defined as the "timestamp string" literal in SQL 92 standard: http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt (see P. 90)
and so this format is very widely used as the default datetime / timestamp literal for major existing RDBMS implementations.
PostgreSQL
https://www.postgresql.org/docs/11/datatype-datetime.html#id-1.5.7.13.19.7.2
MySQL
https://dev.mysql.com/doc/refman/8.0/en/datetime.html
SQLite3
https://www.sqlite.org/lang_datefunc.html
In order to handle this conversion in Ruby on Rails framework, we define our own String => Time conversion method
https://github.com/rails/rails/blob/b4c715fe/activemodel/lib/active_model/type/helpers/time_value.rb#L62-L76
and Time => String conversion for now,
https://github.com/rails/rails/blob/b4c715fe/activesupport/lib/active_support/core_ext/time/conversions.rb#L7-L59
and I think it's nicer if we had them in the language level with a faster implementation.
As for the method name, maybe we can name it Time.sql92
, Time.sql
, Time.parse_sql92
or whatever, or maybe we can add an option to Time.iso8601
if it could be regarded as a variation of Time.iso8601
? (https://en.wikipedia.org/wiki/ISO_8601#cite_note-30)
Updated by akr (Akira Tanaka) over 5 years ago
Time object needs timezone offset.
"yyyy-MM-dd HH:mm:ss" doesn't contain it.
Always UTC?
Updated by matsuda (Akira Matsuda) over 5 years ago
How about respecting local timezone with Z
, and doing the UTC without Z
?
Updated by jeremyevans0 (Jeremy Evans) over 5 years ago
For many databases, if the timestamp has a fractional component, the fractional component will be included, and if the type includes a timezone, the timezone offset will be included. For example, on PostgreSQL:
SELECT CAST(CAST('2019-07-14 23:21:36.638795-0700' AS timestamp) AS text) AS "v" LIMIT 1
-- "2019-07-14 23:21:36.638795"
SELECT CAST(CAST('2019-07-14 23:21:40.838896-0700' AS timestamptz) AS text) AS "v" LIMIT 1
-- "2019-07-14 23:21:40.838896-07"
I'm guessing the expectation to handle these formats as well (with a variable number of decimal points supported), as Time.iso8601
does?
I would be OK with modifying Time.iso8601
to make either T
or space a valid date/time separator.
Updated by duerst (Martin Dürst) over 5 years ago
matsuda (Akira Matsuda) wrote:
How about respecting local timezone with
Z
, and doing the UTC withoutZ
?
This seems backwards. Timezone 'Z' denotes UTC, see e.g. https://www.timeanddate.com/time/zones/z.
Updated by matsuda (Akira Matsuda) over 5 years ago
This seems backwards. Timezone 'Z' denotes UTC, see e.g. https://www.timeanddate.com/time/zones/z.
Indeed. My mistake.
Updated by matsuda (Akira Matsuda) over 5 years ago
I would be OK with modifying Time.iso8601 to make either T or space a valid date/time separator.
In fact this was my first proposal. I firstly asked @akr (Akira Tanaka) about that exact modification on Time.iso8601
, but he didn't like the idea because that behavior is against ISO 8601 specification.
Updated by akr (Akira Tanaka) over 5 years ago
matsuda (Akira Matsuda) wrote:
I would be OK with modifying Time.iso8601 to make either T or space a valid date/time separator.
In fact this was my first proposal. I firstly asked @akr (Akira Tanaka) about that exact modification on
Time.iso8601
, but he didn't like the idea because that behavior is against ISO 8601 specification.
I said its against XML Schema.
https://www.w3.org/TR/xmlschema-2/
Time[.#]iso8601 is alias of Time[.#]xmlschema.
ISO 8601 defines various representation of date and time.
For example, 1985102T1015Z is valid (10:15 of 102th day of 1985 in UTC).
So, practically, some profile (subset) is required.
XML Schema defines a such profile and it defines that [T] is mandatory.
ISO 8601 itself describes that [T] can be omitted by mutual agreement.
But as far as considering XML Schema as a mutual agreement, we cannot omit [T].
Apart from that, I extracted timestamp related syntax from the SQL 92 standard.
<timestamp literal> ::=
TIMESTAMP <timestamp string>
<timestamp string> ::=
<quote> <date value> <space> <time value> [ <time zone interval> ] <quote>
<date value> ::=
<years value> <minus sign> <months value> <minus sign> <days value>
<time value> ::=
<hours value> <colon> <minutes value> <colon> <seconds value>
<time zone interval> ::=
<sign> <hours value> <colon> <minutes value>
<years value> ::= <datetime value>
<months value> ::= <datetime value>
<days value> ::= <datetime value>
<hours value> ::= <datetime value>
<minutes value> ::= <datetime value>
<seconds value> ::=
<seconds integer value> [ <period> [ <seconds fraction> ] ]
<seconds integer value> ::= <unsigned integer>
<seconds fraction> ::= <unsigned integer>
<datetime value> ::= <unsigned integer>
<unsigned integer> ::= <digit>...
<sign> ::= <plus sign> | <minus sign>
<digit> ::=
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<space> ::= !! space character in character set in use
<quote> ::= '
<plus sign> ::= +
<minus sign> ::= -
<period> ::= .
<colon> ::= :
I feel it is difficult to consider ISO 8601 variant because
months value, days value, hours value, minutes value and
seconds integer value can be one digit (and 3 or more digits).
Also, I'm still not certain that what Time object to be generated
when time zone interval is not given.
Updated by akr (Akira Tanaka) over 5 years ago
(1) The SQL spec. defines the range of year as 0001 to 9999.
(not in the syntax but in another table.)
However Ruby Time object can represent arbitrary integer year.
There is a RDB which works with a year outside of 0001 to 9999.
It seems SQLite3 supports -4173 to 9999.
(-4713-11-24 12:00:00 is Julian day number zero in the proleptic Gregorian calendar).
% sqlite3
SQLite version 3.27.2 2019-02-25 16:06:06
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> SELECT datetime("-4713-11-24 11:59:59");
sqlite> SELECT datetime("-4713-11-24 12:00:00");
-4713-11-24 12:00:00
sqlite> SELECT datetime("9999-12-31 23:59:59");
9999-12-31 23:59:59
sqlite> SELECT datetime("10000-01-01 00:00:00");
sqlite>
I feel our method should extend the "years value" syntax with optional "minus sign" at
beginning.
(2) Another mismatch between Ruby's Time and SQL timestamp is
a resolution of timezone offset.
Ruby can represent Rational offset but SQL supports only minutes.
Olson timezone database supports seconds.
Practically, minutes is enough for current timezones.
But there is a historical timezone which use a offset not multiple of 60 seconds:
Europe/Lisbon has a timezone offset -2205.
% zdump -v Europe/Lisbon|head -4
Europe/Lisbon -9223372036854775808 = NULL
Europe/Lisbon -9223372036854689408 = NULL
Europe/Lisbon Sun Dec 31 23:59:59 1911 UT = Sun Dec 31 23:23:14 1911 LMT isdst=0 gmtoff=-2205
Europe/Lisbon Mon Jan 1 00:00:00 1912 UT = Mon Jan 1 00:00:00 1912 WET isdst=0 gmtoff=0
I feel that we can ignore such offset, though.
(3) Method name idea: sql_timestamp
Updated by akr (Akira Tanaka) over 5 years ago
akr (Akira Tanaka) wrote:
Also, I'm still not certain that what Time object to be generated
when time zone interval is not given.
PostgreSQL has SET TIME ZONE command to set a time zone for a session.
https://www.postgresql.org/docs/11/datatype-datetime.html#DATATYPE-TIMEZONES
"Always UTC" seems not appropriate.
However, I feel that SQL timestamp without time zone interval is dangerous for
communication between RDB and Ruby because it is ambiguous at Autumn DST change
(and other time zone offset change).
For example,
1999-10-31 01:30:00 in US/Pacific can be interpreted as
1999-10-31 08:30:00 UTC and 1999-10-31 09:30:00 UTC.
If the method is expected to be used internally (not for human interaction),
using timestamp with time zone interval is the right way.
So, I think supporting timestamp without time zone interval is not important.
Updated by naruse (Yui NARUSE) almost 4 years ago
https://pubs.opengroup.org/onlinepubs/9695959099/toc.pdf
P.42 says " If the application or the user cannot infer a time zone (for example, from the context of the user’s session) then the date/time value is ambiguous."
- C implementation (time.c)
- Time.sql_timestamp(str) for parser
- It should parse SQL timestamp string
- This should cover Rails's use case
- Rails should return the result if is_utc_ is true, and re-create an object with Time.local if is_utc_ is false.
- https://github.com/rails/rails/blob/5cfd58bbfb8425ab1931c618d98b649bab059ce6/activemodel/lib/active_model/type/date_time.rb
- https://github.com/rails/rails/blob/5cfd58bbfb8425ab1931c618d98b649bab059ce6/activemodel/lib/active_model/type/time.rb
- https://github.com/rails/rails/blob/b4c715fe/activemodel/lib/active_model/type/helpers/time_value.rb#L62-L76
- https://github.com/rails/rails/blob/533c5c3a532a01ca5ae6ddd3f04137b13e9271ab/activemodel/lib/active_model/type/helpers/time_value.rb#L64-L85
- Time#sql_timestamp for serializer
Updated by akr (Akira Tanaka) almost 4 years ago
We (akr, naruse and matsuda) discussed this issue.
- implementing this method, Time.sql_timestamp(string), in time.c would be faster implementation than Ruby version because it avoids several string object allocations:
/(\d+)-(\d\d)-(\d\d) (\d\d):(\d\d):(\d\d)/ =~ str; Time.new($1.to_i, $2.to_i, $3.to_i, $4.to_i, $5.to_i, $6.to_i);
$1...$6 allocates strings but C-implementation can avoid them -
Time.sql_timestamp("yyyy-mm-dd HH:MM:SS")
returns a Time object in UTC mode andTime.sql_timestamp("yyyy-mm-dd HH:MM:SS +HH:MM")
returns a Time object in fixed-offset mode. If an application needs to interpret "yyyy-mm-dd HH:MM:SS" as local time, the application can instantiate a new Time object ast = Time.sql_timestamp(string); t = Time.new(t.year, t.month, t.day, t.hour, t.min, t.sec+t.subsec) if t.utc?
Updated by naruse (Yui NARUSE) almost 4 years ago
kamipo says Rails 6.1 uses their RDB driver's implementation to decode serialized string into Time object.
https://twitter.com/kamipo/status/1342820891256733696
For example PostgreSQL as below:
- https://github.com/rails/rails/pull/35062
- https://github.com/ged/ruby-pg/blob/98610b13d728e63f97db2cff1a25e091192df84a/ext/pg_text_decoder.c#L527
Maybe this issue is less important now.
Updated by nobu (Nobuyoshi Nakada) almost 4 years ago
What about making Time.new(string)
to work more?
Time.new("2020-12-25 17:09:16 +0900") #=> 2020-01-01 00:00:00 +0900
Time.new("2020-12-25 08:09:16 UTC") #=> 2020-01-01 00:00:00 +0900
It is surprising that only the first number is parsed as the year.
Updated by akr (Akira Tanaka) almost 4 years ago
naruse (Yui NARUSE) wrote in #note-11:
- Time#sql_timestamp for serializer
Time#sql_timestamp needs to convert rational subsec to finite digits (or raise an error).
truncate to nanoseconds?
Updated by mame (Yusuke Endoh) over 3 years ago
- Related to Feature #18033: Time.new to parse a string added
Updated by hsbt (Hiroshi SHIBATA) over 2 years ago
- Target version changed from 3.1 to 3.2