Allowing omitting the "T" separator from a datetime string #5337

riverszhang89 · 2025-08-14T22:51:35Z

Our customers have issue using spark to create a dataframe on a datetime column, because spark interpolates the provided datetime string, but drops the "T" separator when filling the range.

It seems that omitting the "T" character is fairly acceptable, and ISO 8601 does allow it to be omitted in a few cases. This is a very simple patch to allow it.

roborivers

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Error ⚠.
Regression testing: 3/625 tests failed ⚠.

The first 10 failing tests are:
udf
consumer_non_atomic_default_consumer_generated
consumer

dorinhogea · 2025-08-15T20:07:21Z

If we allow strings without T to be converted to datetimes, we have the following conundrum: is "20250101" a datetime or an epoch value?

riverszhang89 · 2025-08-18T14:15:58Z

If we allow strings without T to be converted to datetimes, we have the following conundrum: is "20250101" a datetime or an epoch value?

If the string contains only digits, it's treated as a unix timestamp (code). So "20250101" will still be an epoch value! I've also added a test for it!

roborivers

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Success ✓.
Regression testing: 5/625 tests failed ⚠.

The first 10 failing tests are:
sc_transactional_rowlocks_generated
udf
consumer_non_atomic_default_consumer_generated
consumer
sc_downgrade

akshatsikarwar · 2025-08-18T18:55:54Z

This breaks UDF and consumer tests?

Our customers have issue using spark to create a dataframe on a datetime column, because spark interpolates the provided datetime string, but drops the "T" separator when filling the range. It seems that omitting the "T" character is fairly acceptable, and ISO 8601 does allow it to be omitted in a few cases. This is a very simple patch to allow it. Signed-off-by: Rivers Zhang <[email protected]>

riverszhang89 · 2025-08-20T18:48:24Z

This breaks UDF and consumer tests?

There's one case that I overlooked, eg 2025-01-01 UTC. Apparently the type code expects that a timezone has a leading space. But because we now additionally allow spaces in between the date portion and the rest of the datetime, we may not have a leading space when we read the timezone. for instance:

2025-01-01 UTC
          ^ old code would stop here, timezone would be "[space]UTC"
   vs
   
2025-01-01 UTC
           ^ new code stops here, timezone is "UTC"

roborivers

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Success ✓.
Regression testing: 2/625 tests failed ⚠.

The first 10 failing tests are:
cdb2_close
sc_downgrade

riverszhang89 force-pushed the datetime_literal branch from 9a25307 to bc595ea Compare August 14, 2025 22:52

roborivers suggested changes Aug 15, 2025

View reviewed changes

riverszhang89 force-pushed the datetime_literal branch 2 times, most recently from de0a305 to c8c985d Compare August 16, 2025 11:58

roborivers suggested changes Aug 18, 2025

View reviewed changes

riverszhang89 force-pushed the datetime_literal branch from c8c985d to cdb7bd0 Compare August 20, 2025 17:44

roborivers suggested changes Aug 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allowing omitting the "T" separator from a datetime string #5337

Allowing omitting the "T" separator from a datetime string #5337

riverszhang89 commented Aug 14, 2025

Uh oh!

roborivers left a comment

Uh oh!

dorinhogea commented Aug 15, 2025

Uh oh!

riverszhang89 commented Aug 18, 2025

Uh oh!

roborivers left a comment

Uh oh!

akshatsikarwar commented Aug 18, 2025

Uh oh!

riverszhang89 commented Aug 20, 2025

Uh oh!

roborivers left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Allowing omitting the "T" separator from a datetime string #5337

Are you sure you want to change the base?

Allowing omitting the "T" separator from a datetime string #5337

Conversation

riverszhang89 commented Aug 14, 2025

Uh oh!

roborivers left a comment

Choose a reason for hiding this comment

Uh oh!

dorinhogea commented Aug 15, 2025

Uh oh!

riverszhang89 commented Aug 18, 2025

Uh oh!

roborivers left a comment

Choose a reason for hiding this comment

Uh oh!

akshatsikarwar commented Aug 18, 2025

Uh oh!

riverszhang89 commented Aug 20, 2025

Uh oh!

roborivers left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants