Skip to content

Implement DataType::{Date32,Date64} => Variant::Date #8081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 15, 2025

Conversation

superserious-dev
Copy link
Contributor

@superserious-dev superserious-dev commented Aug 7, 2025

Which issue does this PR close?

Rationale for this change

Adds Date32, Date64 conversions to the cast_to_variant kernel

What changes are included in this PR?

  • adds fallibility to cast macro
  • conversion of DataType:::{Date32, Date64}=> Variant::Date

Are these changes tested?

Yes, additional unit tests have been added.
Currently there is no negative test for a Date64Array with a date element out-of-range.

Are there any user-facing changes?

Yes, adds new type conversions to kernel

for i in 0..array.len() {
if array.is_null(i) {
$builder.append_null();
continue;
}
let cast_value = $cast_fn(array.value(i));
let cast_value = $cast_fn(array.value(i))?;
Copy link
Contributor Author

@superserious-dev superserious-dev Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New tweak to the macro here. Made cast_fn fallible so that an ArrowError::CastError can be returned if something goes wrong.

Copy link
Contributor Author

@superserious-dev superserious-dev Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A potential downside to this tweak is that it forces the caller's closure to handle a Result even if the cast is known to be infallible, like the case of f16 -> f32.

Date64Type,
as_primitive,
|v: i64| {
Date64Type::to_naive_date_opt(v).ok_or(ArrowError::CastError(format!(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want to tweak the macro, we could call expect or unwrap here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another behavior which might be better is writing Variant::Null when the value can not be converted 🤔

Some(Variant::Date(NaiveDate::from_ymd_opt(2025, 8, 1).unwrap())),
Some(Variant::Date(NaiveDate::MAX)),
],
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could potentially add a negative test here, although it's difficult to construct an invalid Date64Type.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @superserious-dev -- sorry for the delay in reviewing. I left a comment about returning Variant::Null vs an error on conversion failure -- let me know what you think

Date64Type,
as_primitive,
|v: i64| {
Date64Type::to_naive_date_opt(v).ok_or(ArrowError::CastError(format!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another behavior which might be better is writing Variant::Null when the value can not be converted 🤔

@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Aug 14, 2025
@superserious-dev
Copy link
Contributor Author

Thanks @superserious-dev -- sorry for the delay in reviewing. I left a comment about returning Variant::Null vs an error on conversion failure -- let me know what you think

That's an interesting idea. One issue could be that a Null value in the Output Variant would represent 2 things: either a) an error in the casting process or b) a Null value in the input.

For now, I undid the macro change and did unwrap to avoid modifying the macro. Once all the conversions are in, it could be useful unify the error-handling so that they all align on the approach(ie. unwrap vs Err vs Variant::Null).

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @superserious-dev

generic_conversion!(
Date64Type,
as_primitive,
|v: i64| { Date64Type::to_naive_date_opt(v).unwrap() },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably good for now. I'll file a follow on ticket to track handling this case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb alamb merged commit ace8dad into apache:main Aug 15, 2025
12 checks passed
@alamb
Copy link
Contributor

alamb commented Aug 15, 2025

Thank you @superserious-dev

@superserious-dev superserious-dev deleted the date-cast-to-variant branch August 15, 2025 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet-variant parquet-variant* crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Variant]: Implement DataType::Date32 / DataType::Date64 support for cast_to_variant kernel
2 participants