I can't figure out how to model this partition mapping the best way. #31865
isaacsanders
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
A vendor has an SFTP server. The files on the server all contain the same type of data. However, the files are split up in different ways.
The data is focused on a specific trait, called
trait
, there are about 50 different values fortrait
. The data happens all day, almost every day.For data from 2010-01-01 to 2020-12-31, the files are split by
trait
and into two parts with 2015-01-01 being the date on which the split occurs. From 2021-01-01 forward, the files are split only by date, with all values oftrait
being in the same file, together.Here is an example file listing:
I currently use two assets to write data from both types of files into the same location. I want to use an external asset to represent this location. I want the partitions of both upstream assets to map to the observed partitions of the downstream external asset.
Beta Was this translation helpful? Give feedback.
All reactions