-
Notifications
You must be signed in to change notification settings - Fork 0
Dataset add operator. Fixes #17 #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…, query builders, constants. (Motivation: Will be helpful to extend code base to Pangaea) 2. SUpport multi valued parameters by changing input type to str/list[str], coercing list values to a string using |(pipe) and adding respetive {name}AndOr flag to query. 3. Set base towards using logging package. 4. Change input params with kwargs.
…am; changed patches for utils.api.http.get
pyleotups/core/Dataset.py
Outdated
if not same: | ||
log.warning( | ||
"Dataset union: duplicate StudyID %s with differing content. " | ||
"Keeping left-hand version.", sid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add:
(If doing C = A + B, contents of A will be kept)
pyleotups/core/Dataset.py
Outdated
for sid, study in other.studies.items(): | ||
if sid in merged.studies: | ||
try: | ||
same = (merged.studies[sid].to_dict() == study.to_dict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment the line to state that same
refers to content and not StudyId
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add pytest for A=A+B and C=A+B.
Test with 2 datasets (do a xmlid or NOAA study ID query).
Case 1: And B have different IDs so C should contain both A and B IDs.
Case 2: A and B have the same ID so C should look like A.
Case 3: A and B have the same ID but different content, so the warning should be printed and C should still look A.
This features implements set union like addition of two Dataset Objects. In simpler words, two different searches can be combined together.
Dataset holds the StudyId(s) and respective study object(s) from the response retrieved from the query.
With two or more dataset Objects, one can now Add datasets as:
C = A+B
A = A+B
Multiple Study objects can be created over same StudyId in multiple concurrent searches. However, the data stored in these objects will always be same. Programmatically, this is checked before merging two dataset when an overlapping ID is found to avoid incorrect deletion.
Freehand tests added in example notebook. pyTest tests to be added soon.