Skip to content

📝 Refactor curate guide #2957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 21, 2025
Merged

📝 Refactor curate guide #2957

merged 3 commits into from
Jul 21, 2025

Conversation

sunnyosun
Copy link
Member

@sunnyosun sunnyosun commented Jul 17, 2025

Before After
Screenshot 2025-07-21 at 11 22 25 Screenshot 2025-07-21 at 11 22 46
Screenshot 2025-07-21 at 11 23 38 Screenshot 2025-07-21 at 11 23 59
NA Screenshot 2025-07-21 at 11 26 13
NA Screenshot 2025-07-21 at 11 26 41
Screenshot 2025-07-21 at 11 27 15 Screenshot 2025-07-21 at 11 27 33
NA Screenshot 2025-07-21 at 11 28 16
NA Screenshot 2025-07-21 at 11 28 45
NA Screenshot 2025-07-21 at 11 29 04

Copy link

codecov bot commented Jul 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.72%. Comparing base (dcea0f6) to head (ae2b829).
Report is 17 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2957      +/-   ##
==========================================
- Coverage   91.89%   89.72%   -2.18%     
==========================================
  Files          73       70       -3     
  Lines       11442     9890    -1552     
==========================================
- Hits        10515     8874    -1641     
- Misses        927     1016      +89     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

github-actions bot commented Jul 17, 2025

Deployment URL: https://350459c1.lamindb.pages.dev

@falexwolf
Copy link
Member

I'm sure this is great! Just, it's impossible to see what and why it changed.

Could you write a brief summary of rationales and add Before and After screenshots?

@falexwolf
Copy link
Member

image

Two comments here:

  1. I'd not say "messy, real-world data" because the guide is full of synthetic examples. I also don't think we have to stress "messiness" because validation is also for internal data.
  2. I think the emojis break style because we don't use them anywhere else. If we want to consistently adopt them, sure, but otherwise I think it's prettier without.
  3. The sentences in the enumeration are so long that maybe a period at the end wouldn't hurt now.

@falexwolf
Copy link
Member

falexwolf commented Jul 21, 2025

image

One could add: "similar to pydantic.Model for dictionaries, and pandera.Schema, and pyarrow.lib.Schema for tables, but supporting more complicated data structures."

@falexwolf
Copy link
Member

I'd say:

image

What Features (dimensions) exist in your dataset

@falexwolf
Copy link
Member

image

I'd say gene-derived features.

Copy link
Member

@falexwolf falexwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's great!

Just very few stylistic remarks.

@sunnyosun sunnyosun marked this pull request as ready for review July 21, 2025 09:55
@sunnyosun sunnyosun merged commit 36198f3 into main Jul 21, 2025
12 of 14 checks passed
@sunnyosun sunnyosun deleted the refactor-curate branch July 21, 2025 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants