Skip to content

Conversation

@AdnanElAssadi56
Copy link
Contributor

If you add a model or a dataset, please add the corresponding checklist:

@KennethEnevoldsen
Copy link
Contributor

@Samoed can I ask to take this one?

lambda x: x[self.label_column_name] is not None and len(x[self.label_column_name]) > 0
)

# Only subsample splits that are larger than n_samples to avoid division by zero
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this is required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Label had wrong type, and handling was taking some time.

@Samoed
Copy link
Member

Samoed commented Jan 7, 2026

Can you reupload this dataset? You can do it by task.push_dataset_to_hub(f"mteb/{task.metadata.name}"). I tried to do this on kaggle, but casting took too much memory. This would help to close #3499

@Samoed
Copy link
Member

Samoed commented Jan 7, 2026

By the way do you know why we're using only HSN subset?

@AdnanElAssadi56
Copy link
Contributor Author

@imadtyx I see that you've first integrated this. Can you please tell us about this?

By the way do you know why we're using only HSN subset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants