-
Notifications
You must be signed in to change notification settings - Fork 47
Open
Labels
rThis issue is specific to the R language cookbookThis issue is specific to the R language cookbook
Description
It would be great if there were a way to sample from an arrow dataset. I put together this somewhat hacky example, but I bet there's some thing a bit more elegant..
library(arrow)
library(dplyr)
library(nycflights13)
flights <- nycflights13::flights
flights$id <- seq_len(nrow(flights))
for(i in unique(flights$month)) {
out <- filter(flights, month == i)
arrow::write_parquet(out, paste0("flight_ds/", i, ".parquet"))
}
ds <- arrow::open_dataset("flight_ds")
sample <- sample(flights$id, 100)
ds %>%
filter(id %in% sample) %>%
collect()
Metadata
Metadata
Assignees
Labels
rThis issue is specific to the R language cookbookThis issue is specific to the R language cookbook