-
Notifications
You must be signed in to change notification settings - Fork 5
Tidyup 7: Recoding and replacing values in the tidyverse #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1d08e4f
to
2ee1675
Compare
2ee1675
to
5347976
Compare
recode_values is the boss for Likert scale responses. So great for a 100 question questionnaire with 5 item Likerts for every Q. |
I'm not sure if this is intended, but it's currently not possible to change the data type with penguins |>
mutate(
size = body_mass |>
replace_when(
body_mass > 4750 ~ "large",
body_mass > 3550 ~ "medium",
body_mass > 0 ~ "small"
)
)
#> Error in `mutate()`:
#> ℹ In argument: `size = replace_when(...)`.
#> Caused by error in `replace_when()`:
#> ! Can't convert `..1 (right)` <character> to <integer>.
#> Run `rlang::last_trace()` to see where the error occurred. It also looks like if we wanted to use penguins |>
mutate(
size = body_mass |>
replace_when(
body_mass > 4750 ~ 3,
body_mass > 3550 ~ 2,
TRUE ~ 1
)
) The proposal doesn't say that |
@JoFrhwld to be extremely clear, These 3 functions join
And that's exactly the point! |
I'm sure you have thought about it, but i didn't see it explicitly stated. I'm going to assume that if there are duplicate values in |
@EmilHvitfeldt yep, same idea as dplyr::replace_values(1, from = c(1, 1), to = c(2, 3))
#> [1] 2
dplyr::replace_values(1, 1 ~ 2, 1 ~ 3)
#> [1] 2 Created on 2025-08-05 with reprex v2.1.1 |
What is the expected use with factors? If the lookup tbl contains a factor in the What is the relationship with This all looks really great btw. |
Easy to read link:
https://github.com/tidyverse/tidyups/blob/feature/007/007-tidyverse-recoding-and-replacing.md
We’d love to get your thoughts on this proposal to add new column recoding and replacing tools to dplyr. The goal is to fill some important gaps left by
case_when()
andcase_match()
by creating a slightly larger family of interconnected functions. Specifically, we wish to improve on:Recoding columns, both interactively and programmatically (i.e. with a pre computed lookup table, like
plyr::mapvalues()
)case_when()
recode_values()
Replacing a few values within an existing column. In particular by providing obviously named, easy to use, and type stable tools for doing so, which function as enhanced forms of
[<-
andbase::replace()
.replace_when()
replace_values()
Please feel free to contribute however you feel comfortable — you're welcome to comment here on individual lines of the tidyup, or open bigger discussion topics in an new issue. If there are things you’d prefer to discuss in private, please feel free to email me. I’ll plan to close the discussion on Aug 18 and we will advance to the implementation stage.