-
Notifications
You must be signed in to change notification settings - Fork 53
Description
🚀 Feature request
Is it possible to sample the negatives for the Two-Tower model from a column provided by the input data?
For example, we want to sample negatives from the list of items we displayed to a user in the future. The data schema would look like this
item_id_inputs | item_id_negatives | item_id_target
[1, 2, 3, ...].| [4, 5, ...] | 3
We could get the negatives for that row from item_id_negatives. Usually item_id_negatives will be about the same range for all users, but we could consider that column as ragged and slice to the smallest in the batch when building the cost function matrix.
Motivation
We're planning to sample negatives from items the user has been exposed to in the future. So random or in batch negative sampling won't get us there. This feature could also be useful for anybody working on problems related implicit negative feedback, dwelling time, etc that requires more control over how negatives are sampled in Two Tower models.