Skip to content

[FEA] Two Tower negatives sampled from a dedicated input column  #885

@EderSantana

Description

@EderSantana

🚀 Feature request

Is it possible to sample the negatives for the Two-Tower model from a column provided by the input data?
For example, we want to sample negatives from the list of items we displayed to a user in the future. The data schema would look like this

item_id_inputs | item_id_negatives | item_id_target
[1, 2, 3, ...].| [4, 5, ...]       | 3

We could get the negatives for that row from item_id_negatives. Usually item_id_negatives will be about the same range for all users, but we could consider that column as ragged and slice to the smallest in the batch when building the cost function matrix.

Motivation

We're planning to sample negatives from items the user has been exposed to in the future. So random or in batch negative sampling won't get us there. This feature could also be useful for anybody working on problems related implicit negative feedback, dwelling time, etc that requires more control over how negatives are sampled in Two Tower models.

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions