Some questions about the prototype matrix

Hi, thanks for your code! I have some questions about the model.
When we construct the prototype matrix(N_l x N_p x D), the 1xD vectors in it is derived from the whole image/sentence; 
However, when conducting subsequent operations of the Cross-modal Prototype Querying and the Cross-modal Prototype Responding, it is to look for the most suitable vector in the prototype matrix for each patch or word. Does this sound not so matching? image -patch, sentence - word?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about the prototype matrix #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Some questions about the prototype matrix #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions