Skip to content

Vector Aggregation function #3777

@barryhunter

Description

@barryhunter

Proposal:

It's probably a little unusual, but would be nice to be able to do query like

SELECT GROUPBY(), AVG(float_vector) FROM index WHERE ... GROUP BY year

ie get an 'average' vector from vector column. ie the average for all dimensions in the vector. In theory finds a 'central' vector for the group.
.... ie loops through all dimensions and calculates the average for each float.

In practice, it actually could just be SUM() that needed. as only the direction of the vector that important, can just sum up all the dimensions individually, and not bother dividing by the 'count'.
... would be ok to return unnormalized vector (can do that externally) - and because normalizing the actual magnitude will be changed anyway (normalizing will do the dividing)

Know it means processing a lot of data (ie going to have to sum up high dimensional vectors) but should still be better doing it internal to manticore, rather in the application.

Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

  • Implementation completed
  • Tests developed
  • Documentation updated
  • Documentation reviewed
  • OpenAPI YAML updated and issue created to rebuild clients

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions