Skip to content

Conversation

tharropoulos
Copy link
Collaborator

TLDR

New options to transform Firestore documents before indexing to Typesense.

Change Summary

Added Configuration:

  1. In extension.yaml:
    • Added 4 new configuration parameters:
      • TRANSFORM_FUNCTION_NAME: Name of Cloud Function for document transformation
      • TRANSFORM_FUNCTION_PROJECT_ID: Project where transform function is deployed
      • TRANSFORM_FUNCTION_REGION: Region of transform function
      • TRANSFORM_FUNCTION_SECRET: Auth secret for transform function

Added Functionality:

  1. In utils.js:

    • Added transformDocument(): Calls external transform function with error handling
    • Implements fallback to original document when transformation fails
  2. In indexOnWrite.js:

    • Updated to conditionally use document transformation before indexing
    • Added branch to handle transformed vs. non-transformed document flow

Added Tests:

  1. New file test/utilsTransform.spec.js:

    • Tests for transform function with various scenarios:
      • Success path with proper transformation
      • Handling missing transform function configuration
      • Error handling for failed transformations
      • Edge cases like documents without IDs
  2. In package.json:

    • Added new test command: test:utils

PR Checklist


const url = `https://${region}-${projectId}.cloudfunctions.net/${transformFunctionName}`;

const response = await fetch(url, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tharropoulos Can we add automatic retries here with exponential backoff?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tharropoulos Looks like this is not addressed yet

extension.yaml Outdated
example: transformDoc
default: ""
required: false
- param: TRANSFORM_FUNCTION_PROJECT_ID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tharropoulos Given the number of new parameters this will add to the UI, what do you think about adding a constraint so that the function has to be in the same project and region as the one the extension is installed in?

I know it will limit some functionality, but it will also help keep the config simple.

Unless, is there a way to move this under an advanced collapsible section in the installation UI?

- add `transformDocument` function to call external transformation functions
- implement error handling for failed transformations with fallback to original doc
- add comprehensive tests for the transformation functionality
- update package.json with new test command
- update indexOnWrite to conditionally use transformation function
- check for transformFunctionName in config before transforming
- apply transformation before converting to typesense document
- add TRANSFORM_FUNCTION_NAME parameter for specifying transform function
- add TRANSFORM_FUNCTION_PROJECT_ID for cross-project function support
- add TRANSFORM_FUNCTION_REGION for region specification
- add TRANSFORM_FUNCTION_SECRET for authorization to protected functions
- Remove `TRANSFORM_FUNCTION_PROJECT_ID` and `TRANSFORM_FUNCTION_REGION` parameters from `extension.yaml`
- Update `config.js` to use `GCLOUD_PROJECT` and `LOCATION` environment

const url = `https://${region}-${projectId}.cloudfunctions.net/${transformFunctionName}`;

const response = await fetch(url, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tharropoulos Looks like this is not addressed yet

transformFunctionName: process.env.TRANSFORM_FUNCTION_NAME,
transformFunctionSecret: process.env.TRANSFORM_FUNCTION_SECRET,
transformFunctionProjectId: process.env.GCLOUD_PROJECT,
transformFunctionRegion: process.env.LOCATION,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the above two lines still needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to call out to eh region-project id. Both of those are coming from env variables. The gcloud project is automatic and the location is being set from extension.yaml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants