-
Notifications
You must be signed in to change notification settings - Fork 417
Description
Description of configuration
- Extension name: firestore-bigquery-export
- Extension version: 0.2.5
Configuration values (redact info where appropriate):
- Cloud Functions location: us-east4
- Firestore Instance ID: qarik-spearinai-demo-project-chat (Note: This is a non-default database)
- Collection path: qarik-spearinai-demo-project-chat
Description of problem
The official backfill script (fs-bq-import-collection) is unusable for projects that have a non-default Firestore database. The version of the script installed in a standard Google Cloud Shell environment is too old and lacks the feature to specify a target database ID. This makes it impossible to backfill historical data in this common architecture.
Steps to reproduce:
-
Create a Firebase project with a named, non-default Firestore database (e.g., my-project-db). Do not use the (default) database.
-
Successfully install the firestore-bigquery-export extension, configuring it to sync a collection from the non-default database.
-
Open Google Cloud Shell and attempt to run the backfill script for the collection.
-
Observe the failures described below.
Expected result
The backfill script should provide a way to specify the non-default Firestore database ID, either through an interactive prompt or a command-line flag (e.g., --firestoreInstanceId), and successfully import the data.
Actual result
The script fails in two different ways depending on the method used:
- Interactive Mode (npx @firebaseextensions/fs-bq-import-collection):
The script does not ask for a database ID. It presumably searches the (default) database, fails to find the collection, and exits with a 5 NOT_FOUND error.
Error importing Collection to BigQuery: Error: Failed to access collection: 5 NOT_FOUND:
- Non-Interactive Mode (with modern flags):Attempts to use modern flags like --firestoreInstanceId fail with an unknown option error. This is because npx and npm install -g in the Cloud Shell environment consistently install a very old version of the script (0.1.26), even when @latest is requested.
The --help output from the installed version proves that the flag to specify a database does not exist:
`akaasula@cloudshell:~ (spearinai)$ npx @firebaseextensions/fs-bq-import-collection --help
Usage: fs-bq-import-collection [options]
Import a Firestore Collection into a BigQuery Changelog Table
Options:
-V, --version output the version number
--non-interactive Parse all input from command line flags instead of prompting the caller. (default: false)
-P, --project Firebase Project ID for project containing the Cloud Firestore database.
-B, --big-query-project Google Cloud Project ID for BigQuery.
-q, --query-collection-group [true|false] Use 'true' for a collection group query, otherwise a collection query is performed.
-s, --source-collection-path The path of the the Cloud Firestore Collection to import from. (This may or may not be the same Collection for which you plan to mirror changes.)
-d, --dataset The ID of the BigQuery dataset to import to. (A dataset will be created if it doesn't already exist.)
-t, --table-name-prefix The identifying prefix of the BigQuery table to import to. (A table will be created if one doesn't already exist.)
-b, --batch-size [batch-size] Number of documents to stream into BigQuery at once. (default: 300)
-l, --dataset-location Location of the BigQuery dataset.
-m, --multi-threaded [true|false] Whether to run standard or multi-thread import version
-u, --use-new-snapshot-query-syntax [true|false] Whether to use updated latest snapshot query
-f, --transform-function-url URL of function to transform data before export (e.g., https://us-west1-project.cloudfunctions.net/transform)
-e, --use-emulator [true|false] Whether to use the firestore emulator`
-f, --failed-batch-output Path to the JSON file where failed batches will be recorded.
-h, --help display help for command
This tooling issue makes it impossible to backfill historical data, which is a critical blocker.