Skip to content

Conversation

fmigneault
Copy link
Collaborator

@fmigneault fmigneault commented May 1, 2025

Based on #27

These extensions can be imported with updated python dependencies, but the underlying SQL collection_search function requires a more recent pgstac docker image. However, that upgrade (pgstac 0.6 to 7.x, 8.x, 9.x) requires many PG 13→15→17 migrations, so it will be done in a follow-up version to keep this fix minimal.

i.e.: must do a few iterative call of pypgstac migrate --debug from a ghcr.io/stac-utils/pgstac:v0.7.0, ghcr.io/stac-utils/pgstac:v0.8.0, ghcr.io/stac-utils/pgstac:v0.9.0 and maybe others in between... in https://github.com/bird-house/birdhouse-deploy, probably setup something that detects the current schema, and desired minor version, and do them in a loop. Otherwise, just provide manual operations to be applied.

To avoid the above complicated procedure evaluated in bird-house/birdhouse-deploy#534, instead we will wait for bird-house/birdhouse-deploy#532 to simply export the data, and then force upgrade the STAC app from scratch using latest https://github.com/stac-utils/pgstac/pkgs/container/pgstac and STAC-FastAPI. After that, this PR can be safely included since the STAC-FastAPI will be able to include the relevant collection_search SQL function.

@fmigneault fmigneault self-assigned this May 1, 2025
@fmigneault fmigneault changed the base branch from fix-token-paging to main June 5, 2025 23:33
@fmigneault fmigneault marked this pull request as ready for review June 5, 2025 23:40
@fmigneault fmigneault requested a review from mishaschwartz June 5, 2025 23:40
Copy link
Contributor

@mishaschwartz mishaschwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also please add an entry in CHANGES.md and run the pre-commit checks?

The pre-commit checks should run whenever a commit is pushed but maybe that got disabled at some point? Can you please check if the pre-commit.ci plugin is enabled for this repo? (I don't have permission to view that).

For now can you either run the pre-commit hooks locally or run:

ruff check
ruff format

@fmigneault
Copy link
Collaborator Author

@mishaschwartz

The pre-commit checks should run [...]

It wasn't enabled. I've activated it.

@mishaschwartz
Copy link
Contributor

mishaschwartz commented Jun 9, 2025

It wasn't enabled. I've activated it.

Thanks!

@fmigneault
Copy link
Collaborator Author

Got something working, but FreeTextAdvancedExtension is blocked from invalid parsing.
Waiting on feedback from stac-utils/stac-fastapi#849

@fmigneault
Copy link
Collaborator Author

@fmigneault
Copy link
Collaborator Author

Evaluated search variants

Search items across collections

> curl -X GET 'http://localhost:8000/stac/search?q=SeaLake%20OR%20River' | jq '.features[]|.id,.collection' | paste - -
"EuroSAT-full-train-sample-14830-class-SeaLake"	"EuroSAT-full-train"
"EuroSAT-full-train-sample-13483-class-River"	"EuroSAT-full-train"
> curl -X POST http://localhost:8000/stac/search -d '{"q": ["SeaLake", "River"]}' -H 'Content-Type: application/json' | jq '.features[]|.id,.collection' | paste - -
"EuroSAT-full-train-sample-14830-class-SeaLake"	"EuroSAT-full-train"
"EuroSAT-full-train-sample-13483-class-River"	"EuroSAT-full-train"
> curl -X POST http://localhost:8000/stac/search -d '{"q": "SeaLake OR River"}' -H 'Content-Type: application/json' | jq '.features[]|.id,.collection' | paste - -
"EuroSAT-full-train-sample-14830-class-SeaLake"	"EuroSAT-full-train"
"EuroSAT-full-train-sample-13483-class-River"	"EuroSAT-full-train"

Search items in collection

> curl -X GET 'http://localhost:8000/stac/collections/EuroSAT-full-train/items?q=SeaLake%20OR%20River' | jq '.features[]|.id,.collection' | paste - -
"EuroSAT-full-train-sample-14830-class-SeaLake"	"EuroSAT-full-train"
"EuroSAT-full-train-sample-13483-class-River"	"EuroSAT-full-train"

Search collections metadata

> curl -X GET 'http://localhost:8000/stac/collections?q=SeaLake%20OR%20River'  | jq '.collections[].id'
**no match - OK**
> curl -X GET 'http://localhost:8000/stac/collections?q=train'  | jq '.collections[].id'
"EuroSAT-full-train"
> curl -X GET 'http://localhost:8000/stac/collections?q=train,subset'  | jq '.collections[].id'
"EuroSAT-full-train"
> curl -X GET 'http://localhost:8000/stac/collections?q=train%20OR%20subset'  | jq '.collections[].id'
"EuroSAT-full-train"
> curl -X GET 'http://localhost:8000/stac/collections?q=train%20AND%20subset'  | jq '.collections[].id'
**no match - OK**

Copy link
Contributor

@mishaschwartz mishaschwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This looks great and I'm glad it's working. Thanks for adding the usage examples in the comment as well.

Just one question below about commented code but otherwise it looks good.

CollectionSearchFilterExtension(client=FiltersClient()),
],
)
# collection_search_extension = CollectionSearchPostExtension.from_extensions( # GET + POST
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this left on purpose or should it be removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is left on purpose to highlight the conflicting endpoints, since the extension must be only partially enabled.
(✅ POST /search, ✅ GET /search, ❌ POST /collections, ✅ GET /collections?q=...)

@fmigneault fmigneault merged commit fe077b3 into main Aug 15, 2025
1 check passed
fmigneault added a commit to bird-house/birdhouse-deploy that referenced this pull request Sep 2, 2025
## Overview

Update all STAC components and extra functionalities related to them.

## Changes

**Non-breaking changes**

>[!WARNING]
> Assumes that STAC API `5.2.0-crim-1.1.0` (default) is already applied.
If migrating from older versions, all database migration issues when
moving from 3.x still apply.

- STAC: Update STAC Browser to
[`crim-ca/stac-browser:3.4.0-dev`](https://github.com/crim-ca/stac-browser/releases/tag/3.4.0-dev).

  - Dockers are now built directly with the GitHub CI releases
(see
https://github.com/crim-ca/stac-browser/pkgs/container/stac-browser).
  - Synchronize with latest changes (as of 2025-08-16).
- Beside the necessary `prefixPath` override for Nginx Proxy redirect
and a minor HTML file resolution fix, the image is entirely up-to-date
with the official upstream code.
    - Supports STAC 1.1.0.
    - Supports language locales.
- Greatly improves parsing of STAC metadata and their visual rendering.
- Allows dynamic runtime
[config.js](https://github.com/radiantearth/stac-browser/blob/main/config.js)
overrides.
For the time being, only the required `catalogUrl` is overridden, but
further settings could be added later on.

- STAC: Update STAC API to `crim-ca/stac-app:2.0.1`.

- Changes in
[`crim-ca/stac-app:2.0.0`](https://github.com/crim-ca/stac-app/releases/tag/2.0.0)
includes:
- migration to `stac-fastapi==6.0.0` and corresponding fixes to support
it
- add `q` parameter free-text search on `/search`, `/collections` and
`/collections/{collectionId}/items` endpoints
- enforce JSON schema validation of all `stac_extensions` referenced by
published STAC Items and Collections
    - multiple dependency updates
- Minor package dependency fix in
[`crim-ca/stac-app:2.0.1`](https://github.com/crim-ca/stac-app/releases/tag/2.0.1).

- STAC: Update `stac-db` and `stac-migration` to version `0.9.8`.

- STAC: Add `optional/stac-db-persist` and `STAC_DB_PERSIST_DIR` to
allow custom STAC DB metadata storage location.

**Breaking changes**

>[!WARNING]
> If a STAC Item or Collection is POST'd to the STAC API and that its
JSON definition does not respect all of its `stac_extensions`'s schemas,
the operation will now be refused. While this is potentially "breaking",
it is the right expectation, as invalid contents would otherwise be
allowed and inserted into the DB.

- n/a

## Related Issue / Discussion

- #561 
- crim-ca/stac-app#37
- crim-ca/stac-app#28
- crim-ca/stac-app#40
- crim-ca/stac-browser#6
- fixes #562
- fixes #498, fixes #346
- technically, still a fork, but it's a 1-liner diff until
radiantearth/stac-browser#653 is merged
- diff:
radiantearth/stac-browser@main...crim-ca:stac-browser:master
- closes crim-ca/stac-populator#100

## CI Operations

<!--
The test suite can be run using a different DACCS config with
``birdhouse_daccs_configs_branch: branch_name`` in the PR description.
To globally skip the test suite regardless of the commit message use
``birdhouse_skip_ci`` set to ``true`` in the PR description.

Using ``[<cmd>]`` (with the brackets) where ``<cmd> = skip ci`` in the
commit message will override ``birdhouse_skip_ci`` from the PR
description.
Such commit command can be used to override the PR description behavior
for a specific commit update.
However, a commit message cannot 'force run' a PR which the description
turns off the CI.
To run the CI, the PR should instead be updated with a ``true`` value,
and a running message can be posted in following PR comments to trigger
tests once again.
-->

birdhouse_daccs_configs_branch: master
birdhouse_skip_ci: false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants