-
Notifications
You must be signed in to change notification settings - Fork 6
DM-50405: Create service descriptor for hoverdrive endpoints #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
jonathansick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's some things I'm not sure of:
- Is it valid to include two
RESOURCEtags in one service descriptor XML file? - What is the usage of the
INFOtags, or are these deprecated? - How should the
datalink-manifest.jsonfile be used? It seems to be related to theINFOtags.
Some feedback on these, but folks who were more involved in writing these might want to correct me if any of this is wrong: It should be perfectly valid to include multiple RESOURCE tags in one service descriptor XML file. The INFO tags are being used as template placeholders in this implementation. The datalink-manifest.json file basically serves as a registry that maps datalink service IDs to required column names. |
|
I don't know if there is a better way to encode the conditional validation for using In particular I'm thinking adding something like this to the redirect param description: And then for the column params (column, table..) |
gpdf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonathansick Can we find a time to talk about what you had in mind w.r.t. the "optional" parameter? Perhaps this is something that could be brought up with the IVOA, if we agree that it's valuable.
I think @robyww would be interested in this as well.
|
The whole idea of a |
11be4ef to
a6c7025
Compare
jonathansick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gpdf I've updated Hoverdrive with new redirect-specififc endpoints, which now make the service descriptor (updated on this branch) easier to write. At the moment we only have endpoints for the redirect functionality. We'll have to discuss what response is desired when asking for multiple links.
Does this service descriptor look good? I'm still unsure of what, if anything, to do with datalink-manifest.json since these endpoints run on any table or column name, not a specific column name.
datalink/hoverdrive.xml
Outdated
| <PARAM name="accessURL" datatype="char" arraysize="*" | ||
| value="$baseUrl$/api/hoverdrive/column-docs-redirect"/> | ||
| <GROUP name="inputParams"> | ||
| <PARAM name="table" datatype="char" use="required"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we cannot have the 'use' attribute in the PARAM so I suspect we will have to remove it from the three PARAMs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it looks as if the PARAMs need a value attribute. I think we can set value="" for these which are being templated.
| @@ -0,0 +1,40 @@ | |||
| <?xml version="1.0" encoding="UTF-8"?> | |||
| <VOTABLE xmlns="http://www.ivoa.net/xml/VOTable/v1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.2"> | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a couple of INFO elements here:
<INFO name="$tap_schema_columns_table_name$" ID="$tap_schema_columns_table_name$" value="this will be dropped..." />
<INFO name="$tap_schema_columns_column_name$" ID="$tap_schema_columns_column_name$" value="this will be dropped..." />
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what's meant by "this will be dropped..."? I saw it elsewhere and wasn't sure whether it means this was a deprecated feature or something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea that is confusing, but it basically means that those INFO elements are temporary placeholders that get replaced during processing. They are used to establish references between the columns and the datalink service params.
So basically the ref value in the datalink resource definition ends up getting as a value the unique ID of the column (and the INFO elements get used through the processing step to achieve that).
So assuming you have this field in your results that corresponds to a datalink parameter:
<FIELD name="table_name" datatype="char" arraysize="64*" ID="col_0">
<DESCRIPTION>the table this column belongs to</DESCRIPTION>
</FIELD>
This would end up being referenced to like this in the datalink resource:
<PARAM name="table" datatype="char" ref="col_0" value="">
<DESCRIPTION>The name of the table.</DESCRIPTION>
</PARAM>
Hoverdrive provides two endpoints for getting documentation links about tables and columns respectively. These endpoints are described in https://sqr-086.lsst.io. Both endpoints provide an optional mode triggered by a ?redirect=true query parameter where the client is redirected to the most-relevant documentation URL. In this case, only a single column or table can be specified by the ?table and ?column parameters. I'm not sure how to encode this logic in the service descriptor.
Hoverdrive now provides /column-docs-redirect and /table-docs-redirect endpoints. This removes the need to express the optional redirect query parameter in the service descriptor. Since we haven't implemented getting VO Tables of documentation links yet, only these redirects are implemented and represented in the service descriptor right now.
Now we're connecting the service descriptor to the schema table's coluns - ref attributes in the PARAM fields map to INFO tags - INFO tags map those ref templates to column names in the tap schema - datalink-manifest.json maps the column names to the datalink service descriptor Co-authored-by: stvoutsin <[email protected]>
91a3638 to
03c872c
Compare
|
Thanks to some helpful coaching from @stvoutsin I think this service descriptor is closer to "right", although I still don't know how to test it. Any advice on next steps would be great! What we've done is figure out the |
The hoverdrive/column-docs-redirect endpoint needs both table and column name columns, but the hoverdrive/table-docs-redirect endpoint takes just the table name column. With the previous set up, the manifest prevented the table-docs-redirect endpoint from being used because a circumstance with only the table+column names was present. Splitting the service descriptor into separate XML files for each set of unique column dependencies should solve this.
|
@stvoutsin stood this up in data-dev and this is the result for https://data-dev.lsst.cloud/api/tap/sync?LANG=ADQL&REQUEST=doQuery&QUERY=SELECT+TOP+1+*+FROM+tap_schema.columns : <?xml version="1.0" encoding="UTF-8"?>
<VOTABLE xmlns="http://www.ivoa.net/xml/VOTable/v1.3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.4">
<RESOURCE type="results">
<INFO name="QUERY_STATUS" value="OK" />
<INFO name="QUERY_TIMESTAMP" value="2025-05-22T23:02:37.441" />
<INFO name="QUERY" value="SELECT TOP 1 * FROM tap_schema.columns" />
<TABLE>
<FIELD name="table_name" datatype="char" arraysize="64*" ID="col_0">
<DESCRIPTION>the table this column belongs to</DESCRIPTION>
</FIELD>
<FIELD name="column_name" datatype="char" arraysize="64*" ID="col_1">
<DESCRIPTION>the column name</DESCRIPTION>
</FIELD>
<FIELD name="utype" datatype="char" arraysize="512*" ID="col_2">
<DESCRIPTION>lists the utypes of columns in the tableset</DESCRIPTION>
</FIELD>
<FIELD name="ucd" datatype="char" arraysize="64*" ID="col_3">
<DESCRIPTION>lists the UCDs of columns in the tableset</DESCRIPTION>
</FIELD>
<FIELD name="unit" datatype="char" arraysize="64*" ID="col_4">
<DESCRIPTION>lists the unit used for column values in the tableset</DESCRIPTION>
</FIELD>
<FIELD name="description" datatype="char" arraysize="512*" ID="col_5">
<DESCRIPTION>describes the columns in the tableset</DESCRIPTION>
</FIELD>
<FIELD name="datatype" datatype="char" arraysize="64*" ID="col_6">
<DESCRIPTION>lists the ADQL datatype of columns in the tableset</DESCRIPTION>
</FIELD>
<FIELD name="xtype" datatype="char" arraysize="64*" ID="col_7">
<DESCRIPTION>a DALI or custom extended type annotation</DESCRIPTION>
</FIELD>
<FIELD name="arraysize" datatype="char" arraysize="16*" ID="col_8">
<DESCRIPTION>lists the size of variable-length columns in the tableset</DESCRIPTION>
</FIELD>
<FIELD name=""size"" datatype="int" ID="col_9">
<DESCRIPTION>deprecated: use arraysize</DESCRIPTION>
</FIELD>
<FIELD name="principal" datatype="int" ID="col_10">
<DESCRIPTION>a principal column; 1 means 1, 0 means 0</DESCRIPTION>
</FIELD>
<FIELD name="indexed" datatype="int" ID="col_11">
<DESCRIPTION>an indexed column; 1 means 1, 0 means 0</DESCRIPTION>
</FIELD>
<FIELD name="std" datatype="int" ID="col_12">
<DESCRIPTION>a standard column; 1 means 1, 0 means 0</DESCRIPTION>
</FIELD>
<FIELD name="column_index" datatype="int" ID="col_13">
<DESCRIPTION>recommended sort order when listing columns of a table</DESCRIPTION>
</FIELD>
<DATA>
<TABLEDATA>
<TR>
<TD>dp02_dc2_catalogs.CcdVisit</TD>
<TD>band</TD>
<TD />
<TD>meta.id;instr.bandpass</TD>
<TD />
<TD>Name of the band used to take the exposure where this source was measured. Abstract filter that is not associated with a particular instrument.</TD>
<TD>char</TD>
<TD />
<TD>*</TD>
<TD />
<TD>1</TD>
<TD>0</TD>
<TD>0</TD>
<TD>13</TD>
</TR>
</TABLEDATA>
</DATA>
</TABLE>
<INFO name="placeholder" value="ignore" />
</RESOURCE>
<RESOURCE type="meta" name="ColumnDocumentationRedirect" utype="adhoc:service">
<DESCRIPTION>Redirect to the most relevant documentation link for a column.</DESCRIPTION>
<PARAM name="accessURL" datatype="char" arraysize="*" value="https://data-dev.lsst.cloud/api/hoverdrive/column-docs-redirect" />
<PARAM name="exampleURL" datatype="char" arraysize="*" value="https://data-dev.lsst.cloud/api/hoverdrive/column-docs-redirect?table=dp02_dc2_catalogs.Object&column=detect_isPrimary">
<DESCRIPTION>Example request to redirect to the documentation for the 'detect_isPrimary' column in the 'dp02_dc2_catalogs.Object' table.</DESCRIPTION>
</PARAM>
<GROUP name="inputParams">
<PARAM name="table" datatype="char" arraysize="*" ref="col_0" value="">
<DESCRIPTION>The name of the table.</DESCRIPTION>
</PARAM>
<PARAM name="column" datatype="char" arraysize="*" ref="col_1" value="">
<DESCRIPTION>The name of the column.</DESCRIPTION>
</PARAM>
</GROUP>
</RESOURCE>
<RESOURCE type="meta" name="TableDocumentationRedirect" utype="adhoc:service">
<DESCRIPTION>Redirect to the most relevant documentation link for a table.</DESCRIPTION>
<PARAM name="accessURL" datatype="char" arraysize="*" value="https://data-dev.lsst.cloud/api/hoverdrive/table-docs-redirect" />
<PARAM name="exampleURL" datatype="char" arraysize="*" value="https://data-dev.lsst.cloud/api/hoverdrive/table-docs-redirect?table=dp02_dc2_catalogs.Object">
<DESCRIPTION>Example request to redirect to the documentation for the 'dp02_dc2_catalogs.Object' table.</DESCRIPTION>
</PARAM>
<GROUP name="inputParams">
<PARAM name="table" datatype="char" arraysize="*" ref="col_0" value="">
<DESCRIPTION>The name of the table.</DESCRIPTION>
</PARAM>
</GROUP>
</RESOURCE>
</VOTABLE>That column schema query provides the data links for both the table and column documentation redirect, and it looks like the |
Hoverdrive provides two endpoints for getting documentation links about tables and columns respectively. These endpoints are described in https://sqr-086.lsst.io.
Both endpoints provide an optional mode triggered by a
?redirect=truequery parameter where the client is redirected to the most-relevant documentation URL. In this case, only a single column or table can be specified by the?tableand?columnparameters. I'm not sure how to encode this logic in the service descriptor.Hoverdrive is currently deployed on data-dev and data-int. Online API docs.
Checklist
When making changes to YAML files in the schemas directory: