-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add netexMode and netexSubmode to GTFS GraphQL Trip type #6676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev-2.x
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev-2.x #6676 +/- ##
=============================================
+ Coverage 71.61% 71.80% +0.19%
- Complexity 18936 19162 +226
=============================================
Files 2059 2081 +22
Lines 77481 77946 +465
Branches 7902 7961 +59
=============================================
+ Hits 55488 55971 +483
+ Misses 19166 19150 -16
+ Partials 2827 2825 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@@ -2496,6 +2496,10 @@ type Trip implements Node { | |||
gtfsId: String! | |||
"Global object ID provided by Relay. This value can be used to refetch this object using **node** query." | |||
id: ID! | |||
"Mode of the trip as a netex mode (string), gtfs modes of routes are mapped into them if necessary." | |||
netexMode: String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this isn't the NeTEx mode but applies to GTFS trips as well, doesn't it?
I think you can return the TransitMode instance rather than a String.
(I don't want to preempt the discussion.)
@@ -2496,6 +2496,10 @@ type Trip implements Node { | |||
gtfsId: String! | |||
"Global object ID provided by Relay. This value can be used to refetch this object using **node** query." | |||
id: ID! | |||
"Mode of the trip as a netex mode (string), gtfs modes of routes are mapped into them if necessary." | |||
netexMode: String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
netexMode: String | |
mode: TransitMode |
@@ -199,14 +200,10 @@ public static class GraphQLBicycleParkingPreferencesInput { | |||
public GraphQLBicycleParkingPreferencesInput(Map<String, Object> args) { | |||
if (args != null) { | |||
if (args.get("filters") != null) { | |||
this.filters = ((List<Map<String, Object>>) args.get("filters")).stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to run yarn install
again before generating this code. The library version has changed: #6647
Why do you need to add netexSubmode into the GTFS API? I think that the GTFS API should not contain any Transmodel, likewise Transmodel API should not contain any GTFS. The modes should be mapped in our internal model, e.g. we should treat 714 from a GTFS data source to be the same as a RAIL_REPLACEMENT_BUS from a NeTEx data source, so no matter if the source data is from GTFS or from NeTEx, 714 will be returned in the GTFS API and RAIL_REPLACEMENT_BUS in the Transmodel API. In concrete terms, I suggest that we add |
This is a possibility. But exactly because of the possible lack of equivalence I think that both versions (gtfs and netex) should be exposed. The other possibility, and possibly a nicer one, is to make the actual mapper configurable, so that when there are equivalence problems, the problem is not an OTP problem but a HSL problem which we can solve by editing a file called FinlandTransmodelSubmodeMapper or something. And then if there is an "unmappable" netex submode, I'll just invent a random very high number and use it as the corresponding type. Ugly, but limited to my installation only. |
I think this is the way to go, with configurable mappings in the configuration JSON (which includes sensible defaults e.g. 714 = replacement bus). If you have clients who want to know about the NeTEx they should use the NeTEx API. |
I think, and I think also Thomas thinks, that a client should be able to not care whether the original data is gtfs or netex. The business situation here is that we will be getting some data as netex and some as gtfs for several years, but it makes life for the frontend developer unnecessarily hard if they have to care. |
I like what we discussed in the last meeting: keep submode internal and introduce API properties for what the submode is currently used for. In your case you want to know if something is a rail replacement bus. |
Yes, I think that a client should be able to not care whether the original data is gtfs or netex, but your proposal does the exact opposite thing. (it forces user to use a certain API and a certain field depending on the source data, as gtfsType from the GTFS API is currently null for NeTEx data) If they want to know NeTEx they should use the NeTEx API, which can be used across both data sources with the correct GTFS types mapped to the NeTEx types (e.g. GTFS 714 will be returned as Replacement Bus in the NeTEx API), while if they are using GTFS they should start seeing 714 for all kinds of replacement buses even if the replacement bus is from NeTEx data. |
Oh, the version of this PR right now is definitely not going forward. I'm just floating ideas because there are multiple competing views on how this should be done. Leonard's suggestion would mean that there is something like trip.replacementMode = 'BUS' which would be populated by trip.route.type = 714 || trip.submode = railReplacementBus. Your suggestion would mean there is trip.type = 714 which is populated by trip.route.type = 714 || trip.submode = railReplacementBus. So not that different in implementation right now, but they indicate different futures of the interface. Leonard's suggestion was born in Tuesday's dev meeting because everyone agrees that both the gtfs type and netex submode collapse several different dimensions into one, and that makes it almost completely useless. |
We discussed this in yesterday's meeting and agreed that we will not expose the submode itself but the properties that the submode is supposed to be communicating. In this case we want to know if it's a replacement bus and @tkalvas will propose an API for it. There was talk about a generic container object holding these properties but it's hard to predict what we will need in the future, so I'm not sure how well we can plan for it. How we extract this property from the input feeds may or may not be deployment-specific. If necessary it can be made configurable later. Is this what we agreed? |
Design NoteNeTEx submodes / GTFS extended route types are not clearly defined - and poorly designed. The sub-modes is just a label to group a set of properties(geo. service area [local, regional, national, internatonal], Wifi, bike raks, food options, seating and sleeping facilities, ticket options, express and so on). What properties a submode mappes to is usually deployment specific and may also be feed specific. To deal with this my suggested strategy is to create each property on the OTP Internal model - these can be exposed in filters and properties on the APIs. To map into the OTP Internal model we should create a mapping matrix. If we can agree on the matrix we can keep it in OTP, if not we need to make it configurable, but I suggest that we just use a simple CSV file for it. Then overriding the default in config would be easy at any point in time(now or later). Mapping table We only add columns now that are requested, other columns can be added later. This is just a draft, example details are included to make it easier to understand. The target properties can be basic types like boolean, numbers, strings, scalars (duration, time), enums, or expressions.
|
I agree with most of that. However, I think it is better to implement the csv file based configuration right away. There is no way a common mapping will survive very long. I also think it's best to postpone the actual mapping until query time. The alternative uses more memory and is more complicated to develop and test (because it then has to happen slightly differently for scheduled data and realtime data). I would also not wish to have the configuration part of the graph, it is so lightweight it would be better as a router-config.json option and read at process startup. |
Now that Trip has both replacementMode and originalMode, maybe the case for having a container level for them is stronger than before. On the other hand, originalMode is defined even when there is no replacement going on. |
Still no resolution on how search should work. I asked one representative end user and they said that they expect the replacement bus to be found both when searching/filtering for buses or trains. |
I did a highly scientific™ survey in my household and there people found that it's ok to include the replacement bus when you search for rail only. |
I think a new API input is needed as well for the traveller to include replacement transport. For example, if replacement service is off, a search for rail will only include rail services, if replacement service is on, a search for rail will include rail services and also other services which replaces rail. However, what should be the behaviour if I want to search for a rail service, but the service involved is a bus replacement train (a train service which acts as a replacement service for a bus)? Also, in addition, how to handle services which are both a general service and a replacement service? For example, when a rail line is closed, a specified service bus is designated to be the rail replacement bus on that route? |
Joel just suggested, that instead of the Trip.originalMode I now introduced, we would not have Trip.originalMode at all. Instead, we would change Route.mode for GTFS-sourced Routes where Route.type indicates replacement service, so that Route.mode would work similarly to the way it works for NeTEx-sourced Routes, i.e. Route.mode is always the original mode, even for replacement services. Trip.replacementMode would continue to be null for non-replacement services, and the replacing mode for replacement services. I opposed this on the principle of not introducing breaking API changes, but couldn't convince Joel. |
One alternative suggestion would be to detect during graph build what is actually the most used mode in routes and use it as the route mode (at least in the GTFS API). I assume that quite often even in netex these rail replacements for example are designed as separate lines instead of just trips within a train line. With that approach, replacement trips originating from GTFS data would behave like before and sometimes the behaviour of netex lines would be different. However, that wouldn't completely eliminate the need for two extra fields on trip to tell what is the original mode and what is the replacement mode. |
Is there anything for me to review yet? I didn't understand if you made a decision what you want to do about the GTFS/NeTEx main mode problem. |
I don't think we have a decision about that, but resolving it should not really cause large changes to the PR as it stands. So whether you want to review everything we have so far is up to you. |
We have conflicting requirements which have to be prioritized to resolve them.
The problematic case now under consideration is rail replacement buses and Specifically, a GTFS rail replacement bus has Route.mode = BUS and Route.type The current code fails the requirements by revealing the source feed type. There is a choice between keeping the GTFS query result almost exactly the same Teemu prefers adding new fields to the result types in the schema, so that Joel prefers not extending the schema if at all possible, and instead What Transmodel GraphQL query returns for NeTEx-sourced data would not be Thomas, Teemu, and Joel all prefer configurability, possibly so that without |
I think we should consider submitting a proposal to change the GTFS standard as this is a deficiency in the GTFS standard. For example, in Stockholm, route 7 is a tram route but bus departures can be seen for the purpose of extra capacity. There is no way for GTFS to represent all them in a single route. In the meantime, I think that Joel's suggestion is the best before we can get the specification changed. |
The root problem here is there is no way to officially associate a trip of a different mode to a route in GTFS compared to NeTEx. I have opened a proposal of the Specifically, the problem MBTA faced was that, when they started adding bus trips in subway routes which call at bus stops, consumers were confused if they should put the subway symbol or the bus symbol at the stop, and they started plotting the subway lines linking the bus stops. OpenTripPlanner has the Therefore, I have written in the proposal that it should be used for cases where the different modes practically behave the same, such as Stockholm route 7, where trams and buses call at the same stops and run on the same shape. I would like a discussion of the actual use case in HSL to see how we should model the trips in GTFS and NeTEx, and decide how we should progress with the proposal and this PR. |
NeTEx is much nicer here, and I'd probably start moving our frontend to the Transmodel GraphQL API, but it is not possible to translate the GTFS-sourced replacement bus data consistently and correctly to Transmodel, because GTFS fails to specify the original mode when a replacement happens. HSL has kind of ended up in a difficult position when we agreed to take GTFS data from one source and NeTEx data from another source at the same time, because both sources include replacement services (the GTFS one for trams, and the NeTEx one for rail). Many possibilities that we can work with probably exist, but they won't be correct in the general sense. I'm in the progress of combining Joel's suggestion with the CSV configuration file, so that only submodes mentioned in the CSV actually affect the mode and type reported by the GTFS GraphQL API at all, so nothing changes behaviourally unless you configure it. |
…s api should report a route as replacement
Now that I attempted to integrate Thomas's and Joel's points (with both of them on vacation so I can't even ask), I'm not sure if their goals can be simultaneously achieved at all. Thomas wanted us to move away from the extended type towards a more property-based system, but now basically Joel wants to simulate a GTFS Route based on what a collection of NeTEx ServiceJourneys happened to have as a submode. Both are implementable, proof: I implemented them. |
…even if source is NeTEx
I now pushed the version which implements Joel's vision (as far as I can tell), but it's lacking on the features Thomas desired. |
So do you mean that your version will now return a BUS for a route consisting of rail replacement bus trips if queried in GTFS, but Rail if queried in Transmodel? What about a route from NeTEx source consisting of both? |
First of all - the scope of this PR is TOO BIG. Sorry, but someone should have stoped this a long time a go. There need to be a specification on how this should work in one or more Github issues. I read through the whole discussion, and not sitting with this on a dayly basis it is impossible to know what is decided. I want this:
Some comments:
This will not work, there are a lot of domain rules on mode and submodes - we are not changing them unless we have to. So, we need to map into the OTP domain model.
The clients need to handle this, OTP should NOT be "smart".
+1
OTP should be deterministic and thread all data the same way - the state of one trip should be independent of the state of another.
+1 for Joel, "keep their old behaviour in all cases" is extremely difficult. To move forward you sometimes need to break things... I suggest we discuss this in the next design meeting. |
Summary
Issue: #6817
GTFS and NeTEx have trip modes and submodes in different places. This PR makes it possible to read information derived from GTFS extended route types and NeTEx trip submodes, using the GTFS graphql interface, from trips, not routes. There is a CSV file (named in the router configuration) which defines the mapping, and can be extended for new derived information.
Sample file:
The first data line of the sample file means that if NeTEx-sourced data generates a Route, and all Trips it has have submode "replacementRailService", the Route.mode will be reported as "BUS" in the GTFS query. The second data line of the sample file means that if NeTEx-sourced data generates a Route which has no Trips at all, the Route.mode will be reported as "BUS" in the GTFS query.
This is now Joel's preferred version as far as I understand it. I probably made a mess of the NeTEx loader's dependency injection, but it still accurately represents where the logic has to go or it will be either non-performant or much harder to do.
Digitransit (Finland) will move into a transitory period when some things will be in NeTEx and some things in GTFS, possibly for multiple years. Something like this is required because we need a mechanism to query data from both sources, GTFS and NeTEx.
Unit tests
None yet. Let's discuss if this approach is ok first.
Changelog
Yes.
Bumping the serialization version id
Yes.