Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating crew runs from public-facing trips using distinct service_ids #76

Open
jeffkessler-keolis opened this issue Jul 3, 2024 · 4 comments

Comments

@jeffkessler-keolis
Copy link
Contributor

tl;dr This non-breaking change would allow producers to have separate sets of runs that map to the same trips — as is standard in most schedule systems — by adding an optional trip_service_id field to run_events, and allowing the assignment of run-only service_ids on dates using calendar and calendar_dates supplement files.

Context

TODS v2 represents a major step forward in the standard with respect to the modeling of crew runs. The overhauled run_events.txt file likewise provides enhancements for the modeling of runs alongside trips that may already exist in GTFS.

Problem

Since run_events.txt are mapped to existing GTFS trips via service_ids, the two are paired in a 1:1 relationship of runs and trips. This presents two challenges for producers seeking to model crew runs in run_events.txt.

  1. Many scheduling systems (e.g. Hastus) decouple "vehicle schedules" and "crew schedules," representation of which in a single shared service_id limits the ability to distinguish between the two and create a 1:many mapping of trips and runs.

    • For example, while a set of trips may remain unchanged, the underlying runs could be modified themselves (generally around trackwork or holidays).

      • Consider a set of trips with service_id spring24-schedule1. While there may be a standard set of crew runs mapped to the trips on that day, some crew runs may not operate on the day before/after a holiday, and other runs may be modified to accommodate the lower staffing, even through all trips will operate normally.

      • e.g. For a US railroad, this Friday (7/5/24) will likely be very light ridership with individuals taking off after the Independence Day [Federal/≈"Bank"] Holiday, despite operating a standard weekday schedule. As a result, additional Assistant Conductors to support higher-ridership trains may not be required on Friday, meaning some CREW runs will not operate, and others will be modified, despite operating the same public schedule of TRIPS (which would continue to use the standard weekday schedule's service_id in the base GTFS).

  2. Crew schedules could have different runs depending on the day of the week, yet be stored in the same schedule (e.g. special Friday service).

    • Many scheduling systems allow for deviations in runs within a singular schedule, an example of which might be the inclusion of certain runs and trips that operate only on a particular day of the week within a larger set of service.

    • e.g. The seasonal CapeFLYER service operates only on Fridays, with some modified crew runs to accommodate the additional service on Fridays within a singular Hastus crew schedule.

Potential Mitigations

Create a unique service_id for every applicable combination of trips and runs (vehicle and crew schedule)

While one could create a concatenated service_id representing the combination of vehicle and crew schedules (e.g. service_id spring24-schedule1-crew1 and spring24-schedule1-crew2, doing so is a inadvisable for two reasons:

  1. This approach requires producers to change the way in which they produce their public GTFS, which is something the working group expressed we wanted to generally avoid; PLUS,

  2. This approach requires duplication of otherwise identical trip data for each service_id, alongside modification to the underlying primary keys (e.g. trip_ids of trip1 and trip1-crew2 to ensure uniqueness to each service_id), thereby (a) adding considerably to the size of the underlying file, (b) adding extensive onus to producers, and (c) potentially also having downstream impacts on customer-facing applications that group service by service_id.

Proposed Solution

Adding a new, optional trip_service_id field to run_events

To combat these issues, ensure backwards-compatibility, and more easily support export from existing scheduling systems, the addition of an optional trip_service_id field is proposed.

  • This would allow individuals with paired vehicle and crew schedules to use the existing v2 specification's use of run_events without modifications (ensuring backwards-compatibility with TODS v2).

  • Those with decoupled crew and trip schedules could keep them separate by defining a new service_id for the runs in run_events, and mapping the runs to corresponding trips using the trips' existing service_ids by entering them in the added trip_service_id field.

  • In the above examples, the distinct crew schedules of spring24-crew1 and spring24-crew2 could be used as the service_id fields in run_events.txt, with the trip_service_id value being spring24-schedule1 in both instances.

    • This allows each instance of the crew schedule to map to the same set of trips without having to adjust the public GTFS data.
    • Exporting of data from scheduling systems (e.g. Hastus) would be easier, with better paradigms for supporting data exchange/interoperability.
    • Where there is a split of the data by day-of-week, these deviations could be modeled as distinct service_ids (e.g. spring24-crew1-fri for runs in a Friday schedule, and spring24-crew1-mtwr for other weekdays) and assigned on their applicable days, yet all mapping back to the same service_ids for the respective trips.
    • Dates that certain runs are in effect, including special runs, could be addressed via the existing approaches for putting certain service_ids in effect.

Supporting additional service_id definitions via calendar_supplement.txt and calendar_dates_supplement.txt

Introducing new service_id entries means a mechanism would need to be added for the assignment of these entries to particular dates. Fortunately, the newly-introduced supplement paradigm allows for an updating of the calendar.txt and calendar_dates.txt files via the addition of entries in applicable TODS supplement files.

These files would permit the assignment of any new run-specific service_ids on both specific dates and in a given date range via the existing data standard of the calendar.txt and calendar_dates.txt files.

Approach Limitations

  • A check would need to be added to ensure applicable trip service_ids are in effect whenever the corresponding run service_ids are in effect.

  • A service_id used for runs could NOT be the same as one used for trips in public GTFS, UNLESS the schedules are coupled together (i.e. the same runs will always be operated when those trips are being operated).

    • Where these need to be decoupled, the service_ids must be distinct.
    • e.g. Where trips are listed with service_id of spring24-schedule1, assigning runs to service_id of spring24-schedule1 indicates that those runs will always operate when the corresponding trips are running.
    • Plain Language: If you assign runs in run_events.txt to a service_id that is also assigned to trips, you're saying that those trips only operate when those runs are operating, and those runs operate whenever those trips are operating.
  • Where a service_id of runs is already paired to a service_id of trips (i.e. run_events.txt has entries with a defined service_id matching one already defined in the standard GTFS), a second set of runs can NOT be assigned to use the trips assigned to the same service_id, as — by definition — the runs paired to the same service_id would be in effect at the same time.

    • e.g. Where trips are listed with service_id of spring24-schedule1, and runs are also assigned to service_id of spring24-schedule1, runs with service_id of spring24-schedule1-crew2 could NOT have a trip_service_id of spring24-schedule1, as the calendar/calendar_dates entries would indicate the runs with service_id of spring24-schedule1 will always operate when trips with service_id of spring24-schedule1 are operating, thus creating duplicate runs for the same trips.
    • Plain Language: If you have one service_id that represents both trips and runs, you can't then create another set of runs that reference just the trip component of that service_id. Doing so would give duplicate runs, since the calendar would say both sets of runs are in effect at the same time. To do so, the runs and trips would need to be decoupled with separate service_ids.

Next Steps

  • Happy to use this as a springboard for discussion as to potential approaches and evolutions of the spec for TODS 2.1+.

  • This also ties in nicely to discussion of approaches for rostering, as the assignment of runs within a roster can be mapped more flexibly with run variations uncoupled from trips.

  • We'll draft a pull request of what this'd look like to provide a more tangible overview of how this'd be represented in the documentation.

@jfabi
Copy link
Collaborator

jfabi commented Jul 8, 2024

@jeffkessler-keolis Thanks for sharing the concern around the different ways schedules can be implemented as well as the possible solution.

Question: Your proposal is only to add the new optional trip_service_id. There may be multiple possible readings, but mine is that #66 doesn't prohibit a new service_id value from being added in TODS. The proposed spec does not say that run_events.service_id must come from trips.txt, and doing so would necessarily preclude runs containing only non-trip/deadhead events (re #11). Would it be worth adding some clarification to #66 to note how run_events.service_id can be utilized?

@skyqrose
Copy link
Contributor

Okay, I think I understand how this is an issue, and how this suggestion fixes it. Responses to specific parts:

  1. Problem:
    1. CapeFLYER:
      1. To clarify, the way this is represented in GTFS is normal-service M-F and additional-cape-flyer-service on Fridays only. Then for crew, you're proposing normal-run-service on M-Thur, and entire-friday-run-service on Friday? And it wouldn't work to assign runs to only the additional-cape-flyer-service because some employees work part of a day on the CapeFLYER and part of their day on other routes?
  2. Potential Mitigations:
    1. Yeah this seems like not a good option.
    2. Are you (Keolis) blocked from producing run_events.txt until this is done?
  3. Proposed Solution:
    1. new trip_service_id:
      1. Is it 100% backwards compatible, or would consumers have to change to avoid misinterpereting anything if they unexpectedly get a TODS file that uses this approach?
      2. How would it compare to do a new run_service_id instead, and then the existing service_id field refers to the same trip service id as in GTFS? It's probably not better, but I just want to make sure the possibility is considered. run services and trip services are distinct categories, so it'd be nice if service_id always referred to the same thing, but run_events.txt should probably have run services as its primary key instead of trip services.
  4. Limitations:
    1. There's quite a bit of complexity in the requirements here. All of it makes sense, but writing it down in a spec that's easy to understand and easy to translate into validation scripts will be a challenge.
    2. If it's not too much of a burden, I think writing a draft spec-quality description for the service_id and trip_service_id columns would help make the issue easier to discuss and could uncover more problems to fix.

I think it'd be useful to see some example data (run_events.txt, calendar.txt, and calendar_supplement.txt), probably modeled after the CapeFLYER case.

@skyqrose
Copy link
Contributor

And responding to Josh:
#66 without this proposal couldn't use newly-defined service_ids because without calendar_supplement.txt those services wouldn't happen on any dates. Once calendar_supplement.txt exists, yes, you should be able to define new deadhead-only or event-only services. (And any PR to implement this proposal should make that clear in the documentation for calendar_supplement.txt)

@jeffkessler-keolis
Copy link
Contributor Author

@skyqrose I detailed a number of different examples in #80, but to directly answer your questions:

  1. That's one example, but yes, the CapeFLYER employees work in both additional-cape-flyer-service realm AND in normal-service. (I realize now I did not provide an example matching this specific scenario, although I can add one.)
  2. Yes.
  3. It is backwards compatible in the sense that it fully supports any previously-existing file, but could be viewed as a breaking change in that a run_events.txt file could fail a primary key validation in having two entries with the same PK if a consumer is not reading the additional trip_service_id field… but the underlying run data would be garbage, anyway, if that field were provided but not interpreted, so the non-silent failure would be desired.
  4. Drafted in Enabling separation of trips and runs via trip_service_ids #80
  5. (Per @jfabi's question and @skyqrose's answer) I hadn't even thought of this, but it's an excellent point and one I've added to the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants