Skip to content

Commit

Permalink
add optimise-services-and-exceptions βœ…πŸ“
Browse files Browse the repository at this point in the history
part of #10
  • Loading branch information
derhuerst committed Feb 26, 2022
1 parent f77a344 commit d0b6b2e
Show file tree
Hide file tree
Showing 6 changed files with 327 additions and 5 deletions.
92 changes: 92 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
- [`readPathways(readFile, filters)`](#readpathways)
- [`readShapes(readFile, filters)`](#readshapes)
- [`computeTrajectories(readFile, filters)`](#computetrajectories)
- [`optimiseServicesAndExceptions(readFile, timezone, filters)`](#optimiseservicesandexceptions)


## `readCsv`
Expand Down Expand Up @@ -764,3 +765,94 @@ for await (const trajectory of computeTrajectories(readFile, filters)) {
```

*Note:* In order to work, `computeTrajectories` must load reduced forms of `trips.txt`, `stop_times.txt`, `frequencies.txt` and `shapes.txt` into memory. See [*store API*](#store-api) for more details.


## `optimiseServicesAndExceptions`

A GTFS feed may have a set of `calendar.txt` and/or `calendar_dates.txt` rows that express service days in an overly verbose way. Some examples:

- feeds without `calendar.txt`, where every service day is expressed as a `exception_type=1` (added) exception – In many of such cases, we can reduce the number of exceptions by adding a row in `calendar.txt` with the respective day(s) turned on (e.g. `tuesday=1`).
- feeds with `calendar.txt`, where some services have more `exception_type=2` (removed) exceptions than "regular" day-of-the-week-based service dates (e.g. `thursday=1`) – In this case, we can turn off the "regular" service dates (`thursday=0`) and use `exception_type=1` (added) exceptions.

For each service, **`optimiseServicesAndExceptions` computes the optimal combination of day of the week flags (e.g. `monday=1`) and exceptions, minimalising the number of exceptions necessary to express the set of service dates**.

```js
const readCsv = require('gtfs-utils/read-csv')
const optimiseServices = require('gtfs-utils/optimise-services-and-exceptions')

const readFile = name => readCsv('path/to/gtfs/' + name + '.txt')

const services = readServices(readFile, 'Europe/Berlin')
for await (const [id, changed, service, exceptions] of services) {
if (changed) {
console.log(id, 'changed!')
console.log('service:', service)
console.log('exceptions:', exceptions)
} else {
console.log(id, 'unchanged!', id)
}
}
```

`optimiseServicesAndExceptions(readFile, timezone, filters = {})` reads `calendar.txt` and `calendar_dates.txt`. It returns an [async iterable](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/asyncIterator) of `[serviceId, changed, service, exceptions]` entries.

- If `changed` is `true`,
- the service's `calendar.txt` row or `calendar_dates.txt` rows (or both) have been optimised,
- `service` contains the optimised service,
- `exceptions` contains all `calendar_dates.txt` rows applying to the *optimised* service.
- If `changed` is `false`,
- the service cannot be optimised,
- `service` contains the `calendar.txt` as it was before, or a mock service if there was none before,
- `exceptions` contains the `calendar_dates.txt` rows as they were before.

The [test fixture](../test/fixtures/optimise-services-and-exceptions) contains three services (`more-exceptions-than-regular`, `more-regular-than-exceptions`, should-stay-unchanged), of which the first two can be optimised. With its files as input, the code above will print the following:

```
more-exceptions-than-regular changed!
service: {
service_id: 'more-exceptions-than-regular',
start_date: '20220301',
end_date: '20220410',
monday: '0',
tuesday: '0',
wednesday: '0',
thursday: '0',
friday: '0',
saturday: '0',
sunday: '0',
}
exceptions: [{
service_id: 'more-exceptions-than-regular',
date: '20220302',
exception_type: '1',
}, {
service_id: 'more-exceptions-than-regular',
date: '20220324',
exception_type: '1',
}, {
service_id: 'more-exceptions-than-regular',
date: '20220330',
exception_type: '1',
}, {
service_id: 'more-exceptions-than-regular',
date: '20220331',
exception_type: '1',
}]
more-regular-than-exceptions changed!
service: {
service_id: 'more-regular-than-exceptions',
monday: '1',
tuesday: '0',
wednesday: '0',
thursday: '0',
friday: '1',
saturday: '0',
sunday: '0',
start_date: '20220301',
end_date: '20220410',
}
exceptions: []
should-stay-unchanged unchanged! should-stay-unchanged
```
27 changes: 27 additions & 0 deletions examples/optimise-services-and-exceptions.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
'use strict'

const {join: pathJoin} = require('path')
const readCsv = require('../read-csv')
const optimiseServicesAndExceptions = require('../optimise-services-and-exceptions')

const fixtureDir = pathJoin(__dirname, '..', 'test', 'fixtures', 'optimise-services-and-exceptions')
const readFile = (file) => {
return readCsv(pathJoin(fixtureDir, file + '.csv'))
}

;(async () => {
const optimisedSvcs = optimiseServicesAndExceptions(readFile, 'Europe/Berlin')
for await (const [id, changed, service, exceptions] of optimisedSvcs) {
if (changed) {
console.log(id, 'changed!')
console.log('service:', service)
console.log('exceptions:', exceptions)
} else {
console.log(id, 'unchanged!', id)
}
}
})()
.catch((err) => {
console.error(err)
process.exit(1)
})
13 changes: 10 additions & 3 deletions lib/dates-between.js
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ const weekdayIndexes = [

const cache = new LRUCache({maxSize: 50})

const computeDatesBetween = (beginning, end, weekdays, timezone) => {
const computeDatesBetween = (beginning, end, weekdays, timezone, weekdayMap = null) => {
if (!isObj(weekdays)) throw new Error('weekdays must be an object.')
weekdays = Object.assign(Object.create(null), noWeekdays, weekdays)
for (let weekday in weekdays) {
Expand All @@ -50,6 +50,7 @@ const computeDatesBetween = (beginning, end, weekdays, timezone) => {
weekdays.saturday,
weekdays.sunday,
timezone,
weekdayMap !== null ? 'wd' : '',
].join('-')
if (cache.has(signature)) {
return Array.from(cache.get(signature))
Expand All @@ -62,8 +63,14 @@ const computeDatesBetween = (beginning, end, weekdays, timezone) => {
const dates = []
let t = new Date(beginning + 'T00:00Z')
for (let i = 0; t <= end; i++) {
if (weekdays[weekdayIndexes[t.getUTCDay()]]) {
dates.push(t.toISOString().slice(0, 10))
const weekday = t.getUTCDay()
if (weekdays[weekdayIndexes[weekday]]) {
const date = t.toISOString().slice(0, 10)
dates.push(date)

if (weekdayMap !== null) {
weekdayMap.set(date, weekday)
}
}
t.setUTCDate(t.getUTCDate() + 1)
}
Expand Down
85 changes: 85 additions & 0 deletions optimise-services-and-exceptions.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
'use strict'

const readServicesAndExceptions = require('./read-services-and-exceptions')
const datesBetween = require('./lib/dates-between')

const WEEKDAYS = [
// JS Date ordering
'sunday',
'monday',
'tuesday',
'wednesday',
'thursday',
'friday',
'saturday',
]

const noWeekday = {
monday: false,
tuesday: false,
wednesday: false,
thursday: false,
friday: false,
saturday: false,
sunday: false,
}

const formatDate = isoDate => isoDate.split('-').join('')

const optimiseServicesAndExceptions = async function* (readFile, timezone, filters = {}, opt = {}) {
const weekdaysMap = new Map()
const svcsAndExceptions = readServicesAndExceptions(readFile, timezone, filters, {
...opt,
exposeStats: true,
weekdaysMap,
})

for await (let [serviceId, dates, svc, nrOfDates, removedDates] of svcsAndExceptions) {
const nrOfDefaultDates = []
for (let wd = 0; wd < WEEKDAYS.length; wd++) {
const defaultDates = datesBetween(
svc.start_date, svc.end_date,
{...noWeekday, [WEEKDAYS[wd]]: true},
timezone,
weekdaysMap,
)
nrOfDefaultDates[wd] = defaultDates.length
}

let changed = false
svc = {...svc}
for (let wd = 0; wd < 7; wd++) {
// todo: make this customisable
const flag = nrOfDates[wd] > nrOfDefaultDates[wd] / 2 | 0 ? '1' : '0'
changed = changed || (flag !== svc[WEEKDAYS[wd]])
svc[WEEKDAYS[wd]] = flag
}

const exceptions = []
for (const date of dates) {
const wd = weekdaysMap.get(date)
if (svc[WEEKDAYS[wd]] === '1') continue
exceptions.push({
service_id: serviceId,
date: formatDate(date),
exception_type: '1', // added
})
}

for (const date of removedDates) {
const wd = weekdaysMap.get(date)
if (svc[WEEKDAYS[wd]] === '0') continue
exceptions.push({
service_id: serviceId,
date: formatDate(date),
exception_type: '2', // removed
})
}

// todo [breaking]: remove serviceId (idx 0), move svc first,
// follow read-services-and-exceptions here
yield [serviceId, changed, svc, exceptions]
}
}

module.exports = optimiseServicesAndExceptions
40 changes: 38 additions & 2 deletions read-services-and-exceptions.js
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,15 @@ const readServicesAndExceptions = async function* (readFile, timezone, filters =
throw new TypeError('filters.serviceException must be a function')
}

const {
exposeStats,
weekdaysMap,
} = {
exposeStats: false,
weekdaysMap: new Map(),
...opt,
}

await new Promise(r => setTimeout(r, 0))

let servicesFileExists = true
Expand Down Expand Up @@ -83,9 +92,19 @@ const readServicesAndExceptions = async function* (readFile, timezone, filters =
filterB: serviceExceptionFilter,
})

const weekdayOf = (date) => {
if (weekdaysMap.has(date)) return weekdaysMap.get(date)
const weekday = new Date(date + 'T00:00Z').getDay()
weekdaysMap.set(date, weekday)
return weekday
}

const {NONE} = joinIteratively
let dates = []
let svc = {service_id: NaN}
// todo: default to null? perf?
let nrOfDates = new Array(7).fill(0)
let removedDates = []

for await (const [s, ex] of pairs) {
let _svc = {service_id: NaN}
Expand Down Expand Up @@ -113,12 +132,13 @@ const readServicesAndExceptions = async function* (readFile, timezone, filters =
if (dates.length > 0) {
if (svc.start_date === null) svc.start_date = dates[0]
if (svc.end_date === null) svc.end_date = dates[dates.length - 1]
yield [svc.service_id, dates, svc]
yield [svc.service_id, dates, svc, nrOfDates, removedDates]
}

svc = _svc

if (s !== NONE) {
const wdm = exposeStats ? weekdaysMap : null
dates = datesBetween(
s.start_date, s.end_date,
{
Expand All @@ -131,10 +151,19 @@ const readServicesAndExceptions = async function* (readFile, timezone, filters =
sunday: s.sunday === '1',
},
timezone,
wdm,
)
} else {
dates = []
}

if (exposeStats) {
nrOfDates = new Array(7).fill(0)
for (const date of dates) {
nrOfDates[weekdayOf(date)]++
}
removedDates = []
}
}

if (ex !== NONE) {
Expand All @@ -145,10 +174,17 @@ const readServicesAndExceptions = async function* (readFile, timezone, filters =
const i = arrEq(dates, date)
if (i >= 0) {
dates.splice(i, 1) // delete
if (exposeStats) {
nrOfDates[weekdayOf(date)]--
removedDates.push(date)
}
}
} else if (ex.exception_type === ADDED) {
if (!arrHas(dates, date)) {
arrInsert(dates, date)
if (exposeStats) {
nrOfDates[weekdayOf(date)]++
}
}
} // todo: else emit error
}
Expand All @@ -157,7 +193,7 @@ const readServicesAndExceptions = async function* (readFile, timezone, filters =
if (dates.length > 0) {
if (svc.start_date === null) svc.start_date = dates[0]
if (svc.end_date === null) svc.end_date = dates[dates.length - 1]
yield [svc.service_id, dates, svc]
yield [svc.service_id, dates, svc, nrOfDates, removedDates]
}
}

Expand Down
Loading

0 comments on commit d0b6b2e

Please sign in to comment.