Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow extensions of other datatypes #323

Closed
lfrank opened this issue Jan 12, 2018 · 10 comments
Closed

Allow extensions of other datatypes #323

lfrank opened this issue Jan 12, 2018 · 10 comments
Labels
category: enhancement improvements of code or code behavior
Milestone

Comments

@lfrank
Copy link

lfrank commented Jan 12, 2018

It would be very helpful if it were possible to extend the NWBTable and NWBTableRegion classes as well.
For example, we'd like to be able to do the following:

#define a franklab ElectrodeTableRegion that includes:
#		an attribute of ntrode to correspond to the tetrode or probe shank for this region
#		a link to an IntervalSeries defining the set of times that are valid for this ntrode
FL_ElectrodeTableRegion_ext = NWBGroupSpec('Frank Lab ElectrodeTableRegion',
 								 attributes=[NWBAttributeSpec('ntrode',
 															  'int',
 															  'The user defined number for this collection of electrodes')],
 								 links=[NWBLinkSpec('link to IntervalSeries indicating valid time range',
 													'IntervalSeries',
 													quantity=1,
 													name='valid_times')],
 								 neurodata_type_inc='ElectrodeTableRegion',
 								 neurodata_type_def='FL_ElectrodeTableRegion')
@lfrank lfrank added the category: enhancement improvements of code or code behavior label Jan 12, 2018
@oruebel
Copy link
Contributor

oruebel commented Jan 12, 2018

@lfrank 'ElectrodeTableRegion is a dataset, not a group. You should be able to extend it like any other dataset. In your example, this would mean instead of defining a NWBGroupSpec you would need to define a NWBDatasetSpec instead. You can then add attributes etc. to the dataset. However, since it is a dataset you cannot add sub groups, datasets, or links. Similarly, ElectrodeTable is also a dataset so you will also need to use NWBDatasetSpec when you want to extend the table.

@ajtritt
Copy link
Member

ajtritt commented Jan 12, 2018

@lfrank I suggest you add valid times for ntrodes by extending ElectrodeGroup. This sounds like a property of an NTrode, rather than the subset of ElectrodeTable.

ntrode_ext = NWBGroupSpec('a group of electrodes that is not always valid'
                          links=[NWBLinkSpec('link to IntervalSeries indicating valid time range',
                                             'IntervalSeries',
                                             quantity=1,
                                             name='valid_times')],
                          neurodata_type_inc='ElectrodeGroup',
                          neurodata_type_def='NtrodeGroup')

After doing this, you have two options. 1) Leave ElectrodeTableRegion alone, since the Ntrode Id should be accessible already in the NtrodeGroup referred to in the ElectrodeTable (ElectrodeTableRegion is just a subset of this). Or 2) extend it to add an attribute containing the Ntrode id.

If you want to attach it to the ElectrodeTableRegion for convenience, go for it. The only draw back is that it duplicates some data, but I don't think that will be a problem since I assume you're not going to update ntrode IDs.

FL_etr_ext = NWBDatasetSpec('Frank Lab ElectrodeTableRegion',   # use a DatasetSpec, since ElectrodeTableRegion is a dataset
                attributes=[NWBAttributeSpec('ntrode','int',
                                             'The user defined number for this collection of electrodes')],
                neurodata_type_inc='ElectrodeTableRegion',
                neurodata_type_def='FL_ElectrodeTableRegion')

Alternatively, you could store a reference to the NtrodeGroup that the ElectrodeTableRegion applies to. Would it make sense to just add this functionality to the core schema (in addition to the base ElectrodeTableRegion)? I suspect storing data from electrodes that are part of the same electrode group is a common use-case.

FL_etr_ext = NWBDatasetSpec('Frank Lab ElectrodeTableRegion',   # use a DatasetSpec, since ElectrodeTableRegion is a dataset
                attributes=[NWBAttributeSpec('ntrode', RefSpec('NtrodeGroup', 'object'),
                                             'The Ntrode these electrodes belong to')],
                neurodata_type_inc='ElectrodeTableRegion',
                neurodata_type_def='NtrodeElectrodeTableRegion')

@lfrank
Copy link
Author

lfrank commented Jan 13, 2018

@oruebel @ajtritt Thanks to both of you for the quick responses.

First question: how could I have figured out that ElectrodeTableRegion was a dataset?

Second, in terms of the best way to do this, the situation I am trying to address is as follows: A given physical devices (say, a tetrode) has four electrodes which are part of an electrode group. That device has a fixed ID value (so that could indeed be part of the electrode group) but is moved from day to day. As such, the z coordinate (for example) would change.

Based on our earlier discussion, I had thought the proposed solution was to create separate entries for each day in the electrode table, such that that device would have it's own table region for each day which could have different coordinates associated with it. In that case, we need a separate list of valid times for each day, which would seem to me to suggest that I should extend the table region to include validtimes.

This is a reasonably common scenario, so I'd certainly be in favor of extending the spec to include it.

Thoughts on the best way to do this?

@oruebel
Copy link
Contributor

oruebel commented Jan 16, 2018

In terms of looking up the type, I'm not sure if we have a shorthand function for that right now. I typically just look at the spec either in YAML directly or by looking it up in the catalog:

>>> import pynwb
>>> # Get the namespace catalog
>>> ts =  pynwb.get_type_map()
>>> nc = ts.namespace_catalog
>>> # Look up the spec for the ElectrodeTable
>>> es =   nc.get_spec('core', 'ElectrodeTable')
>>> # Check the type
>>> type(es)
<class 'pynwb.spec.NWBDatasetSpec'>

@ajtritt is there a shorter (or more elegant) way of looking up the type (dataset, group, attribute etc.) of a neurodata_type?

Regarding your second question, I need to have a look at the ElectrodeTable again to see how you would do this.

@ajtritt
Copy link
Member

ajtritt commented Jan 17, 2018

@lfrank The schema documentation indicates that ElectrodeTableRegion is a dataset, and the PyNWB documentation says to use NWBDatasetSpec for extending these. If you think this is confusing, please let us know--any suggestions on making things clearer are much appreciated.

@ajtritt
Copy link
Member

ajtritt commented Jan 17, 2018

@lfrank I think you should keep the valid times off of the ElectrodeTableRegion since you may use the same region in multiple places. For example, you may want to filter some raw data and create multiple ElectricalSeries from (e.g. one for each band) from the recording of a single tetrode. In this case, you would end up with multiple ElectrodeTableRegions duplicating the same information about valid times. To avoid duplicating data all over a file, I suggest keeping valid times attached to the ElectrodeGroup/ElectrodeTable somehow.

How to best structure it depends on how you want to access the data later on. Do you want to query for electrodes valid during a given time? Or do you want to query for the time during which a given electrode is valid? The latter query would be efficient with the schema I proposed. If you think you'll do the former, a better way would be to create a table with start, stop, and a region-reference to the valid rows in the ElectrodeTable (I can give specifics if you want to do this).

@lfrank
Copy link
Author

lfrank commented Jan 17, 2018

@ajtritt Thanks, that's helpful.

First, I didn't know to look in the schema documentation (although it seems fairly obvious in retrospect 8-). It would be useful to add a notation to that effect in the documentation of the extensions at some point.

As for the actual extension, the queries are much more likely to want to find electrodes that are valid during a given time, so I think the table is the way to go there. If you have time to suggest a specific way to do that, I'd appreciate it, and I think this is of broad enough relevance that it could be added to the spec, but that is a decision for the TAB...

@oruebel
Copy link
Contributor

oruebel commented Jan 17, 2018

@lfrank @ajtritt I've updated the schema docs to explicitly list the base type for each type in the main overview (instead of just in the table) (see NeurodataWithoutBorders/nwb-schema#107). I've also added a tip to the extension tutorial to point users to the schema docs (see #328). Hopefully that takes care of the documentation problem part of this issues.

@ajtritt
Copy link
Member

ajtritt commented Jan 18, 2018

@lfrank The table for querying for electrodes that were valid at a given time sounds like a data structure we would need for a query interface. We can bring this up in the next TAB meeting.

@oruebel oruebel added this to the NWB 2.x milestone May 18, 2018
@rly
Copy link
Contributor

rly commented Apr 11, 2024

All data types can be extended now. The electrodes table is not yet a data type but we are working on revamping the electrodes table and making it a data type as part of NEP002 and SpikeInterface/ndx-probeinterface#24 over the next couple of months.

@rly rly closed this as completed Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: enhancement improvements of code or code behavior
Projects
None yet
Development

No branches or pull requests

4 participants