You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, you need to write your own conversion code since different datasets have different formats. A brief, incomplete documentation of the data format we use:
You will need two files datasetName_note_chords.length.npy and datasetName_note_chords.npy.
The datasetName_note_chords.npy file is a matrix with shape [T, D]. T is the number of frames (one frame is a sixteenth note).
The second dimension is a concatenation of the following:
* 1-D notes: range 0-129. 0 for silence, 1 for sustain, 2-129 for onsets with 128 MIDI pitches.
* 1-D root: range 0-11, or -1. the chord root. -1 for non-chord.
* 12-D chroma: boolean. the pitch class of the chord.
* 1-D bass: range 0-11, or -1. the chord bass note. -1 for non-chord. In the nottingham dataset, we use the root note as the bass note.
* 1-D beat: boolean. Whether this frame is a beat frame (unused in training)
* 1-D downbeat: boolean. Whether this frame is a downbeat frame (unused in training)
Assume your dataset has N songs, you should concatenate the matrix along the first axis (T axis) and record the lengths of each song in datasetName_note_chords.length.npy (i.e., datasetName_note_chords.length.npy is an integer array of size N).
You can refer to nottingham_note_chords.length.npy and nottingham_note_chords.npy for an example.
After that, you can create a dataset split file datasetName_note_chords.split.txt to split the indices of songs (starting from 0 to N-1) to the training, validation and test split (3 lines, one line each split).
Thanks.
The text was updated successfully, but these errors were encountered: