Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locata dataset #34

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
14 changes: 13 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@ adheres to `Semantic Versioning <http://semver.org/spec/v2.0.0.html>`_.
`Unreleased`_
-------------


Added
~~~~~

- Support for the `LOCATA <http://www.locata-challenge.org>`_ dataset in
``pyroomacoustics.datasets.locata``

Bugfix
~~~~~~

Expand Down Expand Up @@ -209,7 +216,6 @@ Added
- Add iterative Wiener filtering approach for single channel denoising in
``pyroomacoustics.denoise.iterative_wiener``.


Changed
~~~~~~~

Expand Down Expand Up @@ -302,6 +308,12 @@ Deprecation Notice
``pyroomacoustics.overlap_add``, etc, are now **deprecated**
and will be removed in the near future

Bugfix
~~~~~~

- Fixed the way multichannel signals are handled by
``pyroomacoustics.datasets.AudioSample`` `plot` and `play` functions.

`0.1.18`_ - 2018-04-24
----------------------

Expand Down
1 change: 1 addition & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ moment we support the following.
* `CMU ARCTIC <http://www.festvox.org/cmu_arctic/>`_
* `TIMIT <https://catalog.ldc.upenn.edu/ldc93s1>`_
* `Google Speech Commands Dataset <https://research.googleblog.com/2017/08/launching-speech-commands-dataset.html>`_
* `LOCATA <http://www.locata-challenge.com>`_

For more details, see the `doc <http://pyroomacoustics.readthedocs.io/en/pypi-release/pyroomacoustics.datasets.html>`_.

Expand Down
7 changes: 7 additions & 0 deletions docs/pyroomacoustics.datasets.locata.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
pyroomacoustics.datasets.locata module
======================================

.. automodule:: pyroomacoustics.datasets.locata
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions docs/pyroomacoustics.datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Datasets Available

pyroomacoustics.datasets.cmu_arctic
pyroomacoustics.datasets.google_speech_commands
pyroomacoustics.datasets.locata
pyroomacoustics.datasets.timit

Tools and Helpers
Expand Down
15 changes: 15 additions & 0 deletions examples/locata.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import numpy as np
import pyroomacoustics as pra

if __name__ == '__main__':

# parse arguments
parser = argparse.ArgumentParser(description='Test DOA algorithm on the locata data.')
parser.add_argument('-a', '--algo', choices=doa.algos.keys(),
help='doa algorithm')
parser.add_argument('-l', '--locata', type=str, default=None,
help='Location of LOCATA files')
parser.add_argument('--task', type=int, default=1,
help='LOCATA task number')
parser.add_argument('--rec', type=int, default=1,
help='LOCATA recording number')
4 changes: 4 additions & 0 deletions examples/locata_challenge.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
import argparse
import pyroomacoustics as pra


1 change: 1 addition & 0 deletions pyroomacoustics/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,3 +141,4 @@
from .timit import Word, Sentence, TimitCorpus
from .cmu_arctic import CMUArcticCorpus, CMUArcticSentence, cmu_arctic_speakers
from .google_speech_commands import GoogleSpeechCommands, GoogleSample
from .locata import LOCATA, LocataRecording
14 changes: 12 additions & 2 deletions pyroomacoustics/datasets/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ class AudioSample(Sample):
The sampling frequency of the samples is an extra parameter.

For multichannel audio, we assume the same format used by
```scipy.io.wavfile <https://docs.scipy.org/doc/scipy-0.14.0/reference/io.html#module-scipy.io.wavfile>`_``,
`scipy.io.wavfile <https://docs.scipy.org/doc/scipy-0.14.0/reference/io.html#module-scipy.io.wavfile>`_,
that is ``data`` is then a 2D array with each column being a channel.

Attributes
Expand Down Expand Up @@ -130,7 +130,12 @@ def play(self, **kwargs):
print('Warning: sounddevice package is required to play audiofiles.')
return

sd.play(self.data, samplerate=self.fs, **kwargs)
if self.data.ndim > 1:
data = self.data[:,0]
else:
data = self.data

sd.play(data, samplerate=self.fs, **kwargs)

def plot(self, NFFT=512, noverlap=384, **kwargs):
'''
Expand All @@ -150,6 +155,8 @@ def plot(self, NFFT=512, noverlap=384, **kwargs):
# Handle single channel case
if self.data.ndim == 1:
data = self.data[:,None]
else:
data = self.data

nchannels = data.shape[1]

Expand All @@ -158,11 +165,14 @@ def plot(self, NFFT=512, noverlap=384, **kwargs):
prows = int(np.ceil(nchannels / pcols))

for c in range(nchannels):
plt.subplot(prows, pcols, c+1)
plt.specgram(data[:,c], NFFT=NFFT, Fs=self.fs, noverlap=noverlap, **kwargs)
plt.xlabel('Time [s]')
plt.ylabel('Frequency [Hz]')
plt.title('Channel {}'.format(c+1))

plt.tight_layout(pad=0.5)


class Dataset(object):
'''
Expand Down
6 changes: 0 additions & 6 deletions pyroomacoustics/datasets/cmu_arctic.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,6 @@
import numpy as np
from scipy.io import wavfile

try:
import sounddevice as sd
have_sounddevice = True
except:
have_sounddevice = False

from .utils import download_uncompress
from .base import Meta, AudioSample, Dataset

Expand Down
Loading