You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Validating catalog at path https://data.lsdb.io/hats/gaia_dr3/gaia/ ...
Found 3933 partitions.
Approximate coverage is 100.00 % of the sky.
True
Instead, I get:
{
"name": "ArrowInvalid",
"message": "Error creating dataset. Could not read schema from 'https://data.lsdb.io/hats/gaia_dr3/gaia/'. Is this a 'parquet' file?: Could not open Parquet input source 'https://data.lsdb.io/hats/gaia_dr3/gaia/': Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.",
"stack": "---------------------------------------------------------------------------
ArrowInvalid Traceback (most recent call last)
Cell In[2], line 1
----> 1 is_valid_catalog(gaia_catalog_path, verbose=True, fail_fast=True, strict=True)
File ~/.local/lib/python3.11/site-packages/hats/io/validation.py:125, in is_valid_catalog(pointer, strict, fail_fast, verbose)
113 ignore_prefixes = [
114 \"_common_metadata\",
115 \"_metadata\",
(...)
121 \"README\",
122 ]
124 # As a side effect, this confirms that we can load the directory as a valid dataset.
--> 125 (dataset_path, dataset) = read_parquet_dataset(
126 pointer,
127 ignore_prefixes=ignore_prefixes,
128 exclude_invalid_files=False,
129 )
131 parquet_path_pixels = []
132 for hats_file in dataset.files:
File ~/.local/lib/python3.11/site-packages/hats/io/file_io/file_io.py:190, in read_parquet_dataset(source, **kwargs)
187 file_system = source.fs
188 source = source.path
--> 190 dataset = pds.dataset(
191 source,
192 filesystem=file_system,
193 format=\"parquet\",
194 **kwargs,
195 )
196 return (str(source), dataset)
File ~/.conda/envs/lsdb/lib/python3.11/site-packages/pyarrow/dataset.py:794, in dataset(source, schema, format, filesystem, partitioning, partition_base_dir, exclude_invalid_files, ignore_prefixes)
783 kwargs = dict(
784 schema=schema,
785 filesystem=filesystem,
(...)
790 selector_ignore_prefixes=ignore_prefixes
791 )
793 if _is_path_like(source):
--> 794 return _filesystem_dataset(source, **kwargs)
795 elif isinstance(source, (tuple, list)):
796 if all(_is_path_like(elem) or isinstance(elem, FileInfo) for elem in source):
File ~/.conda/envs/lsdb/lib/python3.11/site-packages/pyarrow/dataset.py:486, in _filesystem_dataset(source, schema, filesystem, partitioning, format, partition_base_dir, exclude_invalid_files, selector_ignore_prefixes)
478 options = FileSystemFactoryOptions(
479 partitioning=partitioning,
480 partition_base_dir=partition_base_dir,
481 exclude_invalid_files=exclude_invalid_files,
482 selector_ignore_prefixes=selector_ignore_prefixes
483 )
484 factory = FileSystemDatasetFactory(fs, paths_or_selector, format, options)
--> 486 return factory.finish(schema)
File ~/.conda/envs/lsdb/lib/python3.11/site-packages/pyarrow/_dataset.pyx:3126, in pyarrow._dataset.DatasetFactory.finish()
File ~/.conda/envs/lsdb/lib/python3.11/site-packages/pyarrow/error.pxi:155, in pyarrow.lib.pyarrow_internal_check_status()
File ~/.conda/envs/lsdb/lib/python3.11/site-packages/pyarrow/error.pxi:92, in pyarrow.lib.check_status()
ArrowInvalid: Error creating dataset. Could not read schema from 'https://data.lsdb.io/hats/gaia_dr3/gaia/'. Is this a 'parquet' file?: Could not open Parquet input source 'https://data.lsdb.io/hats/gaia_dr3/gaia/': Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file."
}
I am running this in a Python notebook on USDF. I'm using lsdb via pip install 'lsdb[full]'.
Maybe I've missed an installation step? The previous cell (and all previous tutorial notebooks) seem to run fine.
The text was updated successfully, but these errors were encountered:
Yes - this has been addressed in astronomy-commons/hats#404, but this has not been released. Does this still occur if you install hats from current main?
Bug report
I encounter unexpected output in the second code cell of the Topic: Manual catalog verification demo.
I would expect to get (as in the doc):
Instead, I get:
I am running this in a Python notebook on USDF. I'm using lsdb via
pip install 'lsdb[full]'
.Maybe I've missed an installation step? The previous cell (and all previous tutorial notebooks) seem to run fine.
The text was updated successfully, but these errors were encountered: