Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when downloading with HSDS Local Server: IncompleteRead #31

Closed
hawbecker opened this issue Jan 28, 2025 · 12 comments
Closed

Error when downloading with HSDS Local Server: IncompleteRead #31

hawbecker opened this issue Jan 28, 2025 · 12 comments

Comments

@hawbecker
Copy link

I created a new conda environment and installed the necessary modules for HSDS, h5pyd, and the others required in the tutorial notebook. I then started a local HSDS server following the steps in the guide and everything up to this point seems to be working fine.

I am running through the NSRDB tutorial (notebook 3) and everything is going well until cell 19 where it downloads a time slice for every 10 points. I then get the following error after almost exactly 1 minute (and 1-2 seconds):

IncompleteRead                            Traceback (most recent call last)
File [~/.local/lib/python3.13/site-packages/urllib3/response.py:754](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/urllib3/response.py#line=753), in HTTPResponse._error_catcher(self)
    753 try:
--> 754     yield
    756 except SocketTimeout as e:
    757     # FIXME: Ideally we'd like to include the url in the ReadTimeoutError but
    758     # there is yet no clean way to get at it from this context.

File [~/.local/lib/python3.13/site-packages/urllib3/response.py:900](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/urllib3/response.py#line=899), in HTTPResponse._raw_read(self, amt, read1)
    890     if (
    891         self.enforce_content_length
    892         and self.length_remaining is not None
   (...)
    898         # raised during streaming, so all calls with incorrect
    899         # Content-Length are caught.
--> 900         raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
    901 elif read1 and (
    902     (amt != 0 and not data) or self.length_remaining == len(data)
    903 ):
   (...)
    906     # `http.client.HTTPResponse`, so we close it here.
    907     # See https://github.com/python/cpython/issues/113199

IncompleteRead: IncompleteRead(0 bytes read, 403680 more expected)

The above exception was the direct cause of the following exception:

ProtocolError                             Traceback (most recent call last)
File ~/.local/lib/python3.13/site-packages/requests/models.py:820, in Response.iter_content.<locals>.generate()
    819 try:
--> 820     yield from self.raw.stream(chunk_size, decode_content=True)
    821 except ProtocolError as e:

File [~/.local/lib/python3.13/site-packages/urllib3/response.py:1066](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/urllib3/response.py#line=1065), in HTTPResponse.stream(self, amt, decode_content)
   1065 while not is_fp_closed(self._fp) or len(self._decoded_buffer) > 0:
-> 1066     data = self.read(amt=amt, decode_content=decode_content)
   1068     if data:

File [~/.local/lib/python3.13/site-packages/urllib3/response.py:955](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/urllib3/response.py#line=954), in HTTPResponse.read(self, amt, decode_content, cache_content)
    953         return self._decoded_buffer.get(amt)
--> 955 data = self._raw_read(amt)
    957 flush_decoder = amt is None or (amt != 0 and not data)

File [~/.local/lib/python3.13/site-packages/urllib3/response.py:878](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/urllib3/response.py#line=877), in HTTPResponse._raw_read(self, amt, read1)
    876 fp_closed = getattr(self._fp, "closed", False)
--> 878 with self._error_catcher():
    879     data = self._fp_read(amt, read1=read1) if not fp_closed else b""

File [~/.conda/envs/nsrdb/lib/python3.13/contextlib.py:162](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/contextlib.py#line=161), in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    161 try:
--> 162     self.gen.throw(value)
    163 except StopIteration as exc:
    164     # Suppress StopIteration *unless* it's the same exception that
    165     # was passed to throw().  This prevents a StopIteration
    166     # raised inside the "with" statement from being suppressed.

File [~/.local/lib/python3.13/site-packages/urllib3/response.py:778](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/urllib3/response.py#line=777), in HTTPResponse._error_catcher(self)
    777         arg = f"Connection broken: {e!r}"
--> 778     raise ProtocolError(arg, e) from e
    780 except (HTTPException, OSError) as e:

ProtocolError: ('Connection broken: IncompleteRead(0 bytes read, 403680 more expected)', IncompleteRead(0 bytes read, 403680 more expected))

During handling of the above exception, another exception occurred:

ChunkedEncodingError                      Traceback (most recent call last)
File [~/.local/lib/python3.13/site-packages/h5pyd/_hl/dataset.py:1205](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/h5pyd/_hl/dataset.py#line=1204), in Dataset.__getitem__(self, args, new_dtype)
   1204 try:
-> 1205     rsp = self.GET(req, params=params, format="binary")
   1206 except IOError as ioe:

File [~/.local/lib/python3.13/site-packages/h5pyd/_hl/base.py:986](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/h5pyd/_hl/base.py#line=985), in HLObject.GET(self, req, params, use_cache, format)
    985 downloaded_bytes = 0
--> 986 for http_chunk in rsp.iter_content(chunk_size=HTTP_CHUNK_SIZE):
    987     if http_chunk:  # filter out keep alive chunks

File [~/.local/lib/python3.13/site-packages/requests/models.py:822](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/requests/models.py#line=821), in Response.iter_content.<locals>.generate()
    821 except ProtocolError as e:
--> 822     raise ChunkedEncodingError(e)
    823 except DecodeError as e:

ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read, 403680 more expected)', IncompleteRead(0 bytes read, 403680 more expected))

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
File <timed exec>:1

File [~/.local/lib/python3.13/site-packages/h5pyd/_hl/dataset.py:1214](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/h5pyd/_hl/dataset.py#line=1213), in Dataset.__getitem__(self, args, new_dtype)
   1212         break
   1213     else:
-> 1214         raise IOError(f"Error retrieving data: {ioe.errno}")
   1215 if isinstance(rsp, str):
   1216     # hexencoded response?
   1217     # this is returned by API Gateway for lamba responses
   1218     rsp = bytes.fromhex(rsp)

OSError: Error retrieving data: None

If I change the stride to be every 1000 points instead of every 10, I can successfully download the data. So it seems like there is a limit to the request size or time that I'm hitting with this.

The exact line that is failing is:

%time data = dset[timestep, ::10]   # extract every 10th location at a particular time

Are you aware of any limits for this, or is there a way that I can extend the limits?

Thanks a lot for any help you can provide.

@rolson2
Copy link
Collaborator

rolson2 commented Jan 28, 2025

Thanks for pointing this out. I'll see if I can find a solution. In the meantime, I have run this notebook successfully using GitHub Codespaces (instructions are in the README).

@hawbecker
Copy link
Author

I was going to run in Codespaces but will need the downloaded data stored locally and didn't think I would be able to do that through Codespaces. Is that not the case?

@rolson2
Copy link
Collaborator

rolson2 commented Jan 28, 2025

@hawbecker, I just tested running the notebook 03_NSRDB_introduction using the Setup a Local HSDS Server instructions.

The notebook ran okay for me. I did run into a very similar error that you are getting when running the notebook via the NREL developer API which is due to rate limitations. I'm not sure why you're getting this error while running via a Local HSDS Server.

I believe there are methods to retrieve output files from Codespaces though I've not attempted this yet. I will see if I can figure this out and add instructions to the README.

@hawbecker
Copy link
Author

Ok, great - thanks a lot. I look forward to any workaround that helps me get some NSRDB data. I do feel that this local HSDS server route is very easy to use, so if there is a fix on that end that would be really nice. I understand if that's a bigger ask, though.

@rolson2
Copy link
Collaborator

rolson2 commented Jan 29, 2025

@hawbecker, downloading the output files from Codespaces might not be the best option. It might be better to try and resolve the issue you're getting from the local HSDS server setup. Would you share the output you get from step 7 in the setup instructions... hsinfo

@hawbecker
Copy link
Author

No problem! hsinfo output:

local ➜ hsinfo
server name: NREL prod HSDS server
server state: READY
endpoint: https://developer.nrel.gov/api/hsds
username: api_gateway
password: *****
server version: 0.8.4
node count: 8
up: 148 days, 7 hours 6 min 56 sec
h5pyd version: 0.21.0

@rolson2
Copy link
Collaborator

rolson2 commented Jan 29, 2025

It looks like you have things setup to use the NREL developer API rather then a local HSDS server. The developer API is rate limited and is probably why you are getting that error. If you have things setup for the local server, the output of hsinfo should be something more like this (versioning will likely differ)...

server name: Highly Scalable Data Service (HSDS)
server state: READY
endpoint: http://localhost:5101
username: anonymous
password:
server version: 0.8.4
node count: 4
up: 53 sec
h5pyd version: 0.18.0

Try double checking the steps in these instructions.

@hawbecker
Copy link
Author

Ahhh great catch - thank you! I believe I went through those steps and then overwrote the .hscfg file with the developer lines. I'm now seeing the correct output:

local ➜ hsinfo
server name: Highly Scalable Data Service (HSDS)
server state: READY
endpoint: http://localhost:5101
username: anonymous
password:
server version: 0.9.2
node count: 4
up: 1 min 15 sec
h5pyd version: 0.21.0

Trying the download again as it is in the tutorial, I now get a more explicit "timeout" error. This happens after 3 minutes.

The command:
%time data = dset[timestep, ::10] # extract every 10th location at a particular time

The error:

---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File [~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py:444](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py#line=443), in HTTPResponse._error_catcher(self)
    443 try:
--> 444     yield
    446 except SocketTimeout:
    447     # FIXME: Ideally we'd like to include the url in the ReadTimeoutError but
    448     # there is yet no clean way to get at it from this context.

File [~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py:567](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py#line=566), in HTTPResponse.read(self, amt, decode_content, cache_content)
    566 with self._error_catcher():
--> 567     data = self._fp_read(amt) if not fp_closed else b""
    568     if amt is None:

File [~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py:533](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py#line=532), in HTTPResponse._fp_read(self, amt)
    531 else:
    532     # StringIO doesn't like amt=None
--> 533     return self._fp.read(amt) if amt is not None else self._fp.read()

File [~/.conda/envs/nsrdb/lib/python3.13/http/client.py:479](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/http/client.py#line=478), in HTTPResponse.read(self, amt)
    478     amt = self.length
--> 479 s = self.fp.read(amt)
    480 if not s and amt:
    481     # Ideally, we would raise IncompleteRead if the content-length
    482     # wasn't satisfied, but it might break compatibility.

File [~/.conda/envs/nsrdb/lib/python3.13/socket.py:719](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/socket.py#line=718), in SocketIO.readinto(self, b)
    718 try:
--> 719     return self._sock.recv_into(b)
    720 except timeout:

TimeoutError: timed out

During handling of the above exception, another exception occurred:

ReadTimeoutError                          Traceback (most recent call last)
File [~/.conda/envs/nsrdb/lib/python3.13/site-packages/requests/models.py:816](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/site-packages/requests/models.py#line=815), in Response.iter_content.<locals>.generate()
    815 try:
--> 816     yield from self.raw.stream(chunk_size, decode_content=True)
    817 except ProtocolError as e:

File [~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py:628](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py#line=627), in HTTPResponse.stream(self, amt, decode_content)
    627 while not is_fp_closed(self._fp):
--> 628     data = self.read(amt=amt, decode_content=decode_content)
    630     if data:

File [~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py:566](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py#line=565), in HTTPResponse.read(self, amt, decode_content, cache_content)
    564 fp_closed = getattr(self._fp, "closed", False)
--> 566 with self._error_catcher():
    567     data = self._fp_read(amt) if not fp_closed else b""

File [~/.conda/envs/nsrdb/lib/python3.13/contextlib.py:162](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/contextlib.py#line=161), in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    161 try:
--> 162     self.gen.throw(value)
    163 except StopIteration as exc:
    164     # Suppress StopIteration *unless* it's the same exception that
    165     # was passed to throw().  This prevents a StopIteration
    166     # raised inside the "with" statement from being suppressed.

File [~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py:449](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/site-packages/urllib3/response.py#line=448), in HTTPResponse._error_catcher(self)
    446 except SocketTimeout:
    447     # FIXME: Ideally we'd like to include the url in the ReadTimeoutError but
    448     # there is yet no clean way to get at it from this context.
--> 449     raise ReadTimeoutError(self._pool, None, "Read timed out.")
    451 except BaseSSLError as e:
    452     # FIXME: Is there a better way to differentiate between SSLErrors?

ReadTimeoutError: HTTPConnectionPool(host='localhost', port=5101): Read timed out.

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
File [~/.local/lib/python3.13/site-packages/h5pyd/_hl/dataset.py:1205](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/h5pyd/_hl/dataset.py#line=1204), in Dataset.__getitem__(self, args, new_dtype)
   1204 try:
-> 1205     rsp = self.GET(req, params=params, format="binary")
   1206 except IOError as ioe:

File [~/.local/lib/python3.13/site-packages/h5pyd/_hl/base.py:986](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/h5pyd/_hl/base.py#line=985), in HLObject.GET(self, req, params, use_cache, format)
    985 downloaded_bytes = 0
--> 986 for http_chunk in rsp.iter_content(chunk_size=HTTP_CHUNK_SIZE):
    987     if http_chunk:  # filter out keep alive chunks

File [~/.conda/envs/nsrdb/lib/python3.13/site-packages/requests/models.py:822](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.conda/envs/nsrdb/lib/python3.13/site-packages/requests/models.py#line=821), in Response.iter_content.<locals>.generate()
    821 except ReadTimeoutError as e:
--> 822     raise ConnectionError(e)
    823 except SSLError as e:

ConnectionError: HTTPConnectionPool(host='localhost', port=5101): Read timed out.

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
File <timed exec>:1

File [~/.local/lib/python3.13/site-packages/h5pyd/_hl/dataset.py:1214](http://localhost:8888/lab/tree/Code/Python/GreenH2/~/.local/lib/python3.13/site-packages/h5pyd/_hl/dataset.py#line=1213), in Dataset.__getitem__(self, args, new_dtype)
   1212         break
   1213     else:
-> 1214         raise IOError(f"Error retrieving data: {ioe.errno}")
   1215 if isinstance(rsp, str):
   1216     # hexencoded response?
   1217     # this is returned by API Gateway for lamba responses
   1218     rsp = bytes.fromhex(rsp)

OSError: Error retrieving data: None

When trying again with downloading every 1000th cell, it also timed out which was not the case before when using the dev API. Using every 5000 cell worked.

In the instructions for setting up the local server it mentions using rex to download the data. Should I switch to using that?

@rolson2
Copy link
Collaborator

rolson2 commented Feb 1, 2025

@hawbecker, I'm not sure what the issue is at this point but I am also getting this. You can definitely try using rex and see if that helps. Notebook 08 gives an introduction to using rex. I'm going to reach out to some people to try and resolve this timeout issue. Thanks for pointing this out and sorry if it's holding you up. I was able to get at that data in chunks and then merging, but that did take quite a long time so not the best solution...

chunk_size = 1000
data = []

for i in range(0, dset.shape[1], chunk_size):
    chunk = dset[timestep, i:i+chunk_size:10]
    data.append(chunk)

data = np.concatenate(data)

@rolson2
Copy link
Collaborator

rolson2 commented Feb 5, 2025

@hawbecker, after talking to @jreadey about this issue he got me to retry downloading from Codespaces and it does seem to be working just fine. Codespaces might be the best option for cases where you are accessing very large amounts of data (like in notebook 3). In your Codespace you can right click a file and then select download.

@jreadey
Copy link
Collaborator

jreadey commented Feb 5, 2025

I tried out downloading from Codespaces as well and it was no problem. Here's a short sample of how to create an HDF5 within codespaces:

  data = dset[timestep, ::10]   # extract every 10th location at a particular time
  import h5py
  fout = h5py.File("nsrdb.h5", "w")
  fout.create_dataset("ghi", data=data)
  f.close()

When the file shows up in the explorer side-panel, you can right-click on it and select download. This will copy it to your Downloads folder.

In general this will be faster than running HSDS locally since the codespace environment runs in the cloud and will have faster access to the S3 store (when you create the codespace, selecting "US West" as the region will help).

@hawbecker
Copy link
Author

Ok great - thanks a lot to you both for the help on this - I really appreciate it!

@rolson2 rolson2 closed this as completed Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants