Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement seekable streams #418

Closed
cdmurph32 opened this issue Nov 22, 2024 · 8 comments
Closed

Implement seekable streams #418

cdmurph32 opened this issue Nov 22, 2024 · 8 comments

Comments

@cdmurph32
Copy link

Seekable streams would enable a great deal of functionality for my C2PA SDK work and likely many other use cases. I don't have much of an understanding of how this would be implemented, but I'd like to start a conversation.

The most obvious use case would be file handles. This would allow feature parity with web_sys::FileSystemFileHandle.
For example this JS code uses this wasm-bindgen rust code:

 async signFileSystemHandle(
    signer: AsyncSigner,
    format: string,
    source: FileSystemFileHandle,
    dest: FileSystemFileHandle,
  ): Promise<Uint8Array> {
    const {
      value: { signerPtr, manifestBytes },
      ptr,
    } = await Builder.pool.builder_signFileSystemHandle(
      Builder.workerConfig,
      this.getPtr(),
      signer.getPtr(),
      format,
      source,
      dest,
    );
    this.ptr = ptr;
    signer.setPtr(signerPtr);

    return manifestBytes;
  }

The wasm-bindgen rust code can use FIleSystemFileHandle as a stream using createWritable

    #[wasm_bindgen(js_name = signFileSystemHandle)]
    pub async fn sign_file_system_handle(
        mut self,
        signer_ptr: u32,
        format: &str,
        source_handle: &FileSystemFileHandle,
        dest_handle: &FileSystemFileHandle,
    ) -> Result<JsValue, JsSysError> {
        let signer = JavaScriptAsyncSigner::from_ptr(signer_ptr)?;
        let mut source = HandleStream::new(source_handle).await?;
        let mut dest = HandleStream::new(dest_handle).await?;
        let manifest_bytes = self
            .builder
            .sign_async(&signer, format, &mut source, &mut dest)
            .await
            .unwrap();

Besides file handles, seekable streams would be very useful for other use cases. C2PA manifests must be located in very large asset containers, which must be parsed to understand their structure. This requires the streams to be seekable. As a workaround, my POC of a C2PA component reads the input stream into a cursor before passing it to the c2pa SDK functions. This is obviously far from ideal and will not work with very large containers streams such as video.

fn add_seek_to_read<R: Read>(mut reader: R) -> std::io::Result<Cursor<Vec<u8>>> {
    let mut buffer = Vec::new();
    reader.read_to_end(&mut buffer)?;
    Ok(Cursor::new(buffer))
}
@dicej
Copy link
Collaborator

dicej commented Nov 22, 2024

@lukewagner have you thought at all about how seeking would work for stream<T>?

@lukewagner
Copy link
Member

In general, I've been assuming that seeking happens via methods on resources that return new streams, e.g.:

resource file {
  pread-stream: func(offset: u64) -> stream<u8>;
}

which reflects the fact that not all stream sources are seekable.

However, one operation that does make sense on a stream (and almost made it into wasi-io/streams, iirc) which I think makes sense as a canon built-in would be a stream.skip that in effect lets you "seek" forward N elements as an optimization over simply reading and discarding N elements. I think this make sense for all streams.

@sunfishcode
Copy link
Member

(For what it's worth, wasi-io/streams does have skip for seeking forward.)

@lukewagner
Copy link
Member

Oops, right, I just missed it in my scan :P

@cdmurph32
Copy link
Author

For wasi-io/streams skip and not seek makes a lot of sense, but for wasi::filesystem specifically it would be very useful to have seek. It would allow us to use wasi for our NodeJS SDK. Does that sound like a reasonable feature?

@dicej
Copy link
Collaborator

dicej commented Nov 26, 2024

For wasi-io/streams skip and not seek makes a lot of sense, but for wasi::filesystem specifically it would be very useful to have seek. It would allow us to use wasi for our NodeJS SDK. Does that sound like a reasonable feature?

Would the pread-stream sketch Luke posted above be suitable? The idea is that you can't seek an existing stream, but you can open a new one at an arbitrary offset within the file. And you can always build an abstraction on top of that which makes it look like you're seeking with a single stream.

@cdmurph32
Copy link
Author

Yes. I think that sounds good. I wasn't aware that we already had what I was basically looking for with descriptors.
Should we leave this open until filesystem 0.3.0, or just close it?

@sunfishcode
Copy link
Member

I think we can close this, as wasi-filesystem 0.2.0 already uses this pattern of returning new streams at offsets rather than having a seek on input-stream or output-stream , so it's now what wasi-filesystem 0.3.0 is proposed to do as well, so I don't think there's anything else needed from the component-model repo side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants