Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(drivers): webdav driver #276

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

shareable-vision
Copy link

πŸ”— Linked issue

Issue #275

❓ Type of change

  • πŸ“– Documentation (updates to the documentation, readme, or JSdoc annotations)
  • 🐞 Bug fix (a non-breaking change that fixes an issue)
  • πŸ‘Œ Enhancement (improving an existing functionality like performance)
  • ✨ New feature (a non-breaking change that adds functionality)
  • 🧹 Chore (updates to the build process or auxiliary tools and libraries)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)

πŸ“š Description

Webdav driver for unstorage powered by the webdav NPM package.

πŸ“ Checklist

  • I have linked an issue or discussion.
  • I have updated the documentation accordingly.

Considerations

The driver currently only implements read-only functions.

To-do

  • Implement setItem and removeItem, leveraging putFileContents and deleteFile

@nuxt-studio
Copy link

nuxt-studio bot commented Jul 23, 2023

βœ… Live Preview ready!

Name Edit Preview Latest Commit
unstorage Edit on Studio β†—οΈŽ View Live Preview a01ef02

@shareable-vision
Copy link
Author

Driver (read-only) is tested and working with Nuxt Content.

yarrow.love added 2 commits July 24, 2023 10:27
Passing the 'directory' option implies that this source prefix should be stripped from the path.
@pi0
Copy link
Member

pi0 commented Jul 24, 2023

Hi dear @shareable-vision i love the idea of supporting webdav as a driver!

Checking the webdav npm package used, it seems little bit overcomplex. I think we probably can directly implement a webdav client only using native fetch API.

An (old) example to inspire: https://github.com/ai-zelenin/webdav-client-fetch/blob/master/ts/index.ts

Do you think you can try to reimplement it with fetch only and no dependencies? πŸ™πŸΌ

@shareable-vision
Copy link
Author

Do you think you can try to reimplement it with fetch only and no dependencies? πŸ™πŸΌ

That's a great idea. I will work on it.

@shareable-vision
Copy link
Author

shareable-vision commented Jul 25, 2023

WebDav server sends XML data. Although testing with NextCloud, I'm cross-referencing with other authorities and avoiding any namespaced properties, keeping it minimal and hopefully generally compatible.

Native fetch is great . .. but there is not a Node equivalent of browser-native DOMParser.

If we can use xml2js, then I'm almost done re-implementing read-only driver. Installed in an empty repo, xml2js clocks in at about 250kb. Or xpath.

Any idea for reliably parsing XML with no dependencies? β€”folks highly discourage parsing XML with regex. What do you think?

β€”I'm really glad you put me onto this!

@shareable-vision
Copy link
Author

shareable-vision commented Jul 25, 2023

After taking time to consider, I don't think we should use a manual regex approach for parsing the XML for the sake of having zero dependencies, while I really appreciate that you got me to thinking about the efficiency of the package that we're leveraging.

WebDav is a standardization of XML β€”the parsing of the XML responses is key to the function of the driver, and it is reasonable to require that a user install an additional peer dependency.

fast-xml-parser looks like it will be a good fit.

@pi0
Copy link
Member

pi0 commented Jul 25, 2023

Hi dear @shareable-vision. Sorry for answering late and thanks for willing to continue working on this.

Nice point about XML parser lacking. Looking at webdav dependencies, they use fast-xml-parser i would recommend to try it.

Usng bundlephobia.com, fast-xml-parser is 25KB vs JSDom 2.4 MB.

We might find something even smaller but i think that's a good start.

@shareable-vision
Copy link
Author

shareable-vision commented Jul 27, 2023

Just pushed re-implementation of read-only version of the Webdav driver using 'fast-xml-parser'.

I will work on implementing setItem() and removeItem() next.

Will have to study Webdav resource locking.

@shareable-vision
Copy link
Author

shareable-vision commented Jul 27, 2023

I want to work on this a little longer and make some improvements. One very major improvement that I see now will be to use the Depth header β€”currently implementation is recursively re-fetching subdirectories and awaiting a Promise.all. I tried the header with a faulty value ('Infinity') β€”now I see that a value of 'infinity' will eliminate the re-fetching and speed up getKeys() β€”specs say that server should support "infinity requests" β€”could implement re-fetching as a fall-back mechanism, but it's probably not necessary.

I'm also mulling over the advantages of polling. Will study the other drivers more about this . .
I think where I'm landing is that it would be good to provide optional caching and polling, which are each disabled with a default value of 0.

WebDAV uses 'etag' to version files. When re-syncing does need to happen, the etag can be referenced to quickly validate the cache.

@pi0 pi0 marked this pull request as draft July 27, 2023 15:35
@pi0
Copy link
Member

pi0 commented Jul 27, 2023

Thanks for updates and nice work on this @shareable-vision

I have marked PR as draft in meantime please ping when felt ready

I will also check locally in coming days and try to help if fine for you πŸ‘πŸΌ

yarrow.love added 2 commits July 27, 2023 18:07
…rectories.

Implement optional 'infinityDepthHeaderUnavailable' to use fallback mechanism.
@shareable-vision
Copy link
Author

shareable-vision commented Jul 27, 2023

Perfect πŸ‘

Did implement Depth: "infinity" header as default with config option to fallback if unsupported β€”much better.

Will do write functions as best as I can without getting into the weeds about locked resources.

Will touch base soon.

@shareable-vision
Copy link
Author

shareable-vision commented Jul 30, 2023

Almost ready for review. Open to feedback.

setItem() and removeItem() are implemented, honoring all WebDAV specs requirements.

setItem() will ensure that the target parent directory exists by making MKCOL requests on each non-existent ancestor directory. If 'type' key on opts is set to "directory", setItem() will make a directory on WebDAV storage instead of uploading a file.

WebDAV defines etag as a versioning mechanism that can be used for cache validation. When getKeys() updates meta for an item, if the etag changes, then the item.body is deleted, thus is no longer "cached".

Concerns:

Can optimize PUT request?: I'm not sure if the PUT requests may be optimized by using a ReadableStream. Probably because they are already utf8-encoded, this won't make any difference . .

What should be the desired behavior of the ttl key and caching? In my mind, the most optimal solution for WebDAV is for the consumer of the driver to receive a Webhook from the WebDAV service and then intentionally re-fetch keys/meta by calling getKeys(). Unstorage itself is the "cache". getItem() and getMeta() should immediately return available data, fetching only as needed. There should be no reason to repeatedly call getKeys() unless the consumer is expecting the values to have changed, or else are unsure and are polling the source.

My thinking previously was that an undefined ttl might signify ("Do not cache"), however another reasonable interpretation would be (TTL of zero or undefined => Always re-fetch), which would be the opposite of my intention.

Polling? I haven't implemented any polling (which might be configurable via an 'interval' key). I'm not sure if it's a good idea. It seems like the driver consumer can do that if they wish simply by re-calling getKeys() as needed. If polling were implemented, the consumer would have to provide a callback function to be called when source changes. Actually configuring an interval, or manually polling may be more efficient than using TTL β€”when polling, the fetching is happening in the background, not in the middle of a user's request.

yarrow.love added 2 commits July 29, 2023 23:54
 * Consolidate redundant assignment of latest.atime
 * Avoid bug of early return on failing to make (already existent) empty directory, which will occur because Object.keys(files) does not provide keys/meta for empty directories.
 * Use delete keyword on files[key].body when etag changes.
 * undefined 'ttl' disables cache expiration.
 * `getKeys()` always refetches meta from source; uses 'etag' to validate.

Update docs:
 * setItem/removeItem functions are available.
@shareable-vision shareable-vision marked this pull request as ready for review July 30, 2023 16:36
@shareable-vision
Copy link
Author

Driver is ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants