Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support creating operator from URL #5407

Closed
Xuanwo opened this issue Dec 11, 2024 Discussed in #1494 · 4 comments · Fixed by #5444
Closed

Support creating operator from URL #5407

Xuanwo opened this issue Dec 11, 2024 Discussed in #1494 · 4 comments · Fixed by #5444

Comments

@Xuanwo
Copy link
Member

Xuanwo commented Dec 11, 2024

Discussed in #1494

Originally posted by frostming March 7, 2023

Proposal

I propose an alternative approach to create an opendal Operator -- from a resource URI(or URL? either is good) string.

I am new to OpenDAL and feel free to close it if it is already a thing -- That's great!

Examples:

let op = Operator::from_uri("file:///tmp")?.finish();

This does the same as

let mut builder = Fs::default();
builder.root("/tmp");
let op: Operator = Operator::create(builder)?.finish();

Another example, HDFS:

let op = Operator::from_uri("hdfs://127.0.0.1:9000/tmp")?.finish();

Advantages

  • Of course, this can save us a few lines of initialization code.
  • This will also make the configuration easier -- only one config line is needed.
  • Language bindings can choose to not expose Builder structs at all. For a minimal interface, an Operator struct is sufficient, service backends can be set up with a primitive string. This will significantly save the effort of supporting a new language.

Possible Solutions

We already have a unique Scheme for each Builder, we can give them a unique URL prefix. The URI will be encoded as follows:

$scheme://$positional_arg1/$positional_arg2?arg1=value1&arg2=value2

Where positional arguments are required parameters for the specific service, such as root for Fs. Specifically, the URI will be parsed with Url::parse, and the query_pairs will be fed to the ::from_iter() method of the corresponding Builder to get a builder object. Thanks to the good API design of OpenDAL, these APIs are ready to do this.

Be aware because all values are URL components, they must be percent-encoded.

Some concerns

For some services that use URL as the parameter themselves, such as IPFS, the resource URI will be a bit different:

ipfs://...... # for http endpoint
ipfss://...... # for https endpoint

And there can be multiple schemes map to the same builder. Both http:// and https:// map to HTTP builder.

This might cause some confusion, so I will leave it to the maintainers to make the choice of how this feature will go.

@jorgehermo9
Copy link
Contributor

Hi, I'm interested in addressing this. Did you start working on this @Xuanwo ? apache/hudi-rs#131 (comment)

@Xuanwo
Copy link
Member Author

Xuanwo commented Dec 23, 2024

Hi, I'm interested in addressing this. Did you start working on this @Xuanwo ? apache/hudi-rs#131 (comment)

Hi, thank you very much for your interest. I'm currently working on a design for this.

@jorgehermo9
Copy link
Contributor

hi @Xuanwo, this issue was closed by the RFC PR, is that intended? should we reopen that until the RFC is implemented?

@Xuanwo
Copy link
Member Author

Xuanwo commented Dec 27, 2024

hi @Xuanwo, this issue was closed by the RFC PR, is that intended? should we reopen that until the RFC is implemented?

Work will be tracked at #5444

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants