-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
…8632) ### Rationale for this change This PR tries to enhance Gandiva by supporting registering external C functions to its function registry, so that developers can author third party functions with complex dependency and expose them as C functions to be used in Gandiva expression. See more details in GH-38589. ### What changes are included in this PR? This PR primarily adds a new API to the `FunctionRegistry` so that developers can use it to register external C functions: ```C++ arrow::Status Register( NativeFunction func, void* c_function_ptr, std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt); ``` ### Are these changes tested? * The changes are tested via unit tests in this PR, and the unit tests include several C functions written using C++ and we confirm this kind of functions can be used by Gandiva after registration using the above mentioned new API. * Additionally, locally I wrote some Rust based functions, and integrate the Rust based functions into a C++ program by using the new registration API and verified this approach did work, but this piece of work is not included in the PR. ### Are there any user-facing changes? There are several new APIs added to `FunctionRegistry` class: ```C++ /// \brief register a C function into the function registry /// @ param func the registered function's metadata /// @ param c_function_ptr the function pointer to the /// registered function's implementation /// @ param function_holder_maker this will be used as the function holder if the /// function requires a function holder arrow::Status Register( NativeFunction func, void* c_function_ptr, std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt); /// \brief get a list of C functions saved in the registry const std::vector<std::pair<NativeFunction, void*>>& GetCFunctions() const; const FunctionHolderMakerRegistry& GetFunctionHolderMakerRegistry() const; ``` * Closes: #38589 ### Notes * This PR is related with #38116, which adds the initial support for registering LLVM IR based external functions into Gandiva. Authored-by: Yue Ni <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
- Loading branch information
Showing
25 changed files
with
550 additions
and
121 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
// Licensed to the Apache Software Foundation (ASF) under one | ||
// or more contributor license agreements. See the NOTICE file | ||
// distributed with this work for additional information | ||
// regarding copyright ownership. The ASF licenses this file | ||
// to you under the Apache License, Version 2.0 (the | ||
// "License"); you may not use this file except in compliance | ||
// with the License. You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, | ||
// software distributed under the License is distributed on an | ||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
// KIND, either express or implied. See the License for the | ||
// specific language governing permissions and limitations | ||
// under the License | ||
|
||
#include <llvm/IR/Type.h> | ||
|
||
#include "gandiva/engine.h" | ||
#include "gandiva/exported_funcs.h" | ||
|
||
namespace { | ||
// calculate the number of arguments for a function signature | ||
size_t GetNumArgs(const gandiva::FunctionSignature& sig, | ||
const gandiva::NativeFunction& func) { | ||
auto num_args = 0; | ||
num_args += func.NeedsContext() ? 1 : 0; | ||
num_args += func.NeedsFunctionHolder() ? 1 : 0; | ||
for (auto const& arg : sig.param_types()) { | ||
num_args += arg->id() == arrow::Type::STRING ? 2 : 1; | ||
} | ||
num_args += sig.ret_type()->id() == arrow::Type::STRING ? 1 : 0; | ||
return num_args; | ||
} | ||
|
||
// map from a NativeFunction's signature to the corresponding LLVM signature | ||
arrow::Result<std::pair<std::vector<llvm::Type*>, llvm::Type*>> MapToLLVMSignature( | ||
const gandiva::FunctionSignature& sig, const gandiva::NativeFunction& func, | ||
gandiva::LLVMTypes* types) { | ||
std::vector<llvm::Type*> arg_llvm_types; | ||
arg_llvm_types.reserve(GetNumArgs(sig, func)); | ||
|
||
if (func.NeedsContext()) { | ||
arg_llvm_types.push_back(types->i64_type()); | ||
} | ||
if (func.NeedsFunctionHolder()) { | ||
arg_llvm_types.push_back(types->i64_type()); | ||
} | ||
for (auto const& arg : sig.param_types()) { | ||
arg_llvm_types.push_back(types->IRType(arg->id())); | ||
if (arg->id() == arrow::Type::STRING) { | ||
// string type needs an additional length argument | ||
arg_llvm_types.push_back(types->i32_type()); | ||
} | ||
} | ||
if (sig.ret_type()->id() == arrow::Type::STRING) { | ||
// for string output, the last arg is the output length | ||
arg_llvm_types.push_back(types->i32_ptr_type()); | ||
} | ||
auto ret_llvm_type = types->IRType(sig.ret_type()->id()); | ||
return std::make_pair(std::move(arg_llvm_types), ret_llvm_type); | ||
} | ||
} // namespace | ||
|
||
namespace gandiva { | ||
Status ExternalCFunctions::AddMappings(Engine* engine) const { | ||
auto const& c_funcs = function_registry_->GetCFunctions(); | ||
auto const types = engine->types(); | ||
for (auto& [func, func_ptr] : c_funcs) { | ||
for (auto const& sig : func.signatures()) { | ||
ARROW_ASSIGN_OR_RAISE(auto llvm_signature, MapToLLVMSignature(sig, func, types)); | ||
auto& [args, ret_llvm_type] = llvm_signature; | ||
engine->AddGlobalMappingForFunc(func.pc_name(), ret_llvm_type, args, func_ptr); | ||
} | ||
} | ||
return Status::OK(); | ||
} | ||
} // namespace gandiva |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
// Licensed to the Apache Software Foundation (ASF) under one | ||
// or more contributor license agreements. See the NOTICE file | ||
// distributed with this work for additional information | ||
// regarding copyright ownership. The ASF licenses this file | ||
// to you under the Apache License, Version 2.0 (the | ||
// "License"); you may not use this file except in compliance | ||
// with the License. You may obtain a copy of the License at | ||
// | ||
// http://www.apache.org/licenses/LICENSE-2.0 | ||
// | ||
// Unless required by applicable law or agreed to in writing, | ||
// software distributed under the License is distributed on an | ||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
// KIND, either express or implied. See the License for the | ||
// specific language governing permissions and limitations | ||
// under the License. | ||
|
||
#include "gandiva/function_holder_maker_registry.h" | ||
|
||
#include <functional> | ||
|
||
#include "arrow/util/string.h" | ||
#include "gandiva/function_holder.h" | ||
#include "gandiva/interval_holder.h" | ||
#include "gandiva/random_generator_holder.h" | ||
#include "gandiva/regex_functions_holder.h" | ||
#include "gandiva/to_date_holder.h" | ||
|
||
namespace gandiva { | ||
|
||
using arrow::internal::AsciiToLower; | ||
|
||
FunctionHolderMakerRegistry::FunctionHolderMakerRegistry() | ||
: function_holder_makers_(DefaultHolderMakers()) {} | ||
|
||
arrow::Status FunctionHolderMakerRegistry::Register(const std::string& name, | ||
FunctionHolderMaker holder_maker) { | ||
function_holder_makers_.emplace(AsciiToLower(name), std::move(holder_maker)); | ||
return arrow::Status::OK(); | ||
} | ||
|
||
template <typename HolderType> | ||
static arrow::Result<FunctionHolderPtr> HolderMaker(const FunctionNode& node) { | ||
std::shared_ptr<HolderType> derived_instance; | ||
ARROW_RETURN_NOT_OK(HolderType::Make(node, &derived_instance)); | ||
return derived_instance; | ||
} | ||
|
||
arrow::Result<FunctionHolderPtr> FunctionHolderMakerRegistry::Make( | ||
const std::string& name, const FunctionNode& node) { | ||
auto lowered_name = AsciiToLower(name); | ||
auto found = function_holder_makers_.find(lowered_name); | ||
if (found == function_holder_makers_.end()) { | ||
return Status::Invalid("function holder not registered for function " + name); | ||
} | ||
|
||
return found->second(node); | ||
} | ||
|
||
FunctionHolderMakerRegistry::MakerMap FunctionHolderMakerRegistry::DefaultHolderMakers() { | ||
static const MakerMap maker_map = { | ||
{"like", HolderMaker<LikeHolder>}, | ||
{"to_date", HolderMaker<ToDateHolder>}, | ||
{"random", HolderMaker<RandomGeneratorHolder>}, | ||
{"rand", HolderMaker<RandomGeneratorHolder>}, | ||
{"regexp_replace", HolderMaker<ReplaceHolder>}, | ||
{"regexp_extract", HolderMaker<ExtractHolder>}, | ||
{"castintervalday", HolderMaker<IntervalDaysHolder>}, | ||
{"castintervalyear", HolderMaker<IntervalYearsHolder>}}; | ||
return maker_map; | ||
} | ||
} // namespace gandiva |
Oops, something went wrong.