Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use clang's native modules for header generation #448

Closed
madsmtm opened this issue May 22, 2023 · 4 comments
Closed

Use clang's native modules for header generation #448

madsmtm opened this issue May 22, 2023 · 4 comments
Labels
A-framework Affects the framework crates and the translator for them enhancement New feature or request help wanted Extra attention is needed

Comments

@madsmtm
Copy link
Owner

madsmtm commented May 22, 2023

clang has functionality for loading, which is in part how Swift gets their nice import Foundation.NSString.

We should use that in header-translator too, especially for feature-gating things (instead of feature-gating based on class name).

It can be enabled using something like:

-fmodules -fapinotes-modules -fmodules-cache-path=./target

Though after doing that, I'm having trouble with getting libclang to give me a cursor to each module, so that I can check what's in it!

Other possibly interesting options: -Xclang -fmodule-format=raw, -fmodule-feature=objc,

@madsmtm madsmtm added enhancement New feature or request help wanted Extra attention is needed A-framework Affects the framework crates and the translator for them labels May 22, 2023
@madsmtm
Copy link
Owner Author

madsmtm commented May 22, 2023

I think this is going to be yet another case of "libclang does not expose enough", and the path forwards is probably #345

@madsmtm
Copy link
Owner Author

madsmtm commented May 26, 2023

A huge gain from this would be in compile-times: We could feature-flag things depending on module name instead of the current, clunky class name!

Though reconsidering, we can maybe already do this? I know that the module.modulemap file can change this, and we should handle that in the future, but most frameworks just have the behaviour "export each separate header as a submodule".

So maybe we can just use filenames in the first iteration?

Concretely, statements would require the following cfg attributes:

  • Internal module imports (NSArray.rs) would be cfg-gated behind Foundation_NSArray.
    • Actually, maybe we should consider making these public? They are in Swift (import Foundation.NSArray), and it would give us a nice place to put the iterator adapters that arrays need.
  • Classes would need cfg based on their superclasses modules. So using NSWindow would require AppKit_NSWindow and AppKit_NSResponder, unless we figure out a scheme for transitively enabling features (easy with class hierarchies, not so much for modules).
    • But perhaps that's actually desirable? Transitively enabling features can be confusing, especially if at some point you want to cut down on features, but disabling AppKit_NSWindow suddenly also breaks some other, totally unrelated part of the codebase that just uses NSResponder.
  • Protocols would need cfgs for their super protocols, though likely less of an issue than with classes.
  • Protocol implementations would require cfg from both the class and the protocol.
  • Categories would need the class' cfg
  • Enumerations would basically never need cfgs
    • Enumeration cases may need it though, if they're using constants/enumeration cases from another file.
    • Similarly for free-standing constants
  • Typed enumerations would need to be cfg-gated depending on their inner type.
  • Statics would need to be cfg-gated depending on their type, and possibly their value, if they have one (similar to enumeration cases).
  • Structs would need cfg-gating depending on their fields.
  • Methods and functions would need to be cfg-gated based on their arguments.

Many of these are similar to what we've already implemented, although I suspect that if we drop the "classes have their superclasses transitively enabled", it will be harder to gather the list of required features elsewhere (and it will be more confusing in docs). Though again, this will already be a problem with all the other types, so solving it for classes will not be enough.

@silvanshade
Copy link
Contributor

silvanshade commented May 26, 2023

clang has functionality for loading, which is in part how Swift gets their nice import Foundation.NSString.

We should use that in header-translator too, especially for feature-gating things (instead of feature-gating based on class name).

[...]

Though after doing that, I'm having trouble with getting libclang to give me a cursor to each module, so that I can check what's in it!

[...]

I think this is going to be yet another case of "libclang does not expose enough", and the path forwards is probably #345

@madsmtm This sounds like a good idea going forward and indeed seems related to #345.

Just to give an update on my last comment in that issue, I've been working on a new rewrite of the bindings I initially made for ClangImporter, now using a more efficient combination of autocxx and hand-written cxx bindings.

It's still in a private repo but I plan to replace the contents of silvanshade/framework-translator with it soon (though I will probably rename it to clang-importer-cxx).

In any case, the bindings are much more complete now and I'm nearly to the point to where it's possible to query modules and headers for declarations and then be able walk through the AST nodes.

Since the ClangImporter machinery seems heavily oriented around clang modules, I guess it makes sense that it ties into this approach your suggesting.

The current status of the bindings are that it's possible to create an ASTContext and ClangImporter instance and work with most of the basic related LLVM and clang data structures and support machinery.

I'm currently working on adding enough remaining bindings from swift/include/swift/AST to be able to query the ASTContext and ClangImporter for useful information about modules and declarations. I think I should have something working for that by the end of next week judging by the current rate of progress.

I was planning on finishing that up and then cleaning up the build process stuff before making the new version public but since this may be relevant for your ideas here, I can try to just get the current repo updated with what I've got sooner and then just worry about finishing up the other stuff later.

Just to give an example of how using ClangImporter via those bindings will look, here is a unit test for ClangImporter (from the swift repo here) translated to Rust:

/// Create a temporary cache on disk and clean it up at the end.
#[test]
fn cache() -> BoxResult<()> {
    let temp = tempfile::tempdir()?;

    // Initialize default clang importer options.
    moveit!(let mut clang_importer_options = unsafe { swift::ClangImporterOptions::new() });

    // Create a cache subdirectory for the modules and PCH.
    let cache = temp.path().join("cache");
    std::fs::create_dir(&cache)?;
    unsafe { clang_importer_options.as_mut().set_module_cache_path(&cache) };
    unsafe { clang_importer_options.as_mut().set_precompiled_header_output_dir(&cache) };

    // Create the includes.
    let include = temp.path().join("include");
    std::fs::create_dir(&include)?;

    let module_dot_modulemap = include.join("module.modulemap");
    let a_dot_h = include.join("A.h");
    let bridging_dot_h = include.join("bridging.h");

    unsafe { clang_importer_options.as_mut().modify_extra_args_push_back("-nosysteminc") };
    {
        let include = include.as_os_str().to_str().expect("path should be a valid UTF-8 string");
        unsafe { clang_importer_options.as_mut().modify_extra_args_push_back(&format!("-I{include}")) };
    }
    {
        std::fs::write(&module_dot_modulemap, indoc! {r#"
            module A {
                header "A.h"
            }
        "#})?;
        std::fs::write(&a_dot_h, indoc! {r#"
            int foo(void);
        "#})?;
        std::fs::write(&bridging_dot_h, indoc! {r#"
            #import <A.h>
        "#})?;
    }

    // Create a bridging header.
    unsafe { clang_importer_options.as_mut().set_bridging_header(&bridging_dot_h) };

    // Set up the importer.
    moveit!(let mut lang_options = unsafe { swift::LangOptions::new() });

    let (os_was_invalid, arch_was_invalid) = {
        let arch = llvm::Twine::from("x86_64");
        let vendor = llvm::Twine::from("apple");
        let os = llvm::Twine::from("darwin");
        moveit!(let target = unsafe { llvm::Triple::new(&arch, &vendor, &os) });
        unsafe { lang_options.as_mut().set_target(target) }
    };
    if os_was_invalid {
        return Err("invalid os".into());
    }
    if arch_was_invalid {
        return Err("invalid arch".into());
    }

    moveit!(let mut sil_options = unsafe { swift::SilOptions::new() });
    moveit!(let mut type_checker_options = unsafe { swift::TypeCheckerOptions::new() });

    unsafe { llvm::initialize_llvm() };

    moveit!(let mut search_path_options = unsafe { swift::SearchPathOptions::new() });
    moveit!(let mut symbol_graph_options = unsafe { swift::symbolgraphgen::SymbolGraphOptions::new() });
    moveit!(let mut source_manager = unsafe { swift::SourceManager::new() });
    moveit!(let mut diagnostic_engine = unsafe { swift::DiagnosticEngine::construct_from_source_manager(source_manager.as_mut()) });

    let mut ast_context = unsafe {
        swift::AstContext::get(
            lang_options.as_mut(),
            type_checker_options.as_mut(),
            sil_options.as_mut(),
            search_path_options.as_mut(),
            clang_importer_options.as_mut(),
            symbol_graph_options.as_mut(),
            diagnostic_engine.as_mut(),
            |_module_name, _is_overlay| true,
        )
    };

    let swift_pch_hash = None;
    let dependency_tracker = None;
    let mut clang_importer =
        unsafe { swift::ClangImporter::create(ast_context.pin_mut(), swift_pch_hash, dependency_tracker) };

    let pch = cache.join("bridging.h.pch");
    std::fs::File::create(&pch)?;

    // Emit a bridging PCH and check that we can read the PCH.
    assert!(!unsafe { clang_importer.pin_mut().can_read_pch(&pch) });
    assert!(!unsafe { clang_importer.pin_mut().emit_bridging_pch(&bridging_dot_h, &pch) });
    assert!(unsafe { clang_importer.pin_mut().can_read_pch(&pch) });

    // Overwrite the PCH with garbage.  We should still be able to read it from the in-memory cache.
    std::fs::write(&pch, "garbage")?;
    assert!(unsafe { clang_importer.pin_mut().can_read_pch(&pch) });

    temp.close()?;
    Ok(())
}

@madsmtm
Copy link
Owner Author

madsmtm commented Jan 10, 2025

I managed to do this with the current header-translator in #678, the trick was to pass the path to the .modulemap directly to clang, and use the -Xclang -emit-module -fmodule-name=Foundation flags.

@madsmtm madsmtm closed this as completed Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-framework Affects the framework crates and the translator for them enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants