Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert trivial copy constructor and assignment operator calls to memcpy #1202

Open
smeenai opened this issue Dec 3, 2024 · 8 comments
Open
Labels
good first issue Good for newcomers optimization Optimizations that can be implemented at the ClangIR level

Comments

@smeenai
Copy link
Collaborator

smeenai commented Dec 3, 2024

CodeGen replaces trivial copy constructors and assignment operators with memcpy. ClangIR intentionally doesn't do so when generating CIR, so that all those function calls are available for analysis: https://godbolt.org/z/xcKdzKaWM. We'd ideally switch to the memcpy at some point before generating LLVM though, to generate better code. One potentially idea would be to tag either the functions themselves or the call sites during CIR generation, and then have a later pass (e.g. LoweringPrepare) do the memcpy conversion. Search for isMemcpyEquivalentSpecialMember under clang/lib/CodeGen for examples of how it handles this.

@smeenai smeenai added good first issue Good for newcomers optimization Optimizations that can be implemented at the ClangIR level labels Dec 3, 2024
@bcardosolopes
Copy link
Member

Btw, we have a missing feature for isMemcpyEquivalentSpecialMember as well, not sure if it's in all places it should though

@shubhe25p
Copy link

Hey Bruno and Shoib, I am new to MLIR/ClangIR space and would love to contribute, I am not sure if I can solve it but I would like to try, can someone give more details on what exactly I need to do or resources that I can look into? Thank you!

@smeenai
Copy link
Collaborator Author

smeenai commented Dec 13, 2024

Welcome aboard!

The ClangIR website has good resources around getting started, building and testing, etc. For this task, we're concerned with two parts of ClangIR:

  • CIRGen, which takes an AST and produces the corresponding ClangIR. This is closely modeled after Clang's CodeGen library, which produces LLVM IR from an AST.
  • Transforms, which produce lower-level ClangIR from higher-level ClangIR.

Clang's CodeGen directly emits memcpy instead of calls to trivial copy constructors and assignment operators. CIRGen intentionally diverges from this behavior and emits actual function calls, but we'd also like to replace those calls with a memcpy at some later stage in the pipeline.

You can search for isMemcpyEquivalentSpecialMember under clang/lib/CodeGen to see the places where it produces memcpy instead of function calls. You'll also be able to find the corresponding CIRGen files and functions under clang/lib/CIR/CodeGen and see how its behavior diverges.

For introducing the actual memcpy operations, there's multiple possible approaches. One idea would be to tag the function calls generated by CIRGen with some attribute like memcpy_equivalent, and then have a later pass like LoweringPrepare replace those calls with memcpy. You'll get a better sense of the potential designs and their trade-offs as you start working on this, and of course we're available to answer questions and give suggestions.

The ClangIR Discord channel is another good place to get advice or just say hi (you'll need to join the LLVM Discord server first). Keep in mind though that winter holidays are approaching, so Bruno and I will both be only sporadically available from now till early January.

@shubhe25p
Copy link

Hi Shoib, thank you so much for the detailed response, I am setting up local environment and will continue to explore the code, I am already on LLVM discord and also please enjoy your holidays, I will try to understand the task better on my own before asking for help.

@smeenai
Copy link
Collaborator Author

smeenai commented Dec 13, 2024

Sounds good! A few other tips which might be useful:

@shubhe25p
Copy link

Thank you! I will take this incrementally and work on it over the holidays. Apologies for the delay in my response, as I am currently wrapping up my internship. I have ClangIR installed on my remote machines and was able to generate the AST, CIR, and other components. I'm currently reviewing the Codegen code to understand how the AST is passed to the lower layers.
Additionally, I was wondering how frequently ClangIR is merged with the main LLVM project. I tried to build it first and encountered some errors.

@shubhe25p
Copy link

Hey, Happy new year! I am thinking that attributing both FuncOp and CallOp seems redundant. What if we just attribute the CallOp and have the lower pass discard the corresponding FuncOp? Also, how are attributes actually set? The ClangIR documentation doesn’t seem to cover this, so I’m not sure. I was thinking of adding a tag (as extra attrs) if a CallOp invokes a trivial copy constructor (isMemcpyEquivalentSpecialMember) —what do you think? Apologies if these questions seem basic.

@bcardosolopes
Copy link
Member

Additionally, I was wondering how frequently ClangIR is merged with the main LLVM project

ClangIR is being incrementally being upstreamed to llvm-project, all development should be done in the incubator until we reach the point to move over (which is probably at least 6 months away).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers optimization Optimizations that can be implemented at the ClangIR level
Projects
None yet
Development

No branches or pull requests

3 participants