-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BERT Layer Built-In #2184
base: main
Are you sure you want to change the base?
BERT Layer Built-In #2184
Conversation
@phaniarnab I will add support for the GELU activation here once GELU PR is merged. |
The GELU PR is merged now @MaximilianSchreff . |
320fc20
to
043338b
Compare
@phaniarnab added support for GELU and also a new test case with GELU activation. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2184 +/- ##
=========================================
Coverage 71.86% 71.87%
- Complexity 44707 44715 +8
=========================================
Files 1450 1450
Lines 169272 169272
Branches 32996 32996
=========================================
+ Hits 121652 121667 +15
+ Misses 38290 38279 -11
+ Partials 9330 9326 -4 ☔ View full report in Codecov by Sentry. |
Thanks, @MaximilianSchreff. I will merge this in. |
@phaniarnab Added intermediate outputs in outputs since they are required in backward pass. This should be the last patch for this PR and it can be merged :) |
Thank you @MaximilianSchreff. I will have a look and merge it in. |
This PR introduces the full BERT layer from the BERT transformer architecture to SystemDS as a built-in operation.
This PR is part of a series of PRs to support the BERT architecture in SystemDS. The BERT layer is the component in the BERT architecture. Backward pass will follow, as well as rest of components.
Includes
Testing:
Added a comprehensive test cases comparing the forward pass results against HuggingFace Transformer Library's implementation for correctness.
transformers.models.bert.modeling_bert.BertLayer