From 970c6f57c5ed4933529657f3bddf30945bcff883 Mon Sep 17 00:00:00 2001 From: Doris Amoakohene <143125519+DorisAmoakohene@users.noreply.github.com> Date: Mon, 4 Dec 2023 11:10:15 -0700 Subject: [PATCH] Update README.md --- README.md | 39 +++++++++++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 9420afc..3ae145e 100644 --- a/README.md +++ b/README.md @@ -21,16 +21,47 @@ A more stable and optimized data.table package, with improved performance and us ## Performance Regression Analysis in the data.table Package -A. link to the issue(s) comment(s) with original code(s) that reported the regression. (more than one if there was more than one issue with code that exhibits the regression) summary of symptom(s) in your own words. +## Description: +This repository aims to investigate performance regressions in the data.table package by examining its history, creating relevant performance tests, and using atime to analyze the performance of different code branches (before regression, regression, fix regression). Additionally, it provides a GitHub Action called r-asymptotic-testing that allows you to perform asymptotic testing on your data.table repository package upon every push/pull request. -B. link to PR which caused the regression. what was the cause of the performance regression in the data.table code? (in your own words) +## User Guide +# To Start: -C. link to PR which fixed the regression. what was changed to fix it? (in your own words) +1. To begin, conduct the atime test for the different code branches (before regression, regression, fix regression) to identify potential performance issues. [Here](https://github.com/DorisAmoakohene/Researchwork_Rdata.table/blob/main/Performance%20regression%20with%235424.Rmd) is an example of how to perform the atime test + + NB: Set up the necessary environment and dependencies, ensuring that the data.table package and the atime package are installed and loaded. + +2. Generate a plot to showcase the fixes made in the data.table package using the atime package. Utilize the `atime_versions` function to track the fixes across different versions. Pass the following named arguments to `atime::atime_versions`: N, setup, expr, and the different code branches. More documentation of the atime package can be found [here](https://github.com/tdhock/atime/blob/compare-dt-tidy/man/atime_pkg.Rd). The documentation provides detailed information on how to use the atime package for performance analysis and tracking changes across different versions + +3. Use the `plot` function to visually represent the execution times of the expression evaluated across different versions of the data.table package. + +4. Run the GitHub Action by writing tests in `inst/atime/tests.R`. Define `test.list` as a list with names corresponding to different tests. Each element should be a list with named elements N, setup, expr, to be passed as named arguments to `atime::atime_versions`. For further elaboration on the process of performing asymptotic time testing using the atime package, please refer to [this](https://github.com/marketplace/actions/r-asymptotic-testing) + +# Example Usage + +A. Provide links to the issue(s) comment(s) containing the original code(s) that reported the regression. If there are multiple issues with code exhibiting the regression, include links to each issue along with a summary of the observed symptoms in your own words. + +B. Link to the pull request (PR) that caused the regression. Explain the cause of the performance regression in the data.table code using your own words. + +C. Link to the PR that fixed the regression. Describe the changes made to fix the regression in your own words. + +D. Provide links to your atime test code(s) and plot(s) that illustrate the performance regression and its fix. If there are multiple issues with code exhibiting regressions, include links and plots for each issue. -D. link to your atime test code(s), and plot(s) which lets us see the performance regression and fix. (more than one if there was more than one issue with code that exhibits the regression) - R atime code file(s) - png atime figure file(s) +# Inputs: +- Links to the issue(s) comment(s) with the original code(s) reporting the regression. +- Link to the pull request (PR) that caused the regression. +- Link to the PR that fixed the regression. +- Links to the atime test code(s) and plot(s) illustrating the performance regression and its fix. + +# Outputs: +- Visualizations of the performance regressions and fixes. +- GitHub Action report indicating the results of asymptotic testing on the data.table repository package. + + + ## Performance Regression Analysis # 1.