Skip to content

Commit

Permalink
Merge pull request #26 from worldbank/release-v1.2
Browse files Browse the repository at this point in the history
Version v1.2
  • Loading branch information
kbjarkefur authored Feb 22, 2024
2 parents 3ad6235 + 18f1c16 commit 97910c2
Show file tree
Hide file tree
Showing 21 changed files with 1,996 additions and 895 deletions.
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,13 @@
!src/dev/assets/*.png
!src/dev/assets/*.css

####################################
# Ignore ssc outputs
src/dev/ssc

# Ignore test outputs
src/tests/outputs/

* Ignore the local dev env set up by repado
src/tests/dev-env/

Expand Down
25 changes: 19 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This Stata module is package providing a utility toolkit
for reproducibility best-practices.
The motivation for this package is to make DIME Analytics'
The motivation for this package is to make the World Bank's
reproducibility best-practices more accessible to a wider Stata community.
The best-practices promoted in this package appreciated
identified and implemented as part of the
Expand All @@ -15,6 +15,12 @@ Currently, this toolkit has the following commands:
| [repado](https://worldbank.github.io/repkit/reference/repado.html) | Command used to manage project ado command dependencies. This command provides a way to make sure that all team members as well as future reproducers of the projects code use the exact same version of all command dependencies. |
| [repkit](https://worldbank.github.io/repkit/reference/repkit.html) | Command named the same as the package. Most important purpose is that this command makes the code `which repkit` work. |
| [reprun](https://worldbank.github.io/repkit/reference/reprun.html) | This command is used to automate reproducibility checks by running a do-file or a set of do-files and compare all state values (RNG-value, datasignature etc.) between the two runs. This command is currently only release as a beta-version. |
| [reproot](https://dime-worldbank.github.io/repkit/reference/reproot.html) |
This command allows teams to set up dynamic root-paths that require
no manual user-specific set-up. It also supports root-paths in
multi-rooted projects, meaning projects that use different tools to
collaborate on, for example, code and data. |
| [reprun](https://dime-worldbank.github.io/repkit/reference/reprun.html) | This command is used to automate reproducibility checks by running a do-file or a set of do-files and compare all state values (RNG-value, data signature etc.) between the two runs. This command is currently only release as a beta-version. |

# Installation

Expand Down Expand Up @@ -63,9 +69,16 @@ with contribution to the code.
# Authors

This package is written and published by
[DIME Analytics](https://www.worldbank.org/en/research/dime/data-and-analytics).
[DIME Analytics](https://www.worldbank.org/en/research/dime/data-and-analytics)
and the [LSMS Team](https://www.worldbank.org/en/programs/lsms).
Both teams are teams within the [World Bank](https://www.worldbank.org/)
DIME Analytics is a research data methodology team part of the
[Development Impact](https://www.worldbank.org/en/research/dime)
department within the [World Bank](https://www.worldbank.org/).

Contact: [email protected]
[Development Impact](https://www.worldbank.org/en/research/dime) department.
The Living Standards Measurement Study (LSMS) is the World Bank's
flagship household survey program and is
part of the World Bank’s
[Development Data Group](https://www.worldbank.org/en/about/unit/unit-dec/dev).

Contact:
- [email protected]
- [email protected]
4 changes: 2 additions & 2 deletions src/ado/repado.ado
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
*! version 1.1 17DEC2024 DIME Analytics [email protected]
*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org

cap program drop repado
program define repado, rclass

qui {

version 13.0
version 14.1

syntax [using/], ///
/// Optional commands
Expand Down
8 changes: 4 additions & 4 deletions src/ado/repkit.ado
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
*! version 1.1 17DEC2024 DIME Analytics [email protected]
*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org

cap program drop repkit
program define repkit, rclass

version 13.0
version 14.1

* UPDATE THESE LOCALS FOR EACH NEW VERSION PUBLISHED
local version "1.1"
local versionDate "17DEC2024"
local version "1.2"
local versionDate "20240222"
local cmd "repkit"

syntax [anything]
Expand Down
217 changes: 217 additions & 0 deletions src/ado/reproot.ado
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - [email protected], [email protected]

cap program drop reproot
program define reproot, rclass

qui {

version 14.1

* Update the syntax. This is only a placeholder to make the command run
syntax , Project(string) Roots(string) [prefix(string) clear]

noi di _n "{hline}"

* initiate locals
local tot_time 0
local tot_dirs 0
local rootfiles ""

local env_file "~/reproot-env.yaml"
local root_file "reproot.yaml"

/***************************************************
Test if all roots are already loaded
***************************************************/

local roots_set ""
local roots_notset ""

* If clear is used, then add all roots to roots_notset,
* and search for all of them again
if !missing("`clear'") {
local roots_notset "`roots'"
}

* If clear is not used, test what root globals are already set,
* and search only for roots not already set in root globals
else {
* Test which roots if any are already loaded
foreach root of local roots {
* Test if root exists with prefix
if missing("${`prefix'`root'}") {
local roots_notset : list roots_notset | root
}
else local roots_set : list roots_set | root
}
}

/***************************************************
Output any roots are already set
***************************************************/

if !missing("`roots_set'") {
noi di as text _n "{pstd}These roots were already set in these globals:{p_end}"
foreach root of local roots_set {
local prefix_root "`prefix'`root'"
noi di as text "{phang2}- Global: {result:`prefix_root'} - Root: {result:${`prefix_root'}}{p_end}"
}
}

/***************************************************
Output if all roots are already set
***************************************************/

if missing("`roots_notset'") {
noi di as result _n "{pstd}All required roots are already loaded. No search for roots will be done.{p_end}" _n _n "{hline}"

** The command ends here
}

* There are roots to search for
else {

/***************************************************
Output that at least some roots were not loaded
***************************************************/

noi di as text _n "{pstd}These required roots were not already loaded:{p_end}"
foreach root of local roots_notset {
noi di as text "{pmore}- {bf:`root'}{p_end}"
}
noi di as text _n "{pstd}Starting search of file system.{p_end}" _n

/***************************************************
Read env file before search
***************************************************/

* Get home dir
local pwd = "`c(pwd)'"
cd ~
local homedir = "`c(pwd)'"
cd "`pwd'"

* Test if this location has a root file
cap confirm file "`env_file'"
if (_rc) {
noi di as text `"{phang}No file {inp:reproot-env.yaml} found in home directory {it:`homedir'}. This file is required to set up once per computer to use {cmd:reproot}. See instructions on how to set up this file {browse "https://dime-worldbank.github.io/repkit/articles/reproot-files.html":here}.{p_end}"' _n
error 601
exit
}

* Get reprootpaths and skipdirs from env file
reproot_parse env , file("`env_file'")
local envpaths `"`r(envpaths)'"'
local skipdirs `"`r(skipdirs)'"'

/***************************************************
Search each reprootpaths
***************************************************/

foreach envpath of local envpaths {

noi di as smcl `"{hline}"' _n

* Parse max recursion and search path from reprootpath
gettoken maxrecs search_path : envpath, parse(":")
local search_path = substr("`search_path'",2,.)

* Search next folder
noi di as result `"{pstd}{ul:Searching folder: `search_path', with folder depth: `maxrecs'}{p_end}"'
noi reproot_search, ///
path(`"`search_path'"') skipdirs(`"`skipdirs'"') recsleft(`maxrecs')

* Get time, dir_count, and roots found
local time = `r(timer)'
local dirs = `r(num_dir_searched)'
local this_rootdirs = `"`r(rootdirs)'"'

* Output this search
noi di_search_results, ///
time(`time') dcount(`dirs') rootdirs(`"`this_rootdirs'"')

* Add these rootdirs to the list of all dirs
local rootdirs = trim(`"`rootdirs' `this_rootdirs'"')

* Update the time and dir_count to the grand totals
local tot_time = `tot_time' + `time'
local tot_dirs = `tot_dirs' + `dirs'
}

* Output the grand total
noi di as smcl `"{hline}"'
noi di_search_results, total ///
time(`tot_time') dcount(`tot_dirs') rootdirs(`"`rootdirs'"')
noi di as smcl `"{hline}"'


/***************************************************
Parse the root files
***************************************************/

local found_roots ""

foreach rootdir of local rootdirs {
reproot_parse root, file("`rootdir'/`root_file'")
local this_root "`r(root)'"
local this_root_global "`prefix'`this_root'"
local this_root_project "`r(project)'"

* Test if this root belongs the relevant project
if "`project'" == "`this_root_project'" {

* Test if root was already found, if not then add to found_roots
if (`: list this_root in found_roots') {
noi di as error _n "{pstd}A second root called {result:`this_root)'} was found for this project found in folder {result:`rootdir'}.{p_end}"
error 99
exit
}
local found_roots : list found_roots | this_root
noi di "found_roots `found_roots'"

local found_str "Root {result:`this_root'} for project {result:`this_root_project'} found"


if (`: list this_root in roots') {
* Output that a relevant root has been found
noi di _n as text "{pstd}`found_str'. Setting global {result:{c S|}{c -(}`this_root_global'{c )-}} to: {result:`rootdir'}{p_end}"

global `this_root_global' "`rootdir'"
}
* Root not required - just skip it
else {
noi di _n as text "{pstd}`found_str', but root not required, so no global is set for this root.{p_end}"
}
}
}
noi di _n `"{hline}"'
}

* Return all root dires found regardless if they were for this project
return local rootdirs "`rootdirs'"

// Remove then command is no longer in beta
noi repkit "beta reproot"

}
end


cap program drop di_search_results
program define di_search_results

syntax, time(numlist) dcount(numlist) [rootdirs(string) total]

local time: display %8.2f `time'
local dcount: display %14.0fc `dcount'

local rcount: list sizeof rootdirs

local time = trim("`time'")
local dcount = trim("`dcount'")

if missing("`total'") local intro_str "In this search directory"
else local intro_str "In total"

noi di as result _n `"{pstd}`intro_str', `dcount' directories were searched in `time' seconds, and `rcount' reproot root(s) were found.{p_end}"' _n
end
Loading

0 comments on commit 97910c2

Please sign in to comment.