diff --git a/.gitignore b/.gitignore index 5f07ed7..12d980b 100644 --- a/.gitignore +++ b/.gitignore @@ -61,6 +61,13 @@ !src/dev/assets/*.png !src/dev/assets/*.css +#################################### +# Ignore ssc outputs +src/dev/ssc + +# Ignore test outputs +src/tests/outputs/ + * Ignore the local dev env set up by repado src/tests/dev-env/ diff --git a/README.md b/README.md index bc386ae..1e9d7f3 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ This Stata module is package providing a utility toolkit for reproducibility best-practices. -The motivation for this package is to make DIME Analytics' +The motivation for this package is to make the World Bank's reproducibility best-practices more accessible to a wider Stata community. The best-practices promoted in this package appreciated identified and implemented as part of the @@ -15,6 +15,12 @@ Currently, this toolkit has the following commands: | [repado](https://worldbank.github.io/repkit/reference/repado.html) | Command used to manage project ado command dependencies. This command provides a way to make sure that all team members as well as future reproducers of the projects code use the exact same version of all command dependencies. | | [repkit](https://worldbank.github.io/repkit/reference/repkit.html) | Command named the same as the package. Most important purpose is that this command makes the code `which repkit` work. | | [reprun](https://worldbank.github.io/repkit/reference/reprun.html) | This command is used to automate reproducibility checks by running a do-file or a set of do-files and compare all state values (RNG-value, datasignature etc.) between the two runs. This command is currently only release as a beta-version. | +| [reproot](https://dime-worldbank.github.io/repkit/reference/reproot.html) | +This command allows teams to set up dynamic root-paths that require +no manual user-specific set-up. It also supports root-paths in +multi-rooted projects, meaning projects that use different tools to +collaborate on, for example, code and data. | +| [reprun](https://dime-worldbank.github.io/repkit/reference/reprun.html) | This command is used to automate reproducibility checks by running a do-file or a set of do-files and compare all state values (RNG-value, data signature etc.) between the two runs. This command is currently only release as a beta-version. | # Installation @@ -63,9 +69,16 @@ with contribution to the code. # Authors This package is written and published by -[DIME Analytics](https://www.worldbank.org/en/research/dime/data-and-analytics). +[DIME Analytics](https://www.worldbank.org/en/research/dime/data-and-analytics) +and the [LSMS Team](https://www.worldbank.org/en/programs/lsms). +Both teams are teams within the [World Bank](https://www.worldbank.org/) DIME Analytics is a research data methodology team part of the -[Development Impact](https://www.worldbank.org/en/research/dime) -department within the [World Bank](https://www.worldbank.org/). - -Contact: dimeanalytics@worldbank.org +[Development Impact](https://www.worldbank.org/en/research/dime) department. +The Living Standards Measurement Study (LSMS) is the World Bank's +flagship household survey program and is +part of the World Bank’s +[Development Data Group](https://www.worldbank.org/en/about/unit/unit-dec/dev). + +Contact: +- dimeanalytics@worldbank.org +- lsms@worldbank.org diff --git a/src/ado/repado.ado b/src/ado/repado.ado index c1b9df8..32dcba1 100644 --- a/src/ado/repado.ado +++ b/src/ado/repado.ado @@ -1,11 +1,11 @@ -*! version 1.1 17DEC2024 DIME Analytics dimeanalytics@worldbank.org +*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org cap program drop repado program define repado, rclass qui { - version 13.0 + version 14.1 syntax [using/], /// /// Optional commands diff --git a/src/ado/repkit.ado b/src/ado/repkit.ado index f594a46..a77b302 100644 --- a/src/ado/repkit.ado +++ b/src/ado/repkit.ado @@ -1,13 +1,13 @@ -*! version 1.1 17DEC2024 DIME Analytics dimeanalytics@worldbank.org +*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org cap program drop repkit program define repkit, rclass - version 13.0 + version 14.1 * UPDATE THESE LOCALS FOR EACH NEW VERSION PUBLISHED - local version "1.1" - local versionDate "17DEC2024" + local version "1.2" + local versionDate "20240222" local cmd "repkit" syntax [anything] diff --git a/src/ado/reproot.ado b/src/ado/reproot.ado new file mode 100644 index 0000000..ba89d89 --- /dev/null +++ b/src/ado/reproot.ado @@ -0,0 +1,217 @@ +*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org + +cap program drop reproot + program define reproot, rclass + +qui { + + version 14.1 + + * Update the syntax. This is only a placeholder to make the command run + syntax , Project(string) Roots(string) [prefix(string) clear] + + noi di _n "{hline}" + + * initiate locals + local tot_time 0 + local tot_dirs 0 + local rootfiles "" + + local env_file "~/reproot-env.yaml" + local root_file "reproot.yaml" + + /*************************************************** + Test if all roots are already loaded + ***************************************************/ + + local roots_set "" + local roots_notset "" + + * If clear is used, then add all roots to roots_notset, + * and search for all of them again + if !missing("`clear'") { + local roots_notset "`roots'" + } + + * If clear is not used, test what root globals are already set, + * and search only for roots not already set in root globals + else { + * Test which roots if any are already loaded + foreach root of local roots { + * Test if root exists with prefix + if missing("${`prefix'`root'}") { + local roots_notset : list roots_notset | root + } + else local roots_set : list roots_set | root + } + } + + /*************************************************** + Output any roots are already set + ***************************************************/ + + if !missing("`roots_set'") { + noi di as text _n "{pstd}These roots were already set in these globals:{p_end}" + foreach root of local roots_set { + local prefix_root "`prefix'`root'" + noi di as text "{phang2}- Global: {result:`prefix_root'} - Root: {result:${`prefix_root'}}{p_end}" + } + } + + /*************************************************** + Output if all roots are already set + ***************************************************/ + + if missing("`roots_notset'") { + noi di as result _n "{pstd}All required roots are already loaded. No search for roots will be done.{p_end}" _n _n "{hline}" + + ** The command ends here + } + + * There are roots to search for + else { + + /*************************************************** + Output that at least some roots were not loaded + ***************************************************/ + + noi di as text _n "{pstd}These required roots were not already loaded:{p_end}" + foreach root of local roots_notset { + noi di as text "{pmore}- {bf:`root'}{p_end}" + } + noi di as text _n "{pstd}Starting search of file system.{p_end}" _n + + /*************************************************** + Read env file before search + ***************************************************/ + + * Get home dir + local pwd = "`c(pwd)'" + cd ~ + local homedir = "`c(pwd)'" + cd "`pwd'" + + * Test if this location has a root file + cap confirm file "`env_file'" + if (_rc) { + noi di as text `"{phang}No file {inp:reproot-env.yaml} found in home directory {it:`homedir'}. This file is required to set up once per computer to use {cmd:reproot}. See instructions on how to set up this file {browse "https://dime-worldbank.github.io/repkit/articles/reproot-files.html":here}.{p_end}"' _n + error 601 + exit + } + + * Get reprootpaths and skipdirs from env file + reproot_parse env , file("`env_file'") + local envpaths `"`r(envpaths)'"' + local skipdirs `"`r(skipdirs)'"' + + /*************************************************** + Search each reprootpaths + ***************************************************/ + + foreach envpath of local envpaths { + + noi di as smcl `"{hline}"' _n + + * Parse max recursion and search path from reprootpath + gettoken maxrecs search_path : envpath, parse(":") + local search_path = substr("`search_path'",2,.) + + * Search next folder + noi di as result `"{pstd}{ul:Searching folder: `search_path', with folder depth: `maxrecs'}{p_end}"' + noi reproot_search, /// + path(`"`search_path'"') skipdirs(`"`skipdirs'"') recsleft(`maxrecs') + + * Get time, dir_count, and roots found + local time = `r(timer)' + local dirs = `r(num_dir_searched)' + local this_rootdirs = `"`r(rootdirs)'"' + + * Output this search + noi di_search_results, /// + time(`time') dcount(`dirs') rootdirs(`"`this_rootdirs'"') + + * Add these rootdirs to the list of all dirs + local rootdirs = trim(`"`rootdirs' `this_rootdirs'"') + + * Update the time and dir_count to the grand totals + local tot_time = `tot_time' + `time' + local tot_dirs = `tot_dirs' + `dirs' + } + + * Output the grand total + noi di as smcl `"{hline}"' + noi di_search_results, total /// + time(`tot_time') dcount(`tot_dirs') rootdirs(`"`rootdirs'"') + noi di as smcl `"{hline}"' + + + /*************************************************** + Parse the root files + ***************************************************/ + + local found_roots "" + + foreach rootdir of local rootdirs { + reproot_parse root, file("`rootdir'/`root_file'") + local this_root "`r(root)'" + local this_root_global "`prefix'`this_root'" + local this_root_project "`r(project)'" + + * Test if this root belongs the relevant project + if "`project'" == "`this_root_project'" { + + * Test if root was already found, if not then add to found_roots + if (`: list this_root in found_roots') { + noi di as error _n "{pstd}A second root called {result:`this_root)'} was found for this project found in folder {result:`rootdir'}.{p_end}" + error 99 + exit + } + local found_roots : list found_roots | this_root + noi di "found_roots `found_roots'" + + local found_str "Root {result:`this_root'} for project {result:`this_root_project'} found" + + + if (`: list this_root in roots') { + * Output that a relevant root has been found + noi di _n as text "{pstd}`found_str'. Setting global {result:{c S|}{c -(}`this_root_global'{c )-}} to: {result:`rootdir'}{p_end}" + + global `this_root_global' "`rootdir'" + } + * Root not required - just skip it + else { + noi di _n as text "{pstd}`found_str', but root not required, so no global is set for this root.{p_end}" + } + } + } + noi di _n `"{hline}"' + } + + * Return all root dires found regardless if they were for this project + return local rootdirs "`rootdirs'" + + // Remove then command is no longer in beta + noi repkit "beta reproot" + +} +end + + +cap program drop di_search_results + program define di_search_results + + syntax, time(numlist) dcount(numlist) [rootdirs(string) total] + + local time: display %8.2f `time' + local dcount: display %14.0fc `dcount' + + local rcount: list sizeof rootdirs + + local time = trim("`time'") + local dcount = trim("`dcount'") + + if missing("`total'") local intro_str "In this search directory" + else local intro_str "In total" + + noi di as result _n `"{pstd}`intro_str', `dcount' directories were searched in `time' seconds, and `rcount' reproot root(s) were found.{p_end}"' _n +end diff --git a/src/ado/reproot_parse.ado b/src/ado/reproot_parse.ado new file mode 100644 index 0000000..2344779 --- /dev/null +++ b/src/ado/reproot_parse.ado @@ -0,0 +1,363 @@ +*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org + +cap program drop reproot_parse + program define reproot_parse, rclass + +qui { + + version 14.1 + + * Update the syntax. This is only a placeholder to make the command run + syntax anything, file(string) + + if ("`anything'" == "env") { + reproot_parse_env , file("`file'") + return local envpaths = `"`r(envpaths)'"' + return local skipdirs = `"`r(skipdirs)'"' + } + else if ("`anything'" == "root") { + reproot_parse_root, file("`file'") + return local project = `"`r(project)'"' + return local root = `"`r(root)'"' + } + else { + noi di as error "{ptsd}reproot_parse: incorrect subcommand [`anything']{p_end}" + error 198 + exit + } +} +end + +cap program drop reproot_parse_env + program define reproot_parse_env, rclass + + syntax, file(string) + + local paths "" + local skipdirs "" + local recursedepth 31 // Default depth = Stata max recursion + + /********************************************************** + READ YAML FILE LINE BY LINE + **********************************************************/ + + * Open template to read from and new tempfile to write to + tempname re_file + file open `re_file' using "`file'", read + file read `re_file' line + + local linenum = 1 + + * Read YAML content into string + while r(eof)==0 { + + * Skip comments + if (substr(trim(`"`line'"'),1,1) != "#") { + + local this_indent = 0 + local this_keyword = "" + local this_value = "" + local valid_value = 0 + + * Count indent of this line - and set indent dependent locals + count_indent, line(`"`line'"') + local this_indent = "`r(indent)'" + if (`this_indent' == 0) { + local is_list 0 + local list_of "" + } + + * Trim line to remove indent + local line = trim(`"`line'"') + + ***************************************** + * Parse items that are part of a list + if (`is_list' == 1) { + parse_listitem, line(`"`line'"') allowed_value("string") + local this_value = `"`r(list_value)'"' + if (`r(valid_value)' == 0) { + noi di as error `"{pstd}Invalid list item on line `linenum' in file [`file']: [`line']{p_end}"' + error 98 + } + * Add to local named after the key word this list is part of, paths etc. + local `list_of' `"``list_of'' `this_value'"' + } + + ***************************************** + * Parse top level keywords + + else { + parse_keyword, line(`"`line'"') /// + allowed_keys("paths skipdirs recursedepth") + local this_keyword = "`r(keyword)'" + local this_value = `"`r(value)'"' + + if ("`this_keyword'" == "paths") { + parse_value, value(`"`this_value'"') allowed_values("list string") + local valid_value = `r(valid_value)' + } + else if ("`this_keyword'" == "skipdirs") { + parse_value, value(`"`this_value'"') allowed_values("list string") + local valid_value = `r(valid_value)' + } + else if ("`this_keyword'" == "recursedepth") { + parse_value, value(`"`this_value'"') allowed_values("number") + local valid_value = `r(valid_value)' + } + else { + noi di as error `"{pstd}Icorrect keyword used on line `linenum' in file [`file']: [`line']{p_end}"' + error 98 + } + + * Output error if invalid value + if (`valid_value' == 0) { + noi di as error `"{pstd}In valid value in file [`file'] on line `linenum': [`line']{p_end}"' + error 98 + } + + * Unless value is beginning of list, add the value to this keyword + if (`"`this_value'"' != "begin_list") { + local `this_keyword' `"`this_value'"' + } + else { + local is_list = 1 + local list_of = "`this_keyword'" + } + } + } + + * Read next line + file read `re_file' line + local linenum = 1 + `linenum' + } + + /********************************************************** + PREPARE VALUES TO RETURN + **********************************************************/ + + * Add default recurse depth if path does not have custom depth + local formatted_paths "" + foreach path of local paths { + noi prepend_recdepth , path(`path') recursedepth(`recursedepth') + local formatted_paths `"`formatted_paths' "`r(path)'" "' + } + + return local envpaths = trim(`"`formatted_paths'"') + return local skipdirs `"`skipdirs'"' +end + +cap program drop reproot_parse_root + program define reproot_parse_root, rclass + + * Update the syntax. This is only a placeholder to make the command run + syntax, file(string) + + /********************************************************** + READ YAML FILE LINE BY LINE + **********************************************************/ + + * Open template to read from and new tempfile to write to + tempname re_file + file open `re_file' using "`file'", read + file read `re_file' line + + local linenum = 1 + + while r(eof)==0 { + + * Skip comments + if (substr(trim(`"`line'"'),1,1) != "#") { + + local this_indent = 0 + local this_keyword = "" + local this_value = "" + local valid_value = 0 + + * Make sure that the root file does not have any indent + count_indent, line(`"`line'"') + if (`r(indent)' != 0) { + noi di as error `"{pstd}The root file [`file'] has an indent in line `linenum': [`line']. The root file is not allowed to have any indents.{p_end}"' + error 98 + exit + } + + * Trim line to remove indent + local line = trim(`"`line'"') + + * Parse the line for keyword and value + parse_keyword, line(`"`line'"') allowed_keys("project_name root_name") + local this_keyword = trim("`r(keyword)'") + local this_value = trim("`r(value)'") + + if ("`this_keyword'" == "project_name") { + parse_value, value(`"`this_value'"') allowed_values("string") + local valid_value = `r(valid_value)' + } + else if ("`this_keyword'" == "root_name") { + parse_value, value(`"`this_value'"') allowed_values("string") + local valid_value = `r(valid_value)' + } + else { + noi di as error `"{pstd}Incorrect keyword used on line `linenum' in file [`file']: [`line']{p_end}"' + error 98 + } + + * Add value named after this + local `this_keyword' `"`this_value'"' + } + + * Read next line + file read `re_file' line + local linenum = 1 + `linenum' + } + + * Test that both required keywords were used + local has_required_keys = 1 + if missing("`project_name'") local has_required_keys = 0 + if missing("`root_name'") local has_required_keys = 0 + if (`has_required_keys'==0) { + noi di as error `"{pstd}The root file [`file'] is missing at least one of the keywords project and root. Both are required{p_end}"' + error 98 + exit + } + + * Return rpoject and root + return local project = trim("`project_name'") + return local root = trim("`root_name'") + +end + + +* Parse out keyword from top level item +cap program drop parse_keyword + program define parse_keyword, rclass + + syntax, line(string) [allowed_keys(string)] + + * Parse key and value from line + gettoken keyword value : line, parse(": ") + + * Trim and clean the locals + local keyword = trim("`keyword'") + local value = trim(subinstr(`"`value'"',":","",1)) + + if !missing("`allowed_keys'") & !(`: list keyword in allowed_keys') { + noi di as error `"{pstd}The keyword [`keyword'] in line [`line'] is not allowed in the context it is used. Allowed keywords in that context are: [`allowed_keys'].{p_end}"' + error 99 + } + else { + if missing(`"`value'"') local value "begin_list" + * Return the indend + return local keyword `"`keyword'"' + return local value `"`value'"' + } +end + +cap program drop parse_value + program define parse_value, rclass + + syntax, value(string) allowed_values(string) + + local valid = 0 + + * Test if valid number + if (strpos("`allowed_values'","number")) { + cap confirm number `value' + if (_rc != 7) local valid = 1 + } + + * Test if valid list + if (strpos("`allowed_values'","list")) { + if (`"`value'"' == "begin_list") local valid = 1 + } + + * Test if valid dpuble quoted string with cahr(34) (i.e. ") + * as first and last character + if (strpos("`allowed_values'","string")) { + * Test that first and last charchter is char(34) - (.i.e ") + local c1 = (substr(`"`macval(value)'"',1,1) == char(34)) + local c2 = (substr(strreverse(`"`macval(value)'"'),1,1) == char(34)) + * test that the string do not have more than 2 char(34) - (.i.e ") + local s1 = !(strpos(subinstr(`"`macval(value)'"',char(34),"",2),char(34))) + + * Test that all above resulted in valid + if ((`c1') & (`c2') & (`s1')) local valid = 1 + } + + return local valid_value `valid' + +end + +cap program drop parse_listitem + program define parse_listitem, rclass + + syntax, line(string) allowed_value(string) + + local valid = 1 + + * Parse key and value from line + gettoken bullet value : line + if (trim(`"`bullet'"') != "-") local valid = 0 + + local value = trim(`"`value'"') + + parse_value, value(`"`value'"') allowed_values("`allowed_value'") + if (`r(valid_value)' == 0) local valid = 0 + + return local list_value `"`value'"' + return local valid_value `valid' + +end + +cap program drop prepend_recdepth + program define prepend_recdepth, rclass + + syntax , path(string) recursedepth(numlist) + + * Get part before first : + gettoken depth : path, parse(":") + + * Test if part before : is a valid depth, otherwise add general depth + cap confirm number `depth' + if (_rc) local returnpath `"`recursedepth':`path'"' + else local returnpath `"`path'"' + + return local path `"`returnpath'"' +end + +* Count indents, throws error if any non-standard single space is used. +cap program drop count_indent + program define count_indent, rclass + + syntax, line(string) + + * Get the line length + local linelen = strlen(`"`line'"') + * Initiate locals + local i = 0 + local indent_count = 0 + + * Loop over each character + while (`i'<`linelen') { + * Get next character + local c = substr(`"`line'"',`++i',1) + + * increment indent with 1 if a regular space + if (`"`c'"' == char(32)) { + local indent_count = 1 + `indent_count' + } + * Test for non standard whitespaces (tabs etc) + * This list comes from https://www.stata.com/manuals/fnstringfunctions.pdf + * in str function ustrltrim + else if inlist(`"`c'"',char(9),char(10),char(11),char(12),char(13)) { + * Set indent count to -1 and terminate while loop if found + local indent_count = -1 + local i = `linelen' + } + * If non-whitespace then terminate while loop as no more indent + else local i = `linelen' + } + + * Return the indend + return local indent `indent_count' +end diff --git a/src/ado/reproot_search.ado b/src/ado/reproot_search.ado new file mode 100644 index 0000000..5ce21e1 --- /dev/null +++ b/src/ado/reproot_search.ado @@ -0,0 +1,99 @@ +*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org + +cap program drop reproot_search + program define reproot_search, rclass + +qui { + + version 14.1 + + syntax , /// + path(string) /// + recsleft(numlist) /// + [ /// + skipdirs(string) /// + recurse /// + ] + + * Initiate dir counter at 1 + local d_count 1 + local next_recsleft = `recsleft' - 1 + local rootdirs "" + + local root_file "reproot.yaml" + + /*************************************************** + Initiate things in the original call + ***************************************************/ + + * Start a timer if orginal run. + if missing("`recurse'") { + * TODO: do not hardcode timer number, find first availible + timer clear 68 + timer on 68 + } + + /*************************************************** + Look for reproot file + ***************************************************/ + + * Test if this location has a root file + cap confirm file "`path'/`root_file'" + * File found, handle it + if (_rc == 0) { + noi di as text "{phang}{bf:root:} {it:`path'}{p_end}" + local rootdirs `""`path'""' + } + + /*************************************************** + Recurse over dirs + ***************************************************/ + + * test if recursion depth is met + if (`next_recsleft'>=0) { + * List all sub-folders (if any) in this directory + cap local dir_list : dir `"`path'"' dirs "*" + + * Handle file not found error + if (_rc == 601) { + noi di as text _n "{phang}{red:Warning:} Directory {inp:`path'} could not be searched. Check if the folder path is corrupt. It could also be that the path is longer than what your opertive system can handle.{p_end}" + local dir_list "" + } + + *Run command again to throw unandled error to user + else if (_rc == 0) local dir_list : dir `"`path'"' dirs "*" + + * Recure into dirs unless it is part of skip folders + foreach dir of local dir_list { + if !(`: list dir in skipdirs') { + reproot_search, /// + path("`path'/`dir'") /// + recsleft(`next_recsleft') /// + skipdirs(`skipdirs') /// + recurse + local d_count = `d_count' + `r(num_dir_searched)' + if !missing(`"`r(rootdirs)'"') { + local rootdirs `"`rootdirs' `r(rootdirs)'"' + } + } + } + } + + /*************************************************** + Return number of dirs counted + ***************************************************/ + + return local num_dir_searched `d_count' + return local rootdirs `"`rootdirs'"' + + /*************************************************** + If original call, output info + ***************************************************/ + + if missing("`recurse'") { + timer off 68 + qui timer list 68 + return local timer `r(t68)' + } +} +end diff --git a/src/ado/reprun.ado b/src/ado/reprun.ado index 40ac04d..1d9bb94 100644 --- a/src/ado/reprun.ado +++ b/src/ado/reprun.ado @@ -1,992 +1,992 @@ -*! version 1.1 17DEC2024 DIME Analytics dimeanalytics@worldbank.org +*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org cap program drop reprun program define reprun, rclass - qui { +qui { - version 13.0 + version 14.1 - syntax anything [using/] , [Verbose] [Compact] [noClear] [Debug] [Suppress(passthru)] + syntax anything [using/] , [Verbose] [Compact] [noClear] [Debug] [Suppress(passthru)] - /***************************************************************************** - Syntax parsing and setup - *****************************************************************************/ + /***************************************************************************** + Syntax parsing and setup + *****************************************************************************/ - * Get the name of just the file without the path - local dofile `anything' - local orig_fname = substr(`"`dofile'"',strrpos(`"`dofile'"',"/")+1,.) + * Get the name of just the file without the path + local dofile `anything' + local orig_fname = substr(`"`dofile'"',strrpos(`"`dofile'"',"/")+1,.) - local output `using' - if `"`output'"' == `""' { - local output = substr(`"`dofile'"',1,strrpos(`"`dofile'"',"/")) - } + local output `using' + if `"`output'"' == `""' { + local output = substr(`"`dofile'"',1,strrpos(`"`dofile'"',"/")) + } - if !missing("`debug'") di `"`dofile' || `output'"' + if !missing("`debug'") di `"`dofile' || `output'"' - if missing(`"`clear'"') { - clear // Data matches, zeroed out by default - set seed 12345 // Use Stata default setting when starting routine - } + if missing(`"`clear'"') { + clear // Data matches, zeroed out by default + set seed 12345 // Use Stata default setting when starting routine + } - /************************************************************************* - Test input - *************************************************************************/ + /************************************************************************* + Test input + *************************************************************************/ - *Test that output location exist - mata : st_numscalar("r(dirExist)", direxists("`output'")) - if (`r(dirExist)' == 0) { - noi di as error `"{phang}The folder used in [output(`output')] does not exist.{p_end}"' - error 693 - exit - } + *Test that output location exist + mata : st_numscalar("r(dirExist)", direxists("`output'")) + if (`r(dirExist)' == 0) { + noi di as error `"{phang}The folder used in [output(`output')] does not exist.{p_end}"' + error 693 + exit + } - * Cannot choose verbose and compact - if !missing(`"`verbose'"') local compact "" - - /************************************************************************* - Set up output structure - *************************************************************************/ - - local dirout "`output'/reprun" - * Remove existing output if it exists - mata : st_numscalar("r(dirExist)", direxists("`dirout'")) - if (`r(dirExist)' == 1) rm_output_dir, folder("`dirout'") - * Create the new output folder structure - mkdir "`dirout'" - - * Create the subfolders in the output folder structure - foreach outdir_run in run1 run2 { - * Create a local to folder path and create the folder - local d`outdir_run' "`dirout'/`outdir_run'" - mkdir "`d`outdir_run''" - } + * Cannot choose verbose and compact + if !missing(`"`verbose'"') local compact "" + + /************************************************************************* + Set up output structure + *************************************************************************/ + + local dirout "`output'/reprun" + * Remove existing output if it exists + mata : st_numscalar("r(dirExist)", direxists("`dirout'")) + if (`r(dirExist)' == 1) rm_output_dir, folder("`dirout'") + * Create the new output folder structure + mkdir "`dirout'" + + * Create the subfolders in the output folder structure + foreach outdir_run in run1 run2 { + * Create a local to folder path and create the folder + local d`outdir_run' "`dirout'/`outdir_run'" + mkdir "`d`outdir_run''" + } - /************************************************************************* - Generate the run 1 and run 2 do-files - *************************************************************************/ - - noi di as res "" - noi di as err "{phang}Starting reprun. Creating the do-files for run 1 and run 2.{p_end}" - noi reprun_recurse, dofile("`dofile'") output("`dirout'") stub("m") - local code_file_run1 "`r(code_file_run1)'" - local code_file_run2 "`r(code_file_run2)'" - noi di as err "{phang}Done creating the do-files for run 1 and run 2.{p_end}" - - /************************************************************************* - Execute the run 1 and run 2 file to write the data files - *************************************************************************/ - - * Run 1 - noi di as err `"{phang}Executing "`orig_fname'" for run 1.{p_end}"' - clear - do "`code_file_run1'" - noi di as err `"{phang}Done executing "`orig_fname'" for run 1.{p_end}"' - - * Run 2 - noi di as err `"{phang}Executing "`orig_fname'" for run 2.{p_end}"' - clear - do "`code_file_run2'" - noi di as err `"{phang}Done executing "`orig_fname'" for run 2.{p_end}"' - - /************************************************************************* - Compare the data files and output the result - *************************************************************************/ - - * Output locals - local outputcolumns "10 37 64 91 110" - tempname h_smcl - tempfile f_smcl - - noi di as res "{phang}Generating the report for comparing the two runs.{p_end}" - - * Set up output smcl file - file open `h_smcl' using `f_smcl', write - noi write_and_print_output, h_smcl(`h_smcl') intro_output - - * Set up the titles for the first recursive call - noi write_and_print_output, h_smcl(`h_smcl') l1(" ") /// - l2(`"{phang}Checking file:{p_end}"') - noi print_filetree_and_verbose_title, /// - files(`" "`dofile'" "') h_smcl(`h_smcl') `verbose' `compact' - output_writetitle , outputcolumns("`outputcolumns'") - noi write_and_print_output, h_smcl(`h_smcl') /// - l1("`r(topline)'") l2("`r(state_titles)'") /// - l3("`r(col_titles)'") l4("`r(midline)'") + /************************************************************************* + Generate the run 1 and run 2 do-files + *************************************************************************/ + + noi di as res "" + noi di as err "{phang}Starting reprun. Creating the do-files for run 1 and run 2.{p_end}" + noi reprun_recurse, dofile("`dofile'") output("`dirout'") stub("m") + local code_file_run1 "`r(code_file_run1)'" + local code_file_run2 "`r(code_file_run2)'" + noi di as err "{phang}Done creating the do-files for run 1 and run 2.{p_end}" + + /************************************************************************* + Execute the run 1 and run 2 file to write the data files + *************************************************************************/ + + * Run 1 + noi di as err `"{phang}Executing "`orig_fname'" for run 1.{p_end}"' + clear + do "`code_file_run1'" + noi di as err `"{phang}Done executing "`orig_fname'" for run 1.{p_end}"' + + * Run 2 + noi di as err `"{phang}Executing "`orig_fname'" for run 2.{p_end}"' + clear + do "`code_file_run2'" + noi di as err `"{phang}Done executing "`orig_fname'" for run 2.{p_end}"' + + /************************************************************************* + Compare the data files and output the result + *************************************************************************/ + + * Output locals + local outputcolumns "10 37 64 91 110" + tempname h_smcl + tempfile f_smcl + + noi di as res "{phang}Generating the report for comparing the two runs.{p_end}" + + * Set up output smcl file + file open `h_smcl' using `f_smcl', write + noi write_and_print_output, h_smcl(`h_smcl') intro_output + + * Set up the titles for the first recursive call + noi write_and_print_output, h_smcl(`h_smcl') l1(" ") /// + l2(`"{phang}Checking file:{p_end}"') + noi print_filetree_and_verbose_title, /// + files(`" "`dofile'" "') h_smcl(`h_smcl') `verbose' `compact' + output_writetitle , outputcolumns("`outputcolumns'") + noi write_and_print_output, h_smcl(`h_smcl') /// + l1("`r(topline)'") l2("`r(state_titles)'") /// + l3("`r(col_titles)'") l4("`r(midline)'") + + * Start the recursive call + noi recurse_comp_lines , dirout("`dirout'") stub("m") /// + orgfile(`"`dofile'"') outputcolumns("`outputcolumns'") /// + `verbose' `compact' h_smcl(`h_smcl') `suppress' + + * Write line that close table + output_writetitle , outputcolumns("`outputcolumns'") + noi write_and_print_output, h_smcl(`h_smcl') /// + l1(`"{phang}Done checking file:{p_end}"') /// + l2(`"{pstd}{c BLC}{hline 1}> `dofile'{p_end}"') l3("{hline}") + file close `h_smcl' + + /***************************************************************************** + Write smcl file to disk and clean up intermediate files unless debugging + *****************************************************************************/ + + copy `f_smcl' "`dirout'/`orig_fname'.reprun.smcl" , replace + noi di as res "" + noi di as res `"{phang}SMCL-file with report written to: {view "`dirout'/`orig_fname'.reprun.smcl"}{p_end}"' + + if missing("`debug'") { + rm_output_dir , folder("`dirout'/run1/") + rm_output_dir , folder("`dirout'/run2/") + } + } - * Start the recursive call - noi recurse_comp_lines , dirout("`dirout'") stub("m") /// - orgfile(`"`dofile'"') outputcolumns("`outputcolumns'") /// - `verbose' `compact' h_smcl(`h_smcl') `suppress' + // Remove then command is no longer in beta + noi repkit "beta reprun" - * Write line that close table - output_writetitle , outputcolumns("`outputcolumns'") - noi write_and_print_output, h_smcl(`h_smcl') /// - l1(`"{phang}Done checking file:{p_end}"') /// - l2(`"{pstd}{c BLC}{hline 1}> `dofile'{p_end}"') l3("{hline}") - file close `h_smcl' +end - /***************************************************************************** - Write smcl file to disk and clean up intermediate files unless debugging - *****************************************************************************/ + /*************************************************************************** + **************************************************************************** - copy `f_smcl' "`dirout'/`orig_fname'.reprun.smcl" , replace - noi di as res "" - noi di as res `"{phang}SMCL-file with report written to: {view "`dirout'/`orig_fname'.reprun.smcl"}{p_end}"' + Sub-programs for: Writing run 1 and run 2 dofile - if missing("`debug'") { - rm_output_dir , folder("`dirout'/run1/") - rm_output_dir , folder("`dirout'/run2/") - } - } + **************************************************************************** + ***************************************************************************/ - // Remove then command is no longer in beta - noi repkit "beta reprun" + * Go over the do-file to create run 1 and run 2 do-files. + * Run 1 and 2 are identical to each other. + cap program drop reprun_recurse + program define reprun_recurse, rclass + qui { - end + syntax, dofile(string) output(string) stub(string) - /*************************************************************************** - **************************************************************************** + /************************************************************************* + Create the files that this recursive call needs + *************************************************************************/ - Sub-programs for: Writing run 1 and run 2 dofile + * For each run there will be two files file. + * One code do-file that is a copy of the original file but writes states. + * One data txt-file that the states are written to + foreach run in 1 2 { + * Create code and data output file for each run + tempname code_`run' data_`run' + *Create locals for the file + local code_f`run' "`output'/run`run'/`stub'.do" + local data_f`run' "`output'/run`run'/`stub'.txt" + * Create the files + file open `code_`run'' using "`code_f`run''", write + file open `data_`run'' using "`data_f`run''", write + } - **************************************************************************** - ***************************************************************************/ + /************************************************************************* + Loop over all lines in the do-file + *************************************************************************/ + + * Line write locals + local lnum = 1 // line number tracker + local leof = 0 // end-of file tracker + local subf_n = 0 // tracker of sub-dofiles + + * Line parse locals + local block_stack "" + local loopblock 0 + local commentblock 0 + local last_line "" + local loop_stack "" + local lastline_capture 0 + + * Open the orginal file + tempname code_orig + file open `code_orig' using "`dofile'", read + + * Loop until end of file + while `leof' == 0 { + * Read next line + file read `code_orig' line + local leof = `r(eof)' + + /* Lines with /// are concatenated to long single lines. + If the previous line was a /// line then that content is + in the last_line local which is here concatenated. */ + local line = `"`macval(last_line)' `macval(line)'"' + + * Analyze line in parser to see if this line needs + * and special handling + org_line_parse, line(`"`macval(line)'"') + local write_dataline = `r(write_dataline)' + local firstw = `"`r(firstw)'"' + local secondw = `"`r(secondw)'"' + local thirdw = `"`r(thirdw)'"' + local line_wrap = `r(line_wrap)' + local block_end = `r(block_end)' + local block_add = `"`r(block_add)'"' + local has_rc = `"`r(has_rc)'"' + + * If this row is a closed curly bracket then + * remove most recent word from stack + if (`r(block_end)' == 1) { + local block_pop : word 1 of `block_stack' + local block_stack = subinstr("`block_stack'","`block_pop'","",1) + + if inlist("`block_pop'","foreach","forvalues","while") { + local loop_stack = strreverse("`loop_stack'") + local loop_pop : word 1 of `loop_stack' + local loop_stack = strreverse( /// + subinstr("`loop_stack'","`loop_pop'","",1)) + } + } - * Go over the do-file to create run 1 and run 2 do-files. - * Run 1 and 2 are identical to each other. - cap program drop reprun_recurse - program define reprun_recurse, rclass - qui { + * Add if/else/noi/qui to block stack + if !missing("`r(block_add)'") { + local block_stack "`r(block_add)' `block_stack' " + } - syntax, dofile(string) output(string) stub(string) + * Reset default locals for this line + local write_recline = 0 - /************************************************************************* - Create the files that this recursive call needs - *************************************************************************/ + * Remove /// and pass this line to be included in next line as + * multiline code is being written to one line in the write/check files + if (`line_wrap' == 1) { + local break_pos = strpos(`"`macval(line)'"',"///") + local last_line = substr(`"`macval(line)'"',1,`break_pos'-1) + } - * For each run there will be two files file. - * One code do-file that is a copy of the original file but writes states. - * One data txt-file that the states are written to - foreach run in 1 2 { - * Create code and data output file for each run - tempname code_`run' data_`run' - *Create locals for the file - local code_f`run' "`output'/run`run'/`stub'.do" - local data_f`run' "`output'/run`run'/`stub'.txt" - * Create the files - file open `code_`run'' using "`code_f`run''", write - file open `data_`run'' using "`data_f`run''", write - } + * Not part of a multiline line + else { - /************************************************************************* - Loop over all lines in the do-file - *************************************************************************/ - - * Line write locals - local lnum = 1 // line number tracker - local leof = 0 // end-of file tracker - local subf_n = 0 // tracker of sub-dofiles - - * Line parse locals - local block_stack "" - local loopblock 0 - local commentblock 0 - local last_line "" - local loop_stack "" - local lastline_capture 0 - - * Open the orginal file - tempname code_orig - file open `code_orig' using "`dofile'", read - - * Loop until end of file - while `leof' == 0 { - * Read next line - file read `code_orig' line - local leof = `r(eof)' - - /* Lines with /// are concatenated to long single lines. - If the previous line was a /// line then that content is - in the last_line local which is here concatenated. */ - local line = `"`macval(last_line)' `macval(line)'"' - - * Analyze line in parser to see if this line needs - * and special handling - org_line_parse, line(`"`macval(line)'"') - local write_dataline = `r(write_dataline)' - local firstw = `"`r(firstw)'"' - local secondw = `"`r(secondw)'"' - local thirdw = `"`r(thirdw)'"' - local line_wrap = `r(line_wrap)' - local block_end = `r(block_end)' - local block_add = `"`r(block_add)'"' - local has_rc = `"`r(has_rc)'"' - - * If this row is a closed curly bracket then - * remove most recent word from stack - if (`r(block_end)' == 1) { - local block_pop : word 1 of `block_stack' - local block_stack = subinstr("`block_stack'","`block_pop'","",1) - - if inlist("`block_pop'","foreach","forvalues","while") { - local loop_stack = strreverse("`loop_stack'") - local loop_pop : word 1 of `loop_stack' - local loop_stack = strreverse( /// - subinstr("`loop_stack'","`loop_pop'","",1)) - } + *Reset the last line local + local last_line = "" + get_command, word("`firstw'") + local line_command = "`r(command)'" + + * If using capture, log it and take second word as command + if (inlist("`line_command'","capture")) { + local lastline_capture = 1 + local write_dataline = 0 + * Move forward each word + local firstw = "`secondw'" + local secondw = "`thirdw'" + get_command, word("`firstw'") + local line_command = "`r(command)'" } - - * Add if/else/noi/qui to block stack - if !missing("`r(block_add)'") { - local block_stack "`r(block_add)' `block_stack' " + * Handle row that is not capture + else { + * Test if _rc was used on line that is + * not immedeatly after line with capture + if `lastline_capture' == 0 & `has_rc' == 1 { + noi di as error "{pstd}To make sure that {cmd:reprun} runs correctly, {cmd:_rc} is only allowed to be used immedeatly after the line where {cmd:capture} was used. See this article (TODO) for examples on how code can be rewritten to satisfy this requirement. Line number `lnum'.{p_end}" + error 99 + exit + } + * Make sure that is capture local is reset + local lastline_capture = 0 } - * Reset default locals for this line - local write_recline = 0 - - * Remove /// and pass this line to be included in next line as - * multiline code is being written to one line in the write/check files - if (`line_wrap' == 1) { - local break_pos = strpos(`"`macval(line)'"',"///") - local last_line = substr(`"`macval(line)'"',1,`break_pos'-1) - } + * Line is do or run, so call recursive function + if (inlist("`line_command'","do","run")) { - * Not part of a multiline line - else { + * Write line handling recursion in data file + local write_recline = 1 + * Keep working on the stub + local recursestub "`stub'_`++subf_n'" - *Reset the last line local - local last_line = "" - get_command, word("`firstw'") - local line_command = "`r(command)'" + * Get the file path from the second word + local file = `"`macval(secondw)'"' - * If using capture, log it and take second word as command - if (inlist("`line_command'","capture")) { - local lastline_capture = 1 - local write_dataline = 0 - * Move forward each word - local firstw = "`secondw'" - local secondw = "`thirdw'" - get_command, word("`firstw'") - local line_command = "`r(command)'" - } - * Handle row that is not capture - else { - * Test if _rc was used on line that is - * not immedeatly after line with capture - if `lastline_capture' == 0 & `has_rc' == 1 { - noi di as error "{pstd}To make sure that {cmd:reprun} runs correctly, {cmd:_rc} is only allowed to be used immedeatly after the line where {cmd:capture} was used. See this article (TODO) for examples on how code can be rewritten to satisfy this requirement. Line number `lnum'.{p_end}" - error 99 - exit - } - * Make sure that is capture local is reset - local lastline_capture = 0 - } + noi reprun_recurse, dofile("`file'") /// + output("`output'") /// + stub("`recursestub'") + local sub_f1 "`r(code_file_run1)'" + local sub_f2 "`r(code_file_run2)'" - * Line is do or run, so call recursive function - if (inlist("`line_command'","do","run")) { + * Substitute the original sub-dofile with the check/write ones + local run1_line = /// + subinstr(`"`line'"',`"`file'"',`""`sub_f1'""',1) + local run2_line = /// + subinstr(`"`line'"',`"`file'"',`""`sub_f2'""',1) - * Write line handling recursion in data file - local write_recline = 1 - * Keep working on the stub - local recursestub "`stub'_`++subf_n'" + *Correct potential ""path"" to "path" + local run1_line = subinstr(`"`run1_line'"',`""""',`"""',.) + local run2_line = subinstr(`"`run2_line'"',`""""',`"""',.) + } - * Get the file path from the second word - local file = `"`macval(secondw)'"' + * No special thing with row needing alteration, write row as is + else { - noi reprun_recurse, dofile("`file'") /// - output("`output'") /// - stub("`recursestub'") - local sub_f1 "`r(code_file_run1)'" - local sub_f2 "`r(code_file_run2)'" + * Copy the lines as is + local run1_line `"`macval(line)'"' + local run2_line `"`macval(line)'"' - * Substitute the original sub-dofile with the check/write ones - local run1_line = /// - subinstr(`"`line'"',`"`file'"',`""`sub_f1'""',1) - local run2_line = /// - subinstr(`"`line'"',`"`file'"',`""`sub_f2'""',1) + * Load the local in memory - important to + * build file paths in recursive calls + if inlist("`line_command'","local","global") { + `line' + } - *Correct potential ""path"" to "path" - local run1_line = subinstr(`"`run1_line'"',`""""',`"""',.) - local run2_line = subinstr(`"`run2_line'"',`""""',`"""',.) + * Write foreach/forvalues to block stack and + * it's macro name to loop stack + if inlist("`line_command'","foreach","forvalues") { + local block_stack "`line_command' `block_stack' " + local loop_stack = trim("`loop_stack' `secondw'") } - * No special thing with row needing alteration, write row as is - else { + * Write while to block stack and + * also "while" to loop stack as it does not have a macro name + if inlist("`line_command'","while") { + local block_stack "`line_command' `block_stack' " + local loop_stack = trim("`loop_stack' `line_command'") + } + } - * Copy the lines as is - local run1_line `"`macval(line)'"' - local run2_line `"`macval(line)'"' + if (`write_recline' == 1) { + file write `code_1' `"reprun_dataline, run(1) lnum(`lnum') datatmp("`data_f1'") recursestub(`recursestub') orgsubfile(`file')"' "`macval(rcout)'" _n + file write `code_2' `"reprun_dataline, run(2) lnum(`lnum') datatmp("`data_f2'") recursestub(`recursestub') orgsubfile(`file')"' "`macval(rcout)'" _n + } - * Load the local in memory - important to - * build file paths in recursive calls - if inlist("`line_command'","local","global") { - `line' - } + * Write the line copied from original file + file write `code_1' `"`macval(run1_line)'"' _n "`macval(rcin)'" _n + file write `code_2' `"`macval(run2_line)'"' _n "`macval(rcin)'" _n - * Write foreach/forvalues to block stack and - * it's macro name to loop stack - if inlist("`line_command'","foreach","forvalues") { - local block_stack "`line_command' `block_stack' " - local loop_stack = trim("`loop_stack' `secondw'") + if (`write_dataline' == 1) { + * prepare loop_string with macros + local loop_str = "" + foreach loop_macname of local loop_stack { + if ("`loop_macname'"=="while") { + local loop_str = "`macval(loop_str)' while" } - - * Write while to block stack and - * also "while" to loop stack as it does not have a macro name - if inlist("`line_command'","while") { - local block_stack "`line_command' `block_stack' " - local loop_stack = trim("`loop_stack' `line_command'") + else { + local loop_str = /// + "`macval(loop_str)' `loop_macname':\``loop_macname''" } } - if (`write_recline' == 1) { - file write `code_1' `"reprun_dataline, run(1) lnum(`lnum') datatmp("`data_f1'") recursestub(`recursestub') orgsubfile(`file')"' "`macval(rcout)'" _n - file write `code_2' `"reprun_dataline, run(2) lnum(`lnum') datatmp("`data_f2'") recursestub(`recursestub') orgsubfile(`file')"' "`macval(rcout)'" _n - } + * Write lines to run file 1 and 2 + file write `code_1' `"reprun_dataline, run(1) lnum(`lnum') datatmp("`data_f1'") looptracker("`macval(loop_str)'")"' _n "`macval(rcout)'" _n + file write `code_2' `"reprun_dataline, run(2) lnum(`lnum') datatmp("`data_f2'") looptracker("`macval(loop_str)'")"' _n "`macval(rcout)'" _n + } + } + local ++lnum + } - * Write the line copied from original file - file write `code_1' `"`macval(run1_line)'"' _n "`macval(rcin)'" _n - file write `code_2' `"`macval(run2_line)'"' _n "`macval(rcin)'" _n - - if (`write_dataline' == 1) { - * prepare loop_string with macros - local loop_str = "" - foreach loop_macname of local loop_stack { - if ("`loop_macname'"=="while") { - local loop_str = "`macval(loop_str)' while" - } - else { - local loop_str = /// - "`macval(loop_str)' `loop_macname':\``loop_macname''" - } - } + /************************************************************************* + Close all tempfiles + *************************************************************************/ - * Write lines to run file 1 and 2 - file write `code_1' `"reprun_dataline, run(1) lnum(`lnum') datatmp("`data_f1'") looptracker("`macval(loop_str)'")"' _n "`macval(rcout)'" _n - file write `code_2' `"reprun_dataline, run(2) lnum(`lnum') datatmp("`data_f2'") looptracker("`macval(loop_str)'")"' _n "`macval(rcout)'" _n - } - } - local ++lnum + foreach fh in `code_orig' `code_1' `code_2' `data_1' `data_2' { + file close `fh' + } + + /************************************************************************* + Return tempfiles so they can be used in when the test is run + *************************************************************************/ + return local code_file_run1 "`code_f1'" + return local code_file_run2 "`code_f2'" + } + end + + cap program drop org_line_parse + program define org_line_parse , rclass + + syntax, line(string) + + *Define defaults to be returned + local write_dataline 1 + local firstw "" + local secondw "" + local thirdw "" + local line_wrap 0 + local block_add "" + local block_end 0 + local has_rc 0 + + * Get the first words + tokenize `" `macval(line)' "' + + *********************************** + * Handle quietly and noisily + *********************************** + + get_command , word(`"`1'"') + local line_command = "`r(command)'" + + if inlist("`line_command'","quietly","noisily") { + * Test if beginning of a noi/qui block + if strpos(`"`macval(line)'"',"{") { + local block_add "`line_command'" } + * Retokenize without the noi/qui syntax (including the ":") + local nline = subinstr(`"`macval(line)'"',"`1'","",1) + if (`"`2'"' == ":") /// + local nline = subinstr(`"`macval(nline)'"',"`2'","",1) + if (substr(`"`1'"',1,1)==":") /// + local nline = subinstr(`"`macval(nline)'"',":","",1) + tokenize `" `macval(nline)' "' + } - /************************************************************************* - Close all tempfiles - *************************************************************************/ + *********************************** + * Handle if-else + *********************************** - foreach fh in `code_orig' `code_1' `code_2' `data_1' `data_2' { - file close `fh' + if inlist("`line_command'","if","else") { + if strpos(`"`macval(line)'"',"{") { + local block_add "`line_command'" } + } + + *********************************** + * Parse the line + *********************************** + + local firstw `"`macval(1)'"' + local secondw `"`macval(2)'"' + local thirdw `"`macval(3)'"' - /************************************************************************* - Return tempfiles so they can be used in when the test is run - *************************************************************************/ - return local code_file_run1 "`code_f1'" - return local code_file_run2 "`code_f2'" + * Empty line - skip writing to data file + if (itrim(trim(`"`macval(line)'"')) == "") { + local write_dataline 0 } - end - cap program drop org_line_parse - program define org_line_parse , rclass + * Closed curly bracket - End of block + else if (substr(`"`firstw'"',1,1)=="}") { + local write_dataline 0 + local block_end = 1 + } - syntax, line(string) + /* /// line wrap */ + if (strpos(`"`macval(line)'"',"///")) local line_wrap 1 - *Define defaults to be returned - local write_dataline 1 - local firstw "" - local secondw "" - local thirdw "" - local line_wrap 0 - local block_add "" - local block_end 0 - local has_rc 0 + /* Uses _rc */ + if (strpos(`"`macval(line)'"'," _rc ")) local has_rc 1 - * Get the first words - tokenize `" `macval(line)' "' + * Return all info + return local write_dataline `write_dataline' + return local firstw `"`macval(firstw)'"' + return local secondw `"`macval(secondw)'"' + return local thirdw `"`macval(thirdw)'"' + return local line_wrap `line_wrap' + return local block_end `block_end' + return local block_add "`block_add'" + return local has_rc "`has_rc'" - *********************************** - * Handle quietly and noisily - *********************************** + end - get_command , word(`"`1'"') - local line_command = "`r(command)'" + * This program see if the string passed in word() is a match + * (full word or abbreviation) to a command that toggles some + * special beavior when writing the write and check files + cap program drop get_command + program define get_command, rclass - if inlist("`line_command'","quietly","noisily") { - * Test if beginning of a noi/qui block - if strpos(`"`macval(line)'"',"{") { - local block_add "`line_command'" - } - * Retokenize without the noi/qui syntax (including the ":") - local nline = subinstr(`"`macval(line)'"',"`1'","",1) - if (`"`2'"' == ":") /// - local nline = subinstr(`"`macval(nline)'"',"`2'","",1) - if (substr(`"`1'"',1,1)==":") /// - local nline = subinstr(`"`macval(nline)'"',":","",1) - tokenize `" `macval(nline)' "' - } + syntax, [word(string)] + + local wlen = strlen(`"`word'"') - *********************************** - * Handle if-else - *********************************** + local commands "" + local commands "`commands' do ru:n" // File execution + local commands "`commands' foreach forv:alues while" // Iterations + local commands "`commands' if else" // Logic + local commands "`commands' loc:al gl:obal" // Macros + local commands "`commands' qui:etly n:oisily" // Qui/noi + local commands "`commands' cap:ture" // Qui/noi + local match = 0 - if inlist("`line_command'","if","else") { - if strpos(`"`macval(line)'"',"{") { - local block_add "`line_command'" + foreach command of local commands { + if (`match'==0) { + gettoken abbr rest : command, parse(":") + local rest = subinstr("`rest'",":","",1) + local labbr = strlen("`abbr'") + + *Test if minimum abbreviation is the same + if (substr(`"`word'"',1,`labbr')=="`abbr'") { + *Test if remaining part of the word match the rest of the command + if (substr(`"`word'"',`labbr'+1,.)==substr("`rest'",1,`wlen'-`labbr')) { + return local command `"`abbr'`rest'"' + local match = 1 + } + } } } - *********************************** - * Parse the line - *********************************** + * No match, return OTHER + if (`match'==0) { + return local command "OTHER" + } - local firstw `"`macval(1)'"' - local secondw `"`macval(2)'"' - local thirdw `"`macval(3)'"' + end - * Empty line - skip writing to data file - if (itrim(trim(`"`macval(line)'"')) == "") { - local write_dataline 0 - } + /***************************************************************************** + ****************************************************************************** - * Closed curly bracket - End of block - else if (substr(`"`firstw'"',1,1)=="}") { - local write_dataline 0 - local block_end = 1 - } + Sub-programs for: Comparing results in data files line by line - /* /// line wrap */ - if (strpos(`"`macval(line)'"',"///")) local line_wrap 1 - - /* Uses _rc */ - if (strpos(`"`macval(line)'"'," _rc ")) local has_rc 1 - - * Return all info - return local write_dataline `write_dataline' - return local firstw `"`macval(firstw)'"' - return local secondw `"`macval(secondw)'"' - return local thirdw `"`macval(thirdw)'"' - return local line_wrap `line_wrap' - return local block_end `block_end' - return local block_add "`block_add'" - return local has_rc "`has_rc'" - - end - - * This program see if the string passed in word() is a match - * (full word or abbreviation) to a command that toggles some - * special beavior when writing the write and check files - cap program drop get_command - program define get_command, rclass - - syntax, [word(string)] - - local wlen = strlen(`"`word'"') - - local commands "" - local commands "`commands' do ru:n" // File execution - local commands "`commands' foreach forv:alues while" // Iterations - local commands "`commands' if else" // Logic - local commands "`commands' loc:al gl:obal" // Macros - local commands "`commands' qui:etly n:oisily" // Qui/noi - local commands "`commands' cap:ture" // Qui/noi - local match = 0 - - foreach command of local commands { - if (`match'==0) { - gettoken abbr rest : command, parse(":") - local rest = subinstr("`rest'",":","",1) - local labbr = strlen("`abbr'") - - *Test if minimum abbreviation is the same - if (substr(`"`word'"',1,`labbr')=="`abbr'") { - *Test if remaining part of the word match the rest of the command - if (substr(`"`word'"',`labbr'+1,.)==substr("`rest'",1,`wlen'-`labbr')) { - return local command `"`abbr'`rest'"' - local match = 1 - } - } - } - } + ****************************************************************************** + *****************************************************************************/ - * No match, return OTHER - if (`match'==0) { - return local command "OTHER" - } + cap program drop recurse_comp_lines + program define recurse_comp_lines, rclass + qui { + syntax, dirout(string) stub(string) orgfile(string) /// + outputcolumns(string) h_smcl(string) [verbose] [compact] [suppress(passthru)] - end - /***************************************************************************** - ****************************************************************************** + local df1 "`dirout'/run1/`stub'.txt" + local df2 "`dirout'/run2/`stub'.txt" - Sub-programs for: Comparing results in data files line by line + tempname handle_df1 handle_df2 + file open `handle_df1' using "`df1'", read + file open `handle_df2' using "`df2'", read - ****************************************************************************** - *****************************************************************************/ + local prev_line1 "" + local prev_line2 "" - cap program drop recurse_comp_lines - program define recurse_comp_lines, rclass - qui { - syntax, dirout(string) stub(string) orgfile(string) /// - outputcolumns(string) h_smcl(string) [verbose] [compact] [suppress(passthru)] + * Loop over all lines in the two data files + local eof = 0 + while `eof' == 0 { + * Read next line of data file 1 + file read `handle_df1' line1 + local eof1 = `r(eof)' + * Read next line of data file 2 + file read `handle_df2' line2 + local eof2 = `r(eof)' - local df1 "`dirout'/run1/`stub'.txt" - local df2 "`dirout'/run2/`stub'.txt" + ***************************** + * Test lines to see if the comparison is valid + ***************************** - tempname handle_df1 handle_df2 - file open `handle_df1' using "`df1'", read - file open `handle_df2' using "`df2'", read + *Test if both lines are identitical + local lines_identical = (`"`line1'"'==`"`line2'"') - local prev_line1 "" - local prev_line2 "" + * Testing that not just one + local eof = (`eof1' + `eof2')/2 + if (`eof' == .5) { + noi di "Only one data file came to an end, that is an error" + error 198 + } - * Loop over all lines in the two data files - local eof = 0 - while `eof' == 0 { + * Test if rows are recurse rows + local is_recurse1 = ("`: word 1 of `line1''" == "recurse") + local is_recurse2 = ("`: word 1 of `line2''" == "recurse") + local recurse = (`is_recurse1' + `is_recurse2')/2 + if (`recurse' == .5) { + noi di as error "Internal error: It should never be the case that only one row is a recurse row" + error 198 + } + else if (`recurse' == 1 & `lines_identical' == 0) { + noi di as error "Internal error: Both rows are recurse but they are different" + error 198 + } - * Read next line of data file 1 - file read `handle_df1' line1 - local eof1 = `r(eof)' - * Read next line of data file 2 - file read `handle_df2' line2 - local eof2 = `r(eof)' + ***************************** + * Test lines to see if the comparison is valid + ***************************** - ***************************** - * Test lines to see if the comparison is valid - ***************************** + * Skip rest if reached end of file + if (`eof' != 1) { - *Test if both lines are identitical - local lines_identical = (`"`line1'"'==`"`line2'"') + * If line is a recurse line, then recurese over that file + if (`recurse' == 1 ) { - * Testing that not just one - local eof = (`eof1' + `eof2')/2 - if (`eof' == .5) { - noi di "Only one data file came to an end, that is an error" - error 198 - } + * Getting stub name and orig ndofile name from recurse line + local new_stub : word 2 of `line1' + local new_orgfile : word 3 of `line1' - * Test if rows are recurse rows - local is_recurse1 = ("`: word 1 of `line1''" == "recurse") - local is_recurse2 = ("`: word 1 of `line2''" == "recurse") - local recurse = (`is_recurse1' + `is_recurse2')/2 - if (`recurse' == .5) { - noi di as error "Internal error: It should never be the case that only one row is a recurse row" - error 198 - } - else if (`recurse' == 1 & `lines_identical' == 0) { - noi di as error "Internal error: Both rows are recurse but they are different" - error 198 - } + * Write end to previous table, write the file tree for the next + * recursion, and write the beginning of that table + output_writetitle , outputcolumns("`outputcolumns'") + noi write_and_print_output, h_smcl(`h_smcl') /// + l1("`r(botline)'") l2(" ") /// + l3(`"{pstd} Stepping into sub-file:{p_end}"') + noi print_filetree_and_verbose_title, /// + files(`" "`orgfile'" "`new_orgfile'" "') h_smcl(`h_smcl') `verbose' `compact' + output_writetitle , outputcolumns("`outputcolumns'") + noi write_and_print_output, h_smcl(`h_smcl') /// + l1("`r(topline)'") l2("`r(state_titles)'") /// + l3("`r(col_titles)'") l4("`r(midline)'") + + * Make the recurisive call for next file + noi recurse_comp_lines , dirout("`dirout'") stub("`new_stub'") /// + orgfile(`"`orgfile' "`new_orgfile'" "') /// + outputcolumns("`outputcolumns'") h_smcl(`h_smcl') `verbose' `compact' `suppress' - ***************************** - * Test lines to see if the comparison is valid - ***************************** - - * Skip rest if reached end of file - if (`eof' != 1) { - - * If line is a recurse line, then recurese over that file - if (`recurse' == 1 ) { - - * Getting stub name and orig ndofile name from recurse line - local new_stub : word 2 of `line1' - local new_orgfile : word 3 of `line1' - - * Write end to previous table, write the file tree for the next - * recursion, and write the beginning of that table - output_writetitle , outputcolumns("`outputcolumns'") - noi write_and_print_output, h_smcl(`h_smcl') /// - l1("`r(botline)'") l2(" ") /// - l3(`"{pstd} Stepping into sub-file:{p_end}"') - noi print_filetree_and_verbose_title, /// - files(`" `orgfile' "`new_orgfile'" "') h_smcl(`h_smcl') `verbose' `compact' - output_writetitle , outputcolumns("`outputcolumns'") - noi write_and_print_output, h_smcl(`h_smcl') /// + * Step back into this data file after the recursive call and: + * Write file tree, and write the titles to the continuation for + * this file + noi write_and_print_output, h_smcl(`h_smcl') /// + l1(`"{phang} Stepping back into file:{p_end}"') + noi print_filetree_and_verbose_title, /// + files(`" "`orgfile'" "') h_smcl(`h_smcl') `verbose' `compact' + output_writetitle , outputcolumns("`outputcolumns'") + noi write_and_print_output, h_smcl(`h_smcl') /// l1("`r(topline)'") l2("`r(state_titles)'") /// l3("`r(col_titles)'") l4("`r(midline)'") + } + * Line is data and not recurse : compare the lines + else { - * Make the recurisive call for next file - noi recurse_comp_lines , dirout("`dirout'") stub("`new_stub'") /// - orgfile(`"`orgfile' "`new_orgfile'" "') /// - outputcolumns("`outputcolumns'") h_smcl(`h_smcl') `verbose' `compact' `suppress' - - * Step back into this data file after the recursive call and: - * Write file tree, and write the titles to the continuation for - * this file - noi write_and_print_output, h_smcl(`h_smcl') /// - l1(`"{phang} Stepping back into file:{p_end}"') - noi print_filetree_and_verbose_title, /// - files(`" "`orgfile'" "') h_smcl(`h_smcl') `verbose' `compact' - output_writetitle , outputcolumns("`outputcolumns'") - noi write_and_print_output, h_smcl(`h_smcl') /// - l1("`r(topline)'") l2("`r(state_titles)'") /// - l3("`r(col_titles)'") l4("`r(midline)'") - } - * Line is data and not recurse : compare the lines - else { - - * Compare if lines are different across runs, but also if lines has changed since last line - compare_data_lines, /// - l1("`line1'") pl1("`prev_line1'") /// - l2("`line2'") pl2("`prev_line2'") /// - `suppress' - - * Only display line if there is a mismatch, or if option verbose - * is used, also output if there is a change from previous line - local write_outputline 0 - - * Check each value individually for changes and mismatches - foreach matchtype in rng srng dsig { - * Test if any line is "Change" - local any_change = /// - strpos("`r(`matchtype'_c1)'`r(`matchtype'_c2)'","Change") - if (`any_change' > 0 & !missing(`"`verbose'"')) /// - local write_outputline 1 - * Test if any line is "Missmatch" - local any_mismatch = /// - max(strpos("`r(`matchtype'_m)'","ERR"),strpos("`r(`matchtype'_m)'","DIFF")) - if (`any_mismatch' > 0) & missing(`"`compact'"') local write_outputline 1 - * Compact display - if (`any_mismatch' > 0) & (`any_change' > 0) local write_outputline 1 - } - - * If line is supposed to be outputted, write line - if (`write_outputline' == 1 ) { - output_writerow , /// - outputcolumns("`outputcolumns'") lnum("`r(lnum)'") /// - rng1("`r(rng_c1)'") rng2("`r(rng_c2)'") rngm("`r(rng_m)'") /// - srng1("`r(srng_c1)'") srng2("`r(srng_c2)'") srngm("`r(srng_m)'") /// - dsig1("`r(dsig_c1)'") dsig2("`r(dsig_c2)'") dsigm("`r(dsig_m)'") /// - loopiteration("`r(loopt)'") - noi write_and_print_output, h_smcl(`h_smcl') l1("`r(outputline)'") + * Compare if lines are different across runs, but also if lines has changed since last line + compare_data_lines, /// + l1("`line1'") pl1("`prev_line1'") /// + l2("`line2'") pl2("`prev_line2'") /// + `suppress' + + * Only display line if there is a mismatch, or if option verbose + * is used, also output if there is a change from previous line + local write_outputline 0 + + * Check each value individually for changes and mismatches + foreach matchtype in rng srng dsig { + * Test if any line is "Change" + local any_change = /// + strpos("`r(`matchtype'_c1)'`r(`matchtype'_c2)'","Change") + if (`any_change' > 0 & !missing(`"`verbose'"')) /// + local write_outputline 1 + * Test if any line is "Missmatch" + local any_mismatch = /// + max(strpos("`r(`matchtype'_m)'","ERR"),strpos("`r(`matchtype'_m)'","DIFF")) + if (`any_mismatch' > 0) & missing(`"`compact'"') local write_outputline 1 + * Compact display + if (`any_mismatch' > 0) & (`any_change' > 0) local write_outputline 1 } - * Load these lines into pre_line locals for next run - local prev_line1 "`line1'" - local prev_line2 "`line2'" + * If line is supposed to be outputted, write line + if (`write_outputline' == 1 ) { + output_writerow , /// + outputcolumns("`outputcolumns'") lnum("`r(lnum)'") /// + rng1("`r(rng_c1)'") rng2("`r(rng_c2)'") rngm("`r(rng_m)'") /// + srng1("`r(srng_c1)'") srng2("`r(srng_c2)'") srngm("`r(srng_m)'") /// + dsig1("`r(dsig_c1)'") dsig2("`r(dsig_c2)'") dsigm("`r(dsig_m)'") /// + loopiteration("`r(loopt)'") + noi write_and_print_output, h_smcl(`h_smcl') l1("`r(outputline)'") } - } - * End of this data file - else { - * Close the table for his file - output_writetitle , outputcolumns("`outputcolumns'") - noi write_and_print_output, h_smcl(`h_smcl') /// - l1("`r(botline)'") l2(" ") + + * Load these lines into pre_line locals for next run + local prev_line1 "`line1'" + local prev_line2 "`line2'" } } + * End of this data file + else { + * Close the table for his file + output_writetitle , outputcolumns("`outputcolumns'") + noi write_and_print_output, h_smcl(`h_smcl') /// + l1("`r(botline)'") l2(" ") + } } - end - - cap program drop compare_data_lines - program define compare_data_lines, rclass - - syntax, l1(string) l2(string) [pl1(string) pl2(string) suppress(string)] - - * Parse all lines and put then into locals to be compared - foreach line in l1 l2 pl1 pl2 { - local data "``line''" - while !missing(`"`data'"') { - * Parse next key:value pair of data - gettoken keyvaluepair data : data, parse("&") - local data = substr("`data'",2,.) // remove parse char - * Get key and value from pair and return - gettoken key value : keyvaluepair, parse(":") - local `line'_`key' = substr("`value'",2,.) // remove parse char - } + } + end + + cap program drop compare_data_lines + program define compare_data_lines, rclass + + syntax, l1(string) l2(string) [pl1(string) pl2(string) suppress(string)] + + * Parse all lines and put then into locals to be compared + foreach line in l1 l2 pl1 pl2 { + local data "``line''" + while !missing(`"`data'"') { + * Parse next key:value pair of data + gettoken keyvaluepair data : data, parse("&") + local data = substr("`data'",2,.) // remove parse char + * Get key and value from pair and return + gettoken key value : keyvaluepair, parse(":") + local `line'_`key' = substr("`value'",2,.) // remove parse char } + } - * Testing an returning line number - if ("`l1_l'" != "`l2_l'") { - noi di as error "Internal error: The line number should always be the same in data line from run 1 and run 2. But in this case line number in run 1 it is `l1_l', and in run 2 it is `l1_2'" - error 198 - } - return local lnum "`l1_l'" + * Testing an returning line number + if ("`l1_l'" != "`l2_l'") { + noi di as error "Internal error: The line number should always be the same in data line from run 1 and run 2. But in this case line number in run 1 it is `l1_l', and in run 2 it is `l1_2'" + error 198 + } + return local lnum "`l1_l'" - if ("`l1_loopt'" != "`l1_loopt'") { - noi di as error "Internal error: The looptracker should always be the same in data line from run 1 and run 2. But in this case looptracker in run 1 it is `l1_loopt', and in run 2 it is `l2_loopt'" - error 198 - } + if ("`l1_loopt'" != "`l1_loopt'") { + noi di as error "Internal error: The looptracker should always be the same in data line from run 1 and run 2. But in this case looptracker in run 1 it is `l1_loopt', and in run 2 it is `l2_loopt'" + error 198 + } - // Suppress loop info - if ("`l1_loopt'" == "`pl1_loopt'") & !missing("`l1_loopt'") & strpos("`suppress'","loop") /// - local l1_loopt "" - return local loopt "`l1_loopt'" + // Suppress loop info + if ("`l1_loopt'" == "`pl1_loopt'") & !missing("`l1_loopt'") & strpos("`suppress'","loop") /// + local l1_loopt "" + return local loopt "`l1_loopt'" - * Logic for minimal SRNG checker - local l1_srng = "`l1_srngstate'" - local pl1_srng = "`pl1_srngstate'" + * Logic for minimal SRNG checker + local l1_srng = "`l1_srngstate'" + local pl1_srng = "`pl1_srngstate'" - if ("`l2_srngcheck'" != "0") { - local l2_srng = "`l2_srngstate'" - local pl2_srng = "`pl2_srngstate'" - } - else { - local l2_srng = "`l1_srngstate'" - local pl2_srng = "`pl1_srngstate'" - } + if ("`l2_srngcheck'" != "0") { + local l2_srng = "`l2_srngstate'" + local pl2_srng = "`pl2_srngstate'" + } + else { + local l2_srng = "`l1_srngstate'" + local pl2_srng = "`pl1_srngstate'" + } - local arrow "{c -}{c -}{c -}{c -}{c -}>" + local arrow "{c -}{c -}{c -}{c -}{c -}>" - * Comparing all states since previous line and between runs - foreach state in rng srng dsig { + * Comparing all states since previous line and between runs + foreach state in rng srng dsig { - * Compare state in each run compared to previous line - local `state'_c1 = "" - local change1 0 - if ("`l1_`state''" != "`pl1_`state''") { - local `state'_c1 = "Change" - local change1 1 - } + * Compare state in each run compared to previous line + local `state'_c1 = "" + local change1 0 + if ("`l1_`state''" != "`pl1_`state''") { + local `state'_c1 = "Change" + local change1 1 + } - local `state'_c2 = "" - local change2 0 - if ("`l2_`state''" != "`pl2_`state''") { - local `state'_c2 = "Change" - local change2 1 - } + local `state'_c2 = "" + local change2 0 + if ("`l2_`state''" != "`pl2_`state''") { + local `state'_c2 = "Change" + local change2 1 + } - // Ignore RNG if seed is still on default seed - if ("`state'" == "rng") { - set seed 12345 - if ("`l1_`state''" == "`c(rngstate)'") { - local `state'_c1 = "" - local `state'_c2 = "" - local l1_`state' = "DEFAULT" - local l2_`state' = "DEFAULT" - local change1 0 - local change2 0 - } + // Ignore RNG if seed is still on default seed + if ("`state'" == "rng") { + set seed 12345 + if ("`l1_`state''" == "`c(rngstate)'") { + local `state'_c1 = "" + local `state'_c2 = "" + local l1_`state' = "DEFAULT" + local l2_`state' = "DEFAULT" + local change1 0 + local change2 0 } + } - * Return the labels for each state - return local `state'_c1 "``state'_c1'" - return local `state'_c2 "``state'_c2'" + * Return the labels for each state + return local `state'_c1 "``state'_c1'" + return local `state'_c2 "``state'_c2'" - ************************************************************ - * Compare states across runs + ************************************************************ + * Compare states across runs - * Match - if ("`l1_`state''" == "`l2_`state''") { - if !missing("``state'_c1'``state'_c2'") return local `state'_m "OK!" - } + * Match + if ("`l1_`state''" == "`l2_`state''") { + if !missing("``state'_c1'``state'_c2'") return local `state'_m "OK!" + } - * Not matching + * Not matching + else { + * Stata changes in both runs, but to different values + if (`change1' & `change2') { + return local `state'_m "{err:DIFF}" + } + * Neither value changed, they were different from before + else if (!`change1' & !`change2') { + return local `state'_m "" + } + * Only one value changed - that is an error else { - * Stata changes in both runs, but to different values - if (`change1' & `change2') { - return local `state'_m "{err:DIFF}" - } - * Neither value changed, they were different from before - else if (!`change1' & !`change2') { - return local `state'_m "" - } - * Only one value changed - that is an error - else { - return local `state'_m "{err:ERR}" - } + return local `state'_m "{err:ERR}" } + } - } - end + } + end - /***************************************************************************** - ****************************************************************************** + /***************************************************************************** + ****************************************************************************** - Sub-programs for: Output results + Sub-programs for: Output results - ****************************************************************************** - *****************************************************************************/ + ****************************************************************************** + *****************************************************************************/ - * This sub-program prints output to file and screen. - * It can print up to 6 lines at the same time l1-l6 - * It has a shorthand to print the intro output - cap program drop write_and_print_output - program define write_and_print_output, rclass - - syntax , h_smcl(string) [intro_output /// - l1(string) l2(string) l3(string) l4(string) l5(string) l6(string)] - - * Prepare setup lines - if !missing("`intro_output'") { - local l1 " " - local l2 "{hline}" - local l3 "`line'{phang}reprun output created by user `c(username)' at `c(current_date)' `c(current_time)'{p_end}" - local l4 "`line'{phang}Operating System `c(machine_type)' `c(os)' `c(osdtl)'{p_end}" - local l5 "`line'{phang}Stata `c(edition_real)' - Version `c(stata_version)' running as version `c(version)'{p_end}" - local l6 "{hline}" - } + * This sub-program prints output to file and screen. + * It can print up to 6 lines at the same time l1-l6 + * It has a shorthand to print the intro output + cap program drop write_and_print_output + program define write_and_print_output, rclass - * Output and write the lines - forvalues line = 1/6 { - if !missing(`"`l`line''"') { - noi di as res `"`l`line''"' - file write `h_smcl' `"`l`line''"' _n - } - } - end - - cap program drop output_writerow - program define output_writerow, rclass - - syntax , outputcolumns(numlist) lnum(string) /// - [rng1(string) rng2(string) rngm(string) /// - srng1(string) srng2(string) srngm(string) /// - dsig1(string) dsig2(string) dsigm(string) /// - loopiteration(string)] - - local c1 : word 1 of `outputcolumns' - local c2 : word 2 of `outputcolumns' - local c3 : word 3 of `outputcolumns' - local c4 : word 4 of `outputcolumns' - local c5 : word 5 of `outputcolumns' - - * Line number - local out_line "{c |} `lnum' {col `c1'}" - - * Rng state - local c1 = (`c1' + 9) - local out_line "`out_line'{c |} `rng1'{col `c1'}" - local c1 = (`c1' + 9) - local out_line "`out_line' `rng2'{col `c1'}" - local out_line "`out_line' `rngm'{col `c2'}" - - * Sort rng state - local c2 = (`c2' + 9) - local out_line "`out_line'{c |} `srng1'{col `c2'}" - local c2 = (`c2' + 9) - local out_line "`out_line' `srng2'{col `c2'}" - local out_line "`out_line' `srngm'{col `c3'}" - - * Datasignature - local c3 = (`c3' + 9) - local out_line "`out_line'{c |} `dsig1'{col `c3'}" - local c3 = (`c3' + 9) - local out_line "`out_line' `dsig2'{col `c3'}" - local out_line "`out_line' `dsigm'{col `c4'}" - - - local out_line "`out_line'{c |} `loopiteration'" - return local outputline `out_line' - - end - - cap program drop output_writetitle - program define output_writetitle, rclass - - syntax , outputcolumns(string) - - local c1 : word 1 of `outputcolumns' - local c2 : word 2 of `outputcolumns' - local c3 : word 3 of `outputcolumns' - local c4 : word 4 of `outputcolumns' - local c5 : word 5 of `outputcolumns' - - local h1 = `c1'-2 - local h2 = `c2'-`c1'-1 - local h3 = `c3'-`c2'-1 - local h4 = `c4'-`c3'-1 - local h5 = `c5'-`c4'-1 - - * Top-line - local tt "{c TT}" - local tl "{c TLC}{hline `h1'}`tt'{hline `h2'}`tt'{hline `h3'}`tt'{hline `h4'}" - return local topline "`tl'`tt'{hline `h5'}" - - * State titel line - local sl "{c |}{col `c1'}" - local sl "`sl'{c |}{dup 6: }Seed RNG State{col `c2'}" - local sl "`sl'{c |}{dup 6: }Sort Order RNG{col `c3'}" - local sl "`sl'{c |}{dup 6: }Data Signature{col `c4'}" - return local state_titles "`sl'{c |}" - - * Column title line - local ct "{c |} Run 1 {c |} Run 2 {c |} Match " - local cl "{c |} Line # {col `c1'}" - local cl "`cl'`ct'{col `c2'}`ct'{col `c3'}`ct'{col `c4'}{c |}" - return local col_titles "`cl' Loop iteration:" - - * Middle-line - local mt "{c +}" - local ml "{c LT}{hline 8}`mt'" - local ml "`ml'{hline 8}`mt'{hline 8}`mt'{hline 8}`mt'" - local ml "`ml'{hline 8}`mt'{hline 8}`mt'{hline 8}`mt'" - local ml "`ml'{hline 8}`mt'{hline 8}`mt'{hline 8}`mt'" - return local midline "`ml'{hline `h5'}" - - * Bottom-line - local bt "{c BT}" - local bl "{c BLC}{hline 8}`bt'" - local bl "`bl'{hline 8}`bt'{hline 8}`bt'{hline 8}`bt'" - local bl "`bl'{hline 8}`bt'{hline 8}`bt'{hline 8}`bt'" - local bl "`bl'{hline 8}`bt'{hline 8}`bt'{hline 8}`bt'" - return local botline "`bl'{hline `h5'}" - - end - - * Print file tree - cap program drop print_filetree_and_verbose_title - program define print_filetree_and_verbose_title, rclass - syntax , files(string) h_smcl(string) [verbose] [compact] - local file_count = 0 - foreach file of local files { - noi write_and_print_output, h_smcl(`h_smcl') /// - l1(`"{pstd}{c BLC}{hline `++file_count'}> `file'{p_end}"') - } + syntax , h_smcl(string) [intro_output /// + l1(string) l2(string) l3(string) l4(string) l5(string) l6(string)] - noi di "" - if missing("`verbose'") & missing("`compact'") { - noi di as res "{phang}Lines where Run 1 and Run 2 mismatch for any value:{p_end}" - } + * Prepare setup lines + if !missing("`intro_output'") { + local l1 " " + local l2 "{hline}" + local l3 "`line'{phang}reprun output created by user `c(username)' at `c(current_date)' `c(current_time)'{p_end}" + local l4 "`line'{phang}Operating System `c(machine_type)' `c(os)' `c(osdtl)'{p_end}" + local l5 "`line'{phang}Stata `c(edition_real)' - Version `c(stata_version)' running as version `c(version)'{p_end}" + local l6 "{hline}" + } - if !missing("`verbose'") { - noi di as res "{phang}Lines where Run 1 and Run 2 mismatch {ul:or} change for any value:{p_end}" + * Output and write the lines + forvalues line = 1/6 { + if !missing(`"`l`line''"') { + noi di as res `"`l`line''"' + file write `h_smcl' `"`l`line''"' _n } + } + end + + cap program drop output_writerow + program define output_writerow, rclass + + syntax , outputcolumns(numlist) lnum(string) /// + [rng1(string) rng2(string) rngm(string) /// + srng1(string) srng2(string) srngm(string) /// + dsig1(string) dsig2(string) dsigm(string) /// + loopiteration(string)] + + local c1 : word 1 of `outputcolumns' + local c2 : word 2 of `outputcolumns' + local c3 : word 3 of `outputcolumns' + local c4 : word 4 of `outputcolumns' + local c5 : word 5 of `outputcolumns' + + * Line number + local out_line "{c |} `lnum' {col `c1'}" + + * Rng state + local c1 = (`c1' + 9) + local out_line "`out_line'{c |} `rng1'{col `c1'}" + local c1 = (`c1' + 9) + local out_line "`out_line' `rng2'{col `c1'}" + local out_line "`out_line' `rngm'{col `c2'}" + + * Sort rng state + local c2 = (`c2' + 9) + local out_line "`out_line'{c |} `srng1'{col `c2'}" + local c2 = (`c2' + 9) + local out_line "`out_line' `srng2'{col `c2'}" + local out_line "`out_line' `srngm'{col `c3'}" + + * Datasignature + local c3 = (`c3' + 9) + local out_line "`out_line'{c |} `dsig1'{col `c3'}" + local c3 = (`c3' + 9) + local out_line "`out_line' `dsig2'{col `c3'}" + local out_line "`out_line' `dsigm'{col `c4'}" + + + local out_line "`out_line'{c |} `loopiteration'" + return local outputline `out_line' + + end + + cap program drop output_writetitle + program define output_writetitle, rclass + + syntax , outputcolumns(string) + + local c1 : word 1 of `outputcolumns' + local c2 : word 2 of `outputcolumns' + local c3 : word 3 of `outputcolumns' + local c4 : word 4 of `outputcolumns' + local c5 : word 5 of `outputcolumns' + + local h1 = `c1'-2 + local h2 = `c2'-`c1'-1 + local h3 = `c3'-`c2'-1 + local h4 = `c4'-`c3'-1 + local h5 = `c5'-`c4'-1 + + * Top-line + local tt "{c TT}" + local tl "{c TLC}{hline `h1'}`tt'{hline `h2'}`tt'{hline `h3'}`tt'{hline `h4'}" + return local topline "`tl'`tt'{hline `h5'}" + + * State titel line + local sl "{c |}{col `c1'}" + local sl "`sl'{c |}{dup 6: }Seed RNG State{col `c2'}" + local sl "`sl'{c |}{dup 6: }Sort Order RNG{col `c3'}" + local sl "`sl'{c |}{dup 6: }Data Signature{col `c4'}" + return local state_titles "`sl'{c |}" + + * Column title line + local ct "{c |} Run 1 {c |} Run 2 {c |} Match " + local cl "{c |} Line # {col `c1'}" + local cl "`cl'`ct'{col `c2'}`ct'{col `c3'}`ct'{col `c4'}{c |}" + return local col_titles "`cl' Loop iteration:" + + * Middle-line + local mt "{c +}" + local ml "{c LT}{hline 8}`mt'" + local ml "`ml'{hline 8}`mt'{hline 8}`mt'{hline 8}`mt'" + local ml "`ml'{hline 8}`mt'{hline 8}`mt'{hline 8}`mt'" + local ml "`ml'{hline 8}`mt'{hline 8}`mt'{hline 8}`mt'" + return local midline "`ml'{hline `h5'}" + + * Bottom-line + local bt "{c BT}" + local bl "{c BLC}{hline 8}`bt'" + local bl "`bl'{hline 8}`bt'{hline 8}`bt'{hline 8}`bt'" + local bl "`bl'{hline 8}`bt'{hline 8}`bt'{hline 8}`bt'" + local bl "`bl'{hline 8}`bt'{hline 8}`bt'{hline 8}`bt'" + return local botline "`bl'{hline `h5'}" + + end + + * Print file tree + cap program drop print_filetree_and_verbose_title + program define print_filetree_and_verbose_title, rclass + syntax , files(string) h_smcl(string) [verbose] [compact] + local file_count = 0 + foreach file of local files { + noi write_and_print_output, h_smcl(`h_smcl') /// + l1(`"{pstd}{c BLC}{hline `++file_count'}> `file'{p_end}"') + } - if !missing("`compact'") { - noi di as res "{phang}Lines where Run 1 and Run 2 mismatch {ul:and} change for any value:{p_end}" - } + noi di "" + if missing("`verbose'") & missing("`compact'") { + noi di as res "{phang}Lines where Run 1 and Run 2 mismatch for any value:{p_end}" + } - end + if !missing("`verbose'") { + noi di as res "{phang}Lines where Run 1 and Run 2 mismatch {ul:or} change for any value:{p_end}" + } - /***************************************************************************** - ****************************************************************************** + if !missing("`compact'") { + noi di as res "{phang}Lines where Run 1 and Run 2 mismatch {ul:and} change for any value:{p_end}" + } - Utility sub-programs + end - ****************************************************************************** - *****************************************************************************/ + /***************************************************************************** + ****************************************************************************** - * This program can delete all your folders on your computer if used incorrectly. - cap program drop rm_output_dir - program define rm_output_dir + Utility sub-programs - syntax , folder(string) + ****************************************************************************** + *****************************************************************************/ - *Test that folder exist - mata : st_numscalar("r(dirExist)", direxists("`folder'")) - if (`r(dirExist)' != 0) { + * This program can delete all your folders on your computer if used incorrectly. + cap program drop rm_output_dir + program define rm_output_dir - * File paths can have both forward and/or back slash. - * We'll standardize them so they're easier to handle - local folderStd = subinstr(`"`folder'"',"\","/",.) + syntax , folder(string) - * List directories, files and other files - local dlist : dir `"`folderStd'"' dirs "*" , respectcase - local flist : dir `"`folderStd'"' files "*" , respectcase - local olist : dir `"`folderStd'"' other "*" , respectcase - local files `"`flist' `olist'"' + *Test that folder exist + mata : st_numscalar("r(dirExist)", direxists("`folder'")) + if (`r(dirExist)' != 0) { - * Recursively call this command on all subfolders - foreach dir of local dlist { - rm_output_dir , folder(`"`folderStd'/`dir'"') - } + * File paths can have both forward and/or back slash. + * We'll standardize them so they're easier to handle + local folderStd = subinstr(`"`folder'"',"\","/",.) - * Remove files in this folder - foreach file of local files { - rm `"`folderStd'/`file'"' - } + * List directories, files and other files + local dlist : dir `"`folderStd'"' dirs "*" , respectcase + local flist : dir `"`folderStd'"' files "*" , respectcase + local olist : dir `"`folderStd'"' other "*" , respectcase + local files `"`flist' `olist'"' - * Remove this folder as it is now empty - rmdir `"`folderStd'"' - } - end + * Recursively call this command on all subfolders + foreach dir of local dlist { + rm_output_dir , folder(`"`folderStd'/`dir'"') + } + + * Remove files in this folder + foreach file of local files { + rm `"`folderStd'/`file'"' + } + + * Remove this folder as it is now empty + rmdir `"`folderStd'"' + } + end diff --git a/src/ado/reprun_dataline.ado b/src/ado/reprun_dataline.ado index 66bba22..054ba7b 100644 --- a/src/ado/reprun_dataline.ado +++ b/src/ado/reprun_dataline.ado @@ -1,4 +1,4 @@ -*! version 1.1 17DEC2024 DIME Analytics dimeanalytics@worldbank.org +*! version 1.2 20240222 - DIME Analytics & LSMS Team, The World Bank - dimeanalytics@worldbank.org, lsms@worldbank.org * Command intended to exclusively be run from the run files * that the command iedorep is generating @@ -6,7 +6,7 @@ cap program drop reprun_dataline program define reprun_dataline, rclass - version 13.0 + version 14.1 syntax , /// run(string) /// run 1 or 2 diff --git a/src/dev/run-adodown-util.do b/src/dev/run-adodown-util.do index 6e073a6..5b12e7b 100644 --- a/src/dev/run-adodown-util.do +++ b/src/dev/run-adodown-util.do @@ -1,14 +1,17 @@ * Kristoffer's root path if "`c(username)'" == "wb462869" { - global clone "C:/Users/wb462869/github/repkit" + global clone "C:/Users/wb462869/github/" } * Fill in your root path here if "`c(username)'" == "bbdaniels" { - global clone "/Users/bbdaniels/GitHub/repkit" + global clone "/Users/bbdaniels/GitHub/" } + + local rk "${clone}/repkit" + local ad_src "${clone}/adodown/src" - do "https://raw.githubusercontent.com/lsms-worldbank/adodown/main/src/ado/ad_sthlp.ado" - - ad_sthlp , adf("${clone}") - - //ad_command create reprun_dataline , adf("`repkit'") pkg(repkit) + cap net uninstall adodown + net install adodown, from("`ad_src'") replace + + ad_publish, adf("`rk'") undoc("reproot_parse reproot_search reprun_dataline") ssczip + \ No newline at end of file diff --git a/src/mdhlp/reproot.md b/src/mdhlp/reproot.md new file mode 100644 index 0000000..ffa12ec --- /dev/null +++ b/src/mdhlp/reproot.md @@ -0,0 +1,104 @@ +# Title + +__reproot__ - Command for managing root file paths. + +# Syntax + +__reproot__ , __**p**roject__(_string_) __**r**oots__(_string_) [ __**pre**fix__(_string_) __clear__] + +| _options_ | Description | +|-------------|-----------------| +| __**p**roject__(_string_) | The project name to search roots for. | +| __**r**oots__(_string_) | The root name(s) to search for. | +| __**pre**fix__(_string_) | Adds a project-specific prefix to root globals. | +| __clear__ | Always search for roots even if already loaded. | + +`reproot` is a framework for automatically handling file paths across +teams without requiring project-specific setup from individual users. +Each user needs to set up `reproot` once on their computer (see the next paragraph). +Afterward, users can automatically load root paths with +no manual setup in all projects using `reproot` on that computer. + +`reproot` works by having the team save a root file in root folders required in the project. +Such root folders could be the root of a Git clone folder, +the root of a OneDrive/DropBox folder where data is shared, +or the root of a project folder on a network drive where files are shared, etc. +As long as the folder is accessible from the file system, +a root can be placed in that folder. +File paths to specific files can then be expressed in the code as +relative paths from any of those roots. + +To avoid searching the entire file system for roots (which would take too much time), +each user needs to configure a `reproot-env` file. +This file lists which folders and how many sub-folders of those folders +`reproot` should search for root files. +This setup should make the search take less than a second. + +The `reproot-env` file should be created in the folder that +Stata outputs when running the code `cd ~`. +This location can be accessed by all users without having to set any root paths first. + +Read more about setting up this file in +this [article](https://worldbank.github.io/repkit/articles/reproot-files.html). +The rest of this help file will focus on how to use this command once those files are set up. + +# Options + +__project__(_string_) indicates the name of the current project. When searching for root files, only root files for this project will be considered. Use a project name that will remain unique across all team members' computers. + +__roots__(_string_) indicates which roots are expected to be found for this project. +The command creates a global based on the root name of that root +if that root folder is found. +The content of the global will be the file path to the location of the root file. +This command does not set globals for roots not listed here, +even if such roots for this project were found. +Unless the __clear__ option is used, +the command does not overwrite any global that already existed before running the command. +Finally, the command tests that there is a global named after each root and +that all of them are non-empty. + +__prefix__(_string_) allows the user to set a project-specific global prefix. +This is strongly recommended to ensure that a global from another project +is not mistaken as a global for the current project. +Unless the __clear__ option is used, +a global already set with a common name, such as `data` or `code`, +will be interpreted as a root global with that name for the current project. +The __prefix()__ option allows a project-specific prefix that is set to all globals. +So, if __prefix("abc_")__ is used, then the globals `data` and `code` +will be set to `abc_data` and `abc_code`. + +__clear__ overwrites globals that already exist with the name that `reproot` would creaet. +This is all the roots listed in __roots()__ with +the __prefix()__ prepended if that option is used. +The default behavior is to not search for roots that are already set up. +If all globals are already set, then the command does not execute the search for roots. + +# Examples + +These examples demonstrate how to include `reproot` in the do-file. +See this [article](https://worldbank.github.io/repkit/articles/reproot-files.html) +for information on how to set up the `.yaml` files this command needs to run. + +## Example 1. + +In this example, the command searches the search paths indicated in +the `reproot-env.yaml` file for root files for the project `my_proj`. +Then it sets the globals `data` and `clone` to the file location where +root files with those names for this project are found. + +``` +reproot , project("my_proj") roots("data clone") +``` + +# Feedback, bug reports and contributions + +Read more about the commands in this package at https://worldbank.github.io/repkit. + +Please provide any feed back by opening and issue at https://github.com/worldbank/repkit. + +PRs with suggestions for improvements are also greatly appreciated. + +# Authors + +LSMS Team, The World Bank lsms@worldbank.org +DIME Analytics, The World Bank dimenalytics@worldbank.org diff --git a/src/repkit.pkg b/src/repkit.pkg index 20b351d..a0af3eb 100644 --- a/src/repkit.pkg +++ b/src/repkit.pkg @@ -1,31 +1,39 @@ * This package file is generated in the adodown workflow. Do not edit directly. *** version -v 1.1 -*** Title -d 'REPKIT': A module with tools related to collaboration and computational reproducibility +v 1.2 +*** title +d 'REPKIT': a module facilitating collaboration and computational reproducibility *** description -d A Stata package with tools related to computational reproducibility +d repkit is a package that aims to standardize best practices for +d reproducibility and collaboration as well as making them more accessible to +d the wider Stata community. This includes features ranging from root-path +d management, dependencies management and other tools that will help ensure the +d reproducibility of a project. d *** stata -d Version: Stata 14.1 +d Requires: Stata version 14.1 d *** author -d Author: DIME Analytics, LSMS Team +d Author: DIME Analytics & LSMS Team, The World Bank *** contact d Contact: dimeanalytics@@worldbank.org, lsms@@worldbank.org *** url d URL: https://github.com/worldbank/repkit d *** date -d Distribution-Date: 20230822 +d Distribution-Date: 20240222 d *** adofiles +f ado/reproot_search.ado +f ado/reproot_parse.ado +f ado/reproot.ado f ado/reprun_dataline.ado f ado/reprun.ado f ado/repado.ado f ado/repkit.ado *** helpfiles +f sthlp/reproot.sthlp f sthlp/reprun.sthlp f sthlp/repado.sthlp f sthlp/repkit.sthlp diff --git a/src/sthlp/repado.sthlp b/src/sthlp/repado.sthlp index b0a4f5e..a9bc49c 100644 --- a/src/sthlp/repado.sthlp +++ b/src/sthlp/repado.sthlp @@ -1,5 +1,5 @@ {smcl} -{* 17 Jan 2024}{...} +{* *! version 1.2 20240222}{...} {hline} {pstd}help file for {hi:repado}{p_end} {hline} @@ -27,12 +27,12 @@ {pstd}This command is used to make sure that all users in a project use the exact same version of the commands the project code requires. This is done by creating a folder that we will call the ado-folder. This folder should be shared with the rest of the code of the project. This will work no matter how the files are shared. It can be using a syncing service like DropBox, a Git repository, a network drive, an external hard drive, a .zip folder etc. {p_end} -{pstd}Using {inp:repado} in the {it:strict} mode, means that no other commands can be used apart from Stata{c 39}s built in commands and the commands in the shared ado-folder. +{pstd}Using {inp:repado} in the {it:strict} mode, means that no other commands can be used apart from Stata{c 39}s built in commands and the commands in the shared ado-folder. The commands that users have installed on their computers will not be available. These settings are restored next time Stata is restarted. {p_end} -{pstd}Using {inp:repado} in the {it:nostrict} mode, means that built-commands and the commands ado-folder are available to the script in addition to any command any user has installed on their computer. However, if a command is installed on a user{c 39}s computer that has the same name as a command in the ado-folder, then the exact version of the command in the ado-folder will be used. +{pstd}Using {inp:repado} in the {it:nostrict} mode, means that built-commands and the commands ado-folder are available to the script in addition to any command any user has installed on their computer. However, if a command is installed on a user{c 39}s computer that has the same name as a command in the ado-folder, then the exact version of the command in the ado-folder will be used. These settings are restored next time Stata is restarted. {p_end} @@ -58,7 +58,7 @@ the differences between the two modes. {dlgtab:Note on old and undocumented but still supported options} -{pstd}In earlier versions of {inp:repado}, {bf:adopath}({it:adopath}) +{pstd}In earlier versions of {inp:repado}, {bf:adopath}({it:adopath}) and {bf:mode}({it:{c -(}} {it:strict} {it:|} {it:nostrict} {it:{c )-}}) were two documented options. These two options are replaced by {bf:using} {it:adopath} and {bf:nostrict}, but they are still supported for the sake of backward compatibility. @@ -68,7 +68,7 @@ but they are still supported for the sake of backward compatibility. {dlgtab:Example 1} -{pstd}In this example, the ado-folder is a folder called {inp:ado} in the folder that the global {inp:myproj} is pointing to. +{pstd}In this example, the ado-folder is a folder called {inp:ado} in the folder that the global {inp:myproj} is pointing to. {p_end} {input}{space 8}repado using "${myproj}/ado" @@ -76,8 +76,8 @@ but they are still supported for the sake of backward compatibility. {dlgtab:Example 2} {pstd}Similarly to example 1, in this example, -the ado-folder is a folder called {inp:ado} in the folder -that the global {inp:myproj} is pointing to. +the ado-folder is a folder called {inp:ado} in the folder +that the global {inp:myproj} is pointing to. In this example the {it:nostrict} mode is used. {p_end} @@ -85,7 +85,7 @@ In this example the {it:nostrict} mode is used. {text} {title:Feedback, bug reports and contributions} -{pstd}Read more about these commands on {browse "https://github.com/dime-worldbank/repkit":this repo} where this package is developed. Please provide any feedback by {browse "https://github.com/dime-worldbank/repkit/issues":opening an issue}. PRs with suggestions for improvements are also greatly appreciated. +{pstd}Read more about these commands on {browse "https://github.com/worldbank/repkit":this repo} where this package is developed. Please provide any feedback by {browse "https://github.com/worldbank/repkit/issues":opening an issue}. PRs with suggestions for improvements are also greatly appreciated. {p_end} {title:Authors} diff --git a/src/sthlp/repkit.sthlp b/src/sthlp/repkit.sthlp index 237002a..0c41748 100644 --- a/src/sthlp/repkit.sthlp +++ b/src/sthlp/repkit.sthlp @@ -1,5 +1,5 @@ {smcl} -{* 17 Jan 2024}{...} +{* *! version 1.2 20240222}{...} {hline} {pstd}help file for {hi:repkit}{p_end} {hline} @@ -30,7 +30,7 @@ That is the main purpose of this command. {title:Feedback, bug reports and contributions} -{pstd}Read more about these commands on {browse "https://github.com/dime-worldbank/repkit":this repo} where this package is developed. Please provide any feedback by {browse "https://github.com/dime-worldbank/repkit/issues":opening an issue}. PRs with suggestions for improvements are also greatly appreciated. +{pstd}Read more about these commands on {browse "https://github.com/worldbank/repkit":this repo} where this package is developed. Please provide any feedback by {browse "https://github.com/worldbank/repkit/issues":opening an issue}. PRs with suggestions for improvements are also greatly appreciated. {p_end} {title:Authors} diff --git a/src/sthlp/reproot.sthlp b/src/sthlp/reproot.sthlp new file mode 100644 index 0000000..b994bce --- /dev/null +++ b/src/sthlp/reproot.sthlp @@ -0,0 +1,127 @@ +{smcl} +{* *! version 1.2 20240222}{...} +{hline} +{pstd}help file for {hi:reproot}{p_end} +{hline} + +{title:Title} + +{phang}{bf:reproot} - Command for managing root file paths. +{p_end} + +{title:Syntax} + +{phang}{bf:reproot} , {bf:{ul:p}roject}({it:string}) {bf:{ul:r}oots}({it:string}) [ {bf:{ul:pre}fix}({it:string}) {bf:clear}] +{p_end} + +{synoptset 15}{...} +{synopthdr:options} +{synoptline} +{synopt: {bf:{ul:p}roject}({it:string})}The project name to search roots for.{p_end} +{synopt: {bf:{ul:r}oots}({it:string})}The root name(s) to search for.{p_end} +{synopt: {bf:{ul:pre}fix}({it:string})}Adds a project-specific prefix to root globals.{p_end} +{synopt: {bf:clear}}Always search for roots even if already loaded.{p_end} +{synoptline} + +{phang}{inp:reproot} is a framework for automatically handling file paths across +teams without requiring project-specific setup from individual users. +Each user needs to set up {inp:reproot} once on their computer (see the next paragraph). +Afterward, users can automatically load root paths with +no manual setup in all projects using {inp:reproot} on that computer. +{p_end} + +{phang}{inp:reproot} works by having the team save a root file in root folders required in the project. +Such root folders could be the root of a Git clone folder, +the root of a OneDrive/DropBox folder where data is shared, +or the root of a project folder on a network drive where files are shared, etc. +As long as the folder is accessible from the file system, +a root can be placed in that folder. +File paths to specific files can then be expressed in the code as +relative paths from any of those roots. +{p_end} + +{phang}To avoid searching the entire file system for roots (which would take too much time), +each user needs to configure a {inp:reproot-env} file. +This file lists which folders and how many sub-folders of those folders +{inp:reproot} should search for root files. +This setup should make the search take less than a second. +{p_end} + +{phang}The {inp:reproot-env} file should be created in the folder that +Stata outputs when running the code {inp:cd ~}. +This location can be accessed by all users without having to set any root paths first. +{p_end} + +{phang}Read more about setting up this file in +this {browse "https://worldbank.github.io/repkit/articles/reproot-files.html":article}. +The rest of this help file will focus on how to use this command once those files are set up. +{p_end} + +{title:Options} + +{pstd}{bf:project}({it:string}) indicates the name of the current project. When searching for root files, only root files for this project will be considered. Use a project name that will remain unique across all team members{c 39} computers. +{p_end} + +{pstd}{bf:roots}({it:string}) indicates which roots are expected to be found for this project. +The command creates a global based on the root name of that root +if that root folder is found. +The content of the global will be the file path to the location of the root file. +This command does not set globals for roots not listed here, +even if such roots for this project were found. +Unless the {bf:clear} option is used, +the command does not overwrite any global that already existed before running the command. +Finally, the command tests that there is a global named after each root and +that all of them are non-empty. +{p_end} + +{pstd}{bf:prefix}({it:string}) allows the user to set a project-specific global prefix. +This is strongly recommended to ensure that a global from another project +is not mistaken as a global for the current project. +Unless the {bf:clear} option is used, +a global already set with a common name, such as {inp:data} or {inp:code}, +will be interpreted as a root global with that name for the current project. +The {bf:prefix()} option allows a project-specific prefix that is set to all globals. +So, if {bf:prefix({c 34}abc_{c 34})} is used, then the globals {inp:data} and {inp:code} +will be set to {inp:abc_data} and {inp:abc_code}. +{p_end} + +{pstd}{bf:clear} overwrites globals that already exist with the name that {inp:reproot} would creaet. +This is all the roots listed in {bf:roots()} with +the {bf:prefix()} prepended if that option is used. +The default behavior is to not search for roots that are already set up. +If all globals are already set, then the command does not execute the search for roots. +{p_end} + +{title:Examples} + +{pstd}These examples demonstrate how to include {inp:reproot} in the do-file. +See this {browse "https://worldbank.github.io/repkit/articles/reproot-files.html":article} +for information on how to set up the {inp:.yaml} files this command needs to run. +{p_end} + +{dlgtab:Example 1.} + +{pstd}In this example, the command searches the search paths indicated in +the {inp:reproot-env.yaml} file for root files for the project {inp:my_proj}. +Then it sets the globals {inp:data} and {inp:clone} to the file location where +root files with those names for this project are found. +{p_end} + +{input}{space 8}reproot , project("my_proj") roots("data clone") +{text} +{title:Feedback, bug reports and contributions} + +{pstd}Read more about the commands in this package at https://worldbank.github.io/repkit. +{p_end} + +{pstd}Please provide any feed back by opening and issue at https://github.com/worldbank/repkit. +{p_end} + +{pstd}PRs with suggestions for improvements are also greatly appreciated. +{p_end} + +{title:Authors} + +{pstd}LSMS Team, The World Bank lsms@worldbank.org +DIME Analytics, The World Bank dimenalytics@worldbank.org +{p_end} diff --git a/src/sthlp/reprun.sthlp b/src/sthlp/reprun.sthlp index 7008791..d3f746f 100644 --- a/src/sthlp/reprun.sthlp +++ b/src/sthlp/reprun.sthlp @@ -1,5 +1,5 @@ {smcl} -{* 17 Jan 2024}{...} +{* *! version 1.2 20240222}{...} {hline} {pstd}help file for {hi:reprun}{p_end} {hline} @@ -16,7 +16,7 @@ [{bf:{ul:d}ebug}] [{bf:{ul:noc}lear}] {p_end} -{phang}By default, {bf:reprun} will execute the complete do-file specified in {c 34}{it:do-file.do}{c 34} once (Run 1), and record the {c 34}seed RNG state{c 34}, {c 34}sort order RNG{c 34}, and {c 34}data signature{c 34} after the execution of every line, as well as the exact data in certain cases. {bf:reprun} will then execute the do-file a second time (Run 2), and find all {it:changes} and {it:mismatches} in these states throughout Run 2. A table of mismatches will be reported in the Results window, as well as in a SMCL file in a new directory called {inp:/reprun/} in the same location as the do-file. If the {inp:using} argument is supplied, the {inp:/reprun/} directory containing the SMCL file will be stored in that location instead. +{phang}By default, {bf:reprun} will execute the complete do-file specified in {c 34}{it:do-file.do}{c 34} once (Run 1), and record the {c 34}seed RNG state{c 34}, {c 34}sort order RNG{c 34}, and {c 34}data signature{c 34} after the execution of every line, as well as the exact data in certain cases. {bf:reprun} will then execute the do-file a second time (Run 2), and find all {it:changes} and {it:mismatches} in these states throughout Run 2. A table of mismatches will be reported in the Results window, as well as in a SMCL file in a new directory called {inp:/reprun/} in the same location as the do-file. If the {inp:using} argument is supplied, the {inp:/reprun/} directory containing the SMCL file will be stored in that location instead. {p_end} {synoptset 15}{...} @@ -50,12 +50,12 @@ {pstd}The {bf:{ul:c}ompact} option, by contrast, produces less detailed reporting, but is often a good first step to begin locating issues in the code. If the {bf:{ul:c}ompact} option is specified, then {it:only} those lines which have changes {it:during} Run 1 or Run 2 {bf:and} mismatches {it:between} the runs will be flagged and reported. This is intended to reduce the reporting of {c 34}cascading{c 34} flags, which are caused because some state value changes inconsistently at a single point and remains inconsistent for the remainder of the run. {p_end} -{pstd}The {bf:{ul:s}uppress()} option is used to hide the reporting of changes that do not lead to mismatches (especially when the {bf:{ul:v}erbose} option is specified) for one or more of the types. In particular, since the sort order RNG frequently changes and should {it:not} be forced to match between runs, it will very often have changes that do not produce errors, specifying {bf:{ul:s}uppress(srng)} will remove a great deal of unhelpful output from the reporting table. To do this for all states, write {bf:{ul:s}uppress(rng srng dsig)}. Suppressing {inp:loop} will clean up the display of loops so that the titles are only shown on the first line; but if combined with {inp:compact} may not display at all. +{pstd}The {bf:{ul:s}uppress()} option is used to hide the reporting of changes that do not lead to mismatches (especially when the {bf:{ul:v}erbose} option is specified) for one or more of the types. In particular, since the sort order RNG frequently changes and should {it:not} be forced to match between runs, it will very often have changes that do not produce errors, specifying {bf:{ul:s}uppress(srng)} will remove a great deal of unhelpful output from the reporting table. To do this for all states, write {bf:{ul:s}uppress(rng srng dsig)}. Suppressing {inp:loop} will clean up the display of loops so that the titles are only shown on the first line; but if combined with {inp:compact} may not display at all. {p_end} {dlgtab:Reporting and debugging options} -{pstd}The {bf:{ul:d}ebug} option allows the user to save all of the underlying materials used by {bf:reprun} in the {inp:/reprun/} folder where the reporting SMCL file will be written. This will include copies of all do-files for each run for manual inspection, text files of the states of Stata after each line, and copies of the dataset at specific lines when it is needed. This can take a lot of space, and is automatically cleaned up after execution if {bf:{ul:d}ebug} is not specified. +{pstd}The {bf:{ul:d}ebug} option allows the user to save all of the underlying materials used by {bf:reprun} in the {inp:/reprun/} folder where the reporting SMCL file will be written. This will include copies of all do-files for each run for manual inspection, text files of the states of Stata after each line, and copies of the dataset at specific lines when it is needed. This can take a lot of space, and is automatically cleaned up after execution if {bf:{ul:d}ebug} is not specified. {p_end} {dlgtab:Other options} @@ -65,7 +65,7 @@ {title:Feedback, bug reports and contributions} -{pstd}Read more about these commands on {browse "https://github.com/dime-worldbank/repkit":this repo} where this package is developed. Please provide any feedback by {browse "https://github.com/dime-worldbank/repkit/issues":opening an issue}. PRs with suggestions for improvements are also greatly appreciated. +{pstd}Read more about these commands on {browse "https://github.com/worldbank/repkit":this repo} where this package is developed. Please provide any feedback by {browse "https://github.com/worldbank/repkit/issues":opening an issue}. PRs with suggestions for improvements are also greatly appreciated. {p_end} {title:Authors} diff --git a/src/tests/repado/repado-schema.do b/src/tests/repado/repado-schema.do index 2e9c936..73c35c8 100644 --- a/src/tests/repado/repado-schema.do +++ b/src/tests/repado/repado-schema.do @@ -2,7 +2,7 @@ global repkit "/Users/bbdaniels/GitHub/repkit" -repado "${repkit}/src/tests/dev-env" +repado using "${repkit}/src/tests/dev-env" copy "https://github.com/graykimbrough/uncluttered-stata-graphs/raw/master/schemes/scheme-uncluttered.scheme" /// "${repkit}/src/tests/plus-ado/scheme-uncluttered.scheme" , replace diff --git a/src/tests/reproot/reproot.do b/src/tests/reproot/reproot.do new file mode 100644 index 0000000..dd18bed --- /dev/null +++ b/src/tests/reproot/reproot.do @@ -0,0 +1,66 @@ + + cap which repkit + if _rc == 111 { + di as error `"{pstd}This test file use features from the package {browse "https://dime-worldbank.github.io/repkit/":repkit}. Click {stata ssc install repkit} to install it and run this file again.{p_end}"' + } + + ************************************* + * Set root path + * TODO: Update with reprun once published + + di "Your username: `c(username)'" + * Set each user's root path + if "`c(username)'" == "`c(username)'" { + global root "C:/Users/wb462869/github//repkit" + } + * Set all other user's root paths on this format + if "`c(username)'" == "" { + global root "" + } + + * Set global to the test folder + global src "${root}/src" + global tests "${src}/tests" + + * Set up a dev environement for testing locally + cap mkdir "${tests}/dev-env" + repado using "${tests}/dev-env" + + * If not already installed in dev-env, add repkit to the dev environment + cap which repkit + if _rc == 111 ssc install repkit + + /* TODO: Uncomment once adodown is published + * If not already installed, add adodown to the dev environment + cap which adodown + if _rc == 111 ssc install adodown + */ + + * Install the latest version of repkit to the dev environment + net uninstall repkit + net install repkit, from("${src}") replace + + * Test 1 - this should all work without error + local prj "reproot-test-1" + local pref "test1_" + + * Reset globals + global `pref'clone "" + global `pref'data "" + + * Run command + reproot, project("`prj'") roots("clone") prefix("`pref'") + reproot, project("`prj'") roots("clone data") prefix("`pref'") + reproot, project("`prj'") roots("clone data") prefix("`pref'") + + * Test 2 - this project has two clone roots + local prj "reproot-test-2" + local pref "test2_" + + * Reset globals + global `pref'clone "" + + * Run command - expected error as two roots named clone exist + cap reproot, project("`prj'") roots("clone") prefix("`pref'") + di _rc + if !(_rc == 99) reproot, project("`prj'") roots("clone") prefix("`pref'") clear diff --git a/src/tests/reproot_parse/reproot_parse.do b/src/tests/reproot_parse/reproot_parse.do new file mode 100644 index 0000000..cc63b94 --- /dev/null +++ b/src/tests/reproot_parse/reproot_parse.do @@ -0,0 +1,46 @@ + + cap which repkit + if _rc == 111 { + di as error `"{pstd}This test file use features from the package {browse "https://dime-worldbank.github.io/repkit/":repkit}. Click {stata ssc install repkit} to install it and run this file again.{p_end}"' + } + + ************************************* + * Set root path + * TODO: Update with reprun once published + + di "Your username: `c(username)'" + * Set each user's root path + if "`c(username)'" == "`c(username)'" { + global root "C:/Users/wb462869/github//repkit" + } + * Set all other user's root paths on this format + if "`c(username)'" == "" { + global root "" + } + + * Set global to the test folder + global src "${root}/src" + global tests "${src}/tests" + + * Set up a dev environement for testing locally + cap mkdir "${tests}/dev-env" + repado using "${tests}/dev-env" + + * If not already installed in dev-env, add repkit to the dev environment + cap which repkit + if _rc == 111 ssc install repkit + + /* TODO: Uncomment once adodown is published + * If not already installed, add adodown to the dev environment + cap which adodown + if _rc == 111 ssc install adodown + */ + + * Install the latest version of repkit to the dev environment + cap net uninstall repkit + net install repkit, from("${src}") replace + + * Test basic case of the command reproot_parse + reproot_parse + + // Add more tests here... diff --git a/src/tests/reproot_search/reproot_search.do b/src/tests/reproot_search/reproot_search.do new file mode 100644 index 0000000..11fdc39 --- /dev/null +++ b/src/tests/reproot_search/reproot_search.do @@ -0,0 +1,46 @@ + + cap which repkit + if _rc == 111 { + di as error `"{pstd}This test file use features from the package {browse "https://dime-worldbank.github.io/repkit/":repkit}. Click {stata ssc install repkit} to install it and run this file again.{p_end}"' + } + + ************************************* + * Set root path + * TODO: Update with reprun once published + + di "Your username: `c(username)'" + * Set each user's root path + if "`c(username)'" == "`c(username)'" { + global root "C:/Users/wb462869/github//repkit" + } + * Set all other user's root paths on this format + if "`c(username)'" == "" { + global root "" + } + + * Set global to the test folder + global src "${root}/src" + global tests "${src}/tests" + + * Set up a dev environement for testing locally + cap mkdir "${tests}/dev-env" + repado using "${tests}/dev-env" + + * If not already installed in dev-env, add repkit to the dev environment + cap which repkit + if _rc == 111 ssc install repkit + + /* TODO: Uncomment once adodown is published + * If not already installed, add adodown to the dev environment + cap which adodown + if _rc == 111 ssc install adodown + */ + + * Install the latest version of repkit to the dev environment + cap net uninstall repkit + net install repkit, from("${src}") replace + + * Test basic case of the command reproot_search + reproot_search + + // Add more tests here... diff --git a/src/tests/reprun/reprun.do b/src/tests/reprun/reprun.do index 59db63a..2747441 100644 --- a/src/tests/reprun/reprun.do +++ b/src/tests/reprun/reprun.do @@ -22,7 +22,9 @@ * Install the version of this package in * the plus-ado folder in the test folder - repado "${test_fldr}/dev-env/" + + cap mkdir "${test_fldr}/dev-env/" + repado using "${test_fldr}/dev-env/" cap net uninstall repkit net install repkit, from("${src_fldr}") replace