Skip to content

technical_documentation

Ulf Kronman edited this page Mar 14, 2017 · 2 revisions

Technical documentation for Open APC Sweden

Ulf Kronman, 2017-02-25

Swedish pre-processing

Run python script /python/se/clean_and_merge_apc_files.py -l se_SV.UTF-8

Reads list of files to process from ../data/apc_file_list.txt

Function:

  • Merges APC files
  • Changes SANT/FALSKT to TRUE/FALSE
  • Changes comma (,) decimal delimiter to period (.) decimal delimiter
  • Removes big number whitespace formatting from Excel

Result in /data/apc_se_merged.tsv

Main enrichment process

Run python script /python/se/apc_csv_processing.py -l se_SV.UTF-8 ../data/apc_se_merged.tsv

Uses:

Result: /python/out.csv

Swedish post-processing

Run python script /python/se/normalise_and_copy.py

Uses: /python/out.csv

Result: /data/apc_se.csv

Analysis

Clone this wiki locally