-
Rep_content_dat_prep.R
: This script takes the Repeat_content_sum.csv output files for all lines and combines them into a data set with more consistent columns. The output dataset is calledNAM_array_coords.tsv
. -
Synteny_Knobs.R
: This script identifies knob synteny by comparing knob positions based on orthologous genes. For the version for the main figure, only knobs that are syntenic to B73 knobs that are >= 100kb are identified. For the supplemental version comparing sequence-defined arrays and classical, cytological knobs, the orthologous gene sets are limited to orthologs present in all lines. When any structural variants are identified in the coordinates, the order in B73 is assumed to be the null, and the coordinates are adjusted to match B73. -
NAM_plot_Supp.R_linux.sh
: This script modifies the gff files of repeats and TE's and subsetting to the elements that are within repeat arrays to reduce the required memory and time for further analysis. This is prep for NAM_plot_Supp.txt. -
NAM_plot_Supp.R
:This script generates the Supplement figure of the largest knobs with TE content. It takes the output fromNAM_plot_Supp.R_linux.sh
andNAM_array_coords.tsv
file as inputs.
NAM_array_coords.tsv
is the output fromRep_content_dat_prep.R
.NAM_array_coords_annotation_cyt_search_edit.csv
is the data from above with hand annotations checking for relationship with cytological knobs evident in FISH imagery.- All scripts are in
scripts
folders and data in the assets folder.