workshop_day2

Instructions for the sequence phylogeny activity:

Take some time to get to know your data.
Count how many fasta files (DNA seqs from different organisms) you have in each file.
Combine all of the sequences into one large fasta file
Make a subdirectory in the parent directory called “raw data”
Move your original fasts into the raw data folder (do this so you keep raw data safe while running other commands)
Use the combined fasta file to grab only the headers of each fasta into a file that will be a list of the fasta headers (names). Name this file “headers.txt”
Now, count how many sequences made it into your combined fasta to make sure all the sequences did in fact get added. (You should get the same # you got in part 2)

Add this count to the headers.txt file. Make sure you dont wipe all the headers out of that file.

Go to this website https://www.genome.jp/tools-bin/mafft Use the alignment tool to align your fasta files (the all_seqs file) and generate a phylogenetic tree.

Challenge: Use the challenge seqs folder and from that parent directory use one bash script to go into each subfolder and extract the # of fastas in each file in each folder. Then save the output from each into a file in the parent directory. (Hint - use a for loop).

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
challenge seqs		challenge seqs
seqs		seqs
README.md		README.md
animals.txt		animals.txt
instructions.txt		instructions.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

workshop_day2

About

Releases

Packages

callmcgovern/workshop_day2

Folders and files

Latest commit

History

Repository files navigation

workshop_day2

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages