Skip to content

Commit

Permalink
change added sample attributes format
Browse files Browse the repository at this point in the history
  • Loading branch information
komstat committed May 17, 2024
1 parent 4d50850 commit 8df8fe6
Show file tree
Hide file tree
Showing 4 changed files with 98 additions and 27 deletions.
Binary file modified JGA_metadata.xlsx
Binary file not shown.
14 changes: 10 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
## 日本語

* 生命情報・DDBJ センター
* 公開日: 2024-02-14
* version: v2.6
* 公開日: 2024-05-17
* version: v2.7

[Bioinformation and DDBJ Center](https://www.ddbj.nig.ac.jp/index-e.html) のデータベースに登録するためのメタデータ XML を生成、チェックするツール。
* [DDBJ Sequence Read Archive (DRA)](https://www.ddbj.nig.ac.jp/dra/submission.html): Submission、Experiment、Run と Analysis (任意) XML を生成・チェックするためのエクセルとスクリプト
Expand All @@ -13,6 +13,7 @@

## 履歴

* 2024-05-17: v2.7 JGA Sample 追加属性の記載方法を変更
* 2024-02-14: v2.6 DRA xsd 1.6.0
* 2024-01-31: v2.5 テンプレート・サンプルデータの修正
* 2023-12-21: v2.4 center name 変更
Expand Down Expand Up @@ -215,6 +216,8 @@ singularity exec excel2xml.simg excel2xml_jga -j JSUB999999 example/JSUB999999_j
* JSUB999999_Study.xml
* JSUB999999_Submission.xml

Sample 追加属性の記載方法が v2.6 以前の形式 (例 age:37; collection_date:2015-03-05) の場合、-r オプションを付けて XML を生成します。  

JGA Submission ID を指定して XML をチェックします。XML は submission-excel2xml ディレクトリ直下に配置されている必要があります。JGA xsd ファイルは build 中にコンテナー内の /opt/submission-excel2xml/ にダウンロードされています。
```
singularity exec excel2xml.simg validate_meta_jga -j JSUB999999
Expand Down Expand Up @@ -243,6 +246,8 @@ sudo docker run -v /path_to_excel_directory:/data -w /data excel2xml excel2xml_j
* JSUB999999_Study.xml
* JSUB999999_Submission.xml

Sample 追加属性の記載方法が v2.6 以前の形式 (例 age:37; collection_date:2015-03-05) の場合、-r オプションを付けて XML を生成します。  

Submission ID を指定して XML をチェックします。XML は submission-excel2xml ディレクトリ直下に配置されている必要があります。JGA xsd ファイルは build 中にコンテナー内の /opt/submission-excel2xml/ にダウンロードされています。
```
sudo docker run -v /path_to_excel_directory:/data -w /data excel2xml validate_meta_jga -j JSUB999999
Expand Down Expand Up @@ -317,8 +322,8 @@ TBD
## English

* Bioinformation and DDBJ Center
* release: 2024-02-14
* version: v2.6
* release: 2024-05-17
* version: v2.7

These files are Excel, container images and tools for generation and validation of metadata XML files for databases of [Bioinformation and DDBJ Center](https://www.ddbj.nig.ac.jp/index-e.html).
* [DDBJ Sequence Read Archive (DRA)](https://www.ddbj.nig.ac.jp/dra/submission-e.html): generate and check Submission, Experiment and Run XML files.
Expand All @@ -327,6 +332,7 @@ These files are Excel, container images and tools for generation and validation

## History

* 2024-05-17: v2.7 Added sample attributes description format changed
* 2024-02-14: v2.6 DRA xsd 1.6.0
* 2024-01-31: v2.5 Fixing templates and sample data
* 2023-12-21: v2.4 center name changes
Expand Down
Binary file modified example/JSUB999999_jga_metadata.xlsx
Binary file not shown.
111 changes: 88 additions & 23 deletions exe/excel2xml_jga
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,17 @@ require 'optparse'
#

# Update history
# 2022-12-23 change handling of submission date
# 2024-05-17 Change the way of describing Sample attributes
# 2022-12-23 Change handling of submission date
# 2022-12-22 AGD
# 2022-12-14 publicly released
# 2022-12-14 Publicly released

### Options
account = ""
submission_no = ""
submission_id = ""
study_accession = ""
previous_sample_attrs_flag = false
OptionParser.new{|opt|

opt.on('-j [JSUB ID]', 'JSUB/ASUB submission ID'){|v|
Expand All @@ -34,6 +36,11 @@ OptionParser.new{|opt|
puts "JGA/AGD Study Accession: #{v}"
}

opt.on('-r', 'flag for the previous added Sample attributes format, internal use'){
previous_sample_attrs_flag = true
puts "Previous added Sample attributes format: #{previous_sample_attrs_flag}"
}

begin
opt.parse!
rescue
Expand Down Expand Up @@ -306,21 +313,62 @@ end
# Sample
samples_a = Array.new
sample_aliases_a = Array.new
added_attr_name_a = Array.new
fixed_attr_name_a = Array.new
i = 0 # array index number

for num, line in sample_a

# 追加属性名を取得
if num == 2

unless previous_sample_attrs_flag

raise "Added Sample attributes in the previous format. Use -r option." if line[11] == "Attributes"

line[1..10].each{|attr_name|
fixed_attr_name_a.push(attr_name)
}

line[11..-1].each{|attr_name|
if attr_name && attr_name.to_s
raise "Added attribute is included in fixed attributes: #{attr_name.to_s}" if fixed_attr_name_a.map(&:downcase).include?(attr_name.to_s.downcase)
added_attr_name_a.push(attr_name.to_s.strip)
end
}

end

end

if /^Sample-\d{1,6}/ =~ line[0]
# alias

added_attr_h = Hash.new

# alias
sample_number = line[0].split("-")[1].to_i
sample_alias = submission_id + "_Sample_" + sprintf("%06d", line[0].split("-")[1].to_i)
sample_aliases_a.push(sample_alias)

# Title があれば
# Title があれば。追加属性は hash で格納
if line[4]
samples_a.push([sample_alias, line[1], line[2], line[3], line[4], line[5], line[6], line[7], line[8], line[9], line[10], line[11]])
end


if previous_sample_attrs_flag
samples_a.push([sample_alias, line[1], line[2], line[3], line[4], line[5], line[6], line[7], line[8], line[9], line[10], line[11]])
else

# 追加属性があれば
if line[11..-1].size > 0
line[11..-1].each_with_index{|attr, idx|
added_attr_h.store(added_attr_name_a[idx], attr.to_s.strip) if added_attr_name_a[idx] && attr && attr.to_s
}
end

samples_a.push([sample_alias, line[1], line[2], line[3], line[4], line[5], line[6], line[7], line[8], line[9], line[10], added_attr_h])

end

end # if line[4]

end

end
Expand Down Expand Up @@ -916,20 +964,37 @@ sample_f.puts xml_sample.SAMPLE_SET{|sample_set|
}
end

# phenotypes
if sam[11] && sam[11].split(";")

sam[11].split(";").each{|phenotype|
sample_attributes.SAMPLE_ATTRIBUTE{|sample_attribute|
pp phenotype if phenotype.strip.split(":")[0].nil?

sample_attribute.TAG(phenotype.strip.split(":")[0].strip)
sample_attribute.VALUE(phenotype.strip.split(":")[1].strip)
}
}

end

if previous_sample_attrs_flag

if sam[11] && sam[11].split(";")

sam[11].split(";").each{|phenotype|
sample_attributes.SAMPLE_ATTRIBUTE{|sample_attribute|
pp phenotype if phenotype.strip.split(":")[0].nil?

sample_attribute.TAG(phenotype.strip.split(":")[0].strip)
sample_attribute.VALUE(phenotype.strip.split(":")[1].strip)
}
}

end

else

# added attributes
sample_added_attr_h = Hash.new
sample_added_attr_h = sam[11]
unless sample_added_attr_h.empty?
sample_added_attr_h.each{|attr_name, attr_value|
sample_attributes.SAMPLE_ATTRIBUTE{|sample_attribute|
sample_attribute.TAG(attr_name)
sample_attribute.VALUE(attr_value)
}
}
end

end

}

}
Expand Down

0 comments on commit 8df8fe6

Please sign in to comment.