-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathLineTAP.tex
855 lines (674 loc) · 32.9 KB
/
LineTAP.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
\documentclass[11pt,a4paper]{ivoa}
\input tthdefs
\input gitmeta
\usepackage{todonotes}
\usepackage{textcomp}
\newcommand{\xsamsref}[1]{\textsuperscript{\ref{#1}}}
\lstloadlanguages{SQL,XML}
\lstset{flexiblecolumns=true,numberstyle=\small,showstringspaces=False,
identifierstyle=\texttt}
\definecolor{lightgray}{rgb}{0.8,0.8,0.8}
\def\rowsep{\noalign{\vspace{2pt}}}
% since we don't do a lot of UCDs here and XSAMS references share a lot
% of the typesetting issues of UCDs, we'll re-use the ucd macro for now
% for them:
\let\xsref=\ucd
\newenvironment{xsamsbridge}
{\begin{list}{}{\setlength\labelwidth{15em}%
\setlength\labelsep{1em}%
\setlength\leftmargin{5em}%
}}
{\end{list}}
\title{LineTAP: IVOA Relational Model for Spectral Lines}
% see ivoatexDoc for what group names to use here
\ivoagroup{DAL}
\author{Castro Neves, M.}
\author{Moreau, N.}
\author{Demleitner, M.}
\editor{Margarida Castro Neves, Nicolas Moreau}
\previousversion[https://www.ivoa.net/documents/LineTAP/20230323]{WD-1.0-2023-03-23}
\begin{document}
\begin{abstract}
This document proposes a relational schema to describe spectral line
transitions that can be queried using the TAP protocol. Its purpose is
to derive from the VAMDC data model a simplified way to query
spectral line databases from VO applications. The underlying model is
rooted in the widely-deployed VAMDC, and the intent is that at least the
atomic and molecular data from VAMDC can easily be re-published using
LineTAP.
\end{abstract}
%\section*{Acknowledgments}
\section*{Conformance-related definitions}
The words ``MUST'', ``SHALL'', ``SHOULD'', ``MAY'', ``RECOMMENDED'', and
``OPTIONAL'' (in upper or lower case) used in this document are to be
interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}.
The \emph{Virtual Observatory (VO)} is a
general term for a collection of federated resources that can be used
to conduct astronomical research, education, and outreach.
The \href{http://www.ivoa.net}{International
Virtual Observatory Alliance (IVOA)} is a global
collaboration of separately funded projects to develop standards and
infrastructure that enable VO applications.
\section{Introduction}
The Simple Line Access Protocol SLAP \citep{2010ivoa.specQ1209O}
currently is the VO
recommendation for querying spectral line collections.
It is based on the Simple Spectral Line Data
Model SSLDM \citep{2010ivoa.spec.1209O}, which defines the underlying
data model.
As in SSLDM, a \emph{spectral line} in this document is considered to
be the result of a (radiative) transition between two energy levels.
More than ten years after the protocol's definition, there are still
very few SLAP services registred in the VO.
On the other hand, the Virtual Atomic and Molecular Data Center
VAMDC \citep{atoms8040076} offers a great amount of spectral line
data. Making this data available to VO clients without major extra
tooling is certainly desirable.
While the query part of VAMDC clearly betrays its origins in VO
standards and thus might readily be integrated into the VO protocol
stack, the service output comes in a very comprehensive derivative of
the XML Schema for Atomic, Molecular and Solid Data XSAMS
\citep{XSAMS:Docs}; in particular, its tree-like nature complicates
casual use. In addition, many interesting use cases can already be
satisfied with a simple relational mapping of XSAMS.
This document proposes LineTAP, a simple way to access spectral line
data through a VO service employing such a simplified relational
mapping. The resulting table schema is presented in
section~\ref{sect:quantities}, while the mapping between our columns and the
VAMDC-XSAMS Data Model is given in section~\ref{sect:mapping}.
During the development of the standard, a major problem in molecular
spectroscopy turned out to be species nomenclature. The core LineTAP
table sidesteps this problem by identifying species using IUPAC standard
InChIs, a choice unpopular with many practitioners. To facilitate the
use of colloquial species designations (``ethyl alcohol''), this
specification also defines a \textit{species table} associating common
names and sum formulas with InChIs in section \ref{sect:speciestable}.
When accessed using the Table Access Protocol TAP
\citep{2019ivoa.spec.0927D}, the tables can be queried using the
expressive SQL-derived query language ADQL, while query results are
available in the VOTable format, easily readable by VO client
applications. Line databases accessible in this way can be registered
in the VO Registry. The detailed rules for this registration, and
recommendations for how to discover LineTAP services, are given in
section~\ref{sect:regmatters}.
\subsection{Role within the VO Architecture}
\begin{figure}
\centering
% As of ivoatex 1.2, the architecture diagram is generated by ivoatex in
% SVG; copy ivoatex/archdiag-full.xml to role_diagram.xml and throw out
% all lines not relevant to your standard.
% Notes don't generally need this. If you don't copy role_diagram.xml,
% you must remove role_diagram.pdf from SOURCES in the Makefile.
\includegraphics[width=0.9\textwidth]{role_diagram.pdf}
\caption{Architecture diagram for LineTAP}
\label{fig:archdiag}
\end{figure}
Fig.~\ref{fig:archdiag} shows the role this document plays within the
IVOA architecture \citep{2021ivoa.spec.1101D}. It is using TAP
\citep{2019ivoa.spec.0927D} for the communication of queries to the
server and the results back to the client, where the queries are written
in ADQL \citep{2008ivoa.spec.1030O}. To make LineTAP tables
discoverable, they are registered using VODataService's
\citep{2021ivoa.spec.1102D} tablesets, and we give recipes on how to
discover them using RegTAP \citep{2019ivoa.spec.1011D}.
\section{Use Cases}
\label{sect:use-cases}
LineTAP really only has a single use case, the discovery of spectral
lines for identification purposes. To structure standards development,
we discuss some situations specifically:
\subsection{Identifying a Single Line}
A user sees a feature in a spectrum with known (and realiable) spectral
calibration and now wants to know what might possibly be responsible for
it. Hence, they query a narrow spectral range and retrieve all known
lines from all services.
To select which of the candidate lines are plausible matches, users
would inspect line metadata such as the originating atom or molecule, the
ionisation state, and perhaps oscillator strengths.
\subsection{Getting Properties of Well-Known Lines}
A user wants to display, say, the Lyman series over a plot of a
spectrum. Hence, a client needs to discover which service holds such
data, select the appropriate records -- presumably by their properties,
perhaps even by their name --, and retrieve them. If multiple services
hold the desired data, it might need to reconcile differing
specifications.
\subsection{Retrieving Spectral Lines for Cross-Identification}
Users may have various reasons to retrieve a larger number of spectral
lines:
\begin{itemize}
\item When analysing a given spectrum, selecting spectral lines that may
fit the ones in the spectrum, for instance to establish the source's
chemistry or physical state. Depending on the prior knowledge of the
source, they will want to constrain the matches to specific species in
specific ionisation (or even excitation) states.
\item When estimating the redshift of an object, features found in the
spectrum need to be matched to the rest wavelengths.
\item When computing theoretical spectra, a comparison to the (observed)
ground truth is desirable.
\end{itemize}
The challenge in all these cases is that displaying all lines
known obviously is impossible due to the sheer volume of the data,
and it would not help users in any way.
Hence, the client needs to have some idea of which lines can be expected
to be strong given the physics of the emission's source region.
Selecting the lines before retrieval is a significant optimisation in
this case, as in wider spectra at least hundreds of thousands of lines
will be within the spectral range, while it probably rarely makes sense
to plot more than a hundred or so. Hence, careful selection of lines
can reduce the volume of data transferred and processed by the client by
several orders of magnitude.
To make good on this promise, the tables need to be queryable such that
lines suspected to be strong for some combination of chemistry,
temperature, and pressure can be filtered out with some accuracy.
\subsection{Finding spectral lines for specific species}
Using a mass spectrometer, researchers find a molecule with the
sum formula C$_{16}$H$_{10}$ in a comet particle. They now want to
figure out whether any line in the spectrum of the coma of the parent
object corresponds to some molecule with that sum formula.
Conversely, a researcher may want to find lines of Methane or perhaps
even Methane with one hydrogen atom being replaced by a deuteron.
\subsection{Credit}
In particular to provide an incentive to contribute to the global
repository of line data, it should be as simple as possible for users to
give credit to the contributors of line data.
\subsection{Resolution of Molecule Designation}
\label{uc:resolution}
A researcher wants to find lines for the molecule they have been calling
``Methyl Mercaptan'' or designated by a pseudo-structural formula like
\verb|CH3SHv=0| for a long time.
\subsection{Non-Use Cases}
This specification differs from VAMDC in that it does not attempt to
cover all possible uses of spectral line data. In particular, no
attempt is made to
\begin{itemize}
\item Publish sufficient information to feed sophisticated,
high-precision atmosphere models.
\item Deal with solid-state spectroscopy.
\item Publish lines of non-electromagnetic messengers.
\end{itemize}
\begin{table}[hpt]
\hskip -0.05\linewidth
\begin{tabular}{p{0.43\linewidth}cp{0.5\linewidth}}
\sptablerule
\textbf{Name [Unit]} \ucd{UCD}&\textbf{Type}&\textbf{Description}\\
\sptablerule
% GENERATED: python3 make-columns-table.py
\texttt{title} \hfil\break\ucd{meta.id} & \textbf{text} & \raggedright Human-readable line designation.\tabularnewline
\rowsep
\texttt{vacuum\_wavelength} [Å] \hfil\break\ucd{em.wl} & \textbf{float} & \raggedright Vacuum wavelength of the transition\tabularnewline
\rowsep
\texttt{vacuum\_wavelength\_error} [Å] \hfil\break\ucd{stat.error;em.wl} & float & \raggedright Total error in vacuum\_wavelength\tabularnewline
\rowsep
\texttt{method} \hfil\break\ucd{meta.code.class} & text & \raggedright Method the wavelength was obtained with (XSAMS controlled vocabulary)\tabularnewline
\rowsep
\texttt{element} \hfil\break\ucd{phys.atmol.element} & text & \raggedright Element name for atomic transitions, NULL otherwise.\tabularnewline
\rowsep
\texttt{ion\_charge} \hfil\break\ucd{phys.electCharge} & integer & \raggedright Total charge (ionisation level) of the emitting particle.\tabularnewline
\rowsep
\texttt{mass\_number} \hfil\break\ucd{phys.atmol.weight} & integer & \raggedright Number of nucleons in the atom or molecule\tabularnewline
\rowsep
\texttt{upper\_energy} [J] \hfil\break\ucd{phys.energy;phys.atmol.initial} & float & \raggedright Energy of the upper state\tabularnewline
\rowsep
\texttt{lower\_energy} [J] \hfil\break\ucd{phys.energy;phys.atmol.final} & float & \raggedright Energy of the lower state\tabularnewline
\rowsep
\texttt{inchi} \hfil\break\ucd{meta.id;phys.atmol;meta.main} & text & \raggedright International Chemical Identifier InChI.\tabularnewline
\rowsep
\texttt{inchikey} \hfil\break\ucd{meta.id;phys.atmol} & text & \raggedright The InChi key (hash) generated from inchi.\tabularnewline
\rowsep
\texttt{einstein\_a} \hfil\break\ucd{phys.atmol.transProb} & float & \raggedright Einstein A coefficient of the radiative transition.\tabularnewline
\rowsep
\texttt{xsams\_uri} \hfil\break\ucd{meta.ref} & text & \raggedright A URI for a full XSAMS description of this line.\tabularnewline
\rowsep
\texttt{line\_reference} \hfil\break\ucd{meta.ref} & \textbf{text} & \raggedright Reference to the source of the line data; this could be a bibcode, a DOI, or a plain URI.\tabularnewline
% /GENERATED
\sptablerule
\end{tabular}
\caption{The columns that make up the LineTAP data model. Column names
and units are mandatory, columns with types in \textbf{bold face} must
not be NULL.}
\label{tab:ltcols}
\end{table}
\section{Spectral Lines Table}\label{sect:quantities}
Table~\ref{tab:ltcols} gives the columns that make up the LineTAP
relational model. Implementations MUST have all columns given in this
table, MUST exactly declare the units as given there, and MUST NOT have
NULL values in columns with types printed in bold face in the table.
Implementations are free to adapt the UCDs and descriptions given in the
table, but they SHOULD give UCDs and descriptions for all columns.
Column types can be adjusted to local needs; in the
above table, ``float'' is to be understood as any suitable
floating-point number, ``integer'' as any integral type, and ``string''
some sort of character sequence.
Implementors are free to add additional, custom columns.
Implementors wishing to communicate the quantum numbers of the two
states participating in the transition SHOULD use the column names
\textit{lo\-wer\_sta\-te\_configuration} and
\textit{up\-per\_sta\-te\_configuration}. This specification does not
constrain the content of these columns, and hence these cannot be
queried or interpreted interoperably.
The following additional notes apply to individual columns:
\begin{itemize}
\item \texttt{title} This is a human readable string representing the
element or molecule producing the line. Implementors are free to choose
titles as they see it, but they should keep in mind that these titles
should work in cramped places (i.e., they should not exceed 20
characters or so) and that they should speak to astronomers. ``21 cm
HI'' or ``[NII]6583'' would be good examples.
\item \texttt{vacuum\_wavelength}
All wavelengths in LineTAP are given for the vacuum,
and they are stored in the database in
Angstrom. An ADQL user-defined function is provided to convert these
wavelengths to other units. They will be described in
sect.~\ref{sect:udfs}.
\item \texttt{vacuum\_wavelength\_error} The integrated error for the
\texttt{vacuum\_wave\-length}. This subsumes the complex error model of
VAMDC and is obviously not intended to replace it for precise analyses.
Data providers should, where necessary, combine systematic and
statistical errors. Where the original errors are asymmetric, this
quantity should ideally reflect the FWHM of a Gaussian fit of the true
error profile, or simply employ the larger error as an upper limit.
\item \texttt{method} Describes which method was used to obtain the
wavelength value. The method names admitted here are listed in the XSAMS
Schema\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/methods.html\#method}}.
\item \texttt{element} For atomic lines, this column gives the
conventional IUPAC element symbol (e.g., H, He, or Rn), using the
conventional capitalisation (i.e., uppercase first letter, lowercase
second letter where present). No additional qualifications are admitted
here; use \texttt{ion\_charge} to denote ionisation levels and
\texttt{mass\_number} to distinguish isotopes. Molecular transitions
have NULL here; use \texttt{inchi} and \texttt{inchikey} to identify
those. For atomic transitions, \texttt{element} is mandatory.
\item \texttt{ion\_charge} Ionization level of the species.
\item \texttt{mass\_number} The mass number of an atom or the sum of
each atomic mass numbers of the elements forming a molecule.
\item \texttt{upper\_energy} Energy in J of the quantum state
with higher energy
\item \texttt{lower\_energy} Energy in J of the quantum state
with lower energy
\item \texttt{inchi} The IUPAC International Chemical Identifier
(InChI), which provides unique labels for well-defined chemical
substances \citep{INCHI}. It is meant to be human readable, but
depending of the molecule it can be very long. Here, the main layer is
mandatory, the charge, stereochemical, and isotopic layers are optional.
Only standard InChIs may be used in LineTAP. This is optional for
atomic transitions (hence, query using \texttt{element},
\texttt{ion\_charge}, and \texttt{mass\_number} for those) but mandatory
for molecules.
\item \texttt{inchikey} Character signature based on a hash code of
the InChI string. It will be used to uniquely identify species. As with
\texttt{inchi}, this is optional for atomic transitions but mandatory
for molecules.
\item \texttt{einstein\_a} Einstein coefficient, or transition probability.
\item \texttt{xsams\_uri} Where full XSAMS metadata is available for a
transition, this link resolves to that document. What is returned MUST
return exactly one transition, and it MUST be in XSAMS version 1.
\item \texttt{line\_reference} Information about the source of the line data,
like an URI, DOI or bibcode. This is a mandatory column. Users should
have a guaranteed way of finding out where a piece of information came
from and where additional information is available.
\end{itemize}
\section{Species Table}\label{sect:speciestable}
\label{ref:speciestable}
The species table is used to facilitate the referencing of molecules. As
there are many summary formulas and colloquial molecule names for common
species (and more than one species may correspond to a given summary
formula and even colloquial name), the resolution of such identifiers to
InChIs is generally non-trivial.
LineTAP's species table contains a mapping between common names and
summary formulas and InChIs. It should be populated by data providers
publishing molecule data to the best of their knowledge. It is
explicitly possible to associate multiple names with a single InChI.
There is no explicit relationship between a species table and LineTAP
tables on a given service, i.e., the presence of a species in the the
species table is not a guarantee that data on it is available from any
table in the service.
For most cases, only the InChIKey is enough to reference a molecule. The InChi
column is present in this table for the case that users want to use it to confirm if the
returned molecule is the one they're searching for.
\begin{table}[hpt]
\hskip -0.05\linewidth
\begin{tabular}{p{0.43\linewidth}cp{0.5\linewidth}}
\sptablerule
\textbf{Name [Unit]} \ucd{UCD}&\textbf{Type}&\textbf{Description}\\
\sptablerule
% GENERATED: python3 make-species-table.py
\texttt{inchikey} \hfil\break\ucd{} & text & \raggedright InChIKey of this species\tabularnewline
\rowsep
\texttt{inchi} \hfil\break\ucd{} & text & \raggedright InChI of this species\tabularnewline
\rowsep
\texttt{name} \hfil\break\ucd{} & text & \raggedright A common name of this species\tabularnewline
\rowsep
\texttt{formula} \hfil\break\ucd{} & text & \raggedright Chemical formula of this species in some free-ish notation\tabularnewline
\rowsep
\texttt{source\_id} \hfil\break\ucd{} & text & \raggedright VAMDC identifier of the origin of this mapping\tabularnewline
% /GENERATED
\sptablerule
\end{tabular}
\caption{The columns that make up the Species Table. }
\label{tab:spcols}
\end{table}
\section{ADQL User-defined functions}
\label{sect:udfs}
LineTAP services MUST implement the \texttt{ivo\_specconv} user defined
function as defined by the Catalogue of ADQL User Defined Functions
\citep{2021ivoa.spec.0310C}\todo{Update to a reference to 1.1 when it is
actually in}.
With this function, users can query LineTAP databases in their preferred
units, as in
\begin{lstlisting}[language=SQL]
SELECT
title,
ivo_specconv(vacuum_wavelength, 'GHz') as freq
WHERE
vacuum_wavelength BETWEEN ivo_specconv(200, 'GHz', 'Angstrom')
AND ivo_specconv(300, 'GHz', 'Angstrom')
\end{lstlisting}
The shorter alternative
\begin{lstlisting}[language=SQL]
SELECT
title,
ivo_specconv(vacuum_wavelength, 'GHz') as freq,
WHERE
ivo_specconv(vacuum_wavelength, 'GHz') BETWEEN 200 AND 300
\end{lstlisting}
\noindent would work as well but will probably put a significantly higher load on
the server machine. It is therefore discouraged.
Also note that since unit conversions may be non-linear, it is generally
wrong to use \texttt{ivo\_specconv} on
\texttt{vacuum\_wavelength\_error}.
\section{LineTAP Query Examples}
\subsection{Use Case Examples}
In this section, we give queries addressing the use cases from
section~\ref{sect:use-cases}.
\subsubsection{Identifying a Single Line}
To obtain human-readable labels for a feature between 4005 and 4005.5
Angstrom, run:
% please-run-a-test
\begin{lstlisting}[language=SQL]
SELECT title, vacuum_wavelength
FROM toss.line_tap
WHERE
vacuum_wavelength BETWEEN 4005 AND 4005.5
\end{lstlisting}
While we suggest it is in general preferable to do unit conversion
client-side, here is how to express such a query in
frequency.
Note that you will have to swap lower and upper limits when converting
from energy or frequency to wavelength.
%please-run-a-test
\begin{lstlisting}[language=SQL]
SELECT title, ivo_specconv(vacuum_wavelength, 'THz')
FROM toss.line_tap
WHERE
vacuum_wavelength BETWEEN
ivo_specconv(6.01, 'THz', 'Angstrom')
AND ivo_specconv(6.00, 'THz', 'Angstrom')
\end{lstlisting}
\subsubsection{Retrieving Atomic Spectral Lines for Identification}
When a researcher has reason to believe significant amounts of
Technetium are in a hot atmosphere, a client might look for lines for
three and four times ionised Technetium between 3000 and
3500 Angstrom using the element column:
%please-run-a-test
\begin{lstlisting}[language=SQL]
SELECT *
FROM toss.line_tap
WHERE
element = 'Tc'
AND ion_charge BETWEEN -4 AND -3
AND vacuum_wavelength BETWEEN 3000 AND 3100
\end{lstlisting}
\subsubsection{Finding Molecular Spectral Lines for Specific Species}
Where species are only known by elemental composition, lines can be
located using SQL patterns against the inchi column, for instance:
% please-run-a-test
\begin{lstlisting}[language=SQL]
SELECT *
FROM casa_lines.line_tap
WHERE
inchi LIKE 'InChI=1S/C4H3N/%'
\end{lstlisting}
For a well-defined species, clients should use the InChi key to
constrain the species. For
normal water, that would be:
% please-run-a-test
\begin{lstlisting}[language=SQL]
SELECT title, vacuum_wavelength, method, einstein_a
FROM casa_lines.line_tap
WHERE
inchikey='XLYOFNOQVPJJNP-NJFSPNSNSA-N'
\end{lstlisting}
\noindent -- for water with one Hydrogen substituted
with Deuterium, the InChI key
would be XLYOFNOQVPJJNP-DYCDLGHINA-N.
\subsubsection{Selecting Candidate Lines}
To only retrieve the 10 lines with the highest Einstein A for a given
species (in this case, CN) and having an upper level in the octave
around
$1.5\,\textrm{meV}$ above the ground state, one would write
% please-run-a-test
\begin{lstlisting}[language=SQL]
SELECT TOP 10
title, vacuum_wavelength, einstein_a, line_reference,
ivo_specconv(upper_energy, 'J', 'eV') as ue, inchi
FROM casa_lines.line_tap
WHERE
inchikey='JEVCWSUVFOYBFI-UHFFFAOYSA-N'
and upper_energy between
ivo_specconv(1, 'meV', 'J')
AND ivo_specconv(2, 'meV', 'J')
ORDER BY einstein_a DESC
\end{lstlisting}
\subsubsection{Characterising a Service's Data Holdings}
Determining how many lines of which species are available on a given
service could be done with a query like this:
% please-run-a-test
\begin{lstlisting}[language=SQL]
SELECT
inchi, count(*) as n_lines
FROM casa_lines.line_tap
GROUP BY inchi
\end{lstlisting}
\subsubsection{Searching With Trivial Molecule Names}
Searching with trivial names as discussed in use
case~\ref{uc:resolution} would often be a two-step process where clients
ask the researcher which InChI would correspond the the species they
were looking for. In simple cases, however, a single joined query can be
run, too.
% please-run-a-test
\begin{lstlisting}[language=SQL]
SELECT
*
FROM casa_lines.line_tap
JOIN species.main as s USING (inchikey)
WHERE s.name='Methylidynium'
\end{lstlisting}
\section{Mapping from VAMDCXSAMS}
\label{sect:mapping}
The quantities used in the mapping belong to the following elements of XSAMS
Schema\footnote{\url{https://vamdc.org/documents/vamdc-xsams-doc-1.0/}\label{fn:schema}}.
\begin{itemize}
\item \xsref{XSAMSData.Species.Atoms.Atom}
\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/speciesAtoms.html\#atom}\label{fn:atom}},
\item \xsref{XSAMSData.Species.Molecules.Molecule}
\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/speciesMolecules.html\#molecule}\label{fn:molecule}},
\item \xsref{XSAMSData.Processes.Radiative.RadiativeTransition}
\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/processRadiative.html\#radiativetransition}\label{fn:radtrans}},
\item \xsref{XSAMSData.Sources}
\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/sources.html}\label{fn:source}}
\end{itemize}
Listed below are the LineTAP quantities defined in section \ref{sect:quantities} and the corresponding elements from the XSAMS schema:
\begin{bigdescription}
\item [vacuum\_wavelength]
\begin{xsamsbridge}
\item[data model] \xsref{RadiativeTransition.EnergyWavelength}\xsamsref{fn:radtrans}
\item[constraints] if \xsref{vacuum} is not true, use
\xsref{AirToVac} to convert.
\end{xsamsbridge}
\item [vacuuml\_wavelength\_error]
\begin{xsamsbridge}
\item[data model] \xsref{RadiativeTransition.EnergyWavelength}, \xsref{DataType.Accuracy} \footnote{%
\url{https://standards.vamdc.eu/dataModel/vamdcxsams/types.html\#accuracytype}}
\end{xsamsbridge}
\item [method]
\begin{xsamsbridge}
\item[data model] \xsref{RadiativeTransition.EnergyWavelength.Method}
\xsamsref{fn:radtrans} \footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/methods.html\#method}}
\item[constraints]
\xsref{MethodID} must be equal to
\xsref{MethodRef} of the respective energy wavelength. Possible values are given in the VAMDC standard.
\end{xsamsbridge}
\item [inchi]
\begin{xsamsbridge}
\item[data model] \xsref{Atom.Isotope.Ion.InChi}
\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/speciesAtoms.html\#atomicion}\label{fn:ion}} or
\xsref{Molecule.MolecularChemicalSpecies.InChI}
\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/speciesMolecules.html\#molecularchemicalspecies}\label{fn:moleculespecies}}
\item[constraints]
\xsref{SpeciesRef} of radiative transition must be equal
to \xsref{SpeciesID} of the respective atom or molecule.
\end{xsamsbridge}
\item [inchikey]
\begin{xsamsbridge}
\item[data model] \xsref{Atom.Isotope.Ion.InChiKey}\xsamsref{fn:ion},
\xsref{Molecule.MolecularChemicalSpecies.InChIKey}\xsamsref{fn:moleculespecies},
\item[constraints]
\xsref{SpeciesRef} of radiative transition must be equal
to \xsref{SpeciesID} of the respective atom or molecule.
\end{xsamsbridge}
\item [ion\_charge]
\begin{xsamsbridge}
\item[data model] \xsref{Atom.Isotope.Ion.IonCharge}\xsamsref{fn:ion} or
\xsref{Molecule.MolecularChemicalSpecies.IonCharge}\xsamsref{fn:moleculespecies}
\item[constraints]
\xsref{SpeciesRef} of radiative transition must be equal
to \xsref{SpeciesID} of the respective atom or molecule.
\end{xsamsbridge}
\item [mass\_number (atoms)]
\begin{xsamsbridge}
\item[data model] \xsref{Atom.Isotope.IsotopeParameters.MassNumber}\xsamsref{fn:atom}
\item[constraints] \xsref{SpeciesRef} of radiative transition must be equal
to \xsref{SpeciesID} of the respective atom or molecule.
\end{xsamsbridge}
\item [upper\_state\_energy]
\begin{xsamsbridge}
\item[data model]
\xsref{Atom.Isotope.Ion.AtomicState.AtomicNumericalData.StateEnergy}\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/speciesAtoms.html\#atomicstate}\label{fn:atomicstate}} or
\xsref{Molecule.MolecularState.MolecularStateCharacterisation.StateEnergy}\footnote{\url{https://standards.vamdc.eu/dataModel/vamdcxsams/speciesMolecules.html\#molecularstate}\label{fn:molecularstate}}
\item[constraints]
\xsref{UpperStateRef} must be equal to
\xsref{StateID} of the respective molecular state or atomic state.
\end{xsamsbridge}
\item [lower\_state\_energy]
\begin{xsamsbridge}
\item[data model]
\xsref{Atom.Isotope.Ion.AtomicState.AtomicNumericalData.StateEnergy} \xsamsref{fn:atomicstate} or
\xsref{Molecule.MolecularState.MolecularStateCharacterisation.StateEnergy}\xsamsref{fn:molecularstate}
\item[constraints] \xsref{LowerStateRef} must be equal to
\xsref{StateID} of the respective molecular state or atomic state.
\end{xsamsbridge}
\item [einstein\_a]
\begin{xsamsbridge}
\item[data model]
\xsref{RadiativeTransition.Probability.TransitionProbabilityA} \xsamsref{fn:radtrans}
\end{xsamsbridge}
\item [line\_reference]
\begin{xsamsbridge}
\item[data model] \xsref{Sources.Source.DigitalObjectIdentifier}\xsamsref{fn:source},
\xsref{Sources.Source.UniformResourceIdentifier}\xsamsref{fn:source}
\end{xsamsbridge}
\item [title]
a human readable string for information. Can be composed by the species name and/or other information that might be useful. This quantity is not part of XSAMS (but part of SSLDM).
\end{bigdescription}
\section{LineTAP and the VO Registry}
\label{sect:regmatters}
\subsection{Registering LineTAP-conforming Tables}
LineTAP line tables are registered using VODataService \citep{2021ivoa.spec.1102D}
tablesets, where the table utype is set to
$$\hbox{\verb|ivo://ivoa.net/std/linetap#lines-1.0|}.$$
The tableset is contained in a VODataService \xmlel{CatalogResource}
record with a TAP auxiliary capability
as per DDC \citep{2019ivoa.spec.0520D}.
Further capabilities, for instance for full VAMDC or legacy SLAP
services, may be given in the same record.
An example for a registry record in VOResource, for the case of
using an auxiliary capability referencing a main TAP service comes with
this document\footnote{\auxiliaryurl{example-record.xml}}.
The noteworthy points in the record are:
\begin{itemize}
\item A \xmlel{relationship} element referencing the main TAP service
through which the service is queriable as per DDC:
\begin{lstlisting}[language=XML,basicstyle=\footnotesize]
<relationship>
<relationshipType>served-by</relationshipType>
<relatedResource ivo-id="ivo://org.gavo.dc/tap"
>GAVO Data Center TAP service</relatedResource>
</relationship>
\end{lstlisting}
\item The declaration for the auxiliary capability, including the access
URL so clients do not need to follow the relationship just declared if
all they need is the access URL:
\begin{lstlisting}[language=XML,basicstyle=\footnotesize]
<capability standardID="ivo://ivoa.net/std/TAP#aux">
<interface role="std" version="1.1" xsi:type="vs:ParamHTTP">
<accessURL use="base">http://dc.zah.uni-heidelberg.de/tap</accessURL>
</interface>
</capability>
\end{lstlisting}
\item Most importantly, the declaration of the table utype that lets
clients discover that this particular table contains LineTAP data:
\begin{lstlisting}[language=XML,basicstyle=\footnotesize]
<table>
<name>toss.ivoa_lines</name>
<title>TOSS</title>
<description> The LineTAP version of...</description>
<utype>ivo://ivoa.net/std/linetap#lines-1.0</utype>
...
</table>
\end{lstlisting}
\end{itemize}
That in the example record, the resource description is identical to the
description of the schema, which again is identical to the description
of the table is an artefact of LineTAP registrations being single-table
and is thus to be expected in most registrations of this type. Clients
are advised to use the resource description for full text searches.
Species tables are registered in exactly the same way, except their
utype is
$$\hbox{\verb|ivo://ivoa.net/std/linetap#species-1.0|}.$$
Data providers should only register line and species tables in one
resource record if the species table really has the same metadata
(description, author, source, etc) as the line table.
\subsection{Discovering LineTAP services}
LineTAP consumers in general are interested in TAP endpoints and table names for
LineTAP services. By our registration pattern, this translates into
resources with TAP capabilities that have a standard key for version 1
LineTAP in a table utype.
Translated into RegTAP \citep{2019ivoa.spec.1011D}, the following query
would return TAP access URLs and the table names:
\begin{lstlisting}[language=SQL]
SELECT table_name, access_url
FROM rr.res_table
NATURAL JOIN rr.capability
NATURAL JOIN rr.interface
WHERE
table_utype LIKE 'ivo://ivoa.net/std/linetap#lines-1.%'
AND standard_id LIKE 'ivo://ivoa.net/std/tap%'
AND intf_role='std'
AND res_type='vs:catalogresource'
\end{lstlisting}
The regular expression in the utype match is to make sure minor version
increments do not prevent service discovery; by IVOA versioning rules,
all LineTAP services of minor version 1 can be operated by all LineTAP
clients of version 1. We do not constrain the version of the TAP
service. Clients may want to adapt the TAP discovery pattern to match
their specific needs.
Adapting the utype, this query will work analogously for species tables.
\appendix
\section{Changes from WD-2023-03-23}
\begin{itemize}
\item Adding the species table
\item Changing the line table utype to \dots lines-1.0 (rather than
\dots table-1.0 before).
\item Making line\_reference mandatory
\end{itemize}
\bibliography{ivoatex/ivoabib,ivoatex/docrepo, localrefs}
\end{document}