-
Notifications
You must be signed in to change notification settings - Fork 22
/
Copy pathChap_API_Tools.tex
1647 lines (1322 loc) · 106 KB
/
Chap_API_Tools.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Chapter: Tools
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{Tools and Debuggers}
\label{chap:api_tools}
The term \textit{tool} widely refers to programs executed by the user or system administrator on a command line. Tools frequently interact with either the \ac{SMS}, user applications, or both to perform administrative and support functions. For example, a debugger tool might be used to remotely control the processes of a parallel application, monitoring their behavior on a step-by-step basis. Historically, such tools were custom-written for each specific host environment due to the customized and/or proprietary nature of the environment's interfaces.
The advent of \ac{PMIx} offers the possibility for creating portable tools capable of interacting with multiple \acp{RM} without modification. Possible use-cases include:
\begin{itemize}
\item querying the status of scheduling queues and estimated allocation time for various resource options
\item job submission and allocation requests
\item querying job status for executing applications
\item launching, monitoring, and debugging applications
\end{itemize}
Enabling these capabilities requires some extensions to the \ac{PMIx} Standard (both in terms of \acp{API} and attributes), and utilization of client-side \acp{API} for more tool-oriented purposes.
This chapter defines specific \acp{API} related to tools, provides tool developers with an overview of the support provided by \ac{PMIx}, and serves to guide \ac{RM} vendors regarding roles and responsibilities of \acp{RM} to support tools. As the number of tool-specific \acp{API} and attributes is fairly small, the bulk of the chapter serves to provide a "theory of operation" for tools and debuggers. Description of the \acp{API} themselves is therefore deferred to the Section \ref{chap:api_tools:apis} later in the chapter.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Connection Mechanisms}
\label{chap:api_tools:cnct}
The key to supporting tools lies in providing mechanisms by which a tool can connect to a \ac{PMIx} server. Application processes are able to connect because their local \ac{RM} daemon provides them with the necessary contact information upon execution. A command-line tool, however, isn't spawned by an \ac{RM} daemon, and therefore lacks the information required for rendezvous with a \ac{PMIx} server.
Once a tool has started, it initializes \ac{PMIx} as a tool (via \refapi{PMIx_tool_init}) if its access is restricted to \ac{PMIx}-based informational services such as \refapi{PMIx_Query_info}. However, if the tool intends to start jobs, then it must include the \refattr{PMIX_LAUNCHER} attribute to inform the library of that intent so that the library can initialize and provide access to the corresponding support.
Support for tools requires that the \ac{PMIx} server be initialized with an appropriate attribute indicating that tool connections are to be allowed. Separate attributes are provided to "fine-tune" this permission by allowing the environment to independently enable (or disable) connections from tools executing on nodes other than the one hosting the server itself. The \ac{PMIx} server library shall provide an opportunity for the host environment to authenticate and approve each connection request from a specific tool by calling the \refapi{pmix_server_tool_connection_fn_t} "hook" provided in the server module for that purpose. Servers in environments that do not provide this "hook" shall automatically reject all tool connection requests.
Tools can connect to any local or remote \ac{PMIx} server provided they are either explicitly given the required connection information, or are able to discover it via one of several defined rendezvous protocols. Connection discovery centers around the existence of \emph{rendezvous files} containing the necessary connection information, as illustrated in Fig. \ref{fig:rndvz}.
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.9\textwidth]{figs/rndvz.pdf}
\end{center}
\caption{Tool rendezvous files}
\label{fig:rndvz}
\end{figure*}
\endgroup
The contents of each rendezvous file are specific to a given \ac{PMIx} implementation, but should at least contain the namespace and rank of the server along with its connection \ac{URI}. Note that tools linked to one \ac{PMIx} implementation are therefore unlikely to successfully connect to \ac{PMIx} server libraries from another implementation.
The top of the directory tree is defined by either the \refattr{PMIX_SYSTEM_TMPDIR} attribute (if given) or the \code{TMPDIR} environmental variable. \ac{PMIx} servers that are designated as \emph{system servers} by including the \refattr{PMIX_SERVER_SYSTEM_SUPPORT} attribute when calling \refapi{PMIx_server_init} will create a rendezvous file in this top-level directory. The filename will be of the form \emph{pmix.sys.hostname}, where \emph{hostname} is the string returned by the \code{gethostname} system call. Note that only one \ac{PMIx} server on a node can be designated as the system server.
Non-system \ac{PMIx} servers will create a set of three rendezvous files in the directory defined by either the \refattr{PMIX_SERVER_TMPDIR} attribute or the \code{TMPDIR} environmental variable:
\begin{itemize}
\item \emph{pmix.host.tool.nspace} where \emph{host} is the string returned by the \code{gethostname} system call and \emph{nspace} is the namespace of the server.
\item \emph{pmix.host.tool.pid} where \emph{host} is the string returned by the \code{gethostname} system call and \emph{pid} is the \ac{PID} of the server.
\item \emph{pmix.host.tool} where \emph{host} is the string returned by the \code{gethostname} system call. Note that servers which are not given a namespace-specific \refattr{PMIX_SERVER_TMPDIR} attribute may not generate this file due to conflicts should multiple servers be present on the node.
\end{itemize}
The files are identical and may be implemented as symlinks to a single instance. The individual file names are composed so as to aid the search process should a tool wish to connect to a server identified by its namespace or \ac{PID}.
Servers will additionally provide a rendezvous file in any given location if the path (either absolute or relative) and filename is specified either during \refapi{PMIx_server_init} using the \refattr{PMIX_LAUNCHER_RENDEZVOUS_FILE} attribute, or by the \refenvar{PMIX_LAUNCHER_RNDZ_FILE} environmental variable prior to executing the process containing the server. This latter mechanism may be the preferred mechanism for tools such as debuggers that need to fork/exec a launcher (e.g., "mpiexec") and then rendezvous with it. This is described in more detail in Section \ref{chap:api_tools:indirect}.
Rendezvous file ownerships are set to the \ac{UID} and \ac{GID} of the server that created them, with permissions set according to the desires of the implementation and/or system administrator policy. All connection attempts are first governed by read access privileges to the target rendezvous file - thus, the combination of permissions, \ac{UID}, and \ac{GID} of the rendezvous files act as a first-level of security for tool access.
A tool may connect to as many servers at one time as the implementation supports, but is limited to designating only one such connection as its \emph{primary} server. This is done to avoid confusion when the tool calls an \ac{API} as to which server should service the request. The first server the tool connects to is automatically designated as the \emph{primary} server.
Tools are allowed to change their primary server at any time via the \refapi{PMIx_tool_set_server} \ac{API}, and to connect/disconnect from a server as many times as desired. Note that standing requests (e.g., event registrations) with the current primary server may be lost and/or may not be transferred when transitioning to another primary server - \ac{PMIx} implementors are not required to maintain or transfer state across tool-server connections.
Tool process identifiers are assigned by one of the following methods:
\begin{itemize}
\item If \refattr{PMIX_TOOL_NSPACE} is given, then the namespace of the tool will be assigned that value.
\begin{itemize}
\item If \refattr{PMIX_TOOL_RANK} is also given, then the rank of the tool will be assigned that value.
\item If \refattr{PMIX_TOOL_RANK} is not given, then the rank will be set to a default value of zero.
\end{itemize}
\item If a process ID is not provided and the tool connects to a server, then one will be assigned by the host environment upon connection to that server.
\item If a process ID is not provided and the tool does not connect to a server (e.g., if \refattr{PMIX_TOOL_DO_NOT_CONNECT} is given), then the tool shall self-assign a unique identifier. This is often done using some combination involving hostname and \ac{PID}.
\end{itemize}
Tool process identifiers remain constant across servers. Thus, it is critical that a system-wide unique namespace be provided if the tool itself sets the identifier, and that host environments provide a system-wide unique identifier in the case where the identifier is set by the server upon connection. The host environment is required to reject any connection request that fails to meet this criterion.
For simplicity, the following descriptions will refer to the:
\begin{itemize}
\item \code{PMIX_SYSTEM_TMPDIR} as the directory specified by either the \refattr{PMIX_SYSTEM_TMPDIR} attribute (if given) or the \code{TMPDIR} environmental variable.
\item \code{PMIX_SERVER_TMPDIR} as the directory specified by either the \refattr{PMIX_SERVER_TMPDIR} attribute or the \code{TMPDIR} environmental variable.
\end{itemize}
The rendezvous methods are automatically employed for the initial tool connection during \refapi{PMIx_tool_init} unless the \refattr{PMIX_TOOL_DO_NOT_CONNECT} attribute is specified, and on all subsequent calls to \refapi{PMIx_tool_attach_to_server}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Rendezvousing with a local server}
Connection to a local \ac{PMIx} server is pursued according to the following precedence chain based on attributes contained in the call to the \refapi{PMIx_tool_init} or \refapi{PMIx_tool_attach_to_server} \acp{API}. Servers to which the tool already holds a connection will be ignored. Except where noted, the \ac{PMIx} library will return an error if the specified file cannot be found, the caller lacks permissions to read it, or the server specified within the file does not respond to or accept the connection — the library will not proceed to check for other connection options as the user specified a particular one to use.
Note that the \ac{PMIx} implementation may choose to introduce a "delayed connection" protocol between steps in the precedence chain - i.e., the library may cycle several times, checking for creation of the rendezvous file each time after a delay of some period of time, thereby allowing the tool to wait for the server to create the rendezvous file before either returning an error or continuing to the next step in the chain.
\begin{itemize}
%
\item If \refattr{PMIX_TOOL_ATTACHMENT_FILE} is given, then the tool will attempt to read the specified file and connect to the server based on the information contained within it. The format of the attachment file is identical to the rendezvous files described in earlier in this section. An error will be returned if the specified file cannot be found.
%
\item If \refattr{PMIX_SERVER_URI} or \refattr{PMIX_TCP_URI} is given, then connection will be attempted to the server at the specified \ac{URI}. Note that it is an error for both of these attributes to be specified. \refattr{PMIX_SERVER_URI} is the preferred method as it is more generalized — \refattr{PMIX_TCP_URI} is provided for those cases where the user specifically wants to use a \ac{TCP} transport for the connection and wants to error out if one isn’t available or cannot be used.
%
\item If \refattr{PMIX_SERVER_PIDINFO} was provided, then the tool will search for a rendezvous file created by a \ac{PMIx} server of the given \ac{PID} in the \code{PMIX_SERVER_TMPDIR} directory. An error will be returned if a matching rendezvous file cannot be found.
%
\item If \refattr{PMIX_SERVER_NSPACE} is given, then the tool will search for a rendezvous file created by a \ac{PMIx} server of the given namespace in the \code{PMIX_SERVER_TMPDIR} directory. An error will be returned if a matching rendezvous file cannot be found.
%
\item If \refattr{PMIX_CONNECT_TO_SYSTEM} is given, then the tool will search for a system-level rendezvous file created by a \ac{PMIx} server in the \code{PMIX_SYSTEM_TMPDIR} directory. An error will be returned if a matching rendezvous file cannot be found.
%
\item If \refattr{PMIX_CONNECT_SYSTEM_FIRST} is given, then the tool will look for a system-level rendezvous file created by a \ac{PMIx} server in the \code{PMIX_SYSTEM_TMPDIR} directory. If found, then the tool will attempt to connect to it. In this case, no error will be returned if the rendezvous file is not found or connection is refused — the \ac{PMIx} library will silently continue to the next option.
%
\item By default, the tool will search the directory tree under the \code{PMIX_SERVER_TMPDIR} directory for rendezvous files of \ac{PMIx} servers, attempting to connect to each it finds until one accepts the connection. If no rendezvous files are found, or all contacted servers refuse connection, then the \ac{PMIx} library will return an error. No "delayed connection" protocols may be utilized at this point.
%
\end{itemize}
Note that there can be multiple local servers - one from the system plus others from launchers and active jobs. The \ac{PMIx} tool connection search method is not guaranteed to pick a particular server unless directed to do so. Tools can obtain a list of servers available on their local node using the \refapi{PMIx_Query_info} \acp{API} with the \refattr{PMIX_QUERY_AVAIL_SERVERS} key.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Connecting to a remote server}
Connecting to remote servers is complicated due to the lack of access to the previously-described rendezvous files. Two methods are required to be supported, both based on the caller having explicit knowledge of either connection information or a path to a local file that contains such information:
\begin{itemize}
%
\item If \refattr{PMIX_TOOL_ATTACHMENT_FILE} is given, then the tool will attempt to read the specified file and connect to the server based on the information contained within it. The format of the attachment file is identical to the rendezvous files described in earlier in this section.
%
\item If \refattr{PMIX_SERVER_URI} or \refattr{PMIX_TCP_URI} is given, then connection will be attempted to the server at the specified \ac{URI}. Note that it is an error for both of these attributes to be specified. \refattr{PMIX_SERVER_URI} is the preferred method as it is more generalized — \refattr{PMIX_TCP_URI} is provided for those cases where the user specifically wants to use the \ac{TCP} transport for the connection and wants to error out if it isn’t available or cannot be used.
%
\end{itemize}
Additional methods may be provided by particular \ac{PMIx} implementations. For example, the tool may use \emph{ssh} to launch a \emph{probe} process onto the remote node so that the probe can search the \code{PMIX_SYSTEM_TMPDIR} and \code{PMIX_SERVER_TMPDIR} directories for rendezvous files, relaying the discovered information back to the requesting tool. If sufficient information is found to allow for remote connection, then the tool can use it to establish the connection. Note that this method is not required to be supported - it is provided here as an example and left to the discretion of \ac{PMIx} implementors.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Attaching to running jobs}
When attaching to a running job, the tool must connect to a \ac{PMIx} server that is associated with that job - e.g., a server residing in the host environment's local daemon that spawned one or more of the job's processes, or the server residing in the launcher that is overseeing the job. Identifying an appropriate server can sometimes prove challenging, particularly in an environment where multiple job launchers may be in operation, possibly under control of the same user.
In cases where the user has only the one job of interest in operation on the local node (e.g., when engaged in an interactive session on the node from which the launcher was executed), the normal rendezvous file discovery method can often be used to successfully connect to the target job, even in the presence of jobs executed by other users. The permissions and security authorizations can, in many cases, reliably ensure that only the one connection can be made. However, this is not guaranteed in all cases.
The most common method, therefore, for attaching to a running job is to specify either the \ac{PID} of the job's launcher or the namespace of the launcher's job (note that the launcher's namespace frequently differs from the namespace of the job it has launched). Unless the application processes themselves act as \ac{PMIx} servers, connection must be to the servers in the daemons that oversee the application. This is typically either daemons specifically started by the job's launcher process, or daemons belonging to the host environment, that are responsible for starting the application's processes and oversee their execution.
Identifying the correct \ac{PID} or namespace can be accomplished in a variety of ways, including:
\begin{itemize}
\item Using typical \ac{OS} or host environment tools to obtain a listing of active jobs and perusing those to find the target launcher.
\item Using a \ac{PMIx}-based tool attached to a system-level server to query the active jobs and their command lines, thereby identifying the application of interest and its associated launcher.
\item Manually recording the \ac{PID} of the launcher upon starting the job.
\end{itemize}
Once the namespace and/or \ac{PID} of the target server has been identified, either of the previous methods can be used to connect to it.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Tool initialization attributes}
\label{api:tools:attributes:tool}
The following attributes are passed to the \refapi{PMIx_tool_init} \ac{API} for use when initializing the \ac{PMIx} library.
%
\declareAttribute{PMIX_TOOL_NSPACE}{"pmix.tool.nspace"}{char*}{
Name of the namespace to use for this tool.
}
%
\declareAttribute{PMIX_TOOL_RANK}{"pmix.tool.rank"}{uint32_t}{
Rank of this tool.
}
%
\declareAttribute{PMIX_LAUNCHER}{"pmix.tool.launcher"}{bool}{
Tool is a launcher and needs to create rendezvous files.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Tool initialization environmental variables}
\label{api:tools:envars:tool}
The following environmental variables are used during \refapi{PMIx_tool_init} and \refapi{PMIx_server_init} to control various rendezvous-related operations when the process is started manually (e.g., on a command line) or by a fork/exec-like operation.
%
\declareEnvar{PMIX_LAUNCHER_RNDZ_URI}{
The spawned tool is to be connected back to the spawning tool using the given \ac{URI} so that the spawning tool can provide directives (e.g., a \refapi{PMIx_Spawn} command) to it.
}
%
\declareEnvar{PMIX_LAUNCHER_RNDZ_FILE}{
If the specified file does not exist, this variable contains the absolute path of the file where the spawned tool is to store its connection information so that the spawning tool can connect to it. If the file does exist, it contains the information specifying the server to which the spawned tool is to connect.
}
%
\declareEnvar{PMIX_KEEPALIVE_PIPE}{
An integer \code{read}-end of a POSIX pipe that the tool should monitor for closure, thereby indicating that the parent tool has terminated. Used. for example, when a tool fork/exec's an intermediate launcher that should self-terminate if the originating tool exits.
}
%
Note that these environmental variables should be cleared from the environment after use and prior to forking child processes to avoid potentially unexpected behavior by the child processes.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Tool connection attributes}
\label{api:struct:attributes:connection}
These attributes are defined to assist \ac{PMIx}-enabled tools to connect with a \ac{PMIx} server by passing them into either the \refapi{PMIx_tool_init} or the \refapi{PMIx_tool_attach_to_server} \acp{API} - thus, they are not typically accessed via the \refapi{PMIx_Get} \ac{API}.
%
\declareAttribute{PMIX_SERVER_PIDINFO}{"pmix.srvr.pidinfo"}{pid_t}{
\ac{PID} of the target \ac{PMIx} server for a tool.
}
%
\declareAttribute{PMIX_CONNECT_TO_SYSTEM}{"pmix.cnct.sys"}{bool}{
The requester requires that a connection be made only to a local, system-level \ac{PMIx} server.
}
%
\declareAttribute{PMIX_CONNECT_SYSTEM_FIRST}{"pmix.cnct.sys.first"}{bool}{
Preferentially, look for a system-level \ac{PMIx} server first.
}
%
\declareAttribute{PMIX_SERVER_URI}{"pmix.srvr.uri"}{char*}{
\ac{URI} of the \ac{PMIx} server to be contacted.
}
%
\declareAttribute{PMIX_SERVER_HOSTNAME}{"pmix.srvr.host"}{char*}{
Host where target \ac{PMIx} server is located.
}
%
\declareAttribute{PMIX_CONNECT_MAX_RETRIES}{"pmix.tool.mretries"}{uint32_t}{
Maximum number of times to try to connect to \ac{PMIx} server - the default value is implementation specific.
}
%
\declareAttribute{PMIX_CONNECT_RETRY_DELAY}{"pmix.tool.retry"}{uint32_t}{
Time in seconds between connection attempts to a \ac{PMIx} server - the default value is implementation specific.
}
%
\declareAttribute{PMIX_TOOL_DO_NOT_CONNECT}{"pmix.tool.nocon"}{bool}{
The tool wants to use internal \ac{PMIx} support, but does not want to connect to a \ac{PMIx} server.
}
%
\declareAttribute{PMIX_TOOL_CONNECT_OPTIONAL}{"pmix.tool.conopt"}{bool}{
The tool shall connect to a server if available, but otherwise continue to operate unconnected.
}
%
\declareAttribute{PMIX_TOOL_ATTACHMENT_FILE}{"pmix.tool.attach"}{char*}{
Pathname of file containing connection information to be used for attaching to a specific server.
}
%
\declareAttribute{PMIX_LAUNCHER_RENDEZVOUS_FILE}{"pmix.tool.lncrnd"}{char*}{
Pathname of file where the launcher is to store its connection information so that the spawning tool can connect to it.
}
%
\declareAttribute{PMIX_PRIMARY_SERVER}{"pmix.pri.srvr"}{bool}{
The server to which the tool is connecting shall be designated the \emph{primary} server once connection has been accomplished.
}
%
\declareAttribute{PMIX_WAIT_FOR_CONNECTION}{"pmix.wait.conn"}{bool}{
Wait until the specified process has connected to the requesting tool or server, or the operation times out (if the \refattr{PMIX_TIMEOUT} directive is included in the request).
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Launching Applications with Tools}
\label{chap:api_tools:launch}
Tool-directed launches require that the tool include the \refattr{PMIX_LAUNCHER} attribute when calling \refapi{PMIx_tool_init}. Two launch modes are supported:
\begin{itemize}
\item \emph{Direct launch} where the tool itself is directly responsible for launching all processes, including debugger daemons, using either the \ac{RM} or daemons launched by the tool – i.e., there is no \emph{intermediate launcher} (IL) such as \emph{mpiexec}. The case where the tool is self-contained (i.e., uses its own daemons without interacting with an external entity such as the \ac{RM}) lies outside the scope of this Standard; and
\item \emph{Indirect launch} where all processes are started via an \ac{IL} such as \emph{mpiexec} and the tool itself is not directly involved in launching application processes or debugger daemons. Note that the \ac{IL} may utilize the \ac{RM} to launch processes and/or daemons under the tool's direction.
\end{itemize}
Either of these methods can be executed interactively or by a batch script. Note that not all host environments may support the direct launch method.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Direct launch}
\label{chap:api_tools:direct}
In the direct-launch use-case (Fig. \ref{fig:dlaunch}), the tool itself performs the role of the launcher. Once invoked, the tool connects to an appropriate \ac{PMIx} server - e.g., a system-level server hosted by the \ac{RM}. The tool is responsible for assembling the description of the application to be launched (e.g., by parsing its command line) into a spawn request containing an array of \refstruct{pmix_app_t} applications and \refstruct{pmix_info_t} job-level information. An allocation of resources may or may not have been made in advance – if not, then the spawn request must include allocation request information.
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.8\textwidth]{figs/directlaunch.pdf}
\end{center}
\caption{Direct Launch}
\label{fig:dlaunch}
\end{figure*}
\endgroup
In addition to the attributes described in \refapi{PMIx_Spawn}, the tool may optionally wish to include the following tool-specific attributes in the \emph{job_info} argument to that \ac{API} (the debugger-related attributes are discussed in more detail in Section \ref{chap:api_tools:debuggers}):
\begin{itemize}
\item \pasteAttributeItem{PMIX_FWD_STDIN}
\item \pasteAttributeItem{PMIX_FWD_STDOUT}
\item \pasteAttributeItem{PMIX_FWD_STDERR}
\item \pasteAttributeItem{PMIX_FWD_STDDIAG}
\item \pasteAttributeItem{PMIX_IOF_CACHE_SIZE}
\item \pasteAttributeItem{PMIX_IOF_DROP_OLDEST}
\item \pasteAttributeItem{PMIX_IOF_DROP_NEWEST}
\item \pasteAttributeItem{PMIX_IOF_BUFFERING_SIZE}
\item \pasteAttributeItem{PMIX_IOF_BUFFERING_TIME}
\item \pasteAttributeItem{PMIX_IOF_OUTPUT_RAW}
\item \pasteAttributeItem{PMIX_IOF_TAG_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_TIMESTAMP_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_XML_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_RANK_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_OUTPUT_TO_FILE}
\item \pasteAttributeItem{PMIX_IOF_OUTPUT_TO_DIRECTORY}
\item \pasteAttributeItem{PMIX_IOF_FILE_PATTERN}
\item \pasteAttributeItem{PMIX_IOF_FILE_ONLY}
\item \pasteAttributeItem{PMIX_IOF_MERGE_STDERR_STDOUT}
\item \pasteAttributeItem{PMIX_NOHUP}
\item \pasteAttributeItem{PMIX_NOTIFY_JOB_EVENTS}
\item \pasteAttributeItem{PMIX_NOTIFY_COMPLETION}
\item \pasteAttributeItem{PMIX_LOG_JOB_EVENTS}
\item \pasteAttributeItem{PMIX_LOG_COMPLETION}
\item \pasteAttributeItem{PMIX_DEBUG_STOP_ON_EXEC}
\item \pasteAttributeItem{PMIX_DEBUG_STOP_IN_INIT}
\item \pasteAttributeItem{PMIX_DEBUG_STOP_IN_APP}
\end{itemize}
\adviceuserstart
The \refattr{PMIX_IOF_FILE_ONLY} indicates output is directed to files and
no copy is sent back to the application. For example, this can be combined with
\refattr{PMIX_IOF_OUTPUT_TO_FILE} or \refattr{PMIX_IOF_OUTPUT_TO_DIRECTORY} to
only output to files.
\adviceuserend
The tool then calls the \refapi{PMIx_Spawn} \ac{API} so that the \ac{PMIx} library can communicate the spawn request to the server.
Upon receipt, the \ac{PMIx} server library passes the spawn request to its host \ac{RM} daemon for processing via the \refapi{pmix_server_spawn_fn_t} server module function. If this callback was not provided, then the \ac{PMIx} server library will return the \refconst{PMIX_ERR_NOT_SUPPORTED} error status.
If an allocation must be made, then the host environment is responsible for
communicating the request to its associated scheduler. Once resources are
available, the host environment initiates the launch process to start the job.
The host environment must parse the spawn request for relevant directives,
returning an error if any required directive cannot be supported. Optional
directives may be ignored if they cannot be supported.
Any error while executing the spawn request must be returned by
\refapi{PMIx_Spawn} to the requester. Once the spawn request has succeeded in
starting the specified processes, the request will return
\refconst{PMIX_SUCCESS} back to the requester along with the namespace of the
started job. Upon termination of the spawned job, the host environment must
generate a \refconst{PMIX_EVENT_JOB_END} event for normal or abnormal
termination if requested to do so. The event shall include:
\begin{itemize}
\item the returned status code (\refattr{PMIX_JOB_TERM_STATUS}) for the
corresponding job;
\item the identity (\refattr{PMIX_PROCID}) and exit status
(\refattr{PMIX_EXIT_CODE}) of the first failed process, if applicable;
\item a \refattr{PMIX_EVENT_TIMESTAMP} indicating the time the termination
occurred; plus
\item any other info provided by the host environment.
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Indirect launch}
\label{chap:api_tools:indirect}
In the indirect launch use-case, the application processes are started via an intermediate launcher (e.g., \emph{mpiexec}) that is itself started by the tool (see Fig \ref{fig:indirlnch}). Thus, at a high level, this is a two-stage launch procedure to start the application: the tool (henceforth referred to as the \emph{initiator}) starts the \ac{IL}, which then starts the applications. In practice, additional steps may be involved if, for example, the \ac{IL} starts its own daemons to shepherd the application processes.
A key aspect of this operational mode is the avoidance of any requirement that the initiator parse and/or understand the command line of the \ac{IL}. Instead, the indirect launch procedure supports either of two methods: one where the initiator assumes responsibility for parsing its command line to obtain the application as well as the \ac{IL} and its options, and another where the initiator defers the command line parsing to the \ac{IL}. Both of these methods are described in the following sections.
\subsubsection{Initiator-based command line parsing}
\label{chap:api_tools:indirect:tool}
This method utilizes a first call to the \refapi{PMIx_Spawn} \ac{API} to start the \ac{IL} itself, and then uses a second call to \refapi{PMIx_Spawn} to request that the \ac{IL} spawn the actual job. The burden of analyzing the initial command line to separately identify the \ac{IL}'s command line from the application itself falls upon the initiator. An example is provided below:
\begin{verbatim}
$ initiator --launcher "mpiexec --verbose" -n 3 ./app <appoptions>
\end{verbatim}
The initiator spawns the \ac{IL} using the same procedure for launching an application - it begins by assembling the description of the \ac{IL} into a spawn request containing an array of \refstruct{pmix_app_t} and \refstruct{pmix_info_t} job-level information. Note that this step does not include any information regarding the application itself - only the launcher is included. In addition, the initiator must include the rendezvous \ac{URI} in the environment so the \ac{IL} knows how to connect back to it.
An allocation of resources for the \ac{IL} itself may or may not be required – if it is, then the allocation must be made in advance or the spawn request must include allocation request information.
\begin{figure*}[ht!]
\centering
\begin{subfigure}{.5\textwidth}
\centering
\includegraphics[width=\textwidth]{figs/indirlnch-start.pdf}
\caption{Indirect Launch - Start}
\label{fig:indirlnch-start}
\end{subfigure}%
\begin{subfigure}{.5\textwidth}
\centering
\includegraphics[width=\textwidth]{figs/indirlnch-end.pdf}
\caption{Indirect Launch - End}
\label{fig:indirlnch-end}
\end{subfigure}
\caption{Indirect launch procedure}
\label{fig:indirlnch}
\end{figure*}
The initiator may optionally wish to include the following tool-specific attributes in the \emph{job_info} argument to \refapi{PMIx_Spawn} - note that these attributes refer only to the behavior of the \ac{IL} itself and not the eventual job to be launched:
\begin{itemize}
\item \pasteAttributeItem{PMIX_FWD_STDIN}
\item \pasteAttributeItem{PMIX_FWD_STDOUT}
\item \pasteAttributeItem{PMIX_FWD_STDERR}
\item \pasteAttributeItem{PMIX_FWD_STDDIAG}
\item \pasteAttributeItem{PMIX_IOF_CACHE_SIZE}
\item \pasteAttributeItem{PMIX_IOF_DROP_OLDEST}
\item \pasteAttributeItem{PMIX_IOF_DROP_NEWEST}
\item \pasteAttributeItem{PMIX_IOF_BUFFERING_SIZE}
\item \pasteAttributeItem{PMIX_IOF_BUFFERING_TIME}
\item \pasteAttributeItem{PMIX_IOF_TAG_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_TIMESTAMP_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_XML_OUTPUT}
\item \pasteAttributeItem{PMIX_NOHUP}
\item \pasteAttributeItem{PMIX_LAUNCHER_DAEMON}
\item \pasteAttributeItem{PMIX_FORKEXEC_AGENT}
\item \pasteAttributeItem{PMIX_EXEC_AGENT}
\item \pasteAttributeItemBegin{PMIX_DEBUG_STOP_IN_INIT}In this context, the initiator is directing the \ac{IL} to stop in \refapi{PMIx_tool_init}. This gives the initiator a chance to connect to the \ac{IL} and register for events prior to the \ac{IL} launching the application job.
\pasteAttributeItemEnd
\end{itemize}
and the following optional variables in the environment of the \ac{IL}:
\begin{itemize}
\item \refenvar{PMIX_KEEPALIVE_PIPE} - an integer \code{read}-end of a POSIX pipe that the \ac{IL} should monitor for closure, thereby indicating that the initiator has terminated.
\end{itemize}
The initiator then calls the \refapi{PMIx_Spawn} \ac{API} so that the \ac{PMIx} library can either communicate the spawn request to a server (if connected to one), or locally spawn the \ac{IL} itself if not connected to a server and the \ac{PMIx} implementation includes self-spawn support. \refapi{PMIx_Spawn} shall return an error if neither of these conditions is met.
When initialized by the \ac{IL}, the \refapi{PMIx_tool_init} function must perform two operations:
\begin{itemize}
\item check for the presence of the \refenvar{PMIX_KEEPALIVE_PIPE} environmental variable - if provided, then the library shall monitor the pipe for closure, providing a \refconst{PMIX_EVENT_JOB_END} event when the pipe closes (thereby indicating the termination of the initiator). The \ac{IL} should register for this event after completing \refapi{PMIx_tool_init} - the initiator's namespace can be obtained via a call to \refapi{PMIx_Get} with the \refattr{PMIX_PARENT_ID} key. Note that this feature will only be available if the spawned \ac{IL} is local to the initiator.
\item check for the \refenvar{PMIX_LAUNCHER_RNDZ_URI} environmental parameter - if found, the library shall connect back to the initiator using the \refapi{PMIx_tool_attach_to_server} \ac{API}, retaining its current server as its primary server.
\end{itemize}
Once the \ac{IL} completes \refapi{PMIx_tool_init}, it must register for the \refconst{PMIX_EVENT_JOB_END} termination event and then idle until receiving that event - either directly from the initiator, or from the \ac{PMIx} library upon detecting closure of the keepalive pipe. The \ac{IL} idles in the intervening time as it is solely acting as a relay (if connected to a server that is performing the actual application launch) or as a \ac{PMIx} server responding to spawn requests.
Upon return from the \refapi{PMIx_Spawn} \ac{API}, the initiator should set the spawned \ac{IL} as its primary server using the \refapi{PMIx_tool_set_server} \ac{API} with the nspace returned by \refapi{PMIx_Spawn} and any valid rank (a rank of zero would ordinarily be used as only one \ac{IL} process is typically started). It is advisable to set a connection timeout value when calling this function. The initiator can then proceed to spawn the actual application according to the procedure described in Section \ref{chap:api_tools:direct}.
\subsubsection{\ac{IL}-based command line parsing}
\label{chap:api_tools:indirect:tool}
In the case where the initiator cannot parse its command line, it must defer that parsing to the \ac{IL}. A common example is provided below:
\begin{verbatim}
$ initiator mpiexec --verbose -n 3 ./app <appoptions>
\end{verbatim}
For this situation, the initiator proceeds as above with only one notable exception: instead of calling \refapi{PMIx_Spawn} twice (once to start the \ac{IL} and again to start the actual application), the initiator only calls that \ac{API} one time:
\begin{itemize}
\item The \refarg{app} parameter passed to the spawn request contains only one \refstruct{pmix_app_t} that contains the entire command line, including both launcher and application(s).
\item The launcher executable must be in the \refarg{app.cmd} field and in \refarg{app.argv[0]}, with the rest of the command line appended to the \refarg{app.argv} array.
\item Any job-level directives for the \ac{IL} itself (e.g., \refattr{PMIX_FORKEXEC_AGENT} or \refattr{PMIX_FWD_STDOUT}) are included in the \refarg{job_info} parameter of the call to \refapi{PMIx_Spawn}.
\item The job-level directives must include both the \refattr{PMIX_SPAWN_TOOL} attribute indicating that the initiator is spawning a tool, and the \refattr{PMIX_DEBUG_STOP_IN_INIT} attribute directing the \ac{IL} to stop during the call to \refapi{PMIx_tool_init}. The latter directive allows the initiator to connect to the \ac{IL} prior to launch of the application.
\item The \refenvar{PMIX_LAUNCHER_RNDZ_URI} and \refenvar{PMIX_KEEPALIVE_PIPE} environmental variables are provided to the launcher in its environment via the \refarg{app.env} field.
\item The \ac{IL} must use \refapi{PMIx_Get} with the \refattr{PMIX_LAUNCH_DIRECTIVES} key to obtain any initiator-provided directives (e.g., \refattr{PMIX_DEBUG_STOP_IN_INIT} or \refattr{PMIX_DEBUG_STOP_ON_EXEC}) aimed at the application(s) it will spawn.
\end{itemize}
Upon return from \refapi{PMIx_Spawn}, the initiator must:
\begin{itemize}
\item use the \refapi{PMIx_tool_set_server} \ac{API} to set the spawned \ac{IL} as its primary server
\item register with that server to receive the \refconst{PMIX_LAUNCH_COMPLETE} event. This allows the initiator to know when the \ac{IL} has completed launch of the application
\item release the \ac{IL} from its "hold" in \refapi{PMIx_tool_init} by issuing the \refconst{PMIX_DEBUGGER_RELEASE} event, specifying the \ac{IL} as the custom range. Upon receipt of the event, the \ac{IL} is free to parse its command line, apply any provided directives, and execute the application.
\end{itemize}
Upon receipt of the \refconst{PMIX_LAUNCH_COMPLETE} event, the initiator should register to receive notification of completion of the returned namespace of the application. Receipt of the \refconst{PMIX_EVENT_JOB_END} event provides a signal that the initiator may itself terminate.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Tool spawn-related attributes}
\label{api:tools:attributes:spawn}
Tools are free to utilize the spawn attributes available to applications (see \ref{api:struct:attributes:spawn}) when constructing a spawn request, but can also utilize the following attributes that are specific to tool-based spawn operations:
%
\declareAttribute{PMIX_FWD_STDIN}{"pmix.fwd.stdin"}{pmix_rank_t}{
The requester intends to push information from its \code{stdin} to the
indicated process. The local spawn agent should, therefore, ensure that the
\code{stdin} channel to that process remains available. A rank of
\refconst{PMIX_RANK_WILDCARD} indicates that all processes in the spawned job
are potential recipients. The requester will issue a call to
\refapi{PMIx_IOF_push} to initiate the actual forwarding of information to
specified targets - this attribute simply requests that the \ac{IL} retain the
ability to forward the information to the designated targets.
}
%
\declareAttribute{PMIX_FWD_STDOUT}{"pmix.fwd.stdout"}{bool}{
Requests that the ability to forward the \code{stdout} of the spawned
processes be
maintained. The requester will issue a call to \refapi{PMIx_IOF_pull} to
specify the callback function and other options for delivery of the forwarded
output.
}
%
\declareAttribute{PMIX_FWD_STDERR}{"pmix.fwd.stderr"}{bool}{
Requests that the ability to forward the \code{stderr} of the spawned
processes be
maintained. The requester will issue a call to \refapi{PMIx_IOF_pull} to
specify the callback function and other options for delivery of the forwarded
output.
}
%
\declareAttribute{PMIX_FWD_STDDIAG}{"pmix.fwd.stddiag"}{bool}{
Requests that the ability to forward the diagnostic channel (if it exists) of
the spawned processes be
maintained. The requester will issue a call to \refapi{PMIx_IOF_pull} to
specify the callback function and other options for delivery of the forwarded
output.
}
%
\declareAttribute{PMIX_NOHUP}{"pmix.nohup"}{bool}{
Any processes started on behalf of the calling tool (or the specified namespace, if such specification is included in the list of attributes) should continue after the tool disconnects from its server.
}
%
\declareAttribute{PMIX_LAUNCHER_DAEMON}{"pmix.lnch.dmn"}{char*}{
Path to executable that is to be used as the backend daemon for the launcher. This replaces the launcher's own daemon with the specified executable. Note that the user is therefore responsible for ensuring compatibility of the specified executable and the host launcher.
}
%
\declareAttribute{PMIX_FORKEXEC_AGENT}{"pmix.frkex.agnt"}{char*}{
Path to executable that the launcher's backend daemons are to fork/exec in place of the actual application processes. The fork/exec agent shall connect back (as a \ac{PMIx} tool) to the launcher's daemon to receive its spawn instructions, and is responsible for starting the actual application process it replaced. See Section \ref{api:tools:debugger:agent} for details.
}
%
\declareAttribute{PMIX_EXEC_AGENT}{"pmix.exec.agnt"}{char*}{
Path to executable that the launcher's backend daemons are to fork/exec in place of the actual application processes. The launcher's daemon shall pass the full command line of the application on the command line of the exec agent, which shall not connect back to the launcher's daemon. The exec agent is responsible for exec'ing the specified application process in its own place. See Section \ref{api:tools:debugger:agent} for details.
}
%
\declareAttribute{PMIX_LAUNCH_DIRECTIVES}{"pmix.lnch.dirs"}{pmix_data_array_t*}{
Array of \refstruct{pmix_info_t} containing directives for the launcher - a convenience attribute for retrieving all directives with a single call to \refapi{PMIx_Get}.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Tool rendezvous-related events}
\label{api:tools:attributes:spawnconst}
The following constants refer to events relating to rendezvous of a tool and launcher during spawn of the \ac{IL}.
\begin{constantdesc}
%
\declareconstitemvalue{PMIX_LAUNCHER_READY}{-155}
An application launcher (e.g., \emph{mpiexec}) shall generate this event to signal a tool that started it that the launcher is ready to receive directives/commands (e.g., \refapi{PMIx_Spawn}). This is only used when the initiator is able to parse the command line itself, or the launcher is started as a persistent \ac{DVM}.
%
\end{constantdesc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{IO Forwarding}
\label{chap:api_tools:iof}
Underlying the operation of many tools is a common need to forward \code{stdin} from the tool to targeted processes, and to return \code{stdout}/\code{stderr} from those processes to the tool (e.g., for display on the user’s console). Historically, each tool developer was responsible for creating their own \ac{IO} forwarding subsystem. However, the introduction of \ac{PMIx} as a standard mechanism for interacting between applications and the host environment has made it possible to relieve tool developers of this burden.
This section defines functions by which tools can request forwarding of input/output to/from other processes and serves as a design guide to:
\begin{itemize}
\item provide tool developers with an overview of the expected behavior of the \ac{PMIx} \ac{IO} forwarding support;
\item guide \ac{RM} vendors regarding roles and responsibilities expected of the \ac{RM} to support \ac{IO} forwarding; and
\item provide insight into the thinking of the \ac{PMIx} community behind the definition of the \ac{PMIx} \ac{IO} forwarding \acp{API}.
\end{itemize}
Note that the forwarding of \ac{IO} via \ac{PMIx} requires that both the host environment and the tool support \ac{PMIx}, but does not impose any similar requirements on the application itself.
The responsibility of the host environment in forwarding of \ac{IO} falls into the following areas:
\begin{itemize}
\item Capturing output from specified processes.
\item Forwarding that output to the host of the \ac{PMIx} server library that requested it.
\item Delivering that payload to the \ac{PMIx} server library via the \refapi{PMIx_server_IOF_deliver} \ac{API} for final dispatch to the requesting tool.
\end{itemize}
It is the responsibility of the \ac{PMIx} library to buffer, format, and deliver the payload to the requesting client. This may require caching of output until a forwarding registration is received, as governed by the corresponding \ac{IO} forwarding attributes of Section \ref{api:tools:attributes:iof} that are supported by the implementation.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Forwarding stdout/stderr}
At an appropriate point in its operation (usually during startup), a tool will utilize the \refapi{PMIx_tool_init} function to connect to a \ac{PMIx} server. The \ac{PMIx} server can be hosted by an \ac{RM} daemon or could be embedded in a library-provided starter program such as \textit{mpiexec} - in terms of \ac{IO} forwarding, the operations remain the same either way. For purposes of this discussion, we will assume the server is in an \ac{RM} daemon and that the application processes are directly launched by the \ac{RM}, as shown in Fig \ref{fig:stdouterr}.
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.8\textwidth]{figs/output.pdf}
\end{center}
\caption{Forwarding stdout/stderr}
\label{fig:stdouterr}
\end{figure*}
\endgroup
Once the tool has connected to the target server, it can request that
processes be spawned on its behalf or that output from a specified set of
existing processes in a given executing application be forwarded to it.
Requests to spawn processes should include the \refattr{PMIX_FWD_STDIN},
\refattr{PMIX_FWD_STDOUT}, and/or \refattr{PMIX_FWD_STDERR} attributes if the
tool intends to request that the corresponding streams be forwarded at some
point during execution.
Note that requests to capture output from existing processes via the
\refapi{PMIx_IOF_pull} \ac{API}, and/or to forward input to specified
processes via the \refapi{PMIx_IOF_push} \ac{API}, can only succeed if the
required attributes to retain that ability were passed when the corresponding
job was spawned. The host is required to return an error for all such requests
in cases where this condition is not met.
Two modes are supported when requesting that the host forward standard output/error via the \refapi{PMIx_IOF_pull} \ac{API} - these can be controlled by including one of the following attributes in the \refarg{info} array passed to that function:
\begin{itemize}
\item \pasteAttributeItem{PMIX_IOF_COPY}
\item \pasteAttributeItemBegin{PMIX_IOF_REDIRECT}This is the default mode of operation.
\pasteAttributeItemEnd{}
\end{itemize}
When requesting to forward \code{stdout}/\code{stderr}, the tool can specify several formatting options to be used on the resulting output stream. These include:
\begin{itemize}
\item \pasteAttributeItem{PMIX_IOF_TAG_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_TIMESTAMP_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_XML_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_RANK_OUTPUT}
\item \pasteAttributeItem{PMIX_IOF_OUTPUT_TO_FILE}
\item \pasteAttributeItem{PMIX_IOF_OUTPUT_TO_DIRECTORY}
\item \pasteAttributeItem{PMIX_IOF_FILE_PATTERN}
\item \pasteAttributeItem{PMIX_IOF_FILE_ONLY}
\item \pasteAttributeItem{PMIX_IOF_MERGE_STDERR_STDOUT}
\end{itemize}
The \ac{PMIx} client in the tool is responsible for formatting the output stream. Note that output from multiple processes will often be interleaved due to variations in arrival time - ordering of output is not guaranteed across processes and/or nodes.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Forwarding stdin}
A tool is not necessarily a child of the \ac{RM} as it may have been started directly from the command line. Thus, provision must be made for the tool to collect its \code{stdin} and pass it to the host \ac{RM} (via the \ac{PMIx} server) for forwarding. Two methods of support for forwarding of \code{stdin} are defined:
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.8\textwidth]{figs/stdin.pdf}
\end{center}
\caption{Forwarding stdin}
\label{fig:stdin}
\end{figure*}
\endgroup
\begin{itemize}
\item internal collection by the \ac{PMIx} tool library itself. This is requested via the \refattr{PMIX_IOF_PUSH_STDIN} attribute in the \refapi{PMIx_IOF_push} call. When this mode is selected, the tool library begins collecting all \code{stdin} data and internally passing it to the local server for distribution to the specified target processes. All collected data is sent to the same targets until \code{stdin} is closed, or a subsequent call to \refapi{PMIx_IOF_push} is made that includes the \refattr{PMIX_IOF_COMPLETE} attribute indicating that forwarding of \code{stdin} is to be terminated.
\item external collection directly by the tool. It is assumed that the tool will provide its own code/mechanism for collecting its \code{stdin} as the tool developers may choose to insert some filtering and/or editing of the stream prior to forwarding it. In addition, the tool can directly control the targets for the data on a per-call basis – i.e., each call to \refapi{PMIx_IOF_push} can specify its own set of target recipients for that particular \emph{blob} of data. Thus, this method provides maximum flexibility, but requires that the tool developer provide their own code to capture \code{stdin}.
\end{itemize}
Note that it is the responsibility of the \ac{RM} to forward data to the host where the target process(es) are executing, and for the host daemon on that node to deliver the data to the \code{stdin} of target process(es). The \ac{PMIx} server on the remote node is not involved in this process. Systems that do not support forwarding of \code{stdin} shall return \refconst{PMIX_ERR_NOT_SUPPORTED} in response to a forwarding request.
\adviceuserstart
Scalable forwarding of \code{stdin} represents a significant challenge. Most environments will at least handle a \emph{send-to-1} model whereby \code{stdin} is forwarded to a single identified process, and occasionally an additional \emph{send-to-all} model where \code{stdin} is forwarded to all processes in the application. Users are advised to check their host environment for available support as the distribution method lies outside the scope of \ac{PMIx}.
\code{Stdin} buffering by the \ac{RM} and/or \ac{PMIx} library can be problematic. If any targeted recipient is slow reading data (or decides never to read data), then the data must be buffered in some intermediate daemon or the \ac{PMIx} tool library itself. Thus, piping a large amount of data into \code{stdin} can result in a very large memory footprint in the system management stack or the tool. Best practices, therefore, typically focus on reading of input files by application processes as opposed to forwarding of \code{stdin}.
\adviceuserend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{IO Forwarding Channels}
\declarestruct{pmix_iof_channel_t}
\label{api:tool:iofchannels}
\versionMarker{3.0}
The \refstruct{pmix_iof_channel_t} structure is a \code{uint16_t} type that defines a set of bit-mask flags for specifying IO forwarding channels. These can be bitwise OR'd together to reference multiple channels.
\begin{constantdesc}
%
\declareconstitemvalue{PMIX_FWD_NO_CHANNELS}{0x0000}
Forward no channels.
%
\declareconstitemvalue{PMIX_FWD_STDIN_CHANNEL}{0x0001}
Forward \code{stdin}.
%
\declareconstitemvalue{PMIX_FWD_STDOUT_CHANNEL}{0x0002}
Forward \code{stdout}.
%
\declareconstitemvalue{PMIX_FWD_STDERR_CHANNEL}{0x0004}
Forward \code{stderr}.
%
\declareconstitemvalue{PMIX_FWD_STDDIAG_CHANNEL}{0x0008}
Forward \code{stddiag}, if available.
%
\declareconstitemvalue{PMIX_FWD_ALL_CHANNELS}{0x00ff}
Forward all available channels.
%
\end{constantdesc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{IO Forwarding constants}
\begin{constantdesc}
%
\declareconstitemvalue{PMIX_ERR_IOF_FAILURE}{-172}
An \ac{IO} forwarding operation failed - the affected channel will be included in the notification.
%
\declareconstitemvalue{PMIX_ERR_IOF_COMPLETE}{-173}
\ac{IO} forwarding of the standard input for this process has completed - i.e., the stdin file descriptor has closed.
%
\end{constantdesc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{IO Forwarding attributes}
\label{api:tools:attributes:iof}
The following attributes are used to control \ac{IO} forwarding behavior at the request of tools. Use of the attributes is optional - any option not provided will revert to some implementation-specific value.
%
\declareAttributeProvisional{PMIX_IOF_LOCAL_OUTPUT}{"pmix.iof.local"}{bool}{
Write output streams to local stdout/err
}
%
\declareAttributeProvisional{PMIX_IOF_MERGE_STDERR_STDOUT}{"pmix.iof.mrg"}{bool}{
Merge stdout and stderr streams from application procs
}
%
\declareAttribute{PMIX_IOF_CACHE_SIZE}{"pmix.iof.csize"}{uint32_t}{
The requested size of the \ac{PMIx} server cache in bytes for each specified channel. By default, the server is allowed (but not required) to drop all bytes received beyond the max size.
}
%
\declareAttribute{PMIX_IOF_DROP_OLDEST}{"pmix.iof.old"}{bool}{
In an overflow situation, the \ac{PMIx} server is to drop the oldest bytes to make room in the cache.
}
%
\declareAttribute{PMIX_IOF_DROP_NEWEST}{"pmix.iof.new"}{bool}{
In an overflow situation, the \ac{PMIx} server is to drop any new bytes received until room becomes available in the cache (default).
}
%
\declareAttribute{PMIX_IOF_BUFFERING_SIZE}{"pmix.iof.bsize"}{uint32_t}{
Requests that \ac{IO} on the specified channel(s) be aggregated in the \ac{PMIx} tool library until the specified number of bytes is collected to avoid being called every time a block of \ac{IO} arrives. The \ac{PMIx} tool library will execute the callback and reset the collection counter whenever the specified number of bytes becomes available. Any remaining buffered data will be \emph{flushed} to the callback upon a call to deregister the respective channel.
}
%
\declareAttribute{PMIX_IOF_BUFFERING_TIME}{"pmix.iof.btime"}{uint32_t}{
Max time in seconds to buffer \ac{IO} before delivering it. Used in conjunction with buffering size, this prevents \ac{IO} from being held indefinitely while waiting for another payload to arrive.
}
%
\declareAttributeProvisional{PMIX_IOF_OUTPUT_RAW}{"pmix.iof.raw"}{bool}{
Do not buffer output to be written as complete lines - output characters as the stream delivers them
}
%
\declareAttribute{PMIX_IOF_COMPLETE}{"pmix.iof.cmp"}{bool}{
Indicates that the specified \ac{IO} channel has been closed by the source.
}
%
\declareAttribute{PMIX_IOF_TAG_OUTPUT}{"pmix.iof.tag"}{bool}{
Requests that output be prefixed with the nspace,rank of the source and a string identifying the channel (\code{stdout}, \code{stderr}, etc.).
}
%
\declareAttribute{PMIX_IOF_TIMESTAMP_OUTPUT}{"pmix.iof.ts"}{bool}{
Requests that output be marked with the time at which the data was received by the tool - note that this will differ from the time at which the data was collected from the source.
}
%
\declareAttributeProvisional{PMIX_IOF_RANK_OUTPUT}{"pmix.iof.rank"}{bool}{
Tag output with the rank it came from
}
%
\declareAttribute{PMIX_IOF_XML_OUTPUT}{"pmix.iof.xml"}{bool}{
Requests that output be formatted in \ac{XML}.
}
%
\declareAttribute{PMIX_IOF_PUSH_STDIN}{"pmix.iof.stdin"}{bool}{
Requests that the \ac{PMIx} library collect the \code{stdin} of the requester and forward it to the processes specified in the \refapi{PMIx_IOF_push} call. All collected data is sent to the same targets until \code{stdin} is closed, or a subsequent call to \refapi{PMIx_IOF_push} is made that includes the \refattr{PMIX_IOF_COMPLETE} attribute indicating that forwarding of \code{stdin} is to be terminated.
}
%
\declareAttribute{PMIX_IOF_COPY}{"pmix.iof.cpy"}{bool}{
Requests that the host environment deliver a copy of the specified output stream(s) to the tool, letting the stream(s) continue to also be delivered to the default location. This allows the tool to tap into the output stream(s) without redirecting it from its current final destination.
}
%
\declareAttribute{PMIX_IOF_REDIRECT}{"pmix.iof.redir"}{bool}{
Requests that the host environment intercept the specified output stream(s) and deliver it to the requesting tool instead of its current final destination. This might be used, for example, during a debugging procedure to avoid injection of debugger-related output into the application’s results file. The original output stream(s) destination is restored upon termination of the tool.
}
%
\declareAttributeProvisional{PMIX_IOF_OUTPUT_TO_FILE}{"pmix.iof.file"}{char*}{
Direct application output into files of form "<filename>.<nspace>.<rank>.stdout" (for \code{stdout}) and "<filename>.<nspace>.<rank>.stderr" (for \code{stderr}). If \refattr{PMIX_IOF_MERGE_STDERR_STDOUT} was given, then only the \code{stdout} file will be created and both streams will be written into it.
}
%
\declareAttributeProvisional{PMIX_IOF_OUTPUT_TO_DIRECTORY}{"pmix.iof.dir"}{char*}{
Direct application output into files of form "<directory>/<nspace>/rank.<rank>/stdout" (for \code{stdout}) and "<directory>/<nspace>/rank.<rank>/stderr" (for \code{stderr}). If \refattr{PMIX_IOF_MERGE_STDERR_STDOUT} was given, then only the \code{stdout} file will be created and both streams will be written into it.
}
%
\declareAttributeProvisional{PMIX_IOF_FILE_PATTERN}{"pmix.iof.fpt"}{bool}{
Specified output file is to be treated as a pattern and not automatically annotated by nspace, rank, or other parameters. The pattern can use \code{\%n} for the namespace, and \code{\%r} for the rank wherever those quantities are to be placed. The resulting filename will be appended with ".stdout" for the \code{stdout} stream and ".stderr" for the \code{stderr} stream. If \refattr{PMIX_IOF_MERGE_STDERR_STDOUT} was given, then only the \code{stdout} file will be created and both streams will be written into it.
}
%
\declareAttributeProvisional{PMIX_IOF_FILE_ONLY}{"pmix.iof.fonly"}{bool}{
Output only into designated files - do not also output a copy to the console's stdout/stderr
}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Debugger Support}
\label{chap:api_tools:debuggers}
Debuggers are a class of tool that merits special consideration due to their particular requirements for access to job-related information and control over process execution. The primary advantage of using \ac{PMIx} for these purposes lies in the resulting portability of the debugger as it can be used with any system and/or programming model that supports \ac{PMIx}. In addition to the general tool support described above, debugger support includes:
\begin{itemize}
\item Co-location, co-spawn, and communication wireup of debugger daemons for scalable launch. This includes providing debugger daemons with endpoint connection information across the daemons themselves.
\item Identification of the job that is to be debugged. This includes automatically providing debugger daemons with the job-level information for their target job.
\end{itemize}
Debuggers can also utilize the options in the \refapi{PMIx_Spawn} \ac{API} to exercise a degree of control over spawned jobs for debugging purposes. For example, a debugger can utilize the environmental parameter attributes of Section \ref{api:struct:attributes:spawn} to request \code{LD_PRELOAD} of a memory interceptor library prior to spawning an application process, or interject a custom fork/exec agent to shepherd the application process.
A key element of the debugging process is the ability of the debugger to require that processes \emph{pause} at some well-defined point, thereby providing the debugger with an opportunity to attach and control execution. The actual implementation of the \emph{pause} lies outside the scope of \ac{PMIx} - it typically requires either the launcher or the application itself to implement the necessary operations. However, \ac{PMIx} does provide several standard attributes by which the debugger can specify the desired attach point:
\begin{itemize}
\item \pasteAttributeItemBegin{PMIX_DEBUG_STOP_ON_EXEC}Launchers that cannot support this operation shall return an error from the \refapi{PMIx_Spawn} \ac{API} if this behavior is requested.
\pasteAttributeItemEnd{}
\item \pasteAttributeItemBegin{PMIX_DEBUG_STOP_IN_INIT}\ac{PMIx} implementations that do not support this operation shall return an error from \refapi{PMIx_Init} if this behavior is requested. Launchers that cannot support this operation shall return an error from the \refapi{PMIx_Spawn} \ac{API} if this behavior is requested.
\pasteAttributeItemEnd{}
\item \pasteAttributeItemBegin{PMIX_DEBUG_STOP_IN_APP}Launchers that cannot support this operation shall return an error from the \refapi{PMIx_Spawn} \ac{API} if this behavior is requested.
Note that there is no mechanism by which the \ac{PMIx} library or the launcher can verify that an application will recognize and support the \refattr{PMIX_DEBUG_STOP_IN_APP} request. Debuggers utilizing this attachment method must, therefore, be prepared to deal with the case where the application fails to recognize and/or honor the request.
\pasteAttributeItemEnd{}
\end{itemize}
If the \ac{PMIx} implementation and/or the host environment support it, debuggers can utilize the \refapi{PMIx_Query_info} \ac{API} to determine which features are available via the \refattr{PMIX_QUERY_ATTRIBUTE_SUPPORT} attribute.
\begin{itemize}
\item \refattr{PMIX_DEBUG_STOP_IN_INIT} by checking \refattr{PMIX_CLIENT_ATTRIBUTES} for the \refapi{PMIx_Init} \ac{API}.
\item \refattr{PMIX_DEBUG_STOP_ON_EXEC} by checking \refattr{PMIX_HOST_ATTRIBUTES} for the \refapi{PMIx_Spawn} \ac{API}.
\end{itemize}
The target namespace or process (as given by the debugger in the spawn request) shall be provided to each daemon in its job-level information via the \refattr{PMIX_DEBUG_TARGET} attribute. Debugger daemons are responsible for self-determining their specific target process(es), and can then utilize the \refapi{PMIx_Query_info} \ac{API} to obtain information about them (see Fig \ref{fig:dbgptable}) - e.g., to obtain the \acp{PID} of the local processes to which they need to attach. \ac{PMIx} provides the \refstruct{pmix_proc_info_t} structure for organizing information about a process' \ac{PID}, location, and state. Debuggers may request information on a given job at two levels:
\begin{itemize}
\item \pasteAttributeItem{PMIX_QUERY_PROC_TABLE}
\item \pasteAttributeItem{PMIX_QUERY_LOCAL_PROC_TABLE}
\end{itemize}
Note that the information provided in the returned proctable represents a snapshot in time. Any process, regardless of role (tool, client, debugger, etc.) can obtain the proctable of a given namespace so long as it has the system-determined authorizations to do so. The list of namespaces available via a given server can be obtained using the \refapi{PMIx_Query_info} \ac{API} with the \refattr{PMIX_QUERY_NAMESPACES} key.
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.8\textwidth]{figs/dbgptable.pdf}
\end{center}
\caption{Obtaining proctables}
\label{fig:dbgptable}
\end{figure*}
\endgroup
Debugger daemons can be started in two ways - either at the same time the application is spawned, or separately at a later time.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Co-Location of Debugger Daemons}
\label{chap:api_tools:colocate}
Debugging operations typically require the use of daemons that are located on
the same node as the processes they are attempting to debug. The debugger can,
of course, specify its own mapping method when issuing its spawn request or
utilize its own internal launcher to place the daemons. However, when attaching
to a running job, \ac{PMIx} provides debuggers with a simplified method for
requesting that the launcher associated with the job \emph{co-locate} the
required daemons. Debuggers can request \emph{co-location} of their daemons by
adding the following attributes to the \refapi{PMIx_Spawn} used to spawn them:
\begin{itemize}
\item \refattr{PMIX_DEBUGGER_DAEMONS} - indicating that the launcher is
being asked to spawn debugger daemons.
\item \refattr{PMIX_DEBUG_TARGET} - indicating the job or process that is
to be debugged. This allows the launcher to identify the processes to be
debugged and their location. Note that the debugger job shall be assigned
its own namespace (different from that of the job it is being spawned
to debug) and each daemon will be assigned a unique rank within that
namespace.
\item \refattr{PMIX_DEBUG_DAEMONS_PER_PROC} - specifies the number of
debugger daemons to be co-located per target process.
\item \refattr{PMIX_DEBUG_DAEMONS_PER_NODE} - specifies the number of
debugger daemons to be co-located per node where at least one target
process is executing.
\end{itemize}
Debugger daemons spawned in this manner shall be provided with the typical
\ac{PMIx} information for their own job plus the target they are to debug via
the \refattr{PMIX_DEBUG_TARGET} attribute. The debugger daemons spawned on a
given node are responsible for self-determining their specific target
process(es) - e.g., by referencing their own \refattr{PMIX_LOCAL_RANK} in the
daemon debugger job versus the corresponding \refattr{PMIX_LOCAL_RANK} of the
target processes on the node. Note that the debugger will be attaching to the application processes
at some arbitrary point in the application's execution unless some method for pausing the application
(e.g., by providing a \ac{PMIx} directive at time of launch, or via a tool using the
\refapi{PMIx_Job_control} \ac{API} to direct that the process be paused) has been employed.
\adviceuserstart
Note that the tool calling \refapi{PMIx_Spawn} to request the launch of the debugger daemons is \emph{not} included in the resulting job - i.e., the debugger daemons do not inherit the namespace of the tool. Thus, collective operations and notifications that target the debugger daemon job will not include the tool unless the namespace/rank of the tool is explicitly included.
\adviceuserend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Co-Spawn of Debugger Daemons}
\label{chap:api_tools:cospawn}
In the case where a job is being spawned under the control of a debugger, \ac{PMIx} provides a shortcut method for spawning the debugger's daemons in parallel with the job. This requires that the debugger be specified as one of the \refstruct{pmix_app_t} in the same spawn command used to start the job. The debugger application must include at least the \refattr{PMIX_DEBUGGER_DAEMONS} attribute identifying itself as a debugger, and may utilize either a mapping option to direct daemon placement, or one of the \refattr{PMIX_DEBUG_DAEMONS_PER_PROC} or \refattr{PMIX_DEBUG_DAEMONS_PER_NODE} directives.
The launcher must not include information regarding the debugger daemons in
the job-level info
provided to the rest of the \refstruct{pmix_app_t}s, nor in any calculated rank
values (e.g., \refattr{PMIX_NODE_RANK} or \refattr{PMIX_LOCAL_RANK}) in those applications. The
debugger job is to be assigned its own namespace and each debugger daemon shall
receive a unique rank - i.e., the debugger application is to be treated as a
completely separate \ac{PMIx} job that is simply being started in parallel with
the user's applications. The launcher is free to implement the launch as a
single operation for both the applications and debugger daemons (preferred), or
may stage the launches as required. The launcher shall not return from the
\refapi{PMIx_Spawn} command until all included applications and the debugger
daemons have been started.
Attributes that apply to both the debugger daemons and the application processes can
be specified in the \refarg{job_info} array passed into the
\refapi{PMIx_Spawn} \ac{API}. Attributes that either (a) apply solely to the
debugger daemons or to one of the applications included in the spawn request,
or (b) have values that differ from those provided in the \refarg{job_info}
array, should be specified in the \refarg{info} array in the corresponding
\refstruct{pmix_app_t}.
Note that \ac{PMIx} job \emph{pause} attributes (e.g., \refattr{PMIX_DEBUG_STOP_IN_INIT}) do not apply to applications (defined in \refstruct{pmix_app_t}) where the \refattr{PMIX_DEBUGGER_DAEMONS} attribute is set to \code{true}.
Debugger daemons spawned in this manner shall be provided with the typical
\ac{PMIx} information for their own job plus the target they are to debug via
the \refattr{PMIX_DEBUG_TARGET} attribute. The debugger daemons spawned on a
given node are responsible for self-determining their specific target
process(es) - e.g., by referencing their own \refattr{PMIX_LOCAL_RANK} in the
daemon debugger job versus the corresponding \refattr{PMIX_LOCAL_RANK} of the
target processes on the node.
\adviceuserstart
Note that the tool calling \refapi{PMIx_Spawn} to request the launch of the debugger daemons is \emph{not} included in the resulting job - i.e., the debugger daemons do not inherit the namespace of the tool. Thus, collective operations and notifications that target the debugger daemon job will not include the tool unless the namespace/rank of the tool is explicitly included.
The \refapi{PMIx_Spawn} \ac{API} only supports the return of a single namespace resulting from the spawn request. In the case where the debugger job is co-spawned with the application, the spawn function shall return the namespace of the application and not the debugger job. Tools requiring access to the namespace of the debugger job must query the launcher for the spawned namespaces to find the one belonging to the debugger job.
\adviceuserend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Debugger Agents}
\label{api:tools:debugger:agent}
Individual debuggers may, depending upon implementation, require varying degrees of control over each application process when it is started beyond those available via directives to \refapi{PMIx_Spawn}. \ac{PMIx} offers two mechanisms to help provide a means of meeting these needs.
The \refattr{PMIX_FORKEXEC_AGENT} attribute allows the debugger to specify an intermediate process (the \ac{FEA}) for spawning the actual application process (see Fig. \ref{fig:dbgfea}), thereby interposing the debugger daemon between the application process and the launcher's daemon. Instead of spawning the application process, the launcher will spawn the \ac{FEA}, which will connect back to the \ac{PMIx} server as a tool to obtain the spawn description of the application process it is to spawn. The \ac{PMIx} server in the launcher's daemon shall not register the fork/exec agent as a local client process, nor shall the launcher include the agent in any of the job-level values (e.g., \refattr{PMIX_RANK} within the job or \refattr{PMIX_LOCAL_RANK} on the node) provided to the application process. The launcher shall treat the collection of \acp{FEA} as a debugger job equivalent to the co-spawn use-case described in Section \ref{chap:api_tools:cospawn}.
\begin{figure*}[ht!]
\centering
\begin{subfigure}{.5\textwidth}
\centering
\includegraphics[width=\textwidth]{figs/dbgfea.pdf}
\caption{Fork/exec agent}
\label{fig:dbgfea}
\end{subfigure}%
\begin{subfigure}{.5\textwidth}
\centering
\includegraphics[width=\textwidth]{figs/dbgea.pdf}
\caption{Exec agent}
\label{fig:dbgea}
\end{subfigure}
\caption{Intermediate agents}
\label{fig:dbginta}
\end{figure*}
In contrast, the \refattr{PMIX_EXEC_AGENT} attribute (Fig. \ref{fig:dbgea}) allows the debugger to specify an agent that will perform some preparatory actions and then exec the eventual application process to replace itself. In this scenario, the exec agent is provided with the application process' command line as arguments on its command line (e.g., \code{"./agent appargv[0] appargv[1]"}) and does not connect back to the host's \ac{PMIx} server. It is the responsibility of the exec agent to properly separate its own command line arguments (if any) from the application description.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Tracking the job lifecycle}
\label{api:tools:trkjob}
There are a wide range of events a debugger can register to receive, but three
are specifically defined for tracking a job's progress:
\begin{itemize}
\item \refconst{PMIX_EVENT_JOB_START} indicates when the first process in
the job has been spawned.
\item \refconst{PMIX_LAUNCH_COMPLETE} indicates when the last process in
the job has been spawned.
\item \refconst{PMIX_EVENT_JOB_END} indicates that all processes have
terminated.
\end{itemize}
Each event is required to contain at least the namespace of the corresponding
job and a \refattr{PMIX_EVENT_TIMESTAMP} indicating the time the event
occurred. In addition, the \refconst{PMIX_EVENT_JOB_END} event shall contain
the returned status code (\refattr{PMIX_JOB_TERM_STATUS}) for the
corresponding job, plus the identity (\refattr{PMIX_PROCID}) and exit status
(\refattr{PMIX_EXIT_CODE}) of the first failed process, if applicable.
Generation of these events by the launcher can be requested by including the
\refattr{PMIX_NOTIFY_JOB_EVENTS} attributes in the spawn request. Note that
these events can be logged via the \refapi{PMIx_Log} \ac{API} by
including the \refattr{PMIX_LOG_JOB_EVENTS} attribute - this can be done either
in conjunction with generated events, or in place of them.
Alternatively, if the debugger or tool solely wants to be alerted to job
termination, then including the \refattr{PMIX_NOTIFY_COMPLETION} attribute in
the spawn request would suffice. This attribute directs the launcher to provide
just the \refconst{PMIX_EVENT_JOB_END} event. Note that this event can be
logged via the \refapi{PMIx_Log} \ac{API} by including the
\refattr{PMIX_LOG_COMPLETION} attribute - this can be done either in
conjunction with the generated event, or in place of it.
\adviceuserstart
The \ac{PMIx} server is required to cache events in order to avoid race
conditions - e.g., when a tool is trying to register for the
\refconst{PMIX_EVENT_JOB_END} event from a very short-lived job. Accordingly,
registering for job-related events can result in receiving events relating to
jobs other than the one of interest.
Users are therefore advised to specify the job whose events are of interest by
including the \refattr{PMIX_EVENT_AFFECTED_PROC} or
\refattr{PMIX_EVENT_AFFECTED_PROCS} attribute in the \refarg{info} array passed
to the \refapi{PMIx_Register_event_handler} \ac{API}.
\adviceuserend