-
Notifications
You must be signed in to change notification settings - Fork 3
Support arbitrary number of LASes and auto-discover flash list to LAS mapping #76
Comments
There are 3 flashlists names that are used across different LASes: diskInfo:
hostInfo:
jobcontrol:
(*) Chosen by manual configuration. In this case only one LAS url (kvm-s3562-1-ip151-95.cms:9945) introduces this collision but this is a problem for which we need a general solution. In old DAQView I see that this issue is not addressed. The flashlist catalog is converted to a map where the name of the flashlist is a key. There is assumption that the flashlist name is unique. This is not a case with the set of LAS urls we are using now. How do we handle general problem of name collision which may occur in the future e.g. misconfiguration. |
I've implemented the solution b as I have to have the next version ready soon - mapping of the flashlists to LASes is going to be changed today. The order of the LAS urls will determine which ones will be ignored. First match from the configuration will be used while the rest will be ignored. |
Hi Maciej,
In these cases you need to use both LASes and merge the rows. An error should be flagged only if two LASes contain rows for the same key (i.e. would be mapped to the same item in the model).
The particular case arises because DAQ and TCDS are in two different monitoring setups. Jobcontrol, diskinfo etc. exist for both setups and end up in different LASes. Entries in both LASes concern exclusive sets of hosts.
Cheers,
Hannes.
…On 19 Apr 2017, at 17:31, Maciej Gladki ***@***.***> wrote:
There are 3 flashlists names that are used across different LASes:
diskInfo:
• http://kvm-s3562-1-ip151-95.cms:9945/urn:xdaq-application:service=xmaslas2g
• http://ucsrv-c2e41-14-01.cms:9942/urn:xdaq-application:service=xmaslas2g
hostInfo:
• http://kvm-s3562-1-ip151-95.cms:9945/urn:xdaq-application:service=xmaslas2g
• http://ucsrv-c2e41-14-01.cms:9942/urn:xdaq-application:service=xmaslas2g
jobcontrol:
• http://kvm-s3562-1-ip151-95.cms:9945/urn:xdaq-application:service=xmaslas2g
• http://ucsrv-c2e41-14-01.cms:9942/urn:xdaq-application:service=xmaslas2g
How do we handle this?
• which LAS url should I use for diskInfo, hostInfo, jobcontrol?
• how do we handle general problem of name collision which may occur in the future e.g. misconfiguration. a) halt DAQAggregator and report fatal problem b) choose one (which?) and ignore the rest c) other suggestions
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Handling missing flashlistsHannes:
|
tcds flash lists missing is a completely normal situation.
But also any other flash list may be missing at some points in time. For example whenever the DAQ gets destroyed, all DAQ related flash lists (i.e. not Level0 and not tcds) will disappear for a while.
We need to review when the flash lists mapping should be detected. For this we need to know how long the flash list discovery takes. If this time is negligible we can discover at every retrieve. Otherwise we should devise a scheme in which we try to re-run discovery only for flash lists that were not yet discovered and maybe not on every iteration.
…On 28 Apr 2017, at 09:12, Maciej Gladki ***@***.***> wrote:
Handling missing flashlists
Hannes:
The aggregator should be able to tolerate missing flash lists. Especially in the case of TCDS flash lists.
• Which flashlists may be missing?
• EVM
• FMMInput
• FMMInputDetail
• FMMStatus
• RU
• ferolConfiguration
• ferolInputStream
• ferolMonitoring
• ferolStatus
• ferolTcpStream
• frlMonitoring
• hostInfo
• levelZeroFM_dynamic
• levelZeroFM_static
• levelZeroFM_subsys
• jobcontrol
• diskInfo
• tcds_cpm_counts
• tcds_cpm_deadtimes
• tcds_cpm_rates
• tcds_pm_action_counts
• tcds_pm_tts_channel
• ferol40Configuration
• ferol40InputStream
• ferol40Status
• ferol40StreamConfiguration
• ferol40TcpStream
• tcdsFM
• How Aggregator indicates that is running in degraded mode (some flashlists could not be found in given LASes)? Entry in log. sth else?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
DAQAggregator needs to be able to support any number of LASes in the configuration.
At the startup of the application, DAQAggregator should query all LASes using the retrieveCatalog service in order to determine on which LAS each flash list is served.
The text was updated successfully, but these errors were encountered: