[HGCal] TICL v3 major upgrade #31906

felicepantaleo · 2020-10-22T16:48:09Z

PR description:

This PR updates the TICL reconstruction to make it more robust against pile-up.
In particular the following actions have been taken:

a track seeded electromagnetic iteration has been introduced;
the order of the iterations has been changed to TrkEM, EM unseeded, Trackseeded hadronic, hadronic unseeded
layerClusters with a minimum size of three are used for pattern recognition
seeding regions are sorted by pT and processed in this order
EM iterations are limited to the first 30 layers
a cut on the first layer id for the shower start has been introduced
a cut on the number of missing layers has been introduced
a cut on the longitudinal compactness of a trackster (sigmaPCA-z)
a simple PF interpretation has been introduced
the MIP iteration has been dropped

PR validation:

This is the electron reconstruction with PU200 before this PR:

This is after the PR:

Single Particle noPU

All samples with 6 energy steps = 10, 20, 50, 100, 200, 300 GeV, eta = 1.8 (HGCAL center)

More info

you can find the latest physics results with this pull request described in the talks by HGCAL and Jets/MET at the HLT Upgrade workshop:
https://indico.cern.ch/event/962025/#4-hgcal
https://indico.cern.ch/event/962025/#6-jetsmet

Timing Report

Timing with PR:
https://fpantale.web.cern.ch/fpantale/circles/web/piechart.html?local=false&dataset=TICLv3_PR31906&resource=time_thread&colours=default&groups=reco_PhaseII&threshold=0

cmsbuild · 2020-10-22T16:48:39Z

The code-checks are being triggered in jenkins.

felicepantaleo · 2020-10-22T16:49:46Z

FYI
@rovere @fwyzard @trtomei @gouskos @hqucms @ebrondol @lecriste @gennai @Sam-Harper @SohamBhattacharya @missirol @hatakeyamak @bendavid

felicepantaleo · 2020-10-22T16:50:59Z

enable profiling

felicepantaleo · 2020-10-22T16:51:08Z

@cmsbuild please test

cmsbuild · 2020-10-22T16:56:15Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31906/19329

This PR adds an extra 224KB to repository

cmsbuild · 2020-10-22T16:56:37Z

The tests are being triggered in jenkins.

CMSSW_11_2_X_2020-10-21-2300/slc7_amd64_gcc820: https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/10216/console Started: 2020/10/22 18:57

cmsbuild · 2020-10-22T16:56:38Z

A new Pull Request was created by @felicepantaleo (Felice Pantaleo) for master.

It involves the following packages:

DataFormats/HGCalReco
RecoHGCal/Configuration
RecoHGCal/TICL
RecoParticleFlow/PFClusterProducer
Validation/HGCalValidation

@perrotta, @andrius-k, @kmaeshima, @ErnestaP, @kpedro88, @cmsbuild, @jfernan2, @fioriNTU, @slava77, @jpata can you please review it and eventually sign? Thanks.
@mmarionncern, @lecriste, @sethzenz, @bsunanda, @clelange, @riga, @cbernet, @vandreev11, @rovere, @lgray, @cseez, @apsallid, @sobhatta, @pfs, @deguio, @hatakeyamak, @seemasharmafnal this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

hatakeyamak · 2020-10-22T17:10:56Z

Thanks @felicepantaleo.
If you don't mind, for my info, can you expand a bit on "a simple PF interpretation has been introduced"? (or just point to some reference on this point)?

cmsbuild · 2020-10-22T18:13:47Z

+1
Tested at: ff51d7e
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-814241/10216/summary.html
CMSSW: CMSSW_11_2_X_2020-10-21-2300
SCRAM_ARCH: slc7_amd64_gcc820

cmsbuild · 2020-10-22T18:13:49Z

Comparison job queued.

kpedro88

This review includes some suggestions to reduce CPU usage and code duplication, as well as a few other minor points. The performance profile from the PR test may provide more direction re: reducing CPU usage (if necessary).

kpedro88 · 2020-10-22T18:26:51Z

RecoHGCal/TICL/plugins/PatternRecognitionbyCA.cc

@@ -44,6 +52,9 @@ PatternRecognitionbyCA<TILES>::PatternRecognitionbyCA(const edm::ParameterSet &c
        << "PatternRecognitionbyCA received an empty graph definition from the global cache";
  }
  eidSession_ = tensorflow::createSession(trackstersCache->eidGraphDef);
+  if (max_missing_layers_in_trackster_ < 100) {


100 seems like a magic number here - how it is obtained?

Also, this could be done in the constructor initializer list:

check_missing_layers(max_missing_layers_in_trackster_ < 100),

kpedro88 · 2020-10-22T18:27:46Z

RecoHGCal/TICL/plugins/PatternRecognitionbyCA.cc

@@ -137,29 +155,105 @@ void PatternRecognitionbyCA<TILES>::makeTracksters(
                                       << input.layerClusters[outerCluster].z() << " " << tracksterId << std::endl;
      }
    }
+    unsigned showerMinLayerId = 99999;
+    std::vector<unsigned int> layerIds;


it appears this variable is never used

kpedro88 · 2020-10-22T18:30:06Z

RecoHGCal/TICL/plugins/PatternRecognitionbyCA.cc

+      lcIdAndLayer.emplace_back(i, layerId);
+    }
+    std::sort(uniqueLayerIds.begin(), uniqueLayerIds.end());
+    uniqueLayerIds.erase(std::unique(uniqueLayerIds.begin(), uniqueLayerIds.end()), uniqueLayerIds.end());


How fast is this procedure (push_back, sort, erase unique) compared to inserting into std::set for typical occupancies and number of duplicates?

the fastest is to build a heap while pushing and then do the erase(unique());
std::set is by definition slower then a vector as it allocated node on the fly (no reserve)

for a very small number of elements, the vector is the fastest option. We are talking of 20-30 elements here

kpedro88 · 2020-10-22T18:35:48Z

RecoHGCal/TICL/plugins/PatternRecognitionbyCA.cc

+      int numberOfMissingLayers = 0;
+      unsigned int j = showerMinLayerId;
+      unsigned int indexInVec = 0;
+      for (auto &layer : uniqueLayerIds) {


const auto&

kpedro88 · 2020-10-22T18:36:37Z

RecoHGCal/TICL/plugins/PatternRecognitionbyCA.cc

+      }
+    }
+
+    bool selected =


temporary is unnecessary

kpedro88 · 2020-10-22T19:02:49Z

RecoHGCal/TICL/plugins/TrackstersMergeProducer.cc

+      tmpCandidate.setRawEnergy(energy);
+      math::XYZTLorentzVector p4(track.momentum().x(), track.momentum().y(), track.momentum().z(), energy);
+      tmpCandidate.setP4(p4);
+      resultCandidates->push_back(tmpCandidate);


copy could be avoided by using emplace_back() and then back()

kpedro88 · 2020-10-22T19:04:20Z

Validation/HGCalValidation/src/HGVHistoProducerAlgo.cc

+            cPOnLayer[h.clusterId][lcLayerId].layerClusterIdToEnergyAndScore[mclId].second = FLT_MAX;
+            //cpsInMultiCluster[multicluster][CPids]
+            //Connects a multi cluster with all related caloparticles.
+            cpsInMultiCluster[mclId].emplace_back(std::make_pair<int, float>(h.clusterId, FLT_MAX));


make_pair is unnecessary with emplace_back

kpedro88 · 2020-10-22T19:05:06Z

Validation/HGCalValidation/src/HGVHistoProducerAlgo.cc

-        occurrencesCPinMCL[c]++;
+      //Loop through all rechits to count how many of them are noise and how many are matched.
+      //In case of matched rechit-simhit, he counts and saves the number of rechits related to the maximum energy CaloParticle.
+      for (auto& c : hitsToCaloParticleId) {


auto (get primitive types by value)

kpedro88 · 2020-10-22T19:07:39Z

Validation/HGCalValidation/src/HGVHistoProducerAlgo.cc

-        maxCPNumberOfHitsInMCL = c.second;
+      //Below from all maximum energy CaloParticles, he saves the one with the largest amount
+      //of related rechits.
+      for (auto& c : occurrencesCPinMCL) {


could use structured binding e.g. auto [id, nhits] : occurrencesCPinMCL

kpedro88 · 2020-10-22T19:07:46Z

Validation/HGCalValidation/src/HGVHistoProducerAlgo.cc

-      for (unsigned int j = 0; j < layers * 2; ++j) {
-        totalCPEnergyFromLayerCP = totalCPEnergyFromLayerCP + cPOnLayer[maxCPId_byEnergy][j].energy;
+      //Find the CaloParticle that has the maximum energy shared with the multicluster under study.
+      for (auto& c : CPEnergyInMCL) {


could use structured binding

cmsbuild · 2020-10-22T19:25:58Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-814241/10216/summary.html

Comparison Summary:

No significant changes to the logs found
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 1071 differences found in the comparisons
DQMHistoTests: Total files compared: 35
DQMHistoTests: Total histograms compared: 2544110
DQMHistoTests: Total failures: 2389
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2541699
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
Checked 149 log files, 22 edm output root files, 35 DQM output files

missirol · 2020-11-04T14:51:04Z

you can find the latest physics results with this pull request described in the talks by HGCAL and Jets/MET at the HLT Upgrade workshop

The plots shown at the TSG workshop correspond to commit abc84da

One minor correction: the HLT-JME plots linked above by Felice include updates up to 4da60d8 (from the 11_1_X backport in #31907); I think that would correspond to 6c32c60 in this PR.

perrotta · 2020-11-04T15:23:13Z

da60d8

Thank you @missirol.

There were several updates and bug fixes applied to this PR, and I think some final check and validation should be allowed based on the very final version before merging it. Also the plots in the PR description seems to correspond to a supposed 9b855f6 commit, which I am not able to retrieve neither here or in the backport PR.

I expect that the authors and the HGCal team can provide some kind of green light based on it. The same for the performance, which should be better re-computed with the very final version (by the way, has this PR settled down by now?)

rovere · 2020-11-04T16:04:36Z

you can find the latest physics results with this pull request described in the talks by HGCAL and Jets/MET at the HLT Upgrade workshop

The plots shown at the TSG workshop correspond to commit abc84da

One minor correction: the HLT-JME plots linked above by Felice include updates up to 4da60d8 (from the 11_1_X backport in #31907); I think that would correspond to 6c32c60 in this PR.

Thanks @missirol
The commits that were added after the one you pointed out in #31907 are related to the various technical changes requested while reviewing this PR, the addition of the EM and HAD information to the pfCandidates produced by TICL and, lastly, the correction on the direction taking the track information as input, when available.
I guess you are in the process of testing the EM and HAD assignment in order to derive JEC in11_1.
In any case, I do not expect dramatic changes to the overall reconstruction even after the inclusion of the latest commits.
There are still a couple of commits missing in the backport, but again those are mainly technical, with no impact on the physics.
I'll wait up until this PR gets merged to have them backported too in 11_1.

perrotta · 2020-11-04T16:33:39Z

Testing with just 20 events from the wf 23234.0 (TTbar with 2026 D49 geometry) the overall event size reductions are

-1.3% for the FEVTDEBUGHLT event content: 6481072 -> 6396632 -84440 -1.3 ALL BRANCHES
-6.4% for the MINIAODSIM event content: 70405 -> 65869 -4536 -6.4 ALL BRANCHES

perrotta · 2020-11-04T16:34:23Z

@slava77 the new products created in output by this PR are the following:

      0.0 ->       809.1        809     NEWO   0.01     TICLCandidates_ticlTrackstersMerge__RECO.
      0.0 ->       616.5        616     NEWO   0.01     ticlTracksters_ticlTrackstersTrkEM__RECO.
      0.0 ->      1123.3       1123     NEWO   0.02     recoHGCalMultiClusters_ticlMultiClustersFromTrackstersTrkEM__RECO.
      0.0 ->       344.2        344     NEWO   0.01     floats_ticlTrackstersTrkEM__RECO.

perrotta · 2020-11-05T12:06:11Z

Trying to summarize the discussion happened yestreday in this thread: please let me know whether you intend to provide updated validations and comparisons, and if so when they can be ready, so that the review can get finalized here.

felicepantaleo · 2020-11-05T12:15:20Z

@perrotta no further validation would be produced in this PR. I think you can proceed with the finalization of the review.

kpedro88 · 2020-11-05T15:03:55Z

+upgrade

perrotta · 2020-11-05T16:47:05Z

RecoHGCal/TICL/plugins/PatternRecognitionbyCA.cc

-      min_clusters_per_ntuplet_(conf.getParameter<int>("min_clusters_per_ntuplet")),
+      skip_layers_(conf.getParameter<int>("skip_layers")),
+      max_missing_layers_in_trackster_(conf.getParameter<int>("max_missing_layers_in_trackster")),
+      check_missing_layers_(max_missing_layers_in_trackster_ < 100),


I would have found more intuitutive using a negative number to identify the max number which corresponds to "do not check". But ok, the behaviour doesn't change anyhow

perrotta · 2020-11-05T17:07:34Z

+1

TICL algo upgraded as described, with clear improvements in the electron energy resolution and updates in the PF description
Event size shrinks (approx 6% less in miniAOD)
Timing also gets reduced, but this is said to be a temporary effect due to the removal of theMIP iteration and selection on the input cluster size: both them are planned to be reintroduced as soon as dedicated studies will be carried on (and the cpu performance will get affected consequently)
Jenkins tests pass and the differences observed there are consequence of the updates implemented, including the removal of a few objects from the event content (including the CandidateFromTracksters and TrackstersMIP)

cmsbuild · 2020-11-05T17:07:59Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

silviodonato · 2020-11-05T18:40:44Z

As expected we see changes in the Phase-2 workflows

23234.0_TTbar_14TeV+2026D49+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal
23434.999_TTbar_14TeV+2026D49PU_PMXS1S2PR+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+PREMIX_PremixHLBeamSpot14PU+DigiTriggerPU+RecoGlobalPU+HARVESTGlobalPU
28234.0_TTbar_14TeV+2026D60+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

Do you know why we see many more difference in 23234.0 and 28234.0? Since this PR improves the performance against the pile-up, I expected to see larger differences in 23434.999 which has pileup instead of 23234.0 and 28234.0 which are without pileup.

Another question: are these (small) differences in the tau ID expected?
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_11_2_X_2020-11-02-2300+814241/39695/validateJR/all_mini_OldVSNew_TTbar14TeV2026D49wf23234p0/
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_11_2_X_2020-11-02-2300+814241/39695/validateJR/all_mini_OldVSNew_TTbar14TeV2026D60wf28234p0/

slava77 · 2020-11-05T18:51:14Z

I expected to see larger differences in 23434.999

I'm beginning to suspect that this workflow is broken in some way
@kpedro88 did we have any physics validation for it?

kpedro88 · 2020-11-05T19:05:01Z

I'm also suspicious that something funny is happening with 23434.999. I don't see anything out of the ordinary in the matrix test logs. I probably won't have time to look into this in more detail in the next few days; maybe it's worth opening an issue to keep track.

silviodonato · 2020-11-06T08:52:25Z

+1

perrotta · 2020-11-09T09:25:39Z

RecoHGCal/TICL/plugins/TrackstersMergeProducer.cc

+            auto thisPt = tracksterTotalRawPt + trackstersMergedHandle->at(otherTracksterIdx).raw_pt() - t.raw_pt();
+            closestTrackster = std::abs(thisPt - track.pt()) < minPtDiff ? otherTracksterIdx : closestTrackster;
+          }
+          tracksterTotalRawPt += trackstersMergedHandle->at(closestTrackster).raw_pt() - t.raw_pt();


@felicepantaleo this line (also pointed out by the static analyzer) escaped my review of the PR: this increment is completely useless here. Either the line can/should be removed, or it was originally intended to do something different and it has to be fixed then. Please check and provide the fix.

thanks @perrotta that line can be erased safely. I will make a pr now

slava77 · 2020-11-20T18:56:13Z

@slava77 the new products created in output by this PR are the following:

      0.0 ->       809.1        809     NEWO   0.01     TICLCandidates_ticlTrackstersMerge__RECO.
      0.0 ->       616.5        616     NEWO   0.01     ticlTracksters_ticlTrackstersTrkEM__RECO.
      0.0 ->      1123.3       1123     NEWO   0.02     recoHGCalMultiClusters_ticlMultiClustersFromTrackstersTrkEM__RECO.
      0.0 ->       344.2        344     NEWO   0.01     floats_ticlTrackstersTrkEM__RECO.

reco monitoring now covers these (sorry, it took a while for me to get to updating the script).

cmsbuild added this to the CMSSW_11_2_X milestone Oct 22, 2020

cmsbuild added code-checks-pending comparison-pending dqm-pending orp-pending pending-signatures reconstruction-pending tests-pending upgrade-pending labels Oct 22, 2020

cmsbuild added code-checks-approved and removed code-checks-pending labels Oct 22, 2020

cmsbuild added tests-started and removed tests-pending labels Oct 22, 2020

rovere mentioned this pull request Oct 22, 2020

TICL reconstruction update #31907

Merged

cmsbuild added tests-approved and removed tests-started labels Oct 22, 2020

kpedro88 reviewed Oct 22, 2020

View reviewed changes

cmsbuild added comparison-available and removed comparison-pending labels Oct 22, 2020

cmsbuild added upgrade-approved and removed upgrade-pending labels Nov 5, 2020

perrotta reviewed Nov 5, 2020

View reviewed changes

cmsbuild added fully-signed reconstruction-approved and removed pending-signatures reconstruction-pending labels Nov 5, 2020

cmsbuild added orp-approved and removed orp-pending labels Nov 6, 2020

cmsbuild merged commit 659f171 into cms-sw:master Nov 6, 2020

slava77 mentioned this pull request Nov 6, 2020

HGCAL in phase-2 premix workflow .999: missing HEB rechits, low response in HEF and EE #32050

Open

perrotta reviewed Nov 9, 2020

View reviewed changes

felicepantaleo mentioned this pull request Nov 9, 2020

[HGCal - TICL] Remove unused code #32061

Merged

slava77 mentioned this pull request Nov 19, 2020

reco comparisons updates for: TICL plots; DD4HEP short matrix wfs cms-sw/cms-bot#1422

Merged

slava77 mentioned this pull request Feb 12, 2021

phase-2 reco CPU hotspot: HGCDoublet::checkCompatibilityAndTag #30957

Closed

[HGCal] TICL v3 major upgrade #31906

[HGCal] TICL v3 major upgrade #31906

Conversation

felicepantaleo commented Oct 22, 2020 • edited Loading

PR description:

PR validation:

Single Particle noPU

More info

Timing Report

cmsbuild commented Oct 22, 2020

felicepantaleo commented Oct 22, 2020 • edited Loading

felicepantaleo commented Oct 22, 2020

felicepantaleo commented Oct 22, 2020

cmsbuild commented Oct 22, 2020

cmsbuild commented Oct 22, 2020 • edited Loading

cmsbuild commented Oct 22, 2020

hatakeyamak commented Oct 22, 2020

cmsbuild commented Oct 22, 2020

cmsbuild commented Oct 22, 2020

kpedro88 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsbuild commented Oct 22, 2020

missirol commented Nov 4, 2020

perrotta commented Nov 4, 2020

rovere commented Nov 4, 2020

perrotta commented Nov 4, 2020 • edited Loading

perrotta commented Nov 4, 2020

perrotta commented Nov 5, 2020

felicepantaleo commented Nov 5, 2020

kpedro88 commented Nov 5, 2020

Choose a reason for hiding this comment

perrotta commented Nov 5, 2020

cmsbuild commented Nov 5, 2020

silviodonato commented Nov 5, 2020

slava77 commented Nov 5, 2020

kpedro88 commented Nov 5, 2020

silviodonato commented Nov 6, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slava77 commented Nov 20, 2020

felicepantaleo commented Oct 22, 2020 •

edited

Loading

felicepantaleo commented Oct 22, 2020 •

edited

Loading

cmsbuild commented Oct 22, 2020 •

edited

Loading

perrotta commented Nov 4, 2020 •

edited

Loading