-
Notifications
You must be signed in to change notification settings - Fork 51
nanoAOD analysis framework
A simple H4l analysis framework based on nanoAODs is being developed in branch Run3 under the "package" ZZAnalysis/NanoAnalysis. This version can be used to run on both Run2 and Run3 nanoAODs. The only caveat is that current centrally-prouced Run2 nanoAODs are done in nanoAODv9, which does not include full information on FSR and on HZZ electron ID.
Quick pointers:
- The steering file is NanoAnalysis/python/nanoZZ4lAnalysis.py. This specifies which modules are runs, cuts, etc.
- To run local tests, sync etc, an example is provided in ZZAnalysis/NanoAnalysis/test/runLocal.py. (run with
./runLocal.py
) - For full production using condor batches, csv files are located in ZZAnalysis/NanoAnalysis/test/prod. Submission works in the same way as for the miniAOD framework (see documentation).
IMPORTANT: contrary to what happens with the miniAOD tree maker, they python executable in the batch scripts depends directly on the .py code in the working area. The py code should therefore not be modified while production is being run.
Output files are nanoAOD files, **skimmed to keep only interesting events (for the signal or control region, filtered to include only the variables relevant to the analysis, and extended with additional variables for leptons, FSR, ZZ candidates, Z candidates, ZLL and ZL control regions. The full documentation the tree content is available in this auto-generated file.
Details on additional variables in the Events tree:
Signal region candidates (ZZCand block)
Filled in ZZFiller.py.
Each event may contain zero, one, or more SR candidates, depending on how the job is configured. The "best candidate" in the event is the one with index bestCandIdx
. A value of -1 means the event has no SR candidate; bestCandIdx should therefore always be checked:
theZZ = None
if bestCandIdx >=0 :
ZZs = Collection(event, 'ZZCand')
theZZ = ZZs[bestCandIdx]
Only for the best candidate in the event, some extra information is addedx added in ZZExtraFiller.py, to be used for categorization:
-
ZZCand_nExtraLep
: number of extra leptons -
ZZCand_nExtraLep
: number of extra Zs (more to be added; eg jets etc)
Z+LL control region candidates (ZLLCand block)
Also filled in ZZFiller.py.
ZLL CR candidates for all CRs (SS, 2P2F, 3P1F, SIP) are stored toghether in this block, and only for events with no SR candidate.
At most 1 candidate per CR is stored. The block can therefore contain between zero, and 4 candidates. The candidate to be used for each specific CR is given by the variables:
* ZLLbestSSIdx
: index of the SS CR candidate in the ZLL block (-1 if none)
* ZLLbest2P2FIdx
: ditto, for 2P2F CR
* ZLLbest3P1FIdx
: ditto, 3P1F CR
* ZLLbestSIPCRIdx
: ditto, for SIP method CR
Z candidates passing full selection (ZCand block)
Also filled in ZZFiller.py.
This block includes Z candidates passing the SR selection requirements (full lepton selection, mass cuts). It is stored so that other collections can refer to Zs in the event (eg: Z+L collection, extra Zs in categorization).
The variable bestZIdx
stores the index of the "best Z" in the event, defined as the one with mass closest to m_Z.
Z+L control region candidates (ZLCand block)
Also filled in ZZFiller.py.
Z+L candidates are made in events with a Z passing the full selection + exactly one lepton passing the relaxed selection. By construction, an event with a Z+L candidate can therefore not contain SR or ZLL CR candidates. For this CR,
-
ZLCand_lepIdx
is the index of the extra lepton in the merged Electron+Muon collection. A value of -1 means that no Z+L candidate is present in the event. - The Z of the ZL candidate is always the best Z in the event, ie:
ZCand[bestZIdx]
These are filled in lepFiller.py
-
[Muon|Electron]_ZZFullSel
: lepton passes the full selection criteria (including isolation) -
[Muon|Electron]_ZZFullSelNoSIP
: lepton passes the full selection criteria (including isolation), except for SIP; to be used for the SIP CR -
[Muon|Electron]_ZZFullID
: lepton passes all ID criteria (excluding isolation) -
[Muon|Electron]_ZZRelaxedId
: lepton passes the relaxed ID criteria; to be used for 3P1F, 2P2F and SS CRs -
[Muon|Electron]_ZZRelaxedIdNoSIP
: same as above but, with no SIP cut. This is the minimum subset of requirements for all CRs. -
[Muon|Electron]_passIso
: lepton passes isolation. This is currently always true for electrons since isolation is included in the electron BDT) -
[Muon|Electron]_pfRelIso03FsrCorr
: isolation value -
[Muon|Electron]_fsrPhotonIdx
: index of associated photon in the FSR collection (overridin the original association in input nanoAODs)
Filled in triggerAndSkim.py We store information on passed paths, taking into accound precedence rules for data PDs based on triggers (to avoid double-counting of the same events in different PDs):
-
HLT_passZZ4l
: the event passes the trigger and the PD precedence veto rules -
HLT_passZZ4lEle
,HLT_passZZ4lMu
,HLT_passZZ4lMuEle
: the event passes the ele, mu, or MuEle triggers
MC truth information (GenZZ block)
Filled in mcTruthAnalyzer.py:
-
GenZZ_FinalState
= product of pdgId's of the four leptons from ZZ -
GenZZ_mass
= mass of the 4l system (pre-FSR) -
GenZZ_Z1l1Idx
etc. = index of the 4 gen leptons in the GenPart collection (pre-FSR) -
FsrPhoton_genFsrIdx
: index of GenPart matched to photons in the FsrPhoton collection (closest gen FSR from Z->l) - to be added: associated leptons, bosons...
Gen-level information for fiducial measurements (FidZ, FidZZ and FidDressedLeps blocks)
Filled in genFiller.py
- `FidZZ_*: gen-level ZZ candidate for the fiducial selection
- `FidZ_*: gen-level Z candidates for the fiducial selection
-
FidDressedLeps_*
: dressed leptons for the fiducial sel -
passedFiducial
: true for events that pass fiducial sel
Filled in weightFiller.py
-
overallEventWeight
: this is the product of all relevant weights, except for the data/MC efficiency correction (computed from lepton SFs), which depends on the candidate being considered. It includes:overallEventWeight = XS * Generator_weight * puWeight * K
where: -
XS
is propagated from the XS*BR value in the csv file (note: this is not stored as a separate variable in the tree) -
Generator_weight
is the event weight from the generator, present in the imput nanoAOD -
puWeight
is the PU reweighting weight, computed in puWeightProducer.py -
K
is the product of the following variables, which are stored in the tree for debugging purposes, if relevant for the specific sample being processed (controlled by options in the csv file):-
KFactor_QCD_qqZZ_M_Weight
: QCD K-factor for qqZZ only -
KFactor_QCD_ggZZ_Nominal_Weight
: QCD K-factor for ggZZ only -
KFactor_EW_qqZZ_Weight
: EW K-factor for qqZZ only. NOTE: implementation of this is currently missing, so this is currently 1. -
ggH_NNLOPS_Weight
: reweighting of ggH for njets and pT
-
Note that overallEventWeight
should be multiplied by the data/MC efficiency for the chosen candidate, i.e.:
ZZCand_dataMCWeight[bestCandIdx]
The sum of weights for all events is stored in the Runs tree in the nanoAOD file. A helper to extract the normalization is available here.
Cf usage example.
For testing and development, it may be necessary to privately produce nanoAODs with additions/changes. Instructions are available here.
Specifically, for nano v12 MC (current version of our datasets):
setenv GT 130X_mcRun3_2022_realistic_v5 setenv ERA Run3 cmsDriver.py NANO -s NANO --mc --conditions ${GT} --era ${ERA} --eventcontent NANOAODSIM --datatier NANOAODSIM --customise_commands="process.add_(cms.Service('InitRootHandlers', EnableIMT = cms.untracked.bool(False)));process.MessageLogger.cerr.FwkReport.reportEvery=1000" -n -1 --no_exec
This produces a configuration NANO_NANO.py
that can be edited to set the input sample and run with cmsRun
.