Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tailcut finder function and script #1325

Open
wants to merge 39 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
9f698dd
Script to "compute" tailcuts and NSB level to be added to the MC
moralejo Dec 13, 2024
7a707e3
Minor fix
moralejo Dec 13, 2024
afd9ba6
mean => median
moralejo Dec 13, 2024
47115f0
cast to int
moralejo Dec 13, 2024
2336cdb
Changed arguments
moralejo Dec 16, 2024
be28773
Minor fix
moralejo Dec 16, 2024
1c8aa49
Path to str in glob argument
moralejo Dec 16, 2024
fcaad2f
Simplification on the selection of the subsample of subruns
moralejo Dec 16, 2024
c359bf6
Fix in subrun selection
moralejo Dec 16, 2024
0733892
Refactoring
moralejo Dec 16, 2024
ee6f29f
set logging.INFO
moralejo Dec 16, 2024
fb1ca4d
Additional NSB computation
moralejo Dec 16, 2024
ef812ce
Better log message
moralejo Dec 16, 2024
17a15c5
Outlier detection
moralejo Dec 16, 2024
2a677b4
Changed outlier detection
moralejo Dec 16, 2024
16bffee
Improved log message
moralejo Dec 16, 2024
c115c11
Better log message
moralejo Dec 16, 2024
9269f2c
Added the median pedestal charge among the returns of find_tailcuts
moralejo Dec 17, 2024
3fd7b3a
string formatting
moralejo Dec 17, 2024
cf54e81
New parameters for function to obtain (from pedestal charges) the add…
moralejo Dec 17, 2024
794f992
Removed from lstchain_dvr_pixselector.py the creation of a json file …
moralejo Dec 18, 2024
353fb18
Remove unused imports
moralejo Dec 18, 2024
b78758c
Check possible wrong input
moralejo Dec 18, 2024
9d88a86
Check for number of valid pixels
moralejo Dec 18, 2024
ac61081
Check for number of pedestals in subrun
moralejo Dec 18, 2024
c8c47de
missing continue
moralejo Dec 18, 2024
2913357
Avoid useless warning from rare empty pixels
moralejo Dec 18, 2024
7486271
Modified limits for cleaning level changes. The former ones referred …
moralejo Dec 19, 2024
c02bf8e
Improved comment
moralejo Dec 19, 2024
45a0cfa
Fixed buggy computation of outlier pixels. Previous one was not removing
moralejo Dec 20, 2024
8f80135
Updated parameters of additional_nsb vs. median_qped parametrization
moralejo Dec 20, 2024
3f0749e
Replaced median of pixel charge in pedestal events by the 95% quantil…
moralejo Dec 20, 2024
9f344d4
Automatic log file name if not provided
moralejo Dec 22, 2024
66add8b
Better treatment of pathological runs
moralejo Jan 15, 2025
57725a7
Update lstchain_find_tailcuts.py
moralejo Jan 15, 2025
684d2ea
Report fraction of unreliable pixels
moralejo Jan 15, 2025
9425c83
Remove spaces from logging message
moralejo Jan 15, 2025
9259b25
Include run number in message
moralejo Jan 15, 2025
69cef5c
Fix log message
moralejo Jan 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
241 changes: 240 additions & 1 deletion lstchain/image/cleaning.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,20 @@
import glob
import logging
import numpy as np
from pathlib import Path
from lstchain.paths import run_to_dl1_filename
from lstchain.io.io import dl1_params_lstcam_key, dl1_images_lstcam_key
from lstchain.io.io import dl1_params_tel_mon_cal_key
from lstchain.io.config import get_standard_config
from ctapipe.containers import EventType
from scipy.stats import median_abs_deviation

__all__ = ['apply_dynamic_cleaning']
from ctapipe.io import read_table
__all__ = ['apply_dynamic_cleaning',
'find_tailcuts',
'pic_th']

log = logging.getLogger(__name__)


def apply_dynamic_cleaning(image, signal_pixels, threshold, fraction):
Expand Down Expand Up @@ -36,3 +50,228 @@
mask_dynamic_cleaning = (image >= dynamic_threshold) & signal_pixels

return mask_dynamic_cleaning


def find_tailcuts(input_dir, run_number):
"""
This function uses DL1 files to determine tailcuts which are adequate
for the bulk of the pixels in a given run. The script also returns the
suggested NSB adjustment needed in the "dark-sky" MC to match the data.
The function uses the median (for the whole camera, excluding outliers)
of the 95% quantile of the pixel charge for pedestal events to deduce the
NSB level. It is good to use a large quantile of the pixel charge
distribution (vs. e.g. using the median) because what matters for having
a relaistic noise simulation is the tail on the right-side, i.e. for
large pulses.
For reasons of stability & simplicity of analysis, we cannot decide the
cleaning levels (or the NSB tuning) on a subrun-by-subrun basis. We select
values which are more or less valid for the whole run.

The script will process a subset of the subruns (~10, hardcoded) of the run,
distributed uniformly through it.

Parameters
----------
input_dir: `Path`
directory where the DL1 files (subrun-wise, i,e, including
DL1a) are stored

run_number: ìnt`
run number to be processed

Returns
-------
additional_nsb_rate: p.e./ns rate of NSB to be added to "dark MC" to
match the noise in the data
newconfig (dict): cleaning configuration for running the DL1ab stage
"""

log.setLevel(logging.INFO)

Check warning on line 89 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L89

Added line #L89 was not covered by tests

# subrun-wise dl1 file names:
dl1_filenames = Path(input_dir,

Check warning on line 92 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L92

Added line #L92 was not covered by tests
run_to_dl1_filename(1, run_number, 0).replace(
'.0000.h5', '.????.h5'))
all_dl1_files = glob.glob(str(dl1_filenames))
all_dl1_files.sort()

Check warning on line 96 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L95-L96

Added lines #L95 - L96 were not covered by tests

# Number of median absolute deviations (mad) away from the median that a
# value has to be to be considered an outlier:
mad_max = 5 # would exclude 8e-4 of the pdf for a gaussian

Check warning on line 100 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L100

Added line #L100 was not covered by tests

# Minimum number of interleaved pedestals in subrun to proceed with
# calculation:
min_number_of_ped_events = 100

Check warning on line 104 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L104

Added line #L104 was not covered by tests

# Minimum number of valid pixels to consider the calculation of NSB level
# acceptable:
min_number_of_valid_pixels = 1000

Check warning on line 108 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L108

Added line #L108 was not covered by tests

# Aprox. maximum number of subruns (uniformly distributed through the
# run) to be processed:
max_number_of_processed_subruns = 10

Check warning on line 112 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L112

Added line #L112 was not covered by tests
# Keep only ~max_number_of_processed_subruns subruns, distributed
# along the run:
dl1_files = all_dl1_files[::int(1+len(all_dl1_files) /

Check warning on line 115 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L115

Added line #L115 was not covered by tests
max_number_of_processed_subruns)]

number_of_pedestals = []
median_ped_qt95_pix_charge = []

Check warning on line 119 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L118-L119

Added lines #L118 - L119 were not covered by tests

for dl1_file in dl1_files:
log.info('\nInput file: %s', dl1_file)

Check warning on line 122 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L121-L122

Added lines #L121 - L122 were not covered by tests

data_parameters = read_table(dl1_file, dl1_params_lstcam_key)
event_type_data = data_parameters['event_type'].data
pedestal_mask = event_type_data == EventType.SKY_PEDESTAL.value

Check warning on line 126 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L124-L126

Added lines #L124 - L126 were not covered by tests

num_pedestals = pedestal_mask.sum()
if num_pedestals < min_number_of_ped_events:
log.warning(f' Too few interleaved pedestals ('

Check warning on line 130 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L128-L130

Added lines #L128 - L130 were not covered by tests
f'{num_pedestals}) - skipped subrun!')
continue

Check warning on line 132 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L132

Added line #L132 was not covered by tests

number_of_pedestals.append(pedestal_mask.sum())
data_images = read_table(dl1_file, dl1_images_lstcam_key)
data_calib = read_table(dl1_file, dl1_params_tel_mon_cal_key)

Check warning on line 136 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L134-L136

Added lines #L134 - L136 were not covered by tests
# data_calib['unusable_pixels'] , indices: (Gain Calib_id Pixel)

# Get the "unusable" flags from the pedcal file:
unusable_hg = data_calib['unusable_pixels'][0][0]
unusable_lg = data_calib['unusable_pixels'][0][1]

Check warning on line 141 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L140-L141

Added lines #L140 - L141 were not covered by tests

unreliable_pixels = unusable_hg | unusable_lg
if unreliable_pixels.sum() > 0:
log.info(f' Removed {unreliable_pixels.sum()/unreliable_pixels.size:.2%} of pixels '

Check warning on line 145 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L143-L145

Added lines #L143 - L145 were not covered by tests
f'due to unreliable calibration!')

reliable_pixels = ~unreliable_pixels

Check warning on line 148 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L148

Added line #L148 was not covered by tests

charges_data = data_images['image']
charges_pedestals = charges_data[pedestal_mask]

Check warning on line 151 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L150-L151

Added lines #L150 - L151 were not covered by tests
# pixel-wise 95% quantile of ped pix charge through the subrun
# (#pixels):
qt95_pix_charge = np.nanquantile(charges_pedestals, 0.95, axis=0)

Check warning on line 154 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L154

Added line #L154 was not covered by tests
# ignore pixels with 0 signal:
qt95_pix_charge = np.where(qt95_pix_charge > 0, qt95_pix_charge, np.nan)

Check warning on line 156 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L156

Added line #L156 was not covered by tests
# median of medians accross camera:
median_qt95_pix_charge = np.nanmedian(qt95_pix_charge)

Check warning on line 158 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L158

Added line #L158 was not covered by tests
# mean abs deviation of pixel qt95 values:
qt95_pix_charge_dev = median_abs_deviation(qt95_pix_charge,

Check warning on line 160 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L160

Added line #L160 was not covered by tests
nan_policy='omit')

# Just a cut to remove outliers (pixels):
outliers = (np.abs(qt95_pix_charge - median_qt95_pix_charge) /

Check warning on line 164 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L164

Added line #L164 was not covered by tests
qt95_pix_charge_dev) > mad_max

if outliers.sum() > 0:
removed_fraction = outliers.sum() / outliers.size
log.info(f' Removed {removed_fraction:.2%} of pixels (outliers) '

Check warning on line 169 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L167-L169

Added lines #L167 - L169 were not covered by tests
f'from pedestal median calculation')

# Now compute the median (for the whole camera) of the qt95's (for
# each pixel) of the charges in pedestal events. Use only reliable
# pixels for this, and exclude outliers:
n_valid_pixels = np.isfinite(qt95_pix_charge[reliable_pixels]).sum()
if n_valid_pixels < min_number_of_valid_pixels:
logging.warning(f' Too few valid pixels ({n_valid_pixels}) for '

Check warning on line 177 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L175-L177

Added lines #L175 - L177 were not covered by tests
f'calculation!')
median_ped_qt95_pix_charge.append(np.nan)

Check warning on line 179 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L179

Added line #L179 was not covered by tests
else:
median_ped_qt95_pix_charge.append(np.nanmedian(qt95_pix_charge[

Check warning on line 181 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L181

Added line #L181 was not covered by tests
reliable_pixels &
~outliers]))
# convert to ndarray:
median_ped_qt95_pix_charge = np.array(median_ped_qt95_pix_charge)
number_of_pedestals = np.array(number_of_pedestals)

Check warning on line 186 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L185-L186

Added lines #L185 - L186 were not covered by tests

# Now compute the median for all processed subruns, which is more robust
# against e.g. subruns affected by car flashes. We exclude subruns
# which have less than half of the median statistics per subrun.
good_stats = number_of_pedestals > 0.5 * np.median(number_of_pedestals)

Check warning on line 191 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L191

Added line #L191 was not covered by tests

# First check if we have any valid subruns at all:
if np.isfinite(median_ped_qt95_pix_charge[good_stats]).sum() == 0:
qped = np.nan
qped_dev = np.nan
additional_nsb_rate = np.nan
log.error(f'No valid computation was possible for run {run_number} with any of the processed subruns!')
return qped, additional_nsb_rate, None

Check warning on line 199 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L194-L199

Added lines #L194 - L199 were not covered by tests

qped = np.nanmedian(median_ped_qt95_pix_charge[good_stats])
not_outlier = np.zeros(len(median_ped_qt95_pix_charge), dtype='bool')

Check warning on line 202 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L201-L202

Added lines #L201 - L202 were not covered by tests
# Now we also remove outliers (subruns) if any:
qped_dev = median_abs_deviation(median_ped_qt95_pix_charge[good_stats])
not_outlier = (np.abs(median_ped_qt95_pix_charge - qped) /

Check warning on line 205 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L204-L205

Added lines #L204 - L205 were not covered by tests
qped_dev) < mad_max

if (~good_stats).sum() > 0:
log.info(f'\nNumber of subruns with low statistics: '

Check warning on line 209 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L208-L209

Added lines #L208 - L209 were not covered by tests
f'{(~good_stats).sum()} - removed from pedestal median '
f'calculation')
if (~not_outlier).sum() > 0:
log.info(f'\nRemoved {(~not_outlier).sum()} outlier subruns '

Check warning on line 213 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L212-L213

Added lines #L212 - L213 were not covered by tests
f'(out of {not_outlier.size}) from pedestal median '
f'calculation')

# recompute with good-statistics and well-behaving runs:
qped = np.nanmedian(median_ped_qt95_pix_charge[good_stats & not_outlier])
log.info(f'\nNumber of subruns used in calculations: '

Check warning on line 219 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L218-L219

Added lines #L218 - L219 were not covered by tests
f'{(good_stats & not_outlier).sum()}')

picture_threshold = pic_th(qped)
boundary_threshold = picture_threshold / 2

Check warning on line 223 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L222-L223

Added lines #L222 - L223 were not covered by tests

# We now create a .json files with recommended image cleaning
# settings for lstchain_dl1ab.
newconfig = get_standard_config()['tailcuts_clean_with_pedestal_threshold']

Check warning on line 227 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L227

Added line #L227 was not covered by tests
# casts below are needed, json does not like numpy's int64:
newconfig['picture_thresh'] = int(picture_threshold)
newconfig['boundary_thresh'] = int(boundary_threshold)

Check warning on line 230 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L229-L230

Added lines #L229 - L230 were not covered by tests

additional_nsb_rate = get_nsb(qped)

Check warning on line 232 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L232

Added line #L232 was not covered by tests

return qped, additional_nsb_rate, newconfig

Check warning on line 234 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L234

Added line #L234 was not covered by tests


def pic_th(qt95_ped):
"""
Parameters
----------
qt95_ped: `float`
95% quantile of pixel charge in pedestal events (for the standard
LocalPeakWindowSearch algo & settings in lstchain)

Returns
-------
`int`
recommended picture threshold for image cleaning (from a table)

"""
mp_edges = np.array([5.85, 7.25, 8.75, 10.3, 11.67])
picture_threshold = np.array([8, 10, 12, 14, 16, 18])

Check warning on line 252 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L251-L252

Added lines #L251 - L252 were not covered by tests

if qt95_ped >= mp_edges[-1]:
return picture_threshold[-1]
return picture_threshold[np.where(mp_edges > qt95_ped)[0][0]]

Check warning on line 256 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L254-L256

Added lines #L254 - L256 were not covered by tests


def get_nsb(qt95_ped):
"""
Parameters
----------
qt95_ped: `float`
95% quantile of pixel charge in pedestal events

Returns
-------
`float`
(from a parabolic parametrization) the recommended additional NSB
(in p.e. / ns) that has to be added to the "dark MC" waveforms in
order to match the data for which the 95% quantile of pedestal pixel
charge is qt95_ped

"""
params = [3.95147396, 0.12642504, 0.01718627]
return (params[1] * (qt95_ped - params[0]) +

Check warning on line 276 in lstchain/image/cleaning.py

View check run for this annotation

Codecov / codecov/patch

lstchain/image/cleaning.py#L275-L276

Added lines #L275 - L276 were not covered by tests
params[2] * (qt95_ped - params[0]) ** 2)
36 changes: 1 addition & 35 deletions lstchain/scripts/lstchain_dvr_pixselector.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,7 @@
The program creates in the output directory (which is the current by default)
a file DVR_settings_LST-1.Run12469.h5 which contains a table "run_summary"
which includes the DVR algorithm parameters determined for each processed
subrun. It also creates a file with recommended cleaning settings for running
DL1ab, based on the NSB level measured in the processed runs. We use
as picture threshold the closest even number not smaller than the charge
"min_charge_for_certain_selection" (averaged for all subruns and rounded)
which is the value from which a pixel will be certainly kept by the Data Volume
Reduction.
subrun.

Then we run again the script over all subruns, and using the option to create
the pixel maks (this can also be done subrun by subrun, to parallelize the
Expand Down Expand Up @@ -68,7 +63,6 @@
from lstchain.paths import parse_dl1_filename
from lstchain.io.io import dl1_params_lstcam_key, dl1_images_lstcam_key
from lstchain.io.io import dl1_params_tel_mon_cal_key
from lstchain.io.config import get_standard_config, dump_config

from ctapipe.io import read_table
from ctapipe.core import Container, Field
Expand Down Expand Up @@ -506,34 +500,6 @@ def main():
time.sleep(300)
continue


# We now create also .json files with recommended image cleaning
# settings for lstchain_dl1ab. We determine the picture threshold
# from the values of min_charge_for_certain_selection:

if not write_pixel_masks:
log.info('Output files:')
for file in list_of_output_files:
log.info(file)
dvrtable = read_table(file, "/run_summary")
picture_threshold = get_typical_dvr_min_charge(dvrtable)

# we round it to an even number of p.e., just to limit the amount
# of different settings in the analysis (i.e. we change
# picture_threshold in steps of 2 p.e.):
if picture_threshold % 2 != 0:
picture_threshold += 1
boundary_threshold = picture_threshold / 2
newconfig = get_standard_config()['tailcuts_clean_with_pedestal_threshold']
newconfig['picture_thresh'] = picture_threshold
newconfig['boundary_thresh'] = boundary_threshold
run = int(file.name[file.name.find('Run')+3:-3])
json_filename = Path(output_dir, f'dl1ab_Run{run:05d}.json')
dump_config({'tailcuts_clean_with_pedestal_threshold': newconfig,
'dynamic_cleaning': get_standard_config()['dynamic_cleaning']},
json_filename, overwrite=True)
log.info(json_filename)

log.info('lstchain_dvr_pixselector finished successfully!')


Expand Down
Loading
Loading