-
Notifications
You must be signed in to change notification settings - Fork 76
The Results Class
IDTxl returns all results generated by algorithms (e.g.,
ActiveInformationStorage()
, MultivariateTE()
) in dedicated results classes.
Results for individual targets and processes, as well as results for the whole
network can be accessed via methods and attributes of these classes. Results
contain estimated quantities as well as a full list of the settings used to
obtain these estimates.
All results classes offer dot-notation to access dictionary entries, e.g.,
>>> result.settings['kraskov_k']
>>> result.settings.kraskov_k
Jump to section:
ResultsNetworkInference()
ResultsNetworkComparison()
ResultsSingleProcessAnalysis()
ResultsPartialInformationDecomposition()
Instances of Results()
objects can be pickled:
import pickle
from idtxl.results import Results
results = Results(n_nodes=5, n_realisations=1000, normalised=True)
pickle.dump(results, open('results.p', 'wb'))
results = pickle.load(open('results.p', 'rb'))
Returned by all algorithms that infer effective networks, (e.g., MultivariateTE()
, BivariateMI()
):
from idtxl.multivariate_te import MultivariateTE
from idtxl.data import Data
data = Data() # initialise an empty data object
data.generate_mute_data(n_samples=500, n_replications=3)
settings = {
'cmi_estimator': 'JidtKraskovCMI',
'n_perm_max_stat': 200,
'n_perm_min_stat': 200,
'n_perm_omnibus': 200,
'n_perm_max_seq': 200,
'max_lag_sources': 5,
'min_lag_sources': 1
}
network_analysis = MultivariateTE()
results = network_analysis.analyse_network(settings, data, targets=[0, 1, 2])
For a first overview, the object offers a method to quickly print inferred edges to the console. The method takes an argument that determines the weighting of the printed edges, e.g., the information-transfer delay as the lag of the past variable with maximum information transfer into the target. More options can be found in the method's docstring:
In [1]: results.print_edge_list(weights='max_te_lag', fdr=False)
0 -> 1, max_te_lag: 2
0 -> 2, max_te_lag: 3
The parameter fdr
returns FDR-corrected results. FDR-correction controls the
false discovery rate over all analyzed targets. See the Wiki's theoretical introduction and the
paper by Benjamini et al. (1995) for details.
Alternatively, get_adjacency_matrix()
returns an instance of AdjacencyMatrix()
containing the weighted adjacency matrix of the inferred network:
In [2]: adj_matrix = results.get_adjacency_matrix(weights='max_te_lag', fdr=False)
In [3]: adj_matrix.print_matrix()
Out[3]:
[[False, True, True, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False]]
[[0, 2, 3, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]
Sources and delays inferred for individual targets can be accessed as numpy arrays:
In [1]: results.get_target_sources(target=1, fdr=False)
Out[1]: array([0]) # 0 is a significant source for target 1
In [2]: results.get_target_delays(target=1, fdr=False)
Out[2]: array([2]) # the delay between the source and target is 2
Detailed results for inferred targets can be accessed as dictionaries:
In [3]: results.get_single_target(1, fdr=False)
Out[3]:
{'selected_vars_sources': [(0, 2), (0, 1)], # selected variables in the sources' past (process index, lag)
'selected_sources_te': array([ 0.31799646, 0.02485651]), # information transfer from individual source variables into the target (same order as 'selected_vars_sources')
'selected_sources_pval': array([ 0.005, 0.005]), # p-values for selected source variables
'selected_vars_target': [(1, 4), (1, 1), (1, 2), (1, 3)], # selected variables in the target's past
'sources_tested': [0, 2, 3, 4], # sources tested for target 1
'omnibus_te': 0.5097320256669797, # the joint information transfer from all source variables into the target
'current_value': (1, 5), # current_value: (process index, absolute samples)
'omnibus_sign': True, # significance of omnibus TE
'omnibus_pval': 0.005,
'te': array([ 0.59638851])}} # single link TE: 0 -> 1, i.e. TE: [(0, 2), (0, 1)] -> 1
Further, the object results
has the following attributes:
results.data_properties # information regarding the data used for analysis
results.settings # settings used for analysis
results.targets_analysed # list of targets analysed
Calling the data_properties
attribute lists properties of the data that went into the analysis:
In [1]: results.data_properties
Out[1]: {'n_realisations': 4950, 'normalised': True, 'n_nodes': 5}
Calling the settings
attribute lists all settings used for analysis. Settings
include both the settings provided by the user and default values set by the
toolbox during analysis, providing a complete log of the analysis performed:
In [2]: results.settings
Out[2]:
{'debug': False,
'max_lag_target': 5,
'tau_target': 1,
'alpha_max_stat': 0.05,
'verbose': True,
'fdr_correction': True,
'n_perm_max_seq': 200,
'lag_mi': 0,
'alpha_omnibus': 0.05,
'normalise': 'false',
'theiler_t': '0',
'min_lag_sources': 1,
'add_conditionals': None,
'num_threads': 'USE_ALL',
'n_perm_omnibus': 200,
'permute_in_time': False,
'max_lag_sources': 5,
'alpha_min_stat': 0.05,
'n_perm_min_stat': 200,
'kraskov_k': '4',
'alpha_max_seq': 0.05,
'local_values': False,
'n_perm_max_stat': 200,
'cmi_estimator': 'JidtKraskovCMI',
'tau_sources': 1,
'noise_level': 1e-08}
Last, the results object offers a method results.combine_results()
to combine
the current partial result with further results, for example if analyses are
split over multiple algorithm calls to perform analysis in parallel over
processes.
Returned by NetworkComparison()
.
import pickle
import numpy as np
from idtxl.data import Data
from idtxl.network_comparison import NetworkComparison
# Load precomputed results for the MuTE network
path = 'test/data/'
res_0 = pickle.load(open(path + 'mute_results_0.p', 'rb'))
res_1 = pickle.load(open(path + 'mute_results_1.p', 'rb'))
# Compare data simulated from the MuTE network against random data to get some
# significant differences in TE.
data_0 = Data()
data_0.generate_mute_data(500, 5)
data_1 = Data(np.random.rand(5, 500, 5), 'psr')
# Comparison settings
comp_settings = {
'cmi_estimator': 'JidtKraskovCMI',
'stats_type': 'independent',
'n_perm_max_stat': 50,
'n_perm_min_stat': 50,
'n_perm_omnibus': 50,
'n_perm_max_seq': 50,
'alpha_comp': 0.26,
'n_perm_comp': 200,
'tail': 'two',
'permute_in_time': True,
'perm_type': 'random'
}
comp = NetworkComparison()
results = comp.compare_within(comp_settings, res_0, res_1, data_0, data_1)
For a first overview, the object offers a method to quickly print inferred differences in edges to the console. The method takes an argument that determines the weighting of the printed differences, e.g., the absolute difference in the compared quantity. More options can be found in the method's docstring:
In [1]: results.print_edge_list(weights='diff_abs')
0 -> 1, diff_abs: 0.43374641827540994
0 -> 2, diff_abs: 0.2662496552329512
Alternatively, get_adjacency_matrix()
returns an instance of AdjacencyMatrix()
containing the weighted adjacency matrix of the inferred network (weights='comparison'
returns True
for links with significant differences in information transfer):
In [2]: adj_matrix = results.get_adjacency_matrix(weights='comparison')
In [3]: adj_matrix.print_matrix()
Out[3]:
[[False, True, True, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False]]
[[False, True, True, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False]]
Detailed results for the comparison of individual targets can be accessed as dictionaries:
In [3]: results.get_single_target(1)
Out[3]:
{'selected_vars_sources': [(0, 2), (0, 1)], # past source variables used for TE estimation (union of past variables for target 1 in res_0 and res_1)
'selected_vars_target': [(1, 4), (1, 1), (1, 2), (1, 3)], # past target variables used for TE estimation (union of past variables for target 1 in res_0 and res_1)
'sources': array([0])}
Further, the object results
has the following attributes:
results.targets_analysed # list of targets analysed
results.ab # True where link a > b
results.pval # p-value for all compared links
results.cmi_diff_abs # absolute differences in network measure compared
results.data_properties # information regarding the data used for analysis
results.settings # settings used for comparison
results.surrogate_distributions # surrogate distribution for each link used for statistical test
Calling the data_properties
attribute lists properties of the data that went into the analysis:
In [9]: results.data_properties
Out[9]: {'n_realisations': 4950, 'normalised': True, 'n_nodes': 5}
Calling the settings attribute lists all settings used for analysis. Settings include both the settings provided by the user and default values set by the toolbox during analysis, providing a complete log of the analysis performed:
In [10]: results.settings
Out[10]:
{'alpha_comp': 0.26, # critical alpha level for link comparison
'cmi_estimator': 'JidtKraskovCMI', # CMI estimator used
'debug': False, # print debug information
'kraskov_k': '4', # number of nearest neighbours used by the Kraskov-estimator
'local_values': False, # return local measures
'n_perm_comp': 200, # no. permutations for link comparison
'n_perm_max_seq': 50, # no. permutations for sequential maximum stats
'n_perm_max_stat': 50, # no. permutations for maximum statistic
'n_perm_min_stat': 50, # no. permutations for minimum statistic
'n_perm_omnibus': 50, # no. permutations for omnius test
'noise_level': 1e-08, # noise added by Kraskov-estimator
'normalise': 'false', # normalisation performed by Kraskov-estimator
'num_threads': 'USE_ALL', # no. threads used by JIDT for CMI estimation
'perm_type': 'random', # permutation type used for temporal surrogate creation
'permute_in_time': True, # permute samples in time to create surrogate data
'stats_type': 'independent', # test type used for link comparison
'tail': 'two', # test tail for CMI test
'tail_comp': 'two', # test tail for link comparison
'theiler_t': '0', # Theiler-T
'verbose': True} # console output
Last, the object offers a method results.combine_results()
to combine the current partial result with further results. This is necessary if analyses are split over multiple algorithm calls to speed up analysis through parallelisation over processes.
Returned by all algorithms that analyse properties of individual processes in the network, e.g., ActiveInformationStorage()
:
from idtxl.data import Data
from idtxl.active_information_storage import ActiveInformationStorage
data = Data()
data.generate_mute_data(n_samples=500, n_replications=3)
settings = {
'cmi_estimator': 'JidtKraskovCMI',
'n_perm_max_stat': 200,
'n_perm_min_stat': 200,
'max_lag': 5,
'tau': 1
}
processes = [1, 2, 3]
network_analysis = ActiveInformationStorage()
results = network_analysis.analyse_network(settings, data, processes)
Calling
In [1]: results.get_significant_processes(fdr=True)
Out[1]: array([ True, True, True], dtype=bool)
returns a list indication for all processes in the network whether they show signifcant active information storage. Detailed results for each process can be obtained by calling
In [2]: results.get_single_process(process=1, fdr=True)
Out[2]:
{'ais': 0.08630697271414345, # total AIS
'ais_pval': 0.002, # p-value of total AIS
'ais_sign': True, # whether AIS is statistically significant
'current_value': (1, 5), # current_value: (process index, absolute samples)
'selected_vars': [(1, 4)]} # selected variables in the process' past
The parameter fdr
returns FDR-corrected results. FDR-correction controls the
false discovery rate over all analyzed targets. See the Wiki's theoretical introduction and the
paper by Benjamini et al. (1995) for details.
Additionally, the object results
has the following attributes:
results.data_properties # information regarding the data used for analysis
results.processes_analysed # list of processes analysed
results.settings # settings used for analysis
Calling the data_properties
attribute lists properties of the data that went into the analysis:
In [1]: results.data_properties
Out[1]: {'n_nodes': 5, 'n_realisations': 475, 'normalised': True}
Calling the settings attribute lists all settings used for analysis. Settings include both the settings provided by the user and default values set by the toolbox during analysis, providing a complete log of the analysis performed:
In [2]: results.settings
Out[2]:
{'add_conditionals': None, # enforce additional conditionals
'alpha_fdr': 0.05, # critical alpha level for FDR-correction
'alpha_max_stat': 0.05, # critical alpha level for maximum statistics
'alpha_mi': 0.05, # critical alpha level for final AIS value
'alpha_min_stat': 0.05, # critical alpha level for minimum statistics
'analytical_surrogates': False, # estimator used analytical surrogates instead of permutation tests
'cmi_estimator': 'JidtKraskovCMI', # CMI estimator used
'debug': False, # print debug information to console
'fdr_constant': 2, # constant used for FDR correction
'fdr_correction': True, # perform FDR correction over processes
'kraskov_k': '4', # number of nearest neighbours used by the Kraskov-estimator
'lag_mi': 0,
'local_values': False, # return local AIS values
'max_lag': 5, # maximum lag when looking for past variables
'n_perm_max_stat': 200, # no. permutations for maximum statistics
'n_perm_mi': 500, # no. permutations for test of final AIS value
'n_perm_min_stat': 200, # no. permutations for minimum statistics
'noise_level': 1e-08, # noise added by Kraskov-estimator
'normalise': 'false', # normalise values within Kraskov-estimator
'num_threads': 'USE_ALL', # no. threads used by JIDT when executing the Kraskov-estimator
'perm_type': 'random', # permutation type used for surrogate creation
'permute_in_time': True, # create surrogates by shuffeling samples instead of realisations
'tau': 1, # spacing between variables that are tested in the processes' past
'theiler_t': '0', # Theiler-correction
'verbose': True} # verbose console output during analysis
Last, the object offers a method results.combine_results()
to combine the current partial result with further results. This is necessary if analyses are split over multiple algorithm calls to speed up analysis through parallelisation over processes.
Returned by PartialInformationDecomposition()
:
import numpy as np
from idtxl.partial_information_decomposition import (
PartialInformationDecomposition)
from idtxl.data import Data
n = 200
alph = 2
x = np.random.randint(0, alph, n)
y = np.random.randint(0, alph, n)
z = np.logical_xor(x, y).astype(int)
data = Data(np.vstack((x, y, z)), 'ps', normalise=False)
settings = {
'alpha': 0.1,
'alph_s1': alph,
'alph_s2': alph,
'alph_t': alph,
'pid_estimator': 'TartuPID',
'lags_pid': [[1, 1], [3, 2], [0, 0]]}
targets = [0, 1, 2]
sources = [[1, 2], [0, 2], [0, 1]]
pid_analysis = PartialInformationDecomposition()
results = pid_analysis.analyse_network(settings, data, targets, sources)
Detailed results for individual targets can be accessed as dictionaries:
In [60]: results.get_single_target(target=1)
Out[60]:
{'current_value': (1, 1),
'num_err': (2.7734070595641924e-10, 0.0, 0.0),
'unq_s1': 0.0018811507166433591, # unique information in source 1
'unq_s2': -1.3723754144808186e-09, # unique information in source 2
'selected_vars_sources': [(0, 1), (2, 1)], # source variables used in PID estimation
'shd_s1_s2': 0.000106651366567541, # shared information between source 1 and 2
'solver': 'ECOS http://www.embotech.com/ECOS', # solver used
'source_1': [(0, 1)], # variables from source 1
'source_2': [(2, 1)], # variables from source 2
'syn_s1_s2': 0.0076602620069411431} # synergy between source 1 and 2
The object results
has the following attributes:
results.data_properties # information regarding the data used for analysis
results.targets_analysed # list of targets analysed
results.settings # settings used for analysis
Calling the data_properties
attribute lists properties of the data that went into the analysis:
In [1]: results.data_properties
Out[1]: {'normalised': False, 'n_realisations': 199, 'n_nodes': 3}
Calling the settings attribute lists all settings used for analysis. Settings include both the settings provided by the user and default values set by the toolbox during analysis, providing a complete log of the analysis performed:
In [2]: results.settings
Out[2]:
{'alph_s1': 2, # alphabet size source 1
'alph_s2': 2, # alphabet size source 2
'alph_t': 2, # alphabet size target
'alpha': 0.1, # critical alpha level for statistical testing (not yet implemented)
'pid_estimator': 'TartuPID', # PID estimator used
'cone_solver': 'ECOS', # cone-solver used by the Tartu-estimator
'ecos_solver_args': {}, # additional arguments passed to cone solver
'lags_pid': array([1, 1]), # lags between sources and target
'verbose': True} # console output
Last, the object offers a method results.combine_results()
to combine the current partial result with further results. This is necessary if analyses are split over multiple algorithm calls to speed up analysis through parallelisation over processes.