Skip to content

Commit

Permalink
Added new style documentation for PAMM actions
Browse files Browse the repository at this point in the history
  • Loading branch information
Gareth Aneurin Tribello authored and Gareth Aneurin Tribello committed Nov 1, 2024
1 parent 6aa1e3f commit 4a94e30
Show file tree
Hide file tree
Showing 3 changed files with 119 additions and 53 deletions.
2 changes: 1 addition & 1 deletion src/colvar/PathMSD.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ void PathMSD::registerKeywords(Keywords& keys) {
PathMSDBase::registerKeywords(keys);
keys.addOutputComponent("sss","default","scalar","the position on the path");
keys.addOutputComponent("zzz","default","scalar","the distance from the path");
keys.addDOI("10.1063/1.2432340")
keys.addDOI("10.1063/1.2432340");
}

PathMSD::PathMSD(const ActionOptions&ao):
Expand Down
85 changes: 80 additions & 5 deletions src/pamm/HBPammMatrix.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,36 @@
/*
Adjacency matrix in which two electronegative atoms are adjacent if they are hydrogen bonded
\par Examples
This method allows you to calculate an adjacency matrix and use all the methods discussed in the documentation for
the [CONTACT_MATRIX](CONTACT_MATRIX.md) upon it to define CVs. The $i,j$ element of the matrix that is calculated
by this action is one if there is a hydrogen bond connecting atom $i$ to atom $j$. Furthermore, we determine whether
there is a hydrogen atom between these two atoms by using the PAMM technique that is discussed in the articles from the
bibliography below and in the documentation for the [PAMM](PAMM.md) action.
The example shown below illustrates how the method is used in practice
```plumed
#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
m: HBPAMM_MATRIX GROUP=1-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm GROUPC=2-192:3,3-192:3
PRINT ARG=m FILE=colvar
```
This example could be used to investigate the connectivity between a collection of water molecules in liquid water. Notice, however,
that the output matrix here is not symmetric.
If you want to investigate whether there are hydrogen bonds between two groups of molecules you can use an input like this:
```plumed
#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
m: HBPAMM_MATRIX GROUPA=1 GROUPB=2-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm GROUPC=2-192:3,3-192:3
PRINT ARG=m FILE=colvar
```
This input outputs a $1\times 63$ matrix in which the $1,i$th element tells you whether or not atom 1 donates a hydrogen bond
to the $i$th element in the group of 63 atoms that was specified using the ACCEPTORS keyword.
In general, it is better to use this action through the [HBPAMM_SA](HBPAMM_SA.md), [HBPAMM_SD](HBPAMM_SD.md) and [HBPAMM_SH](HBPAMM_SH.md)
keywords, which can be used to calculate the number of hydrogen bonds each donor, acceptor or hydrogen atom in your system participates in.
*/
//+ENDPLUMEDOC
Expand All @@ -40,7 +69,22 @@ Adjacency matrix in which two electronegative atoms are adjacent if they are hyd
/*
Calculate the number of hydrogen bonds each acceptor participates in using the HBPamm method
\par Examples
This shortcut action allows you to calculate the number of hydrogen bonds each of the atoms specified using the SITES
keyword accepts from its neighbours. The number of hydrogen bonds that a particular site accepts is determined by using the
PAMM tehcnique that is discussed in the articles from the bibliography below and in the documentation for the [PAMM](PAMM.md) action
The following example shows how you can use this action to calculate how many hydrogen bonds each of the water moecules in
a box of water accepts from its neighbours.
```plumed
#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
acceptors: HBPAMM_SA SITES=1-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm HYDROGENS=2-192:3,3-192:3 MEAN
DUMPATOMS ARG=acceptors ATOMS=1-192:3 FILE=acceptors.xyz
```
The output here is an xyz with five columns. As explained in the documentation for [DUMPATOMS](DUMPATOMS.md), the first four
columns are the usual columns that you would expect in an xyz file. The fifth column then contains the number of hydrogen bonds
that have been accepted by each of the atoms.
*/
//+ENDPLUMEDOC
Expand All @@ -49,7 +93,22 @@ Calculate the number of hydrogen bonds each acceptor participates in using the H
/*
Calculate the number of hydrogen bonds each donor participates in using the HBPamm method
\par Examples
This shortcut action allows you to calculate the number of hydrogen bonds each of the atoms specified using the SITES
keyword donates to its neighbours. The number of hydrogen bonds that a particular site donates is determined by using the
PAMM tehcnique that is discussed in the articles from the bibliography below and in the documentation for the [PAMM](PAMM.md) action
The following example shows how you can use this action to calculate how many hydrogen bonds each of the water moecules in
a box of water donates to its neighbours.
```plumed
#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
donors: HBPAMM_SD SITES=1-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm HYDROGENS=2-192:3,3-192:3 MEAN
DUMPATOMS ARG=donors ATOMS=1-192:3 FILE=donors.xyz
```
The output here is an xyz with five columns. As explained in the documentation for [DUMPATOMS](DUMPATOMS.md), the first four
columns are the usual columns that you would expect in an xyz file. The fifth column then contains the number of hydrogen bonds
that have been donated by each of the atoms.
*/
//+ENDPLUMEDOC
Expand All @@ -58,7 +117,22 @@ Calculate the number of hydrogen bonds each donor participates in using the HBPa
/*
Calculate the number of hydrogen bonds each hydrogen participates in using the HBPamm method
\par Examples
This shortcut action allows you to calculate the number of hydrogen bonds each of the atoms specified using the SITES
keyword donates to its neighbours. The number of hydrogen bonds that a particular site donates is determined by using the
PAMM tehcnique that is discussed in the articles from the bibliography below and in the documentation for the [PAMM](PAMM.md) action
The following example shows how you can use this action to calculate how many hydrogen bonds each of the water moecules in
a box of water donates to its neighbours.
```plumed
#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
hyd: HBPAMM_SH SITES=1-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm HYDROGENS=2-192:3,3-192:3 MEAN
DUMPATOMS ARG=hyd ATOMS=2-192:3,3-192:3 FILE=hydrogens.xyz
```
The output here is an xyz with five columns. As explained in the documentation for [DUMPATOMS](DUMPATOMS.md), the first four
columns are the usual columns that you would expect in an xyz file. The fifth column then contains the number of hydrogen bonds
that each hydrogen atom participates in.
*/
//+ENDPLUMEDOC
Expand Down Expand Up @@ -87,11 +161,12 @@ PLUMED_REGISTER_ACTION(HBPammMatrix,"HBPAMM_MATRIX")
void HBPammMatrix::registerKeywords( Keywords& keys ) {
adjmat::AdjacencyMatrixBase::registerKeywords( keys ); keys.use("GROUPC");
keys.add("compulsory","ORDER","dah","the order in which the groups are specified in the input. Can be dah (donor/acceptor/hydrogens), "
"adh (acceptor/donor/hydrogens) or hda (hydrogens/donor/hydrogens");
"adh (acceptor/donor/hydrogens) or hda (hydrogens/donor/acceptors");
keys.add("compulsory","CLUSTERS","the name of the file that contains the definitions of all the kernels for PAMM");
keys.add("compulsory","REGULARISE","0.001","don't allow the denominator to be smaller then this value");
keys.add("compulsory","GAUSS_CUTOFF","6.25","the cutoff at which to stop evaluating the kernel function is set equal to sqrt(2*x)*(max(adc)+cov(adc))");
keys.needsAction("PAMM"); keys.needsAction("ONES"); keys.needsAction("MATRIX_VECTOR_PRODUCT");
keys.addDOI("10.1063/1.4900655"); keys.addDOI("10.1021/acs.jctc.7b00993");
}

HBPammMatrix::HBPammMatrix(const ActionOptions& ao):
Expand Down
85 changes: 38 additions & 47 deletions src/pamm/PAMM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,80 +29,70 @@
/*
Probabilistic analysis of molecular motifs.
Probabilistic analysis of molecular motifs (PAMM) was introduced in this paper \cite pamm.
Probabilistic analysis of molecular motifs (PAMM) was introduced in the papers in the bibliography.
The essence of this approach involves calculating some large set of collective variables
for a set of atoms in a short trajectory and fitting this data using a Gaussian Mixture Model.
The idea is that modes in these distributions can be used to identify features such as hydrogen bonds or
secondary structure types.
The assumption within this implementation is that the fitting of the Gaussian mixture model has been
done elsewhere by a separate code. You thus provide an input file to this action which contains the
means, covariance matrices and weights for a set of Gaussian kernels, \f$\{ \phi \}\f$. The values and
means, covariance matrices and weights for a set of Gaussian kernels, $\{\phi\}$. The values and
derivatives for the following set of quantities is then computed:
\f[
$$
s_k = \frac{ \phi_k}{ \sum_i \phi_i }
\f]
$$
Each of the \f$\phi_k\f$ is a Gaussian function that acts on a set of quantities calculated within
a \ref mcolv . These might be \ref TORSIONS, \ref DISTANCES, \ref ANGLES or any one of the many
symmetry functions that are available within \ref mcolv actions. These quantities are then inserted into
the set of \f$n\f$ kernels that are in the the input file. This will be done for multiple sets of values
for the input quantities and a final quantity will be calculated by summing the above \f$s_k\f$ values or
Each of the $\phi_k$ is a Gaussian function that acts on a set in quantities calculated that might be calculated
using a [TORSION](TORSION.md), [DISTANCE](DISTANCE.md) or [ANGLE](ANGLE.md) action for example.
These quantities are then inserted into the set of $n$ kernels that are in the the input file. This will be done for multiple sets of values
for the input quantities and a final quantity will be calculated by summing the above $s_k$ values or
some transformation of the above. This sounds less complicated than it is and is best understood by
looking through the example given below.
looking through the example given below, which can be expanded to show the full set of operations that PLUMED is performing.
\warning Mixing \ref mcolv actions that are periodic with variables that are not periodic has not been tested
\warning Mixing input variables that are periodic with variables that are not periodic has not been tested
\par Examples
## Examples
In this example I will explain in detail what the following input is computing:
\plumedfile
#SETTINGS MOLFILE=regtest/basic/rt32/helix.pdb
MOLINFO MOLTYPE=protein STRUCTURE=M1d.pdb
psi: TORSIONS ATOMS1=@psi-2 ATOMS2=@psi-3 ATOMS3=@psi-4
phi: TORSIONS ATOMS1=@phi-2 ATOMS2=@phi-3 ATOMS3=@phi-4
p: PAMM DATA=phi,psi CLUSTERS=clusters.pamm MEAN1={COMPONENT=1} MEAN2={COMPONENT=2}
```plumed
#SETTINGS MOLFILE=regtest/pamm/rt-pamm-periodic/M1d.pdb INPUTFILES=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp
MOLINFO MOLTYPE=protein STRUCTURE=regtest/pamm/rt-pamm-periodic/M1d.pdb
psi: TORSION ATOMS1=@psi-2 ATOMS2=@psi-3 ATOMS3=@psi-4
phi: TORSION ATOMS1=@phi-2 ATOMS2=@phi-3 ATOMS3=@phi-4
p: PAMM DATA=phi,psi CLUSTERS=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp MEAN1={COMPONENT=1} MEAN2={COMPONENT=2}
PRINT ARG=p.mean-1,p.mean-2 FILE=colvar
\endplumedfile
```
The best place to start our explanation is to look at the contents of the clusters.pamm file
The best place to start our explanation is to look at the contents of the `2D-testc-0.75.pammp` file, which you can do
by clicking on the links in the annotated input above. This files contains the parameters of two two-dimensional Gaussian functions.
Each of these Gaussian kernels has a weight, $w_k$, a vector that specifies the position of its center, $\mathbf{c}_k$, and a covariance matrix, $\Sigma_k$.
The $\phi_k$ functions that we use to calculate our PAMM components are thus:
\auxfile{clusters.pamm}
#! FIELDS height phi psi sigma_phi_phi sigma_phi_psi sigma_psi_phi sigma_psi_psi
#! SET multivariate von-misses
#! SET kerneltype gaussian
2.97197455E-0001 -1.91983118E+0000 2.25029540E+0000 2.45960237E-0001 -1.30615381E-0001 -1.30615381E-0001 2.40239117E-0001
2.29131448E-0002 1.39809354E+0000 9.54585380E-0002 9.61755708E-0002 -3.55657919E-0002 -3.55657919E-0002 1.06147253E-0001
5.06676398E-0001 -1.09648066E+0000 -7.17867907E-0001 1.40523052E-0001 -1.05385552E-0001 -1.05385552E-0001 1.63290557E-0001
\endauxfile
This files contains the parameters of two two-dimensional Gaussian functions. Each of these Gaussian kernels has a weight, \f$w_k\f$,
a vector that specifies the position of its center, \f$\mathbf{c}_k\f$, and a covariance matrix, \f$\Sigma_k\f$. The \f$\phi_k\f$ functions that
we use to calculate our PAMM components are thus:
\f[
$$
\phi_k = \frac{w_k}{N_k} \exp\left( -(\mathbf{s} - \mathbf{c}_k)^T \Sigma^{-1}_k (\mathbf{s} - \mathbf{c}_k) \right)
\f]
In the above \f$N_k\f$ is a normalization factor that is calculated based on \f$\Sigma\f$. The vector \f$\mathbf{s}\f$ is a vector of quantities
that are calculated by the \ref TORSIONS actions. This vector must be two dimensional and in this case each component is the value of a
torsion angle. If we look at the two \ref TORSIONS actions in the above we are calculating the \f$\phi\f$ and \f$\psi\f$ backbone torsional
angles in a protein (Note the use of \ref MOLINFO to make specification of atoms straightforward). We thus calculate the values of our
2 \f$ \{ \phi \} \f$ kernels 3 times. The first time we use the \f$\phi\f$ and \f$\psi\f$ angles in the second residue of the protein,
the second time it is the \f$\phi\f$ and \f$\psi\f$ angles of the third residue of the protein and the third time it is the \f$\phi\f$ and \f$\psi\f$ angles
$$
In the above $N_k$ is a normalization factor that is calculated based on $\Sigma$. The vector $\mathbf{s}$ is a vector of quantities
that are calculated by the input [TORSION](TORSION.md) actions. This vector must be two dimensional and in this case each component is the value of a
torsion angle. If we look at the two TORSION actions in the above we are calculating the $\phi$ and $\psi$ backbone torsional
angles in a protein (Note the use of [MOLINFO](MOLINFO.md) to make specification of atoms straightforward). We thus calculate the values of our
2 $ \{ \phi \} $ kernels 3 times. The first time we use the $\phi$ and $\psi$ angles in the second residue of the protein,
the second time it is the $\phi$ and $\psi$ angles of the third residue of the protein and the third time it is the $\phi$ and $\psi$ angles
of the fourth residue in the protein. The final two quantities that are output by the print command, p.mean-1 and p.mean-2, are the averages
over these three residues for the quantities:
\f[
$$
s_1 = \frac{ \phi_1}{ \phi_1 + \phi_2 }
\f]
$$
and
\f[
$$
s_2 = \frac{ \phi_2}{ \phi_1 + \phi_2 }
\f]
$$
There is a great deal of flexibility in this input. We can work with, and examine, any number of components, we can use any set of collective variables
and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums.
and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums. Furthermore,
by expanding the shortcuts in the example above we can obtain insight into how the PAMM method operates.
*/
//+ENDPLUMEDOC

Expand All @@ -125,6 +115,7 @@ void PAMM::registerKeywords( Keywords& keys ) {
keys.add("compulsory","KERNELS","all","which kernels are we computing the PAMM values for");
multicolvar::MultiColvarShortcuts::shortcutKeywords( keys );
keys.needsAction("KERNEL"); keys.needsAction("COMBINE");
keys.addDOI("10.1063/1.4900655"); keys.addDOI("10.1021/acs.jctc.7b00993");
}

PAMM::PAMM(const ActionOptions& ao) :
Expand Down

1 comment on commit 4a94e30

@PlumedBot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found broken examples in automatic/ANN.tmp
Found broken examples in automatic/CAVITY.tmp
Found broken examples in automatic/CLASSICAL_MDS.tmp
Found broken examples in automatic/CONSTANT.tmp
Found broken examples in automatic/CONVERT_TO_FES.tmp
Found broken examples in automatic/COORDINATIONNUMBER.tmp
Found broken examples in automatic/DISTANCE_FROM_CONTOUR.tmp
Found broken examples in automatic/DUMPCUBE.tmp
Found broken examples in automatic/DUMPGRID.tmp
Found broken examples in automatic/EDS.tmp
Found broken examples in automatic/EMMI.tmp
Found broken examples in automatic/FIND_CONTOUR.tmp
Found broken examples in automatic/FIND_CONTOUR_SURFACE.tmp
Found broken examples in automatic/FIND_SPHERICAL_CONTOUR.tmp
Found broken examples in automatic/FOURIER_TRANSFORM.tmp
Found broken examples in automatic/FUNCPATHGENERAL.tmp
Found broken examples in automatic/FUNCPATHMSD.tmp
Found broken examples in automatic/FUNNEL.tmp
Found broken examples in automatic/FUNNEL_PS.tmp
Found broken examples in automatic/GPROPERTYMAP.tmp
Found broken examples in automatic/HISTOGRAM.tmp
Found broken examples in automatic/INCLUDE.tmp
Found broken examples in automatic/INCYLINDER.tmp
Found broken examples in automatic/INENVELOPE.tmp
Found broken examples in automatic/INTERPOLATE_GRID.tmp
Found broken examples in automatic/LOCAL_AVERAGE.tmp
Found broken examples in automatic/MAZE_OPTIMIZER_BIAS.tmp
Found broken examples in automatic/MAZE_RANDOM_ACCELERATION_MD.tmp
Found broken examples in automatic/MAZE_SIMULATED_ANNEALING.tmp
Found broken examples in automatic/MAZE_STEERED_MD.tmp
Found broken examples in automatic/METATENSOR.tmp
Found broken examples in automatic/MULTICOLVARDENS.tmp
Found broken examples in automatic/PCA.tmp
Found broken examples in automatic/PCAVARS.tmp
Found broken examples in automatic/PIV.tmp
Found broken examples in automatic/PLUMED.tmp
Found broken examples in automatic/PYCVINTERFACE.tmp
Found broken examples in automatic/PYTHONFUNCTION.tmp
Found broken examples in automatic/Q3.tmp
Found broken examples in automatic/Q4.tmp
Found broken examples in automatic/Q6.tmp
Found broken examples in automatic/QUATERNION.tmp
Found broken examples in automatic/REWEIGHT_BIAS.tmp
Found broken examples in automatic/REWEIGHT_METAD.tmp
Found broken examples in automatic/SIZESHAPE_POSITION_LINEAR_PROJ.tmp
Found broken examples in automatic/SIZESHAPE_POSITION_MAHA_DIST.tmp
Found broken examples in automatic/TETRAHEDRALPORE.tmp
Found broken examples in automatic/WHAM_HISTOGRAM.tmp
Found broken examples in automatic/WHAM_WEIGHTS.tmp
Found broken examples in AnalysisPP.md
Found broken examples in CollectiveVariablesPP.md
Found broken examples in MiscelaneousPP.md

Please sign in to comment.