Added new style documentation for PAMM actions

plumed · Nov 1, 2024 · 4a94e30 · 4a94e30 · PlumedBot · Nov 1, 2024
1 parent 6aa1e3f
commit 4a94e30
Show file tree

Hide file tree

Showing 3 changed files with 119 additions and 53 deletions.
diff --git a/src/colvar/PathMSD.cpp b/src/colvar/PathMSD.cpp
@@ -75,7 +75,7 @@ void PathMSD::registerKeywords(Keywords& keys) {
   PathMSDBase::registerKeywords(keys);
   keys.addOutputComponent("sss","default","scalar","the position on the path");
   keys.addOutputComponent("zzz","default","scalar","the distance from the path");
-  keys.addDOI("10.1063/1.2432340")
+  keys.addDOI("10.1063/1.2432340");
 }
 
 PathMSD::PathMSD(const ActionOptions&ao):

diff --git a/src/pamm/HBPammMatrix.cpp b/src/pamm/HBPammMatrix.cpp
@@ -31,7 +31,36 @@
 /*
 Adjacency matrix in which two electronegative atoms are adjacent if they are hydrogen bonded
 
-\par Examples
+This method allows you to calculate an adjacency matrix and use all the methods discussed in the documentation for 
+the [CONTACT_MATRIX](CONTACT_MATRIX.md) upon it to define CVs.  The $i,j$ element of the matrix that is calculated 
+by this action is one if there is a hydrogen bond connecting atom $i$ to atom $j$.  Furthermore, we determine whether 
+there is a hydrogen atom between these two atoms by using the PAMM technique that is discussed in the articles from the 
+bibliography below and in the documentation for the [PAMM](PAMM.md) action.
+
+The example shown below illustrates how the method is used in practice
+
+```plumed
+#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
+m: HBPAMM_MATRIX GROUP=1-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm GROUPC=2-192:3,3-192:3 
+PRINT ARG=m FILE=colvar
+```
+
+This example could be used to investigate the connectivity between a collection of water molecules in liquid water. Notice, however,  
+that the output matrix here is not symmetric.
+
+If you want to investigate whether there are hydrogen bonds between two groups of molecules you can use an input like this:
+
+```plumed
+#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
+m: HBPAMM_MATRIX GROUPA=1 GROUPB=2-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm GROUPC=2-192:3,3-192:3 
+PRINT ARG=m FILE=colvar
+``` 
+
+This input outputs a $1\times 63$ matrix in which the $1,i$th element tells you whether or not atom 1 donates a hydrogen bond 
+to the $i$th element in the group of 63 atoms that was specified using the ACCEPTORS keyword.
+
+In general, it is better to use this action through the [HBPAMM_SA](HBPAMM_SA.md), [HBPAMM_SD](HBPAMM_SD.md) and [HBPAMM_SH](HBPAMM_SH.md)
+keywords, which can be used to calculate the number of hydrogen bonds each donor, acceptor or hydrogen atom in your system participates in.
 
 */
 //+ENDPLUMEDOC
@@ -40,7 +69,22 @@ Adjacency matrix in which two electronegative atoms are adjacent if they are hyd
 /*
 Calculate the number of hydrogen bonds each acceptor participates in using the HBPamm method
 
-\par Examples
+This shortcut action allows you to calculate the number of hydrogen bonds each of the atoms specified using the SITES
+keyword accepts from its neighbours. The number of hydrogen bonds that a particular site accepts is determined by using the 
+PAMM tehcnique that is discussed in the articles from the bibliography below and in the documentation for the [PAMM](PAMM.md) action
+
+The following example shows how you can use this action to calculate how many hydrogen bonds each of the water moecules in 
+a box of water accepts from its neighbours.
+
+```plumed
+#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
+acceptors: HBPAMM_SA SITES=1-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm HYDROGENS=2-192:3,3-192:3 MEAN 
+DUMPATOMS ARG=acceptors ATOMS=1-192:3 FILE=acceptors.xyz
+```
+
+The output here is an xyz with five columns. As explained in the documentation for [DUMPATOMS](DUMPATOMS.md), the first four 
+columns are the usual columns that you would expect in an xyz file.  The fifth column then contains the number of hydrogen bonds 
+that have been accepted by each of the atoms.  
 
 */
 //+ENDPLUMEDOC
@@ -49,7 +93,22 @@ Calculate the number of hydrogen bonds each acceptor participates in using the H
 /*
 Calculate the number of hydrogen bonds each donor participates in using the HBPamm method
 
-\par Examples
+This shortcut action allows you to calculate the number of hydrogen bonds each of the atoms specified using the SITES
+keyword donates to its neighbours. The number of hydrogen bonds that a particular site donates is determined by using the 
+PAMM tehcnique that is discussed in the articles from the bibliography below and in the documentation for the [PAMM](PAMM.md) action
+
+The following example shows how you can use this action to calculate how many hydrogen bonds each of the water moecules in 
+a box of water donates to its neighbours.
+
+```plumed
+#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
+donors: HBPAMM_SD SITES=1-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm HYDROGENS=2-192:3,3-192:3 MEAN 
+DUMPATOMS ARG=donors ATOMS=1-192:3 FILE=donors.xyz
+```
+
+The output here is an xyz with five columns. As explained in the documentation for [DUMPATOMS](DUMPATOMS.md), the first four 
+columns are the usual columns that you would expect in an xyz file.  The fifth column then contains the number of hydrogen bonds 
+that have been donated by each of the atoms. 
 
 */
 //+ENDPLUMEDOC
@@ -58,7 +117,22 @@ Calculate the number of hydrogen bonds each donor participates in using the HBPa
 /*
 Calculate the number of hydrogen bonds each hydrogen participates in using the HBPamm method
 
-\par Examples
+This shortcut action allows you to calculate the number of hydrogen bonds each of the atoms specified using the SITES
+keyword donates to its neighbours. The number of hydrogen bonds that a particular site donates is determined by using the 
+PAMM tehcnique that is discussed in the articles from the bibliography below and in the documentation for the [PAMM](PAMM.md) action
+
+The following example shows how you can use this action to calculate how many hydrogen bonds each of the water moecules in 
+a box of water donates to its neighbours.
+
+```plumed
+#SETTINGS INPUTFILES=regtest/pamm/rt-hbpamm/b3lyp.pamm
+hyd: HBPAMM_SH SITES=1-192:3 CLUSTERS=regtest/pamm/rt-hbpamm/b3lyp.pamm HYDROGENS=2-192:3,3-192:3 MEAN 
+DUMPATOMS ARG=hyd ATOMS=2-192:3,3-192:3 FILE=hydrogens.xyz
+```
+
+The output here is an xyz with five columns. As explained in the documentation for [DUMPATOMS](DUMPATOMS.md), the first four 
+columns are the usual columns that you would expect in an xyz file.  The fifth column then contains the number of hydrogen bonds 
+that each hydrogen atom participates in.
 
 */
 //+ENDPLUMEDOC
@@ -87,11 +161,12 @@ PLUMED_REGISTER_ACTION(HBPammMatrix,"HBPAMM_MATRIX")
 void HBPammMatrix::registerKeywords( Keywords& keys ) {
   adjmat::AdjacencyMatrixBase::registerKeywords( keys ); keys.use("GROUPC");
   keys.add("compulsory","ORDER","dah","the order in which the groups are specified in the input.  Can be dah (donor/acceptor/hydrogens), "
-           "adh (acceptor/donor/hydrogens) or hda (hydrogens/donor/hydrogens");
+           "adh (acceptor/donor/hydrogens) or hda (hydrogens/donor/acceptors");
   keys.add("compulsory","CLUSTERS","the name of the file that contains the definitions of all the kernels for PAMM");
   keys.add("compulsory","REGULARISE","0.001","don't allow the denominator to be smaller then this value");
   keys.add("compulsory","GAUSS_CUTOFF","6.25","the cutoff at which to stop evaluating the kernel function is set equal to sqrt(2*x)*(max(adc)+cov(adc))");
   keys.needsAction("PAMM"); keys.needsAction("ONES"); keys.needsAction("MATRIX_VECTOR_PRODUCT");
+  keys.addDOI("10.1063/1.4900655"); keys.addDOI("10.1021/acs.jctc.7b00993");
 }
 
 HBPammMatrix::HBPammMatrix(const ActionOptions& ao):

diff --git a/src/pamm/PAMM.cpp b/src/pamm/PAMM.cpp
@@ -29,80 +29,70 @@
 /*
 Probabilistic analysis of molecular motifs.
 
-Probabilistic analysis of molecular motifs (PAMM) was introduced in this paper \cite pamm.
+Probabilistic analysis of molecular motifs (PAMM) was introduced in the papers in the bibliography.
 The essence of this approach involves calculating some large set of collective variables
 for a set of atoms in a short trajectory and fitting this data using a Gaussian Mixture Model.
 The idea is that modes in these distributions can be used to identify features such as hydrogen bonds or
 secondary structure types.
 
 The assumption within this implementation is that the fitting of the Gaussian mixture model has been
 done elsewhere by a separate code.  You thus provide an input file to this action which contains the
-means, covariance matrices and weights for a set of Gaussian kernels, \f$\{ \phi \}\f$.  The values and
+means, covariance matrices and weights for a set of Gaussian kernels, $\{\phi\}$.  The values and
 derivatives for the following set of quantities is then computed:
 
-\f[
+$$
 s_k = \frac{ \phi_k}{ \sum_i \phi_i }
-\f]
+$$
 
-Each of the \f$\phi_k\f$ is a Gaussian function that acts on a set of quantities calculated within
-a \ref mcolv .  These might be \ref TORSIONS, \ref DISTANCES, \ref ANGLES or any one of the many
-symmetry functions that are available within \ref mcolv actions.  These quantities are then inserted into
-the set of \f$n\f$ kernels that are in the the input file.   This will be done for multiple sets of values
-for the input quantities and a final quantity will be calculated by summing the above \f$s_k\f$ values or
+Each of the $\phi_k$ is a Gaussian function that acts on a set in quantities calculated that might be calculated
+using a [TORSION](TORSION.md), [DISTANCE](DISTANCE.md) or [ANGLE](ANGLE.md) action for example.
+These quantities are then inserted into the set of $n$ kernels that are in the the input file.   This will be done for multiple sets of values
+for the input quantities and a final quantity will be calculated by summing the above $s_k$ values or
 some transformation of the above.  This sounds less complicated than it is and is best understood by
-looking through the example given below.
+looking through the example given below, which can be expanded to show the full set of operations that PLUMED is performing.
 
-\warning Mixing \ref mcolv actions that are periodic with variables that are not periodic has not been tested
+\warning Mixing input variables that are periodic with variables that are not periodic has not been tested
 
-\par Examples
+## Examples
 
 In this example I will explain in detail what the following input is computing:
 
-\plumedfile
-#SETTINGS MOLFILE=regtest/basic/rt32/helix.pdb
-MOLINFO MOLTYPE=protein STRUCTURE=M1d.pdb
-psi: TORSIONS ATOMS1=@psi-2 ATOMS2=@psi-3 ATOMS3=@psi-4
-phi: TORSIONS ATOMS1=@phi-2 ATOMS2=@phi-3 ATOMS3=@phi-4
-p: PAMM DATA=phi,psi CLUSTERS=clusters.pamm MEAN1={COMPONENT=1} MEAN2={COMPONENT=2}
+```plumed
+#SETTINGS MOLFILE=regtest/pamm/rt-pamm-periodic/M1d.pdb INPUTFILES=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp
+MOLINFO MOLTYPE=protein STRUCTURE=regtest/pamm/rt-pamm-periodic/M1d.pdb
+psi: TORSION ATOMS1=@psi-2 ATOMS2=@psi-3 ATOMS3=@psi-4
+phi: TORSION ATOMS1=@phi-2 ATOMS2=@phi-3 ATOMS3=@phi-4
+p: PAMM DATA=phi,psi CLUSTERS=regtest/pamm/rt-pamm-periodic/2D-testc-0.75.pammp MEAN1={COMPONENT=1} MEAN2={COMPONENT=2}
 PRINT ARG=p.mean-1,p.mean-2 FILE=colvar
-\endplumedfile
+```
 
-The best place to start our explanation is to look at the contents of the clusters.pamm file
+The best place to start our explanation is to look at the contents of the `2D-testc-0.75.pammp` file, which you can do 
+by clicking on the links in the annotated input above.  This files contains the parameters of two two-dimensional Gaussian functions.  
+Each of these Gaussian kernels has a weight, $w_k$, a vector that specifies the position of its center, $\mathbf{c}_k$, and a covariance matrix, $\Sigma_k$.  
+The $\phi_k$ functions that we use to calculate our PAMM components are thus:
 
-\auxfile{clusters.pamm}
-#! FIELDS height phi psi sigma_phi_phi sigma_phi_psi sigma_psi_phi sigma_psi_psi
-#! SET multivariate von-misses
-#! SET kerneltype gaussian
-      2.97197455E-0001     -1.91983118E+0000      2.25029540E+0000      2.45960237E-0001     -1.30615381E-0001     -1.30615381E-0001      2.40239117E-0001
-      2.29131448E-0002      1.39809354E+0000      9.54585380E-0002      9.61755708E-0002     -3.55657919E-0002     -3.55657919E-0002      1.06147253E-0001
-      5.06676398E-0001     -1.09648066E+0000     -7.17867907E-0001      1.40523052E-0001     -1.05385552E-0001     -1.05385552E-0001      1.63290557E-0001
-\endauxfile
-
-This files contains the parameters of two two-dimensional Gaussian functions.  Each of these Gaussian kernels has a weight, \f$w_k\f$,
-a vector that specifies the position of its center, \f$\mathbf{c}_k\f$, and a covariance matrix, \f$\Sigma_k\f$.  The \f$\phi_k\f$ functions that
-we use to calculate our PAMM components are thus:
-
-\f[
+$$
 \phi_k = \frac{w_k}{N_k} \exp\left( -(\mathbf{s} - \mathbf{c}_k)^T \Sigma^{-1}_k (\mathbf{s} - \mathbf{c}_k) \right)
-\f]
-
-In the above \f$N_k\f$ is a normalization factor that is calculated based on \f$\Sigma\f$.  The vector \f$\mathbf{s}\f$ is a vector of quantities
-that are calculated by the \ref TORSIONS actions.  This vector must be two dimensional and in this case each component is the value of a
-torsion angle.  If we look at the two \ref TORSIONS actions in the above we are calculating the \f$\phi\f$ and \f$\psi\f$ backbone torsional
-angles in a protein (Note the use of \ref MOLINFO to make specification of atoms straightforward).  We thus calculate the values of our
-2 \f$ \{ \phi \} \f$  kernels 3 times.  The first time we use the \f$\phi\f$ and \f$\psi\f$ angles in the second residue of the protein,
-the second time it is the \f$\phi\f$ and \f$\psi\f$ angles of the third residue of the protein and the third time it is the \f$\phi\f$ and \f$\psi\f$ angles
+$$
+
+In the above $N_k$ is a normalization factor that is calculated based on $\Sigma$.  The vector $\mathbf{s}$ is a vector of quantities
+that are calculated by the input [TORSION](TORSION.md) actions.  This vector must be two dimensional and in this case each component is the value of a
+torsion angle.  If we look at the two TORSION actions in the above we are calculating the $\phi$ and $\psi$ backbone torsional
+angles in a protein (Note the use of [MOLINFO](MOLINFO.md) to make specification of atoms straightforward).  We thus calculate the values of our
+2 $ \{ \phi \} $  kernels 3 times.  The first time we use the $\phi$ and $\psi$ angles in the second residue of the protein,
+the second time it is the $\phi$ and $\psi$ angles of the third residue of the protein and the third time it is the $\phi$ and $\psi$ angles
 of the fourth residue in the protein.  The final two quantities that are output by the print command, p.mean-1 and p.mean-2, are the averages
 over these three residues for the quantities:
-\f[
+$$
 s_1 = \frac{ \phi_1}{ \phi_1 + \phi_2 }
-\f]
+$$
 and
-\f[
+$$
 s_2 = \frac{ \phi_2}{ \phi_1 + \phi_2 }
-\f]
+$$
 There is a great deal of flexibility in this input.  We can work with, and examine, any number of components, we can use any set of collective variables
-and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums.
+and compute these PAMM variables and we can transform the PAMM variables themselves in a large number of different ways when computing these sums.  Furthermore,
+by expanding the shortcuts in the example above we can obtain insight into how the PAMM method operates.
 */
 //+ENDPLUMEDOC
 
@@ -125,6 +115,7 @@ void PAMM::registerKeywords( Keywords& keys ) {
   keys.add("compulsory","KERNELS","all","which kernels are we computing the PAMM values for");
   multicolvar::MultiColvarShortcuts::shortcutKeywords( keys );
   keys.needsAction("KERNEL"); keys.needsAction("COMBINE");
+  keys.addDOI("10.1063/1.4900655"); keys.addDOI("10.1021/acs.jctc.7b00993");
 }
 
 PAMM::PAMM(const ActionOptions& ao) :