[2] Aromatic fragments #5

fgrunewald · 2024-04-22T13:28:13Z

Pysmiles does not allow partial aromatic fragments (for good reason) yet we still want to do them. Thus here we have a workaround. If two atoms are marked as aromatic we simply convert them to regular elements and later change it back. It's stupid but works.

Example Case Martini3 p-cresol:

cgsmiles_str = "{[#TN6]1[#TC5][#TC5A][#TC5]1}.{#TN6=[$][$]cO,#TC5=[$]cc[$],#TC5A=[$][$]cc}"

The aromatic ring is fragmented into four fragments.

pckroon

I can't quite remember what pysmiles does when partial aromatics. Any way, would passing reinterpret_aromatic=False to read_smiles not be an easier workaround and a better idea in general?

pckroon · 2024-05-01T10:27:06Z

cgsmiles/graph_utils.py

@@ -49,6 +49,7 @@ def merge_graphs(source_graph, target_graph, max_node=None):
    for node1, node2 in target_graph.edges:
        if correspondence[node1] != correspondence[node2]:
            attrs = target_graph.edges[(node1, node2)]
+            print(attrs)


Suggested change

print(attrs)

pckroon · 2024-05-01T10:29:32Z

cgsmiles/pysmiles_utils.py

+    organic_subset = 'B C N O P S F Cl Br I * b c n o s p'.split()
+    batom = False
+    for idx, node in enumerate(smiles_str):
+        if node == '[':
+            batom = True
+            start = idx
+
+        if node == ']' and batom:
+            stop = idx+1
+            batom = False
+            yield start, stop
+
+        if node in organic_subset and not batom:
+            yield idx, idx + 1


This won't work for multi-letter organic atoms, such as Br!

indeed; hence PR #4 🤣

pckroon · 2024-05-01T10:30:28Z

cgsmiles/pysmiles_utils.py

+
+def strip_aromatic_nodes(smiles_str):
+    """
+    Find all aromatic nodes and change them to lower


Suggested change

Find all aromatic nodes and change them to lower

Find all aromatic nodes and change them to upper

pckroon · 2024-05-01T10:31:49Z

cgsmiles/pysmiles_utils.py

+        if node in organic_subset and not batom:
+            yield idx, idx + 1
+
+def strip_aromatic_nodes(smiles_str):


I don't like the name of the function very much, since it deals with the smiles string, and not with graph nodes

pckroon · 2024-05-01T10:33:25Z

cgsmiles/read_fragments.py

@@ -102,7 +110,18 @@ def fragment_iter(fragment_str, all_atom=True):
            mol_graph.add_node(0, element="H", bonding=bonding_descrpt[0])
            nx.set_node_attributes(mol_graph, bonding_descrpt, 'bonding')
        elif all_atom:
-            mol_graph = pysmiles.read_smiles(smile)
+            try:
+                mol_graph = pysmiles.read_smiles(smile)


Suggested change

mol_graph = pysmiles.read_smiles(smile)

mol_graph = pysmiles.read_smiles(smile, reinterpret_aromatic=False)

Unfortunately, not. The error is raised already at the parsing stage of the cgsmiles string. You categorically do not allow non-cyclic aromatics. We could modify pysmiles though to allow this if 'reinterpret_aromatic' is set to False

pckroon · 2024-05-01T10:34:10Z

cgsmiles/resolve.py

@@ -165,7 +165,8 @@ def edges_from_bonding_descrpt(self):
        bonding descriptors that formed the edge. Later unconsumed
        bonding descriptors are replaced by hydrogen atoms.
        """
-        for prev_node, node in nx.dfs_edges(self.meta_graph):
+        for prev_node, node in self.meta_graph.edges:
+            print(prev_node, node)


Suggested change

print(prev_node, node)

pckroon · 2024-05-01T10:35:07Z

cgsmiles/resolve.py

+            order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0])
+            print(order)


Suggested change

order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0])

print(order)

order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0])

Cool, this got more complicated. At least needs a comment, but we should probably finish the discussion on #1 about this first.

fgrunewald · 2024-05-08T16:12:58Z

no need for this anymore; the recent pysmiles changes fix everything

do workaround for aromatic fragments

decba1e

fgrunewald changed the title ~~Aromatic fragments~~ [2] Aromatic fragments Apr 22, 2024

fgrunewald changed the base branch from master to squash-opr April 23, 2024 08:59

pckroon requested changes May 1, 2024

View reviewed changes

pckroon reviewed May 1, 2024

View reviewed changes

pckroon requested changes May 1, 2024

View reviewed changes

fgrunewald closed this May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2] Aromatic fragments #5

[2] Aromatic fragments #5

fgrunewald commented Apr 22, 2024

pckroon left a comment

pckroon May 1, 2024

pckroon May 1, 2024

fgrunewald May 2, 2024 •

edited

Loading

pckroon May 1, 2024

pckroon May 1, 2024

pckroon May 1, 2024

fgrunewald May 2, 2024

pckroon May 1, 2024

pckroon May 1, 2024

fgrunewald commented May 8, 2024

	Find all aromatic nodes and change them to lower
	Find all aromatic nodes and change them to upper

	mol_graph = pysmiles.read_smiles(smile)
	mol_graph = pysmiles.read_smiles(smile, reinterpret_aromatic=False)

		order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)[\.]?\d(?:[eE][-+]?\d+)?", bonding[0])
		print(order)

[2] Aromatic fragments #5

[2] Aromatic fragments #5

Conversation

fgrunewald commented Apr 22, 2024

pckroon left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fgrunewald May 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fgrunewald commented May 8, 2024

fgrunewald May 2, 2024 •

edited

Loading