Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2] Aromatic fragments #5

Closed
wants to merge 1 commit into from
Closed

Conversation

fgrunewald
Copy link
Collaborator

Pysmiles does not allow partial aromatic fragments (for good reason) yet we still want to do them. Thus here we have a workaround. If two atoms are marked as aromatic we simply convert them to regular elements and later change it back. It's stupid but works.

Example Case Martini3 p-cresol:

cgsmiles_str = "{[#TN6]1[#TC5][#TC5A][#TC5]1}.{#TN6=[$][$]cO,#TC5=[$]cc[$],#TC5A=[$][$]cc}"

The aromatic ring is fragmented into four fragments.

@fgrunewald fgrunewald changed the title Aromatic fragments [2] Aromatic fragments Apr 22, 2024
@fgrunewald fgrunewald changed the base branch from master to squash-opr April 23, 2024 08:59
Copy link
Collaborator

@pckroon pckroon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't quite remember what pysmiles does when partial aromatics. Any way, would passing reinterpret_aromatic=False to read_smiles not be an easier workaround and a better idea in general?

@@ -49,6 +49,7 @@ def merge_graphs(source_graph, target_graph, max_node=None):
for node1, node2 in target_graph.edges:
if correspondence[node1] != correspondence[node2]:
attrs = target_graph.edges[(node1, node2)]
print(attrs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(attrs)

Comment on lines +11 to +24
organic_subset = 'B C N O P S F Cl Br I * b c n o s p'.split()
batom = False
for idx, node in enumerate(smiles_str):
if node == '[':
batom = True
start = idx

if node == ']' and batom:
stop = idx+1
batom = False
yield start, stop

if node in organic_subset and not batom:
yield idx, idx + 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work for multi-letter organic atoms, such as Br!

Copy link
Collaborator Author

@fgrunewald fgrunewald May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed; hence PR #4 🤣


def strip_aromatic_nodes(smiles_str):
"""
Find all aromatic nodes and change them to lower
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Find all aromatic nodes and change them to lower
Find all aromatic nodes and change them to upper

if node in organic_subset and not batom:
yield idx, idx + 1

def strip_aromatic_nodes(smiles_str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the name of the function very much, since it deals with the smiles string, and not with graph nodes

@@ -102,7 +110,18 @@ def fragment_iter(fragment_str, all_atom=True):
mol_graph.add_node(0, element="H", bonding=bonding_descrpt[0])
nx.set_node_attributes(mol_graph, bonding_descrpt, 'bonding')
elif all_atom:
mol_graph = pysmiles.read_smiles(smile)
try:
mol_graph = pysmiles.read_smiles(smile)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mol_graph = pysmiles.read_smiles(smile)
mol_graph = pysmiles.read_smiles(smile, reinterpret_aromatic=False)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, not. The error is raised already at the parsing stage of the cgsmiles string. You categorically do not allow non-cyclic aromatics. We could modify pysmiles though to allow this if 'reinterpret_aromatic' is set to False

@@ -165,7 +165,8 @@ def edges_from_bonding_descrpt(self):
bonding descriptors that formed the edge. Later unconsumed
bonding descriptors are replaced by hydrogen atoms.
"""
for prev_node, node in nx.dfs_edges(self.meta_graph):
for prev_node, node in self.meta_graph.edges:
print(prev_node, node)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(prev_node, node)

Comment on lines +181 to +182
order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0])
print(order)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0])
print(order)
order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0])

Cool, this got more complicated. At least needs a comment, but we should probably finish the discussion on #1 about this first.

@fgrunewald
Copy link
Collaborator Author

no need for this anymore; the recent pysmiles changes fix everything

@fgrunewald fgrunewald closed this May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants