-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2] Aromatic fragments #5
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't quite remember what pysmiles does when partial aromatics. Any way, would passing reinterpret_aromatic=False
to read_smiles
not be an easier workaround and a better idea in general?
@@ -49,6 +49,7 @@ def merge_graphs(source_graph, target_graph, max_node=None): | |||
for node1, node2 in target_graph.edges: | |||
if correspondence[node1] != correspondence[node2]: | |||
attrs = target_graph.edges[(node1, node2)] | |||
print(attrs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print(attrs) |
organic_subset = 'B C N O P S F Cl Br I * b c n o s p'.split() | ||
batom = False | ||
for idx, node in enumerate(smiles_str): | ||
if node == '[': | ||
batom = True | ||
start = idx | ||
|
||
if node == ']' and batom: | ||
stop = idx+1 | ||
batom = False | ||
yield start, stop | ||
|
||
if node in organic_subset and not batom: | ||
yield idx, idx + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work for multi-letter organic atoms, such as Br!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed; hence PR #4 🤣
|
||
def strip_aromatic_nodes(smiles_str): | ||
""" | ||
Find all aromatic nodes and change them to lower |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Find all aromatic nodes and change them to lower | |
Find all aromatic nodes and change them to upper |
if node in organic_subset and not batom: | ||
yield idx, idx + 1 | ||
|
||
def strip_aromatic_nodes(smiles_str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the name of the function very much, since it deals with the smiles string, and not with graph nodes
@@ -102,7 +110,18 @@ def fragment_iter(fragment_str, all_atom=True): | |||
mol_graph.add_node(0, element="H", bonding=bonding_descrpt[0]) | |||
nx.set_node_attributes(mol_graph, bonding_descrpt, 'bonding') | |||
elif all_atom: | |||
mol_graph = pysmiles.read_smiles(smile) | |||
try: | |||
mol_graph = pysmiles.read_smiles(smile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mol_graph = pysmiles.read_smiles(smile) | |
mol_graph = pysmiles.read_smiles(smile, reinterpret_aromatic=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, not. The error is raised already at the parsing stage of the cgsmiles string. You categorically do not allow non-cyclic aromatics. We could modify pysmiles though to allow this if 'reinterpret_aromatic' is set to False
@@ -165,7 +165,8 @@ def edges_from_bonding_descrpt(self): | |||
bonding descriptors that formed the edge. Later unconsumed | |||
bonding descriptors are replaced by hydrogen atoms. | |||
""" | |||
for prev_node, node in nx.dfs_edges(self.meta_graph): | |||
for prev_node, node in self.meta_graph.edges: | |||
print(prev_node, node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print(prev_node, node) |
order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0]) | ||
print(order) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0]) | |
print(order) | |
order = re.findall("[-+]?[.]?[\d]+(?:,\d\d\d)*[\.]?\d*(?:[eE][-+]?\d+)?", bonding[0]) |
Cool, this got more complicated. At least needs a comment, but we should probably finish the discussion on #1 about this first.
no need for this anymore; the recent pysmiles changes fix everything |
Pysmiles does not allow partial aromatic fragments (for good reason) yet we still want to do them. Thus here we have a workaround. If two atoms are marked as aromatic we simply convert them to regular elements and later change it back. It's stupid but works.
Example Case Martini3 p-cresol:
The aromatic ring is fragmented into four fragments.