Reputation: 31
I am attempting to use RDKIT Chem.FindAllSubgraphsOfLengthN(mol,n) function but am unable to callout the information from using this function. It runs, but I am unable to obtain the substructures.
Does anyone have suggestions on successfully calling out the information from the output of this function after it runs?
I am expecting an output, either in tuple or string form, that lists all substructures with Length N atoms. In my explicit case, I am looking for 4 atoms. Attached is code I have run with also listing the output errors.
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import AllChem
AllChem.SetPreferCoordGen(True)
from rdkit.Chem import rdmolops
mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
mol
structures = Chem.FindAllSubgraphsOfLengthN(mol,4)
structures
print(structures[1])
Output: <rdkit.rdBase._vectint object at 0x0000024C16BBA2E0>
structures[0]
Output: <rdkit.rdBase._vectint at 0x24c16ba62e0>
Upvotes: 1
Views: 434
Reputation: 31
A team member of mine was able to generate a code as to access the paths or subgraphs information. To clarify, paths don't have branching where subgraphs do.
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import AllChem
AllChem.SetPreferCoordGen(True)
from rdkit.Chem import rdmolops
import csv
mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
mol
# Find all subgraphs of length 3 in the molecule
#Subgraphs have branching
subgraphs = Chem.FindAllSubgraphsOfLengthN(mol, 3)
#Paths is no branching
#subgraphs = Chem.FindAllPathsOfLengthN(mol, 3)
print(len(subgraphs))
# Print out the connected SMILES for each subgraph
for subgraph in subgraphs:
# Get the subgraph as a new molecule object
sub_mol = Chem.PathToSubmol(mol, subgraph)
# Generate the connected SMILES string for the subgraph
subgraph_smiles = Chem.MolToSmiles(sub_mol, kekuleSmiles=True)
print(subgraph_smiles)
Output
11
C=CCC
C=CCO
C=CCN
C=C(C)S
CCCS
OCCS
NCCS
CC(C)O
CC(C)N
CC(N)O
CC(N)O
Upvotes: 0
Reputation: 1869
You have to convert to list()
.
And since FindAllSubgraphsOfLengthN
returns bonds and not atoms, you have to look for three bonds.
from rdkit import Chem
from rdkit.Chem import rdDepictor
rdDepictor.SetPreferCoordGen(True)
from rdkit.Chem.Draw import IPythonConsole
IPythonConsole.drawOptions.addBondIndices = True
mol = Chem.MolFromSmiles('C=C(S)C(N)(O)C')
mol
threebonds = Chem.FindAllSubgraphsOfLengthN(mol, 3)
for n in threebonds:
print(list(n))
Output:
[0, 2, 5]
[0, 2, 4]
[0, 2, 3]
[0, 2, 1]
[1, 2, 5]
[1, 2, 4]
[1, 2, 3]
[2, 5, 4]
[2, 5, 3]
[2, 4, 3]
[3, 5, 4]
Upvotes: 1