
An open-source hub for all your molecular featurizers
Discover an unparalleled diversity of molecular featurizers and deploy them directly in your machine learning workflows.


import datamol as dmfrom molfeat.calc import RDKitDescriptors2D
data = dm.data.freesolv().sample(500).smiles.valuesmol2d = data[83]
calc = RDKitDescriptors2D()calc(mol2d)
What is molfeat?
molfeat is an open-source hub that makes it easy for ML scientists to evaluate and implement a wide range of molecular featurizers. Find the right featurizer for your workflow today.
MolT5
MolT5 is a self-supervised learning framework that pretrains transformer-based models on vast amounts of unlabeled natural language text and molecule strings allowing generation of high-quality outputs for molecule captioning and text-based molecule generation.
Updated on
desc3D
3D molecular descriptors are numerical representations of chemical and physical properties of molecules that are based on 3D structures of molecules.
Updated on
desc2D
2D molecular descriptors are numerical representations of chemical and physical properties of molecules that are based on 2D structures of molecules. We augment the RDKit 2D descriptors with additional optional properties.
Updated on
mordred
Mordred calculates over 1800 molecular descriptors, including constitutional, topological, electronic, and geometrical descriptors, among others. Both 2D and 3D descriptors are supported and optional.
Updated on
ecfp-count
The ECFP-Count (Extended Connectivity Fingerprints-Coun is essentially the same as the ECFP. However, instead of being hashed into a binary vector, there is no hashing process and simply a count vector is returned
Updated on
pcqm4mv2_graphormer_base
Pretrained Graph Transformer on PCQM4Mv2 Homo-Lumo energy gap prediction using 2D molecular graphs.
Updated on