This repository contains a set of notebooks (organized into seperate topics by lab) for an introduction to cheminformatics in the Python programming language. Introductions to programming using the Python programming language are included. The notebooks rely on a set of both dependecies Python and other packages listed in the environmental.yml
. Thus, installing a Python environment using conda is required. Conda installation can be done as described on the conda website. Some notebooks are taken from the incredible Teach Open Computer-aided Drug Design course. However, notebooks in this repository are geared to noive programmers.
These noteboks can be run a server by clicking the Binder tag above. However, all changes to the notebooks (e.g., your work) and therefore is only recommended as a temporary solution.
After conda is properly installed, a new conda environment for these notebooks can be created by the command:
conda env create -f environment.yml
The environment can then be activated by:
conda activate intro-chem
And the notebooks can be opened via:
jupyter lab
or
jupyter notebook
-
Course Introduction
#TODO: This is empty -
Introduction to chemical structure annotations– 2D &3D
Teach structure annotations: ChemBioOffice, SMILE, SDF and etc. First homework assignment #TODO: This is empty -
Basic Python, Pandas and matplotlib
Introduce the basics of programming using the Python language. Introduce popular data science and visualization packages, Pandas and matplotlib. -
Advanced python and RDKit
A brief introduction to Objects in Python. Introduction to the cheminformatics Python library RDKit. -
Molecular Descriptors
Generating chemical descriptors - explores the basics of generting molecular descriptors and chemical fingerprints. -
Chemical Similarity
The basics of chemical similarities using different metrics and descriptor spaces. -
Machine Learning Part I - Unsupervised ML
The basics of unsupervised machine learning, including principal component analysis and chemical clustering. -
Machine Learning Part I - QSAR
The basics of supervised machine learning and developing quantitative structure activity relationship (QSAR) models. Develop classification and regression models on a set of benzodiazipines. -
Data science in chemistry
Learning about web retrieval services and how to access chemistry databases via programmatic interfaces. -
Predictive modeling
Develop your own QSAR models and make predictions. -
Pharmacophores
Generating pharmacophores using a ligand based approach for EGFR receptor -
Deep Learning
Develop a deep learning model using data provided from the NICEATM acute oral toxicity challenge.