Name, Formula and SMILES Search

This form searches the chemical component dictionary by molecular name, molecular formula, or SMILES description. A number of search types are supported:

Chemical Component Identifier search options:

Molecular formula search options:

Note: formulas must be entered with spaces, e.g.: C6 H11 N2 O7 P.

Molecular name search options:

SMILES search options:

InChi search options:

Sketch Input and/or Structure Search

This powerful search tool, which uses the MarvinSketch 2D drawing applet, serves three functions:

  1. to upload/build a query molecule for a substructure search of the chemical component dictionary
  2. to build a novel annotated chemical component definition
  3. to convert chemical component descriptions between different formats for external use

To conduct a search for a chemical component, either upload its chemical description in one of the following formats: an mmCIF format chemical component, a MOL/SDF file, a REFMAC/PHENIX monomer library file (mmCIF), or PDB formatted coordinates, or input a SMILES description of the molecule. Once MarvinSketch displays the desired chemical structure, edit the molecule if necessary to match your molecule of interest, and Click on the Search button. The substructure search will try to match a target 2D chemical structure with components in the current chemical dictionary.

The following substructure matching options are supported:

Note: Strict and relaxed matches are only performed on molecules with corresponding all/heavy formulas. Close matches are performed on molecules that have similar but larger formulas (up to +3 in any element).

To sketch a novel chemical component to use as a search query or to save for later, the easiest method is to upload a similar pre-existing chemical component and modify it in MarvinSketch to obtain your target molecule. If necessary, you can draw the 2D chemical structure from scratch by only selecting the size of the drawing window at the bottom of the search form and clicking on the Launch button. Once you have drawn your desired chemical structure, either select your Search options (below) to find it or related chemical components, or click on the Save File button to export your drawing into mmCIF, CML, MOL, SMARTS, SMILES, SDF, or SYBYL format. You can also convert defined chemical components between formats using this tool.

Instance Search

The input to an instance search is a 3-letter-code component identifier, and will result in a table of all PDB entries containing the desired component. The Display option will provide the information in two ways:

Each instance is individually available for download in PDB, MOL/SDF, and mmCIF formats. You can also launch the MarvinSketch viewer to look at the chemical structure in this instance.

Browse Search

The Browse feature allows the user to explore the content of the wwPDB chemical dictionary in a number of categories. Menus are provided to select amino acids, nucleotides, selected top-selling pharmaceuticals, and common aromatic ring systems. Searches are performed by finding structures containing a SMILES pattern or by comparison to a chemical fingerprint. The chemical fingerprint consists of 1000 individual chemical features such as the presence of common functional groups or ring systems. Fingerprints are considered similar if their Tanimoto similarity score is greater than 0.8. Within your query results are links to the chemical component listing, downloadable coordinates, and clickable links to search for further analogs by similar name, similar SMILES string, or similar chemical formula.

To measure similarity or distance in chemical space we precompute a chemical fingerprint for each chemical component in our dictionary. Our chemical fingerprint was developed by Christian Laggner for the OpenBabel software system. The fingerprint contains the SMILES patterns for approximately 1000 chemical features such as the presence of common functional groups or ring systems. Each component is tested for the presence the chemical patterns in the fingerprint. The results are stored in vector bits in which 1 or 0 is set to denote the presence or absence of a particular feature. Two fingerprints (A and B) are compared by using the Tanimoto similarity score, a value between 0 and 1, which is defined as:

Tanimoto score = (A .AND. B) / ( A + B - (A .AND. B) )

is the number of bits set in fingerprint A
is the number of bits set in fingerprint B
(A .AND. B)
is the number of bits set after calculating the bitwise logical AND between A and B
Note that the results of the SMILES and fingerprint comparison may produce some unanticipated results. The SMILES comparison will match substructures of the target molecule within any other component in our chemical dictionary. For small or simple targets (e.g. alanine), this may result is a large number of matches. Results of the chemical fingerprint comparison reflect the bias of the patterns in our chemical fingerprint. The discrimination of this comparison may be useful for locating molecules which have some common features but not with the selectivity of a substructure match.

