:: Home | Deutsch | Print ::

Scientific Databases and Visualization - BioReader

The main focus of this project is the development and application of natural language processing (NLP) methods to support dealing with chemical compound names. A chemical compound can have many different names; it can have several trivial names as well as several systematic names, even when following naming recommendations as those of the International Union of Pure and Applied Chemistry (IUPAC). Furthermore, underspecifying names and class names frequently occur in publications, databases and patents.

This Project focuses on two different approaches:

ChemHits identifies names of
chemical compounds via string normalization. Input names are normalized and subsequently matched against one of several reference databases (ChEBI, KEGG, etc.). (Version 1.0 released Dec. 2009!)

CLP(name2structure) aims at a deep analysis resulting in a chemical structure and classification for a given name.

The methods  and tools developed under this project are to be used by curators of the
SABIO-RK database for the identification of compounds.


 
page last modified: 15.02.2010,12:04



Project Manager

Group Leader, Priv.-Doz. Dr. Wolfgang Mueller
Email:
Phone: +49 (0)6221 - 533 - 231

Fax: +49 (0)6221 - 533 - 298








© EML Research gGmbH