D247.full.pdf (295.68 kB)
The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis
journal contribution
posted on 2023-06-08, 11:26 authored by Frances PearlFrances Pearl, Annabel Todd, Ian Sillitoe, Mark Dibley, Oliver Redfern, Tony Lewis, Christopher Bennett, Russell Marsden, Alistair Grant, David Lee, Adrian Akpor, Michael Maibaum, Andrew Harrison, Timothy Dallman, Gabrielle Reeves, Ilhem Diboun, Sarah Addou, Stefano Lise, Caroline Johnston, Antonio Sillero, Janet Thornton, Christine OrengoThe CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath/) currently contains 43 229 domains classified into 1467 superfamilies and 5107 sequence families. Each structural family is expanded with sequence relatives from GenBank and completed genomes, using a variety of efficient sequence search protocols and reliable thresholds. This extended CATH protein family database contains 616 470 domain sequences classified into 23 876 sequence families. This results in the significant expansion of the CATHHMMmodel library to include models built from the CATH sequence relatives, giving a10%increase in coveragefor detecting remote homologues. An improved Dictionary of Homologous superfamilies (DHS) (http://www.biochem.ucl.ac.uk/bsm/dhs/) containing specific sequence, structural and functional information for each superfamily in CATH considerably assists manual validation of homologues. Information on sequence relatives in CATH superfamilies, GenBank and completed genomes is presented in the CATH associated DHS and Gene3D resources. Domain partnership information can be obtained from Gene3D (http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/). A new CATH server has been implemented (http://www.biochem.ucl.ac.uk/cgi-bin/cath/CathServer.pl) providing automatic classification of newly determined sequences and structures using a suite of rapid sequence and structure comparison methods. The statistical significance of matches is assessed and links are provided to the putative superfamily or fold group to which the query sequence or structure is assigned.
History
Publication status
- Published
File Version
- Published version
Journal
Nucleic Acids ResearchISSN
0305-1048Publisher
Oxford University PressExternal DOI
Issue
S1Volume
33Article number
D247-D251Department affiliated with
- Biochemistry Publications
Full text available
- Yes
Peer reviewed?
- Yes
Legacy Posted Date
2012-04-27First Open Access (FOA) Date
2012-04-27First Compliant Deposit (FCD) Date
2012-04-26Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC