Caldisericum exile AZM16c01T (= NBRC 104410T), isolated from a hot spring in Japan, which was the first cultivated microorganism that belongs to the candidate phylum OP5. Therefore, the new phylum name Caldiserica was proposed for the candidate phylum OP5 which have been indicated its existence by the environmental clone sequences in various environments such as hot spring and pollutant associated places.
C. exile AZM16c01T is a filamentous-shaped, Gram negative thermophilic bacterium that grows anaerobically by reduction of sulfur compounds (thiosulfate, sulfite and elemental sulfur) to hydrogen sulfide, and grows optimally at 65กกºC. Yeast extract is essential for growth, but substrate compounds as electron donner is unknown experimentally.
The genome of C. exile AZM16c01T consists of a single circular chromosome of 1,558,103 base pairs (bp) in length with an average G + C content of 35.4%, and harboring 1,582 predicted protein coding sequences. Approximately 60% of the predicted proteins were most similar to Firmicutes, Dictyoglomi and Thermotogae especially which have the features of thermophilic, anaerobic and heterotrophic bacteria. More interesting was that approximately 9% of the predicted proteins were most similar to Archaea despite the genome size of C. exile AZM16c01T is small. The genome analysis revealed most biosynthetic pathways contained amino acids, cofactors, and nucleotides were lost in the microorganism, which suggests the role as the scavenger in the environment. The presence of genes with various ferredoxin oxidoreductases, heterodimeric sulfide dehydrogenase (Sud) and mbx operon may associate with sulfur reduction in C. exile AZM16c01T, which is similar to the single-enzyme respiratory system proposed in Pyrococcus belongs Archaea.
The genome sequence of C. exile AZM16c01T belonged new phylum Caldiserica would contribute to the clarification of biology and evolution of under thermophilic environment.
Project history
2012-05-14 ..... 1
Release of the Caldisericum exile AZM16c01T (= NBRC 104410T) genomic data
We published the genomic data of Caldisericum exile AZM16c01T (= NBRC 104410T).
Summary of the genomic data
Genomic size
1,558,103 bp
G+C content
35.37 %
Number of ORFs assigned
1,582
Percentage of the coding regions
93.16 %
Percentage of the intronic regions
0.00 %
Number of rRNA genes
3
5S
16S
23S
1
1
1
Number of tRNA genes
46
Ala
Arg
Asn
Asp
Cys
Gln
3
5
1
1
1
2
Glu
Gly
His
Ile
Leu
Lys
2
3
1
1
5
2
Met
Phe
Pro
Ser
Thr
Trp
3
1
3
4
3
1
Tyr
Val
1
3
Number of other features (misc_RNA,misc_feature,repeat)
The nucleotide sequence of the Caldisericum exile AZM16c01T genome was determined by the whole genome shotgun sequencing method as in the case of other organisms analyzed at NITE Biotechnology Center.
General Procedure
DNA shotgun libraries
DNA shotgun libraries with inserts of 1.7 and 5 kb in pUC118 vector (TAKARA) was constructed.
Fosmid library
A Fosmid library with inserts of 38 kb in the pCC1FOS fosmid vector was constructed using the CopyControl Fosmid Library Production Kit (Epicentre).
Nucleotide sequencing
Plasmid and Fosmid clones were end-sequenced using dye-terminator chemistry on an ABI Prism 3730 sequencer (ABI).
Sequence reads were trimmed at a threshold quality value of 20 by Phred and assembled using PHRAP/CONSED software (http://www.phrap.org).
Gap closing
Fosmid end sequences were mapped onto the assembled sequence.
Fosmid clones that link two contigs were selected and sequenced by primer walking to close any gaps.
In some cases, Fosmid clones were subcloned by insertion of Entranceposon using Template Generation System II Kit (Finnzymes) and sequenced.
Validation of the assembled sequence data
In construction of final nucleotide sequence, low-quality regions with a Phrap quality score of less than 40 were re-sequenced and verified. Finally, each base of genome was successfully ensured to be sequenced from Phrap quality value more than 40.
Gene identification and annotation
Putative non-translated genes were identified using the Rfam, tRNAscan-SE and ARAGORN programs.
The prediction of open reading frames (ORFs) was performed using Glimmer3.
The initial set of ORFs was manually selected from the prediction result in combination with BLASTP results.
For functional annotation, the non-redundant UniProt database and protein signature database, InterPro, were searched to assign the predicted protein sequences based on sequence similarities.
The KEGG database was used for pathway reconstruction.
Signal peptides in proteins were predicted using SignalP and transmembrane helices were predicted using TMHMM.