small fontnormal fontlarge fontmail to

Tetragenococcus halophilus NBRC 12172

close allopen/close all

close this sectionAbout this Microorganism

Photo by Noda Institute for Scientific Research
Tetragenococcus halophilus NBRC 12172, a halotolerant lactic acid bacterium, was isolated more than 40 years ago from soy sauce brewing mashes by K. Sakaguchi (Noda Institute for Scientific Research) and has been maintained by Institute for Fermentation, Osaka (IFO) and NBRC. A subsequent study has shown that T. halophilus constitutes the overwhelming majority of the lactic acid bacterial population in the traditional brewing process of soy sauce. It is also used to contribute to the pH decrease of broth and the flavor production in modern soy sauce brewing.
Lactic acid bacteria whose genomes were analyzed so far are mainly derived from animal sources. Although various functionalities such as the improvement of allergic symptoms of plant-derived lactic acid bacteria like this strain are expected for being suited for physical conditions and eating habits of Japanese, their genomes have rarely been analyzed.
The genome analysis of T. halophilus NBRC 12172 revealed a single circular chromosome composed of 2,562,720 bp. Presence of genes involved in the maintenance of osmotic balance well explains the high-salt resistance of T. halophilus NBRC 12172. The genome sequence of T. halophilus NBRC 12172 may facilitate the efficient and rational breeding of this bacterium and contribute to an improved quality of the product through fermentation control using the methods such as DNA microarray. In addition, this strain may serve as a safe genetic resource producing a large number of enzymes for industrial applications with excellent salt resistance and stability.

close this sectionProject history

close this date 2013-06-18 ..... 1
2013-06-18 Release of the Tetragenococcus halophilus NBRC 12172 genomic and proteomic data
imageWe published the genomic and proteomic data of Tetragenococcus halophilus NBRC 12172.

close this sectionSummary of the genomic data

Genomic size 2,562,720 bp
G+C content 36.04 %
Number of ORFs assigned 2,555
Percentage of the coding regions 87.21 %
Percentage of the intronic regions 0.00 %
Number of rRNA genes 15
Number of tRNA genes 62
Number of other features

close this sectionGeneral Procedure

The nucleotide sequence of the Tetragenococcus halophilus NBRC 12172 genome was determined by the whole genome shotgun sequencing method as in the case of other organisms analyzed at NITE Biotechnology Center.

General Procedure
  • DNA shotgun libraries
    DNA shotgun libraries with inserts of 1.5 and 5 kb in pUC118 vector (TAKARA) was constructed.

  • Fosmid library
    A Fosmid library with inserts of 40 kb in the pCC1FOS fosmid vector was constructed using the CopyControl Fosmid Library Production Kit (Epicentre).

  • Nucleotide sequencing
    Plasmid and Fosmid clones were end-sequenced using dye-terminator chemistry on an ABI Prism 3730 sequencer (ABI).
    Sequence reads were trimmed at a threshold quality value of 20 by Phred and assembled using PHRAP/CONSED software (

  • Gap closing
    Fosmid end sequences were mapped onto the assembled sequence.
    Gaps between the assembled sequences were closed by primer walking on gap-spanning fosmid closes or with PCR poducts from genomic DNA.

  • Validation of the assembled sequence data
    In construction of final nucleotide sequence, low-quality regions with a Phrap quality score of less than 40 were re-sequenced and verified. Finally, each base of genome was successfully ensured to be sequenced from Phrap quality value more than 40.

Gene identification and annotation
  • Putative non-translated genes were identified using the Rfam, tRNAscan-SE and ARAGORN programs.

  • The prediction of open reading frames (ORFs) was performed using Glimmer3. The initial set of ORFs was manually selected from the prediction result in combination with BLASTP results.

  • For functional annotation, the non-redundant UniProt database and protein signature database, InterPro, were searched to assign the predicted protein sequences based on sequence similarities.

  • The KEGG database was used for pathway reconstruction.

  • Signal peptides in proteins were predicted using SignalP and transmembrane helices were predicted using TMHMM.

close this sectionRelated links to external databases