Actinobacteria are Gram-positive bacteria that are ubiquitous and abundant in soil. They develop vegetative hyphae, aerial mycelium and conidial spores during their life cycle, thus being morphologically resembling fungi. The interesting and important property of Streptomycetes bacteria is their ability of producing a variety of antibiotics such as Streptomycin, Erythromycin, Tetracycline etc. through complex secondary metabolic pathways. The antibiotics produced by them account for 60% of naturally-occurring antibiotics and are used as antibacterial, antifungal, antiviral, anitiparasitic, immunosuppressant and antitumor medicines.
Streptomyces avermitilis (or, Streptomyces avermectinius) MA-4680T(= NBRC 14893T) is the producer of anthelmintic macrolide "avermectin" that was isolated by Omura et al. of the Kitasato Institute from the soil sample collected in Ito City, Shizuoka Pref. It has an unusually large genome of 9.02 Mb of high G+C content, which, similar to the cases of other bacteria belonging to the genus Streptomyces, exists as a linear chromosome. Both ends (telomeres) of a linear chromosome contain terminal inverted repeats and covalently binding terminal proteins (TPs). Its genome analysis led to the identification of 30 gene clusters involved in secondary metabolite biosynthesis, a half of which is located near either end of the chromosome. Of them, twelve were found to be involved in polyketide synthesis. Analysis of these gene clusters along with enzymes involved would reveal clues as to the synthesis of many secondary metabolites and their physiological roles.
2003-04-08 ..... 1
Release of the Streptomyces avermitilis MA-4680 genomic sequence data
We published the genomic data of Streptomyces avermitilis MA-4680.
Summary of the genomic data
Number of ORFs assigned
Percentage of the coding regions
Percentage of the intronic regions
Number of rRNA genes
Number of tRNA genes
Number of other features (misc_RNA,misc_feature,repeat)
The genome analysis of S. avermitilis MA-4680 was performed as described below by nucleotide sequencing of its whole genome shotgun (WGS) clones, which was rather difficult because of its high G+C content (ca. 71 %). To facilitate ORF assignment, a computer program termed FramePlot(1) was used, which was specifically developed for the analysis of high G+C bacterial genomes.
Procedure of Genome Analysis
Construction of the whole genome shotgun (WGS) clones
The genomic DNA of S. avermitilis was hydrodynamically sheared to yield DNA fragments of 1-2 kb in size, which were then treated with T4 DNA polymerase and T4 polynucleotide kinase to generate blunt ends and cloned into pUC118.
Nucleotide sequencing of the WGS clones and data assembly
The DNA in each of the WGS clones was amplified by PCR using the M13 forward and reverse primers, and sequenced using the DYEnamicTM ET Dye Terminator kit from Amersham. Ten ABI-3700 and two MegaBace (only in the initial stage) DNA sequencers were used. The collected data were assembled using the Phred/Phrap/Consed software package (http://www.phrap.org/) and SPS Phrap software.
Refinement of the sequence data
A library of cosmid clones, each of which contained an insert of approximately 40 kbp in size, was constructed and the ends of each clone were sequenced to identify contigs containing them, thereby establishing the correspondence of the contigs that were resulted from the assembly of the WGS clone sequence data. PCR analysis between the ends of contigs was also carried out.
Validation of the sequence data
A physical map of AseI and DraI cleavage sites was constructed and compared with the one deduced from the finally assembled nucleotide sequence data. The nucleotide sequence data thus established was further analyzed at the Kitasato Institute.
ORF Assignment and Annotation
ORFs were first identified by using Glimmer trained with a set of genes of Streptomyces avermitilis and Streptomyces coelicolor A3(2). With reference to the results of analysis with FramePlot(1), the predicted ORFs were individually examined. When two genes overlapped each other, a more likely one was manually selected.
Similarity search against a non-redundant databases were conducted by using BLAST. Also, search for global and local matches by using the Pfam database was performed using HMMER.
Subsequently, each of the ORFs assigned was annotated if they showed a high degree of similarity to the genes/ORFs of other organisms.
Finally, the resultant ORFs were manually inspected and corrected if necessary.