close all open/close all

DoBISCUIT (Database of BIoSynthesis cluster CUrated and InTegrated)

Secondary metabolites produced by actinomycetes are important as lead-compounds and/or candidates for drug development. Polyketide synthases (PKSs) and non-ribosomal peptide synthetases (NRPSs) have been attracting much attention for their roles in constructing complex compounds. Secondary metabolite biosynthesis gene clusters are often used in biotechnological applications such as heterologous expression and combinatorial biosynthesis. Many scientific papers are published describing biosynthesis gene clusters every year, but information about a particular gene cluster is often dispersed in many references and is not described in a comprehensive manner.

We constructed a literature-based database about known PKS and NRPS gene clusters. The database consists of biosynthesis cluster information, search menu, retrieve information menu and KS/A domain sequence menu. Our database enables easy access to a comprehensive information related to biosynthesis clusters and browsing standardized up-to-date gene descriptions. Our database will thus serve as a useful reference for the analysis of secondary metabolites. biosynthetic gene clusters.

close this section0.Interface overview

general view Top page Compound list Cluster information CDS list CDS information Simple text search Advanced text search Module search BLAST search Category list Get sequence Data download PCR amplified sequences Novelty chart PCR sequence BLAST

close this section1.Top page

Welcome to DoBISCUIT!
If you are interested in biosynthesis cluster information, please enter form center compound list.
If you would like to search this database, please enter from upper-left panel.
If you would like to know how novel sequences will exist in NBRC strains, please enter from center-left panel.
If you would like to retrieve various biosynthesis cluster information, please enter form lower-left panel.

close this section2.Compound list

It shows the compound list corrected in this database. Clicking the tabs, users can change the sort order of the compounds based on the attributes of each compound.
Red icon has a hyperlink to Cluster information, and green icon has a hyperlink to CDS list.

close this section3.Cluster information

It shows integrated information about the biosynthesis cluster. This page has 6 sections;
Compound, Origin, Genomic map, PKS/NRPS modules, References and Data download.
3.1. Compound section
Compound section displays information about the compound produced. Chemical structure, activities and various attributes such as starter unit, sugar unit are displayed.
3.2. Origin section
Origin section displays the strain which produces the compound. You can follow the hyperlink and access the strain of NBRC. The original INSDC entries which report the sequence of the biosynthesis cluster are also displayed.
3.3. Genomic map section
Genomic map section displays the coordinates of genes encoded in the biosynthesis cluster.
If the original INSDC entries were divided in multiple entries, relative coordinates were re-constituted based on the description of the references. Each gene was colored based on its biological function.
3.4. PKS/NRPS modules section
PKS/NRPS modules section displays the domain composition of PKS/NRPS enzymes encoded in the biosynthesis cluster. Each module is displayed in a line. Deduced substrate of each AT or A domain is represented in the right column. Putative inactive domain is expressed in lower-case.
3.5. Reference section
Reference section displays related references about the biosynthesis cluster. We listed not only the paper reporting nucleotide sequence, but also functional analyses of tailoring enzymes published afterward are corrected.
3.6. Data download section
Data download section displays some kinds of data files that users can download. We are providing nucleotide sequence, CDS nucleotide/amino acid sequence and manual assigned annoations in CSV format or genbank format. Clicking each icon starts the download process.

close this section4.CDS list

It shows the list of all CDS encoded in the biosynthesis cluster. CDSs are ordred based on the relative position of each CDS. Enzyme definition, gene name, substrate deduced from the compound structure and functional category are displayed.

close this section5.CDS information

It shows integrated information about CDS encoded in the biosynthesis cluster. This page has 6 sections; Location, Annotation, Genomic map, PKS/NRPS modules, Sequence and Features.
5.1. Location section
Location section displays basic information about the CDS. The name of organism, strain and contig that the CDS are displayed.
5.2. Annotation section
Annotation section displays various functional information assigned by annotators manually. Functional category, product name (in standardized vocabulary), substrate assignment and other notes are displayed. Annotation assigned in original genbank entry is also displayed.
Related reference and corresponding Uniprot entry are represented as the evidence of the annotation. Reference digest is written in Japanese as comment.
5.3. Genomic map section
Same as the genomic map section in Cluster information. See the cluster information page.
5.4. PKS/NRPS modules section
Same as the genomic map section in Cluster information. See the cluster information page.
5.5. Sequence section
Sequence section displays the nucleotide/amino acid sequence of the CDS. Displayed sequence can be switched each other by tab button. In the case of PKS/NRPS, each domain region highlighted inn different colors. Signature sequence the substrate is also represented in different color.
5.6. Feature section
Feature section displays search results of functional inference. °»Show BLAST table°… button has a hyperlink to the result of homology search (BLASTP) executed against Uniprot database. Domain signature search result against Interpro are also displayed.

close this section6.Simple text search

Simple text search form is provided in upper-right corner of all pages. Search target is restricted to Compound name, Organism name and Annotation (product name and gene name). If you wish to search more detailed information, please use Advanced text search.

close this section7.Advanced text search

Advanced text search menu is linked from middle-left panel of Top page. Users can search within DoBISCUIT by entering the search words, specifying the target fields and select the target clusters. Spaces between words are regarded as 'AND' search term. To search for exact phrase, enclose the phrase in double quotation.


Search result will be displayed as the list of clusters containing the search term. You can follow the hyperlink to the biosynthesis cluster page or check a box you want to browse and go to CDS tab to browse the CDSs.

close this section8.Module search

We also provide module search menu to pick up CDSs containing particular domain composition within the modules. All of the module patterns registered in DoBISCUIT are displayed upper part of module search. Users can use auxiliary input boxes to specify the composition of module displayed in middle part of Module search.
Result of module search displays the list of CDSs containing the entered domain composition.

close this section9.BLAST search

To search homologous CDSs, enter your query sequence in FASTA format, specify the target and select BLAST program. We are providing several kinds of BLAST databases, cluster (containing whole cluster sequence), CDS (containing all assigned CDSs) and domain (containing all domains assigned in CDSs).

close this section10.Category list

We are also providing several methods that help users to access information in cross-sectional manner. All of CDSs registered in DoBISCUIT are classified into categories depending on their deduced biological function. Category list hierarchically shows CDSs. To see the lower, click the line.

close this section11.Get sequence

To retrieve sequences from the specified cluster or specified contig, select the compound name, coordinates, contig (if the biosynthesis cluster are composed of several contigs) Gene ID used within DoBISCUIT (ex. Actino_00010) is also available as the search term.

close this section12.Data download

To download sequence and annotation of whole biosynthesis cluster click each icon.

close this section13.PCR amplified sequences

We have extensively determined KS and A domain sequences in NBRC strains, and registered them here. The amplified sequences are listed in Phylogenetic tree. Users can open the lower by clicking each organism name. All strains registered here are available from NBRC. More details see here...

close this section14.Novelty chart

This diagram display novelty of KS and A domain sequences in NBRC's strains. Red and blue mean low similarity to already reported KS and A sequences, respectively, suggesting novel. The strains having novel sequences may produce novel polyketide and/or peptide compounds. This chart is clickable and users can jump to the domain sequence list.

close this section15.PCR sequence BLAST

If you enter your isolates KS and/or A domain (gene) sequences as queries, this program replies the Blast results searched for our database, and you can check the novelty of your strain's gene based on the similarity to the huge numbers of sequences we registered.

close this sectionXX.Other tips implemented in DoBISCUIT

XX.1. Page top tab
Users can see the Page top tab at the right side of all pages. Click this tab to jump to the top of the page.
XX.2. Section jump menu
The lists of sections are displayed at the top of Cluster information page and CDS information page. Click on the section name to jump to the desired section. This menu remains at the top of the page despite scroll of the page.
XX.3. Cluster jump window
If users click on the orange square icon displayed at the upper-right corner in the cluster information page, new window of the compound list is launched. Click on the compound name to jump to the desired cluster information page.
XX.4. Auto-select menu
We are providing two kinds of auto-select menu. When sequence is displayed in the page, blue document icon is also displayed at the upper-right of the sequence. Click on this icon to select the whole sequence. When users use search menus, yellow pen icon is displayed at the upper-right of search result page. Click on this icon to color the search term in yellow.
XX.5. Jump menu at genomic map
In the genomic map section, CDS information is displayed on mouse over action. Click on tiny balloon icon displayed right side of Gene ID to jump to the CDS information.
Page top