RepetDB User Guide

RepetDB is an Intermine database that provides repeat consensus detected and classified by TEdenovo and used by TEannot to annotate copies in genomes. TEdenovo and TEannot are part of the REPET software package.

If you want to go back to RepetDB Click here

1. Search repeats consensus

1.1. Consensus search form

The easiest way to search for consensus repeats in RepetDB is to use the consensus search form on the homepage.

Consensus search form
Consensus search form

This form can search consensus repeats by organism (Taxon group selection), by classification (Wicker classification), by classification quality (confused, manual validation, unclassified) or by similarity features found on them (protein profile features or blast hit on Repbase transposons).

1.2. Get consensus fasta

To download consensus fasta sequences, you need to click the export buttons on result page (Following the use of the Consensus search form for example)

export
export

This button will open a dialog box in which you’ll find a download tab

On this box, you need to select format from “.tsv” to “Fasta sequences” downloadFile You can now download your consensus fasta file

2. Consensus page

Once you have searched consensus repeats and clicked on one, you get to the consensus page. This page contains the following parts:

2.1. Header information

The header part of the consensus page contains basic information like the consensus identifier, the consensus length and the wicker classification. But it can also contain information on the genomic annotation of the consensus with the number of copies and full-length copies, the number of fragments and full-length fragments and the cumulative coverage of the copies on the genome.

2.2. Material and method

In this section is described the dataset in which the consensus have been detected, classified and annotated. You can find the origin organism and genome assembly. Optionally this section can also contain the list of the software used to generate this dataset, a comment on how this dataset was created and the name of the person that could inform you if you have any questions on this dataset. You can find Genome assembly fasta file, genome TE annotation gff3 file and consensus library fasta file are available for download

2.3. Consensus copy statistics

This part contains a table of statistics on the consensus copies annotated on the genome. This table regroups statistics on the copies length, identity and coverage over consensus.

2.4. Feature browser

All the features that have been detected on the consensus can be visualized on an embedded Jbrowse browser. The reference sequence in this Jbrowse is the consensus on which similarity features and structural features are located. To check the details on these features, you can check the following “Similarity features” and “Structural features” sections.

2.5. Similarity features

If the consensus has any similarity features like protein profiles or transposons blast hits, this section lists them in a detailed table. Each type of feature has its own table displaying the positions on the consensus (“Query start” and “Query stop”), positions on the hit (“Hit start” and “Hit stop”), the hit e-value, the hit identity and a details on the hit (its source databse, accession, description or classification).

Transposons blast hit matches Repbase Transposon elements with a link that will lead you to the Repbase database.

Protein profile hit matches GyDB or PFAM profiles with a link that will lead you to the profile card on the databse website.

2.6. Structural features

Like with the similarity features, the structural features have a table for each type of feature containing: SSR regions, ORF regions, Terminal repeats regions (TR) and Poly A regions.

3. Other Intermine features

Like any Intermine databases, RepetDB benefits standard features like the query builder, template queries or lists. In this section, these features will be described to help you use them at their full potential.

3.1. Query builder

Query builder menu
Query builder menu

The query builder is a complex interface that can be used to make any kind of custom queries in the RepetDB database.

If you want to learn more on how to use the query builder, check to the following tutorial.

You can also use the query builder to modify an existing query.

To modify a query from the consensus search form:

3.2. Template query

Templates menu
Templates menu

Template queries are queries with parameters that has the advantage of being easily reusable. RepetDB shares public templates that you can find on the templates page but anyone can add private (for-your-eyes-only) templates.

To create a template:

3.3. Lists

Lists menu
Lists menu

RepetDB can operate on custom lists of data. You can save lists from results pages or create them by uploading lists of identifiers. Lists can be used when running template queries and analyzed by a series of widgets on a list analysis page. You can merge, subtract and find common members if you have more than one list.

All lists, public ones as well as personal ones (if you are logged in) can be viewed on the Lists page, where you can search them and do operations on them. To create a new list yourself, click on ‘Lists’, and then on ‘Upload’ in the toolbar on any RepetDB page: RepetDB’s list creation tool helps you upload a list of identifiers, the list can contain a mix of identifier types.

To preserve a list from query:

Descriptions and tags can also be edited after a list is saved.

3.4. MyMine section

All lists and queries you ran will be saved temporarily in RepetDB for the current session. To save them permanently, you can create a MyMine account. You only need to provide an email address and a password to generate an account, there is no other information required. Your saved data is always private.

You can then access all your lists, queries and templates via the MyMine page. In MyMine you can save lists and queries you create in the QueryBuilder. You can even use the QueryBuilder to turn queries into new templates of your own. You can export/import queries and templates as XML to share them with others.