Reference Viral DataBase (RVDB)

Reference Viral Database (RVDB) is developed by Arifa Khan's group at CBER, FDA for enhancing virus detection using high-throughput/next-generation sequencing (HTS/NGS) technologies. Unique features include reduction/removal of phages, misannotated, irrelevant, and non-viral sequences by manual curation. RVDB is available as Unclustered (U-) and Clustered (C-) nucleotide sequence files. The scripts and steps involved in generating and updating RVDB up to v26.0 have been published (reported in mSphere) and are available with instructions at GitHub. An automatic pipeline is used from RVDBv27.0 and onward. Additionally, an automatic pipeline for non-viral annotation is implemented from U-RVDBv27.0. The Terms of use for RVDB are listed here

The SQLite form of U-RVDB is provided using the sqlite3 module available in python, as described by GitHub. From version 15.1 and thereafter, the RVDB SQLDB is converted to SQLite to provide a flexible import format.

A proteic version of RVDB was developed by Marc Eloit’s group and is available at Institut Pasteur (RVDB-prot and RVDB-prot-HMM). The proteic RVDB may be used to complement analysis using these nucleotidic databases.

Download Current Release (v28.0, Nov 22, 2023)

Citing RVDB

Goodacre N, Aljanahi A, Nandakumar S, Mikailov M, Khan AS. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection. mSphere. 2018 Mar 14;3(2). pii: e00069-18. doi: 10.1128/mSphereDirect.00069-18. eCollection 2018 Mar-Apr. PubMed PMID: 29564396; PubMed Central PMCID: PMC5853486.


If you have any questions or comments regarding RVDB nucleotidic databases, please contact Arifa Khan ( If you have any technical questions or comments regarding the website, please contact the Bioinformatics Core at University of Delaware