COVID-19 Datasets: A Brief Overview

Ke Sun1, Wuyang Li1, Vidya Saikrishna2, Mehmood Chadhar2 and Feng Xia2

  1. School of Software, Dalian University of Technology
    Dalian 116620, China
  2. Institute of Innovation, Science and Sustainability, Federation University Australia, Ballarat
    3353, Australia


The outbreak of the COVID-19 pandemic affects lives and social-economic development around the world. The affecting of the pandemic has motivated researchers from different domains to find effective solutions to diagnose, prevent, and estimate the pandemic and relieve its adverse effects. Numerous COVID-19 datasets are built from these studies and are available to the public. These datasets can be used for disease diagnosis and case prediction, speeding up solving problems caused by the pandemic. To meet the needs of researchers to understand various COVID-19 datasets, we examine and provide an overview of them. We organise the majority of these datasets into three categories based on the category of applications, i.e., time-series, knowledge base, and media-based datasets. Organising COVID-19 datasets into appropriate categories can help researchers hold their focus on methodology rather than the datasets. In addition, applications and COVID-19 datasets suffer from a series of problems, such as privacy and quality. We discuss these issues as well as potentials of COVID-19 datasets.

Key words

COVID-19, Data science, Datasets, Artificial intelligence

Digital Object Identifier (DOI)

Publication information

Volume 19, Issue 3 (September 2022)
Year of Publication: 2022
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Sun, K., Li, W., Saikrishna, V., Chadhar, M., Xia, F.: COVID-19 Datasets: A Brief Overview. Computer Science and Information Systems, Vol. 19, No. 3, 1115-1132. (2022),