Neglected Tropical Diseases (NTD) form a group of 21 diseases with diverse, multidimensional natures, affecting low income and rural areas around the world. One of such diseases is Chagas disease, caused by Trypanosoma cruzi parasite, endemic in 21 countries of South and Central America, but present worldwide and currently estimated to affect more than 7 millions of people. Besides data about detection of new cases and transmission routes, coming from official sources (i.e., Ministries of Health or institutions appointed by them), other sources of potentially useful epidemiological information are already in place and can be beneficial for making a clearer and more complete picture about the diseases in the world (e.g., vector research databases, outbreak alert systems, health economics, drug distribution, related research publications). In addition, having am integrated view of the NTD-related data will enable different types of cross analysis, e.g., analysis of coinfections, comorbidity.
The main goal of this project is to build a system for managing the extraction, storage and integrated processing of data coming from a variety of data sources with NTD-related data.
In particular, the main objectives of the project are:
- Create extraction drivers that can handle different types of data formats coming from the sources (i.e., structured DB, semi-structured Excel files/Web pages, plain text).
- Organize and develop a consolidated database for storing data coming from a variety of different data sources, having in mind the need for flexibility for the further exploitation of this data.
- Develop a system to integrate, transform, and prepare data previously stored in the consolidated database, for further analysis and use (e.g., descriptive and predictive analysis, visualization, download and publications).
This project relates to the third SDG “Good Health and Well-Being” and in particular target 3.3 “By 2030, end the epidemics of NTDs”.
The main users of the WIDP system are members of WHO’s department for Neglected Tropical Diseases, that will use the system to analyze the current status of NTDs in the world and take informed actions towards their control and elimination. In addition, the system should allow WHO users to make different kind of predictions regarding the process of disease elimination (e.g., drug procurement and distribution, dwelling treatment).
System’s main beneficiaries are the population currently affected by NTDs and those in risk of being infected worldwide.
The deployment of the project will go in phases:
- Deployment of the consolidated database
- Creating extraction drivers for an initial set of data sources and start extracting the data.
- Deployment of the data transformation and preparation module.
- Iteratively adding other data source to the system (i.e., repeating phase 2).