The SAPRIN Harmonised Research Operations System


The SAPRIN Harmonised Research Operations System

Health and demographic surveillance systems (HDSS) help researchers understand how factors around health, social and economic wellbeing affect people and the societies that they live in. These HDSS systems are an important part of advanced population registration systems and are used to determine if services are meeting the needs of the population. The value of these longitudinal population data system increases as data accumulates over time, on people’s experience of health and socio-economic wellbeing or lack of it.

SAPRIN consists of several HDSS Nodes, across provinces, in rural, peri-urban and urban settings. The difference between these nodes independently making the best research platform they can manage and a harmonised network of nodes producing data in parallel is enormous. Harmonisation enables pooling of data from the different nodes for larger scale analysis of population trends and their consequences. Comparison of findings between the nodes enables the study of spatial dimensions of epidemics like HIV/AIDS and COVID-19, and socio-economic interventions like provision of electricity and water. While cognisant of the diversity of research agenda on each HDSS Node, standard and common core HDSS platform exits at each Node within SAPRIN. This is achieved using 10 core HDSS platform operational harmonisation principles, which we describe below.   

1.1.  Dynamic and open cohort HDSS; the distinguishing feature of the HDSS platform at each SAPRIN Node is a longitudinal open population cohort in a geographically defined area that grows through birth and in-migration and shrinks through death and out-migration and is flexible enough to accommodate a variety of (nested or standalone) scientific research studies. 

1.2.  Electronic Data collection system;

Each nodal platform is fully paperless. Currently the HDSS Nodes are using OpenHDSS, Survey Solutions and RedCap electronic data collection platforms.

1.3.  A 45-week annual data collection

Data is collected from each household 3 times a year, at an interval of 4 months in a  45-week annual schedule (with 2 variations according to a HDSS Node’s settlement pattern), interspacing one (1) field-based CAPI data point with two (2) call call-centre based CATI data points. In the nucleated/clustered settlement pattern model (Fig1), the DSA is divided into 3 Functional Community Areas (FCA), with each FCA in turn divided into 15 Weekblocks (WB). Each Weekblock is divided into Supervisor Areas (SVA) depending on the number of field teams. Each Supervisor Area is finally divided into Fieldworker Areas (FWA).


Fig1: 45- Week Data Collection Model for a DSA with Nucleated/Clustered Settlement Pattern 

In the dispersed/scattered settlement pattern model (Fig2), the DSA is divided into 15 FCA, and the WB, SVA and FWA within each FCA are built following the same principle as described under the nucleated/clustered settlement pattern model.

 Fig2: 45- Week Data Collection Model for a DSA with Dispersed/Scattered Settlement Pattern 

1.4.  Adoption of SAPRIN Core protocol amended to fit the context of each HDSS Node including any additional node specific data collection.

The SAPRIN protocol has four components (see Fig3) namely; a) household surveillance, b) verbal autopsy, c) individual health surveillance, and d) linkage to service records.

Fig3: SAPRIN Core protocol components and associated data collection methods

1.5.  Standardised question wording, coding and concept definitions of all SAPRIN Core Data Elements;

All SAPRIN core data elements, specified in an annexure to the SAPRIN Core protocol, are the common minimum variables to be collected by each HDSS Node, have been standardisation into questions and response codes, and are used by all HDSS nodes without amendment.

1.6.  SAPRIN-level proforma templates for memorandum of agreements (MoA) with government departments;

SAPRIN is in the process of setting up memorandum of agreements with government departments and research councils at national level. These SAPRIN-level MoA templates will be used by HDSS Nodes to negotiate site-specific MoAs for record linkage purposes with their provincial governments. 

1.7.  Common ethical framework;

Each HDSS node’s platform protocol (i.e. the SAPRIN core protocol amended for contextual fit) will receive ethical approval from the node’s Ethics Committee of note. Written informed consent are applied to all data collection activities. All informed consents include a standard statement disclosing to participants that their anonymised data will be shared for scientific analysis through, inter alia, the SAPRIN data repository.  Further, all informed consent documents and data forms are translated and administered to participants in their local language. Free online translation services are accessed from a portal developed by the South African Centre for Digital Language Resources (SADiLaR), which is also supported by the Department of Science and Innovation (DSI) as part of the new South African Research Infrastructure Roadmap (SARIR).

1.8.  Electronic Data quality assurance/control system;

All HDSS Nodes implement an electronic data QA/QC system to ensure consistent data quality standards across all Nodes. As part of operational harmonisation, SAPRIN Management performs an annual operational quality assurance review at each HDSS node to assess data quality, implementation of core protocol, and operational efficiency.

1.9.  Common Database;

SAPRIN provides a standard database to all HDSS Nodes that is built to store longitudinal data. The database is scalable and can be expanded in terms of both tables and variables. The database is designed to track three core entities on which longitudinal data will be collected (Fig4). The starting point is a dwelling unit which is a physical location, a plot of land belonging to one owner, with buildings, to which geo-coordinates can be assigned. Next is a household, a social group of one or more individual members. Every household must be resident at a dwelling, a physical location. Within each household are household members, who can be resident at the same physical dwelling as the household or could be resident elsewhere away from the household. This results in the concept of multiple household membership, whereby an individual is concurrently a member of two or more households that are resident in the DSA.  Whereas we can have a dwelling with no resident household, it is not possible to have a household without at least one member nor to have household member who is not linked to at least one household. Each physical location, household and household member is assigned a unique permanent study identifier that is used to longitudinally identify them over many years.   


Fig4: SAPRIN Basic Data Structure

1.10. Best Practice Workshop Series;

To attain operational harmony across SAPRIN HDSS Nodes, communities of practice were established in research operations, data management and community engagement to foster collaboration and mutual support between the HDSS Nodes. These communities of best practice, consisting of staff leading a workstream at each HDSS Node, meet annually at a 3-day workshop, organised by SAPRIN and rotationally held at each HDSS Node, to discuss new developments, improvements, exchange best practices aimed at ensuring cost-efficient and cost-effective implementation of the SAPRIN core protocol. During the year, these communities of practice are used as networks to foster inter HDSS exchange visits and daily conversations of onsite and remote support to each other.