Reliability of real-time RT-PCR tests to detect SARS-Cov-2: A literature review

In the face of the COVID-19 (Coronavirus Disease 2019) pandemic, the World Health Organization (WHO) has urged countries to test the population more widely. Clinical laboratories have been confronted with a huge demand for testing and have had to make urgent preparations for staff training, to establish new analytical processes, reorganize the workspace, and stock up on specific equipment and diagnostic test kits. The reliability of SARS-Cov-2 test results is of critical importance, given the impact it has on patient care and the management of the health crisis. A review of the literature available for the period leading up to and including June 2020 on the reliability of SARS-Cov-2 (Severe Acute Respiratory Syndrome Coronavirus) detection methods using real-time RT PCR (Reverse Transcription Polymerase Chain Reaction) brings together the primary factors teams of scientists claim or demonstrate to affect the reliability of results. A description is given of the RT-PCR testing method, followed by a presentation of the characteristics and validation techniques used. A summary of data from the literature on the reliability of tests and commercial kits for SARS-Cov-2 detection, including current uncertainties with regard to the molecular targets selected and genetic diversity of SARS-Cov-2 is provided. The limitations and perspectives are then discussed in detail in the light of the bibliographic data available. Many questions have been asked that still remain unanswered. The lack of knowledge about this novel virus,whichappeared at the endof 2019, hasa significant impact on the technical capacity todevelop reliable, rapid and practical tools for its detection.


Introduction
From a health point of view, the year 2020 has been marked by the onset of the COVID-19 epidemic (Coronavirus Disease 2019) caused by a coronavirus, which began with the first reported death on 31 December 2019 in the city of Wuhan, Central China and then grew into a pandemic [1].
The coronavirus is a family of viruses that includes MERS-Cov and SARS-Cov-1 (Severe Acute Respiratory Syndrome Coronavirus) and can cause illnesses from a simple cold to more severe pathologies. The illness linked to this novel coronavirus, SARS-Cov-2, has been named COVID-19 by the World Health Organization. The symptomatology is very diverse, sometimes with only mild symptoms, such as fever, cough, a flu-like condition, diarrhea or loss of taste and smell [2]. However, SARS-Cov-2 can also cause pneumonia and lead to hospitalization, intensive care, and ultimately death for those who contract the more serious forms. The summer of 2020 has not yet brought a vaccine for the SARS-Cov-2 coronavirus, but many research laboratories and pharmaceutical companies are working to develop one.
The incubation period of the SARS-Cov-2 coronavirus is five to six days on average, but can be up to 14 days [3,4], and during this incubation period the person may be asymptomatic while being contagious. The disease is primarily transmitted through droplets (invisible secretions expelled when speaking, sneezing or coughing) and by the hands.
To limit the spread of the virus and the number of deaths from it, therefore, it is very important to screen virus carriers early on in order to ensure they are dealt with in an appropriate manner.
For this reason, on 30 January 2020, the WHO (World Health Organization) urged States everywhere to set up swift diagnostic testing for SARS-Cov-2 and implement mass screening of populations. The hope was to rapidly and efficiently isolate carriers likely to spread the virus [5]. Obviously, such wide-scale testing obliges health facilities and medical laboratories to acquire specific equipment and diagnostic test materials.
Since the publication, in January 2020, of the annotated genome of SARS-Cov-2 on the site of the US National Center for Biotechnology Information, or NCBI [6], many test kit suppliers have been working to develop new realtime RT-PCR (Reverse Transcription Polymerase Chain Reaction) methods for the detection of SARS-Cov-2.
In France, clinical laboratories are central to the French strategy for managing the COVID-19 crisis and the policy of mass testing. These laboratories are inter-coordinated by Regional Health Agencies (Agences Régionales de Santé, or ARS) in order to ensure the efficient distribution of nationwide testing. They are therefore expected to meet the demands of the ARS in terms of numbers of tests and deadlines for returning results. The laboratories routinely report results to the prescribers and, in certain cases, patients directly, but also send their epidemiological data to Public Health France (Santé Publique France, or SPF), which collects national data before sending it each day to the French Ministry of Health to enable daily monitoring of the national situation.
Given this epidemic situation and the consequent urgency of the need for actions and resources, the reliability of the virus tests and the actual state of health of patients tested are seriously in question. Several articles in the literature describe various factors that contribute to the large number of false negative results recorded [7][8][9]. Indeed, concerns about diagnostic errors have been openly mentioned to the media by health professionals.
The reliability of a test for SARS-Cov-2 detection depends on the accuracy of the result provided by the laboratory, in other words on the reliability of the PCR SARS-Cov-2 detection methods in use.
The object of this article is to review the literature available, for the period leading up to and including June 2020, on the reliability of SARS-Cov-2 detection methods using real-time RT-PCR. It aims to bring together the primary factors that teams of scientists claim or demonstrate to affect the reliability of results.
In view of the large volume of scientific output on the subject of SARS-Cov-2 (nearly 31 600 references identified via Scopus database as of 03 August 2020 for the key word "COVID 19"), this review is based on a detailed study of around 100 scientific papers published between the beginning of March 2020 and the end of June 2020, and on practical experience gained in a clinical laboratory that carries out real-time RT-PCR tests to detect SARS-Cov-2.

Background
The reliability of real-time RT-PCR tests to detect SARS-Cov-2 depends on a large number of factors, both during the development of PCR methods and kits and in the execution of laboratory tests (from sample collection, transportation and storage, to validation of results).
In order to fully appreciate the information derived from the scientific articles used for this literature review, a few technical background details about the detection of SARS-Cov-2 by real-time RT-PCR are presented.

RT-PCR (Reverse Transcription-Polymerase Chain
Reaction) is a method in molecular biology that allows the amplification of single stranded RNA RiboNucleic Acid fragments. To achieve this, the RNA must first be retrotranscribed by an enzyme, reverse transcriptase. This gives cDNA (Complementary DeoxyriboNucleic Acid), which is the template needed for the amplification chain reaction (PCR) [10]. Visualization of the exponential amplification reaction is through fluorescence release, via specific probe systems labeled with fluorochromes, which are hybridized with cDNA fragments. These probes offer real time emission of fluorescent signals, proportional to the quantity of PCR product produced, and thus the construction of an exponential amplification curve as can be seen in Figure 1 [11]. Rn is the intensity of fluorescent emission of the reporter dye divided by the intensity of fluorescent emission of the passive dye (a reference dye incorporated into the PCR master mix to control for differences in master mix volume). DRn is calculated as the difference in Rn values of a sample, and thus represents the magnitude of signal generated during PCR [11].

General information on SARS-Cov-2 detection by the real-time RT-PCR method
Access to the annotated SARS-Cov-2 genome isolated from contaminated patients has enabled the rapid development of PCR tests: these detect, usually from nasopharyngeal swabs [8][9][10][11][12], the nucleic acid of the virus, which is a positive-sense single stranded RNA [13].
The real-time RT-PCR method used to detect SARS-Cov-2 is very often multiplexed, with co-amplification of several targets. For a test to be considered positive, an amplification curve for each molecular target should be observed. In the majority of real-time PCR methods, a positive-result decision is based on the presence of an exponential amplification curve with a Ct value (Cycle threshold) below a given threshold, and dependent on the total number of cycles programmed for the test [14][15][16]. A direct relationship is generally observed between the Ct values of the real-time RT-PCR and the viral load of the sample, although this relationship is less certain at low viral loads [15]. The models of viral load evolution in time for both symptomatic and asymptomatic people are, in fact, beginning to be known [17,18]. However, the Ct value should never be used as an indicator of the severity of the disease or of an appropriate care plan.

Choice of PCR kit
The detection of SARS-Cov-2 is carried out using commercial kits that contain the different reagents. The test kits bought by clinical laboratories must meet French and European Union regulations on In Vitro Diagnostic and CE marking [19], and are thus selected from an official list of kits that have been assessed and validated. This list is published and updated regularly by the French Ministry of Health [20]. The choice of PCR kit by a laboratory, which obviously aims to provide the best possible quality of result to the patient, is influenced by a number of criteria, among which are: -Performance of the kit (which impacts the reliability of the result, particularly when it comes to sensitivity); -Ease of use; -Quality and storage conditions of the reagents; -Associated costs (extra equipment to be purchased, specific consumables, etc.).
Decision makers in laboratories do not have access to enough data nor the time (or wish to spend what time they have) comparing kits in order to decide which is most suited to their needs. In this situation, the choice of PCR kit is often influenced by the practice and choice of other laboratories, in the hope of benefitting from the efforts already made (development of dedicated documentation for the test implementation, method verification model, the management of reagents, etc.).
Given the circumstances and the undercurrent of urgency, it is unlikely laboratories have been able to make completely objective choices from the range of molecular diagnostic kits available.

Test method performance characterization
PCR is a powerful amplification tool for the detection of nucleic acid fragments. Over recent years, molecular biology has become a widely used tool and is indispensable in clinical laboratories, where, for example, it can detect pathogenic agents that are too difficult to cultivate through traditional microbiological methods [21]. However, as with any diagnostic method, the PCR must meet strict performance criteria, which contribute to the reliability of the test results provided by the laboratory. These criteria [22] concern the analytical and diagnostic sensitivity (detection limit or LOD), analytical and diagnostic specificity, amplification efficiency, repeatability and reproducibility of results, etc.
Therefore, it is important to differentiate between: -The accuracy of the method, assessed in specific conditions defined by the method designer; -The conformity of the kit itself, which depends on the quality of the reagents and other materials, their storage conditions and their use within the life cycle of the process. The quality of the PCR kit is the basis for method verification in laboratories.

Validation of a qualitative analysis method
The accuracy of an analytical method is assessed via its technical performance, which is defined during the validation process.
To validate a diagnostic test method, it must first be characterized, and then verified to ensure it meets the performance objectives discussed with the customer or prescriber. Validation of a method is a generic term that designates a series of stages: -Definition of the validation criteria, or desired performance characteristics values, which translate the needs expressed by the customer/prescriber; -Characterization of the method through trials; -Validation stricto sensu.
Before trials can begin, the parameters to be tested and criteria to be met must be defined: only then can the experiment design be constructed for the trials for method evaluations.
The tests to detect SARS-Cov-2 by real-time RT-PCR are based on a qualitative approach since the results, though dependent on a numerical value for Ct, are reported as "presence/absence of the target analyte" [26]. When it comes to validation, qualitative and quantitative methods do not have the same criteria. The different criteria are listed by Belouafa and al. [22].
For the validation of qualitative methods, the comparison between the expected result (actual sample status) and the result returned by the method under evaluation allows the identification of true and false positive/negative situations, as illustrated in Table 1 [27]. An analysis of these situations allows values proper to the method to be defined for the different criteria being evaluated (specificity, sensitivity, trueness, etc.) [27].
The specificity takes into account the risk of false positives. A distinction is made between analytical and diagnostic specificity. The analytical specificity is determined through reference materials and expresses the capacity of the method not to give a signal when the analyte is absent from the sample (i.e. absence of nonspecific signals and cross-reactions). The diagnostic specificity is the capacity of the process to give an account of the actual state of the patient: other viruses (SARS-Cov-2 excluded) should not generate a positive reaction of the method.
The sensitivity expresses the risk of false negatives. The analytical sensitivity describes the smallest quantity of analyte that the system is capable of detecting, while verifying that all strains in the inclusion criteria do indeed give a signal. The diagnostic sensitivity should give an account of the actual state of the patient: if the patient is effectively a carrier of the virus, the system should detect that fact.
A numerical way of expressing analytical sensitivity is the LOD. This is the lower limit above which an analyte is detected reliably. In a qualitative method, the LOD value is determined by analyzing the linearity of response to a decreasing concentration of analyte. This method for determining the LOD is described in the ISO 16140-2 Standard [28].
An important factor in the validation of a method is the ability to return consistent results over time and independently of analytical conditions. In this context, precision expresses the degree of disparity among sets of values. It is established by means of three comparisons: -Repeatability: same sample analyzed a given number of times in identical conditions prescriber; -Intermediate precision (intra-laboratory comparison): same sample analyzed a number of times in different analytical conditions (inter-operator uncertainty, for example); -Reproducibility (inter-laboratory comparison): sample analyzed in several different laboratories.
It is also recommended to carry out inter-laboratory comparisons using reference samples. In this way, other characteristics of the method can be determined, such as its robustness, the practicability or capability of being transferred/subcontracted (which takes into account ease of use) [29].

Validation of PCR methods
A lot of general guides exist that are specific to the validation of PCR techniques (Codex Committee on Methods of Analysis and Sampling [30], European Analytical Chemistry [31], Food and Agricultural Organization [32], Thompson [38]. The ISO 20395 Standard [39] more specifically covers quantitative real-time RT-PCR test methods, highlighting the particular issues of these methods and detailing explicit requirements for their validation. The objective is to characterize the external factors that influence results in order guarantee the reliability of the result [40]. Method validation generally consists of two phases, more especially so for PCR methods [29]: -Phase one consists of laboratory trials to determine the LOD, and thus the analytical sensitivity; -Phase two consists of a series of trials concerning the specificity, the inclusion criteria (serogroups that must be detected by the method) and the exclusion criteria (serogroups that must not be detected by the method). For any PCR method used for diagnostic purposes, the inclusivity and exclusivity should be tested for each strain with 100 copies of the genome per reaction (low number of copies) and 10 000 copies of the genome per reaction (high number of copy) respectively [29]. When testing the exclusivity, no amplification should be observed for excluded strains.
Where this is not the case, the manufacturer should state a Ct limit beyond which the amplification is not considered significant and that therefore does not generate a positive result. Two concentration levels are tested with two trials per level. It is also in this phase that the practicability of the method is evaluated (easy storage of samples and equipment needed, duration of the method, inhibitors and interferences etc.) [29].
A further phase of tests, not as such part of the validation of the method, consists in monitoring the characteristics and demonstrating the performance of the method during its duration of use. This is achieved via the traceability of IQC results (Internal Quality Control), in other words positive and negative controls to validate different series [26]. The objective is to detect a potential drift in the PCR system and thus ensure the reliability of the test process during its lifetime. For monitoring of this data, Levey-Jennings charts are very useful graphic tools [41].

Validation of PCR methods for the detection of SARS-Cov-2
The published articles consulted on the validation of methods for the detection of SARS-Cov-2 are the subject of a number of controversies and reactions from evaluators and reagent manufacturers, as in the case of the tests of the Ausdiagnostics company [42,43]. While the comparison of reagents is accelerating and evaluation methodology, though not yet standardized [44,45], is making necessary adjustments, the critical points to be controlled for all evaluations are being highlighted [46]. Table 1. Summary of results in the form of a contingency table [27]. An example of the variation found in evaluation procedures for PCR kits is test specificity, which is sometimes determined "in silico", that is to say, through bioinformatics analysis of the primer specificity [47], complemented by an experimental approach. Specificity is defined as the absence of cross-reactions with other related coronavirus strains (MERS-CoV, human coronavirus HCoV-OC43, HCoV-229E and HCoV-NL63) or other respiratory virus strains frequently found in humans (adenovirus, human metapneumovirus, influenza A virus (H1N1 and H3N2), influenza B virus, influenza C virus, parainfluenza virus types 1 to 4, rhinovirus or the respiratory syncytial virus) [47][48][49][50][51]. For SARS-Cov-2, the number of tests carried out as well as the number of strains tested is variable as not all laboratories have the same biological materials.
The quality and diversity of the reference samples for method characterization are very important: they must be representative of the samples routinely tested (same sample type, similar composition, etc.) [27], for example: -Naturally contaminated matrices (samples from patients who are positive, for example). This is the best option, since representative of real samples; -Matrices with a known quantity of analytes. These are matrices that may or may not be representative of real samples but are contaminated with a known quantity of a reference RNA.
One of the main difficulties in validating methods is in fact a lack of reference materials. The articles consulted reflect the great heterogeneity in the types of sample used for method validation, but also the number of samples retained, as for example: -A synthetic RNA containing strategic molecular targets of the PCR (ORF1ab, N and E gene), sold by ATCC for example (reference: VR-3276SD) [52] but not integrated into a matrix that resembles real samples [48]; -Viral isolates obtained from patient samples or viral cultures, without verification of the actual content of the samples [47,48].
3 Available data on the reliability of real-time RT-PCR tests to detect SARS-Cov-2 3.1 Basic data and knowledge of the virus 3.

Performance of the different PCR methods and kits
Real-time RT-PCR methods have become the standard for the detection of SARS-Cov-2 in the context of the current pandemic [53]. Following WHO recommendations, many companies and laboratories have developed PCR kits for the detection of this novel coronavirus. These kits are evaluated by expert bodies (National Reference Centers (CNR), Centers for Disease Control and Prevention (CDC), etc.), through a validation of method. For this disease and according to current country regulations, this evaluation is compulsory for kits to put on the market. The differences that exist in one RT-PCR method compared with another are revealed in the validation data that accompanies the kits: there are different molecular targets, variable analysis times and, most notable of all, differences in performance, particularly in terms of sensitivity, which is indicated by the different LOD values. Table 2 gives a summary of this data.
Producing evaluation data for the kits is an ongoing process, with ever more publications adding to the information already summarized in Table 2 [44].
On 13 May 2020, the Canadian Public Health Laboratory Network (CPHLN) Respiratory Virus Working Group published a comparative study of the tests developed in laboratories and the commercial kits used in Canada [66]. Thanks to this study, the LOD of numerous kits and methods are not only known but can be compared. This study also shows that, even with significant differences in sensitivity (certain tests having high detection limits), all the LOD mentioned in this study [66] are between 200 and 600 copies/mL. This difference of 400 copies/mL can be critical, however, in the case of low viral loads. Chih-Cheng Lai et al. [9] have carried out a review of the technologies available for the detection of SARS-Cov-2 and present their performance data and molecular targets. Igloi et al. [44], have more recently evaluated 15 commercial kits, the sensitivity of which varies between 3.33 and 330 RNA copies of the initial template to obtain a repeatable result.
The differences in performance between the various methods can be explained by a number of factors, among which are: -Choice of fluorochromes: With multiplex PCR such as those used for the detection of SARS-Cov-2, and depending on the fluorochromes used, signal/background noise ratio can be variable [67]. It is important for background noise to be low enough not to mask fluorescence that is indicative of the detected virus and to set thresholds in such a way as to detect low but significant Ct values; -Quality of the reagents: As has already been mentioned, the quality of the reagents obtained from the supplier has a serious impact on the reliability of the results that will be generated by them. However, suppliers do not communicate the exact composition of the kits they sell, which makes it difficult to evaluate the quality of reagents over time or from one batch to another. Beyond the intrinsic quality of the kit primers and probes or method, the amplification conditions (reaction mix in particular) play a significant role in the effectiveness of the PCR, and optimization via certain additives can improve the performance of detection tests. For example, adding 0.1 mg/ml of bovine serum albumin (BSA) makes the RNA more accessible to the reverse transcriptase and cDNA to DNA polymerase. It limits non-specific pairing of primers in GC-rich zones [68]. In the case of detection of E and RdRP genes, this addition of BSA has enabled a significant reduction in the frequency of non-specific amplifications (from 63.1% to 12%) [68]. However, it is important to evaluate the effect of these additives on the analytical sensitivity, given that the pairing of the primers with the DNA template of the sample is also partially inhibited; -Reaction volume: In a validation study of alternative extraction methods for real-time RT-PCR, the effect of the reaction volume on the performance of SARS-Cov-2 detection tests was evaluated [69]. Two different protocols were tested: one with a final volume of 25 mL with 20 mL of mix and 5 mL of extraction product, the other with a final volume of 12.5 mL with 10 mL of mix and 2.5 mL of extraction product. The results showed an increase of Ct for the second option (lower volume). To quantify this increase, the Ct values were converted into number of copies/ml in order to calculate a %CV, which varies from 6% to 14% depending on the sample type. This shows that while a protocol that demands less reagent allows a considerable reduction in costs for the laboratory, a lower reaction volume could, indirectly, interfere with the sensitivity of the diagnostic test through an increase in Ct if all use and decision conditions are not adjusted accordingly. Nevertheless, these results should be taken with caution since the article that reported these results has not yet been validated by peers; -Enzyme pair used for the RT and PCR: There are a large number of suitable reverse transcription and DNA polymerization enzymes. Some enzyme pairs are more efficient than others. The enzyme pair should therefore be chosen extremely carefully since, in an exponential amplification system, the slightest difference in efficiency or copy error rate causes sensitivity problems [40]; this, however, is a choice often made by the kit developer and not the end user, since kits come ready for use with the reaction mix included; -Extraction method used: The nucleic acid extraction stage not only makes the RNA accessible, but also clears the nucleic acid solution of proteins and cellular debris from sample collection, which are possible sources of PCR inhibitors. Traditional chemical extraction is long and uses a lot of reagents, which can at times be in short supply. Alternative extraction methods have been tested: direct heating without additives, the addition of formamide-EDTA buffer and the use of an RNAsnapTM buffer. Using a real-time RT-PCR method targeting the E gene, an increase was observed in the Ct values for the three alternatives compared with a standard extraction method (using lysis, precipitation, washing and elution) of 6.9 cycles (±1.7), 8.5 cycles (±1.1) and 7.8 cycles (±1.7) respectively [70]. The alternatives for the extraction of nucleic acids can thus significantly degrade the sensitivity of the overall analysis process. The impact of the RNA extraction methods on the performance of the PCR should therefore be carefully evaluated in the same way as new SARS-Cov-2 detection kits (nCoV-DK) [71]. The extraction stage must be optimized to ensure it does not reduce the overall sensitivity of the detection method.

Molecular targets selected
At the time of writing this article, annotation of the SARS-Cov-2 genome has made it possible to define certain molecular targets, based on specific gene sequences, which also serve as PCR targets [14]. Among these molecular targets are: -Structural genes: envelope protein (S and E), transmembrane (M), helicase (H) and nucleocapsid (N) genes [49,72]; -Accessory genes involved in the enzymatic machinery: RNA polymerase (RdRp), hemagglutinin esterase (HE), and the open reading frame ORF1ab [54,73,74].
The molecular targets selected for detection tests are mostly kept among the strains of SARS-Cov-2 and are present in different numbers of copies in the genome, a fact that has consequences on the efficiency of diagnostic test PCR. In order to optimize the performance of a SARS-Cov-2 detection test and in particular its sensitivity, the concomitant detection of several genes is often to be preferred. Even if detection of the E gene alone is not recommended by the European Center for Disease Prevention and Control (ECDC) because of specificity problems and its vulnerability to sample contamination [75], it is nevertheless recommended to use the E gene as a target, along with the RdRP gene for confirmation [54]. Detection of the E gene can also be combined with detection of the N gene. In the study carried out by Ishige et al. [67], a comparison is made between a multiplex NIID-N and E_Sarbeco methodology and a simplex method targeting the E gene. The multiplex method allows detection of samples with 2-5 copies/reaction, which is a much higher level of sensitivity than with the simplex method. Moreover, the multiplexing method limits the number of doubtful cases at the first amplification and avoids confirmation through detection of the RdRP gene, which offers savings in terms of time and reagents [67]. The Sheffield Teaching Hospitals NHS Foundation Trust in the United Kingdom targeted the E and RdRP genes in 12015 clinical respiratory samples [75]. Extraction was carried out on the MagnaPure96 platform (Roche Diagnostics Ltd, Burgess Hill, United Kingdom) and amplification on the ABI Thermal Cycler (Applied Biosystems, Foster City, United States). Out of 12015 samples, 2593 results gave positive detection results for at least one of the two targets. The combination of the two genes therefore significantly increases the diagnostic sensitivity of the test compared with detection of the RdRp gene alone (+11.9%).
Targeting the M gene might be another possibility worth exploring for new diagnostic kits. Toptan et al. [76] found this gene (the coding for a membrane protein) to be very useful in the detection of SARS-Cov-2 in viral cultures. It appears that the gene is efficiently transcribed in host cells and that already-existing primers and sensors can bond with synthesized mRNA.
The results of studies to date generally highlight the importance of combining several molecular targets for a test to be reliable, sensitive and specific [53]. Diagnostic tools that simultaneously target the E and RdRP/Hel genes seem to offer the best analytical sensitivity [14,17,77,78].

Genetic diversity of SARS-Cov-2
In terms of sensitivity and specificity, the reliability of SARS-Cov-2 detection tests by PCR depends largely on the quality of the primers used. Primers must be constructed to amplify all strains of SARS-Cov-2 present in the environment, while excluding all other viruses. However, the known genetic diversity of SARS-Cov-2 makes the construction of primers for PCR problematic. Coronaviruses, like all RNA viruses, are characterized by a fairly high mutation rate, related to the lack of proofreading of the polymerase [47,79]. In a study by Shen et al. [79] made of 110 sequences collected between 24 December 2019 and 09 February 2020, the mutation rate of SARS-Cov-2 was evaluated at 0.80-2.38 Â 10 À3 nucleotide substitutions per site per year. A more recent Colombian study [80] based on 31 000 complete sequences conducted before 24 May 2020 gives a mutation rate range of 1.67-4.67 Â 10 À3 substitutions per site per year. These sequences illustrate the genetic diversity of SARS-Cov-2 around the world.
This genetic variability is related both to intrinsic polymorphism and to selective pressure exerted, notably, by the human immune system, which forces the virus to mutate in order to thwart the defense system of its host through the rules of natural selection [79].
Even if the true extent of this diversity and its effect on the viral phenotype are not yet fully described, current advances do allow us to distinguish between well-conserved regions and more variable regions, although we do not as yet have the experience or statistical data to refine the genome map of SARS-Cov-2. An experiment was conducted in a Wuhan hospital in January 2020 by Shen et al. [79], using 110 meta-transcriptome sequences of SARS-Cov-2 obtained from BAF samples (Bronchoalveolar fluid) from eight patients who were carriers. The number of variants of SARS-Cov-2 was evaluated from 0 to 51, with a median of 4. This shows not only that viral sequences evolve very fast, but also that a very large number of variants can be found from the same patient at the same time [79].
What we need to know is the impact of this diversity on PCR detection tests and the capacity of the tests to recognize all existing variants. The number of mutations is not all that matters: their position in the genome is highly relevant and, notably, whether or not they are in conserved regions of the genome. A high mutation rate in a known variable region is not a major problem since these are not sequences that will be selected for PCR primers. On the other hand, a low number of mutations can have serious consequences in a region identified as conserved, and hence a potential choice for PCR primers. This situation could create a serious problem with regard to sensitivity.
Work carried out by Alvarez-Diaz et al. [80] on 31 000 SARS-Cov-2 genome sequences taken from the nasopharyngeal samples of 30 patients showed that, among all the sequences found, 99% were identical in the regions targeted by the primer included in detection methods [80]. On the other hand, the 1% of heterogeneous sequences presented discrepancies, notably a mismatch between the genome of SARS-Cov-2 and the commercial primer, including with genes selected in real-time PCR detection tests supported by the WHO. For example, two sites of genetic variability were identified in the sequence of RdRP gene primers from the method recommended by the US CDC. This observed variability has a critical impact on the reliability of the test [80]. Some discrepancies may have little effect on primer pairing, whereas others are critical and can accentuate over time, increasing the risk of false negatives. This creates even more difficulties if the mutations occur in the 3' region (involved in the hybridization of primers), causing a primer mismatch and the absence of amplification and leading to false negative results [80]. We also know that the third nucleotide of a codon is the one with the highest rate of mutation, so it is not recommended to terminate the primer sequence in 3' with the last nucleotide of a codon [80].
The M gene seems less polymorphic than the RdRP gene, but that target contains an SNP (Single Nucleotide Polymorphism) in position 27 046 of the genome, which indicates the possibility of a diversity that could interfere with the effectiveness of the PCR [76].
It is clear that knowledge and exploitation of the genetic diversity of SARS-Cov-2 remain the keys to furthering the sensitivity and reliability of detection tests.

Limitations of SARS-Cov-2 diagnostic tests
Among the various accounts and syntheses that have appeared in the scientific press, a good number deal with the problems that complicate testing for SARS-Cov-2 and limit the reliability of the results produced. Tang et al. [14] have examined many of the factors developed below while also addressing another aspect, which is the biosafety of laboratory operators.

Choice of anatomical sample site
The sensitivity of the tests to detect SARS-Cov-2 has a direct bearing on the reliability of the diagnostic process. The choice of sample collection site is also a delicate matter. The higher the viral load, the higher the probability of isolating viral particles in the sample and so the better the diagnostic sensitivity will be. The WHO recommends taking samples from the upper and the lower respiratory tract, especially if a sample from the upper respiratory tract appears to be negative when there is a strong suspicion of infection [81]. It is recommended to collect samples from both anatomical sites (i.e. upper and lower respiratory tract) in order to improve the reliability of the diagnosis [8,13,82]. However, the choice of anatomical sample site involves logistical problems, biosafety issues for the health professionals who take the samples, and of course time and cost for the laboratory.
A study was carried out in a Beijing hospital in China in conjunction with the Chinese CDC [82] on 1070 samples taken from 205 symptomatic patients. It was found that BAL (BronchoAlveolar Lavage) gave the sample that was the most often positive (14/15, 93%), leading to the assumption that the viral load was higher there. Unfortunately, BAL is a complex medical procedure and may be appropriate for a person who is hospitalized with severe symptoms but is impossible for a policy of mass screening. Furthermore, it creates a biosafety risk for the staff performing the procedure. Of the various anatomical sample collection sites possible, expectoration is very good (72% for a group of 104 patients), with significantly more viral load than certain other samples [15]. In this same study, nasal swabs from both nostrils gave 63% of positives (5/8) [82]. Sputum gave a higher viral load than nasopharyngeal sample collection, while collection from the throat is simply not recommended [8]. Once again, samples taken from expectoration and sputum pose problems for the biosafety of health professionals involved in the procedure. As with BAL, this type of sample collection creates fine aerosol droplets containing virions that spread through the environment, putting people nearby in danger [12]. To conclude, viral load is greater in the lower respiratory tract [14]. Nasopharyngeal sample collection is nevertheless the most common choice for mass testing, despite not being the region where the most significant viral load is typically found [14,83]. The procedure is easy to carry out and not particularly demanding from a logistics point of view. It is worth noting that SARS-Cov-2 can also be found in stools and blood [8,82].

Evolution of viral load versus time and risk of false negatives
Among the parameters that influence the reliability of SARS-Cov-2 testing of the population is the evolution of viral load versus time. The diagnostic method for SARS-Cov-2 available at time of writing suffers from a lack of sensitivity. This means that low viral loads in samples may give rise to false negative results.
Moreover, the scientific community has been astonished to see patients go from a positive result to a negative result and back again in just a few days.
In a study of 610 hospitalized patients carried out in Wuhan between 02 February 2020 and 17 February 2020 [84], each patient was tested by real-time RT-PCR at least twice, a few days apart. After the first tests, the results gave 384 negatives (63.0%). Those patients who were initially considered to be negative were retested one or two days later and the following results were obtained: 48 positives (12.5%), 27 doubtful positives (7.0%), 280 definite negatives (72.9%), and results unavailable for 29 individuals (7.6%). These figures demonstrate the fact that patients can go from a negative to a positive result in a few days and in significant numbers (12.5% not counting the doubtful cases).
Furthermore, patients who went from a positive to a negative result following treatment have sometimes seen their result again go positive after five consecutive sample collections and two successive negative results. The persistence of SARS-Cov-2 in the nasopharyngeal passages was evaluated in both symptomatic and asymptomatic individuals. The overall median persistence of the virus for both symptomatic and asymptomatic individuals was found to be 9.18 days (between 8.04 and 10.48 with a 95% Confidence Level). However, a significant proportion of asymptomatic people (around 25%) gave a positive result after 2 weeks of tests, which is indicative of the persistence of SARS-Cov-2 in the nasal passages and thus the potential for transmission during that period [85]. These findings have implications for public health: how and when to come out of isolation needs to be carefully regulated, given the evidence that a single negative result is insufficient.
Models have been developed to map the effect of the course of the illness on the evolution of viral load in symptomatic patients. The incubation period seems consistently to be 5 to 6 days on average [3], with the duration of symptoms 12 days on average [3]. The viral load increases over time in sick people, reaching its maximum from 5 to 10 days after the first contact with the virus, here referred to as the time of infection. A study combining viral load and the LOD averages of the SARS-Cov-2 detection methods has allowed the construction of a graph that shows two zones where the low viral load of a patient can induce falsely negative results (cf. Fig. 2) [17].
According to a literature review by the Novel Coronavirus Research Compendium at the Johns Hopkins School of Public Health [18] based on the variation of viral load over time in symptomatic individuals, it is possible to approximately estimate the rate of false negatives for each day from exposure to the virus to the 21st day following, assuming symptoms disappear. This was demonstrated through statistical analysis of a large-scale study of family contacts, with hypothesis of an analytical specificity of real-time RT-PCR of 90% and an assumed incubation period of 5 days [3]. The results of the study for the rate of false negatives with real-time RT-PCR versus number of days following exposure are shown in Figure 3.
This graphic shows that the probability of a false negative is close to 100% in the 1 to 3 days following exposure [18]. This is due to a low viral load during this early period of infection [17].
By contrast, the rate of false negatives drops to a minimum at around 7 days: this is the symptomatic period where viral load is at its peak [17]. Finally, the gradual upward slope in the third part of the curve shows the increasing rate of false negatives as the viral load in the patient again reduces, falling below the detection limits of real-time RT-PCR [17].
It is important to note that this study is essentially based on the analysis and statistical data of other publications. The graphical representation gives a general idea of the evolution of false negative versus time.
These combined studies [17,18] reveal that the best time to take a sample from the upper respiratory tract is from 7 to 10 days following contact with the virus, since that is the moment when the viral load is at its highest, and this is confirmed in another article reviewed [12]. This is very significant data, highlighting the necessity for reliable detection tests to ensure correct patient management towards the end of infection, especially with a view to avoiding transmission of the virus.

Quality of pre-analytical process: sample collection and transportation
The pre-analytical stages (collection, transportation and storage of samples) play a major role in the reliability of the overall diagnosis, representing a serious risk of detection error through the following: -Collecting the sample too quickly, without plunging deep enough (lack of sensitivity for the collection of the virus and consequently for the detection of the virus); -Poor transport conditions (rupture of the cold chain, transit time too long); -Presence of interfering substances; -Patient identification failure; -Contamination of samples by virion or its RNA; -Failure to take antiviral treatment into account.
The Centers for Disease Control and Prevention in the United States (CDC) have published precise recommendations for the collection of samples from the respiratory tract (type of sample, equipment to be used, how to go about it) and their handling (storage and transportation) [86]. In the case of nasopharyngeal swabs, it is important to ensure the swab reaches the posterior wall of the nasopharynx, where the viral load is highest. This anatomical region may be difficult to reach if the person has nasal obstructions or deviated nasal passages. The CDC therefore recommends inserting the swab "through the nostril parallel to the palate (not upwards) until resistance is encountered or the distance is equivalent to that from the ear to the nostril of the patient, indicating contact with the nasopharynx", adding that the "swab should reach depth equal to distance from nostrils to outer opening of the ear." The swab must then be gently rolled and rubbed and "left in place for several seconds to absorb secretions". This can be done in both nostrils using the same swab [86].
Good knowledge of the anatomy of the upper respiratory tract is highly beneficial, if not essential, to the quality of the sample. In fact, a study carried out by Piras et al. [83] has confirmed that otorhinolaryngologist doctors carrying out nasopharyngeal sample collection offer a distinct advantage over less experienced staff (nurses and non-specialist doctors, etc.), going so far as to state that diagnostic sensitivity is superior when sample collection is carried out by ENT (Ear, Nose, and Throat) specialists. This of course highlights the fact not only that the choice of anatomical sampling site is crucial but also that the level of skill, training and anatomical knowledge of the person taking the sample is important, and needs to be underpinned by good initial as well as ongoing training that reflects the recommendations in force.
Moreover, the CDC recommends that samples be transferred to the laboratory as quickly as possible to allow rapid storage between 2°C and 8°C. At these temperatures they can be kept for up to 72h after collection. If they are to be kept longer than that, samples should be stored at À18°C or, ideally, À70°C according to CDC and WHO recommendations [81,86]. This initial storage stage is all the more important given the evidence that storage of samples with a low viral load at 4°C for a longer period (several days) can cause an increase in Ct values during real-time RT-PCR [16]. Finally, if a sample has not been delivered in an inactivation transport medium, it must be inactivated upon arrival at the laboratory. Viral inactivation by means of a lysis buffer containing guanidinium is preferable to viral inactivation through heat as it is better for the conservation of RNA in the sample. According to a study by Pan et al. [16] of different viral inactivation methods based on 23 confirmed cases of COVID-19 in Beijing, 23 samples of different types (throat swabs, expectoration, bronchoalveolar lavage fluid, stools and blood) were tested by real-time RT-PCR both with and without prior heat inactivation. The results showed that the average Ct were higher for the inactivated samples (33.07 ± ET 5.00) compared with the non-inactivated samples (average Ct of 32.69 ± ET 4.92). Next, a comparison of the different inactivation methods, chemical (lysis buffer) and thermal (heat), revealed an increase in Ct of 1.08 Ct for the samples inactivated by heat (p < 0.001) [16]. The conclusion is that inactivation by heat degrades part of the RNA initially present in the sample, making it undetectable. Lysis buffers appearing to do less damage, these would at present seem to be the better option.

Limitations and perspectives
This study may cover a short period, but it also covers a large quantity of work carried out by groups of scientists whose common aim is to increase knowledge of the SARS-Cov-2 virus and develop and improve the reliability of tools for its detection. This literature survey brings to light the limitations that exist within the system established in the first semester of 2020 for the diagnosis of SARS-Cov-2, which are of two types: methodological and conceptual.

Methodological limitations of reliability evaluations of SARS-Cov-2 diagnostic tests
From a methodological point of view, these limitations are an insufficiently harmonized evaluation methodology, a limited and inadequate access to biological reference materials and manufacturers' captive in vitro diagnostics systems. Scientific rigor and reproducibility of information are of the essence if these technical barriers are to be surmounted, and quickly. To this end, scientific articles dealing with the subject of PCR method evaluation should follow the MIQE (Minimum Information for Publication of Quantitative Real-Time PCR Experiments) recommendations [87,88] and provide full details of experimental parameters in order to facilitate the work of the professionals who read them. Comprehensive documentation on reproducible experimental protocols would undoubtedly help the end users, namely clinical laboratories, in their choice of diagnostic tools.
It is worth noting that the reproducibility of investigations and conclusions drawn from them is seriously limited by the lack of access to sufficient quantities of sufficiently diverse biological materials. Given the degree of urgency, numerous initiatives have sought to produce reference materials that are only roughly standardized, the quality of which (homogeneity, construction methods, storage conditions, and so on) is difficult to evaluate.
Nevertheless, some useful initiatives have emerged over the last few months, such as the working group that is drawing up standards, standardization guidelines and validation guides for PCR methods for the detection of SARS-Cov-2 under the auspices of the JIMB (Joint Initiative for Metrology in Biology). The declared aim is to create, in conjunction with international laboratories, a controlled information base so as to harmonize practices and increase the reliability of PCR tests, which certainly have room for improvement. These efforts could eventually lead to the availability of reference control samples for method validation [40,89]. Another initiative worth mentioning is the European Virus Archive À GLOBAL (EVA-GLOBAL) project, which hopes to share reference materials for the validation of methods for SARS-Cov-2 detection by PCR [90].
Currently, inter-laboratory comparisons for the detection of SARS-Cov-2 are at an early stage of development because of the problems related to sample homogeneity, storage and transportation. In the absence of EQA (External Quality Assessments) of laboratories engaged in SARS-Cov-2 diagnosis, inter-laboratory comparisons would be a useful way to learn more about the disparity of practices and the reliability of detection tools in practical terms [91]. Attention should now begin to focus seriously on these inter-laboratory comparisons since they are a path towards consistently reliable diagnoses across different regions.
Because of lack of access to information about the composition of commercial kits, it is difficult to appreciate and compare the performance of the different PCR kits on the market. The laboratories that supply them do not share the exact composition of their reagents (construction of primers and sensors, reagent concentrations, etc.). Consequently, it is very difficult to adapt or optimize a commercial method within a clinical laboratory. The only option would be to develop an internal method and carry out a comprehensive validation of it, but this would require time and resources that clinical laboratories simply do not have for that purpose. It therefore falls to the scientific studies on this subject to standardize the transmission of information on the performance of real-time RT-PCR methods, and by this means facilitate the bibliographic research that is a necessary step in choosing a PCR kit.

Conceptual limitations of reliability evaluations of SARS-Cov-2 diagnostic tests
In addition to the technical limitations, conceptual limitations due to insufficient knowledge of the virus and its biology also affect the characterization of the reliability of detection tests.
To begin with, it is important to realize that SARS-Cov-2 has never been isolated in accordance with standard practice, as Crowe points out [92], having studied the SARS-Cov1 epidemic in 2003. The diversity of symptoms among patients makes it impossible to associate specific symptoms with the presence of SARS-Cov-2 RNA or to isolate the virus with precision [92]. Today, adding impurities from patients to cell cultures to provoke cytopathic effects is considered sufficient, whereas this does not in any way enable isolation of viral particles or characterization of their genetic material. In January 2020, viral particles were observed through electron spectroscopy from human epithelial tissues [93], without any viable demonstration that they corresponded to SARS-Cov-2 [92]. The RNA sequence obtained was from impure samples from patients with pneumonia. There is therefore no concrete proof that the RNA studied was actually from the SARS-Cov-2 virus [92]. The original hypothesis on which the diagnostic test and its PCR targets are based, that the detection of this particular RNA is proof of the presence of SARS-Cov-2 in a sample, has not been validated. In other words, significant uncertainty surrounds the isolation of this virus and the characterization of its genome, which are notwithstanding the basis upon which PCR diagnostic tests have been developed. In such circumstances, the point of departure uncertainty has a direct bearing on the general diagnostic reliability [92].
In the light of this, a degree of uncertainty automatically attaches to the molecular targets and the primers and probes of the PCR kits that are available today. It would appear that we do not have enough knowledge or experience to state with certainty which regions of the genome are conserved and which are not, even if data exists for other coronaviruses: given the genetic diversity of the SARS-Cov-2 virus, a very large number of viral genomes would have to be sequenced, over a long period of time and in several regions of the world, to ensure statistically robust data for this one. One of the important jobs to be done as things stand today is to continue to analyze SARS-Cov-2 sequences over time and in different regions of the world, and acquire as much data as possible in order to refine the notion of "conserved region" and develop PCR primers for more reliable diagnosis. New and highly effective sequencing methods have been validated (MinION, MiSeq) that can further this end [94]. Inter and intra individual comparison of SARS-Cov-2 genome studies by sequencing and analysis of the whole genome are in progress [95]. The PCR primers currently used in clinical laboratories need to be constantly questioned and prove their validity in view of the multiplicity of variants of SARS-Cov-2. Moreover, it would be useful, in order to anticipate future mutations of SARS-Cov-2, to make use of bioinformatics tools to predict mutations and broaden the specificity of primers in conserved regions of the viral genome [47].
While the quality of construction of the diagnostic tool is obviously important, so is the quality of the sample to be analyzed. In theory, samples should be stored at 4°C immediately after collection. In practice, it is very difficult for clinical laboratories to comply with recommendations for the storage of samples, in particular during collection and transportation to the laboratory. However, there is no data to describe the effects of storing samples at ambient temperature when no possibility of a proper cold chain exists, nor any definition of effective and practicable actions to compensate the situation (more regular courier trips, urgent transit of samples, etc.) and limit the degradation of samples before they reach the laboratory for analysis.
Finally, it is not an easy matter to make a true estimation of the overall sensitivity of each diagnostic test by taking into account all the relevant factors from both the pre-analytical and the analytical stages. To achieve that requires a posteriori statistics based on clinical, epidemiological and serological information which, for the moment are simply not available [77].
Consequently, in view of all the limitations described, the result of a diagnostic test for SARS-Cov-2 by PCR should not be the only factor taken into consideration when it comes to deciding whether or not an individual is in need of specific or intensive medical care [8,17,96]. The result must be interpreted within the broader clinical context, taking into account symptoms, the medical state of the person (chronic diseases, etc.), thoracic CT scans, etc. Generally speaking, the strategy of wide scale diagnostic testing is a difficult matter, where "Balancing the increased use of laboratory tests, risk of testing errors, need for tests, burden on healthcare systems, benefits of early diagnosis, and risk of unnecessary exposure is a significant and persistent challenge in diagnosing COVID-19." [9].
There is general awareness of the limitations of realtime RT-PCR, and other methods based on different technical principles are in the course of development and validation by regulatory bodies with a view to release onto the market. Among them is a detection technique based on the CRISPR-Cas 12 principle developed by Broughton et al. [97]. It is called SARS-CoV-2 DNA Endonuclease-Targeted CRISPR Trans Reporter (DETECTR) and can be carried out using nasopharyngeal or oropharyngeal swabs transported in a universal viral transport medium. The method consists of isothermal amplification using LAMP technology with detection of the E and N genes, followed by CRISPR detection with a colored line on a fluorescent plate reader to indicate a positive test result. 83 clinical samples were tested, and the method gave a diagnostic sensitivity of 95% and specificity of 100% (in relation to other respiratory viruses). With this method, a result was obtained in 30 to 40 minutes, which is an appreciable advantage for clinical laboratories. However, this method can only currently offer a LOD of 10 copies/mL, which is roughly 10 times higher than other real-time RT-PCR methods, such as that developed by the CDC (CDC test of California Department of Public Health) [55].

Conclusion
The analysis of scientific data published during the first semester of 2020 on the detection of SARS-Cov-2 clearly reflects the emergence of a new pathology and the many challenges that that implies. Many questions that have been asked still remain unanswered. The lack of knowledge about this new virus for the humans, which appeared at the end of 2019, has a significant impact on the technical capacity to develop reliable, rapid and practical tools for its detection. The immediate deficiencies that certain articles bring to the fore À a lack of biological materials, of systematic and harmonized methodology, etc. À are clearly areas for improvement in the management of the current crisis, but also a means of preparing for the crises of the future. This bibliographical study makes it possible to identify both conceptual and practical limits: Real-time RT-PCR tests for the detection of SARS-Cov-2 involve an intrinsic uncertainty linked to genetic issues. In a context where SARS-Cov-2 has never really been isolated, genetic diversity of the virus, which may be present in a large number of variants, should drive the selection of molecular targets and primer sequences used for PCR tests. Based on these fundamental genetic aspects, the development of PCR kits to date has engendered a certain heterogeneity of performance related to the choice of sequences and the kit production quality.
In addition, data has been reported on the performance of screening tests. For reliable screening, it is important to take into account the evolution of viral load in relation to time to avoid a high risk of false negative results. The sampling procedure and the pre-analytical conditions are critical control points for clinical laboratories, which can, however, rely on existing recommendations.
It is worth noting that the difficulties and limitations described with respect to pathogen detection through molecular testing (real-time RT-PCR) are for the most part just as relevant to serological detection methods. If detection by RT-PCR aims to define the presence or absence of a virus at a given moment in time, a serological test analysis considers the exposure of the patient in the past and the presence of a still detectable immune reaction. In other words, the scope of the challenge with serological detection is even broader than with molecular detection, and the reliability of these tests must also be confirmed for any large-scale use.
Six months on from the start of the COVID-19 epidemic, it is important to highlight the necessity for coordination between the continuous improvement of scientific knowledge and the tailoring of strategies and policies for managing the health crisis with a view to effective screening.