Meeting Report: The Diagnostic Landscape for COVID-19

Written by: Dr. Simon Goodman

Bioinsider presented a one-day Webcast meeting on June 18, 2020, entitled “The Diagnostic Landscape for COVID-19”, where leading experts gave an update on the situation in the United States. A recording of the meeting is available from the Bioinsider website. The COVID-19 pandemic has generated a dramatic rapid scientific response, and has had massive public health, social, and economic consequences. In addition, many unreviewed scientific preprints have been posted online, so this meeting provided valuable clarity on several critical topics.

Presentations

Karl V. Voelkerding, MD, University of Utah; ARUP Laboratories, as chair, noted that SARS-CoV-2 is likely zoonotic. Humans can be infected with seven coronaviruses, closely related to those from pangolin and bat. First seen in Wuhan, China at the end of 2019, by the end of January 2020 there were 2700 local infections and 80 deaths. By June 19, 2020, there were 8.2 million infections in 182 countries and 445,000 deaths. The pandemic is uncontained.

Timothy J. O’Leary, MD, Adjunct Professor, University of Maryland School of Medicine, detailed the many statistical issues generated by SARS-CoV-2 analyses. He noted that there is no “best test”, for virus and that it is important to understand the biases involved. He highlighted spectrum bias, verification bias, and imperfect reference bias. Of these, he noted that spectrum bias, where specimens selected to evaluate tests do not represent the population where tests will be applied, and imperfect reference bias are perhaps most serious. SARS-CoV-2 tests evaluated on patients positive for several symptoms may be used on people with only one, while negative samples are from populations never exposed to virus. As virus level decreases over the first weeks of infection, the precise timing of sampling is critical for comparison of assays.

A major issue is correctly defining a gold-standard positive – a common bias in SARS-CoV-2 testing. Discrepant analysis, where agreeing tests are considered definitive, while disagreeing tests are compared to a third test, are always biased, unless all three tests are perfect. Finally, incorporation bias presumes that one test in a series is a gold standard, but this is not appropriate for determining test superiority.

Statistical impartiality demands that the question being asked of the analysis should be made before the samples are gathered – using either McNemar’s or Liddell’s tests, on paired comparisons (one patient-two specimens): both require reliable gold-standard samples. Tests like Chi-squared or Fisher’s Exacta are not appropriate for such samples. But for highly specific and selective tests, like PCR, very large sample numbers are needed to assess differences between different analytical methods.

It is necessary to decide in advance the desired trade-off between sensitivity and specificity, and so the desired P-values that the test should be expected to reach. And then there is the critical issue of deciding how much of a wrong conclusion you are willing to accept, whether type 1 error (finds a difference where none exists) or type 2 error (finds no difference when one is there). Many pathology studies pay no attention to type 2 error, which means that failing to detect a difference between positive and negative values doesn’t mean that no difference exists.

David Hillyard, MD, Medical Director, Molecular Infectious Diseases; Professor, University of Utah School of Medicine, considered “ Multiplatform Comparisons of Molecular Tests for SARS-CoV-2: Predicting Differences in Case Detection”. He noted that during the coming flu season COVID tests, which function robustly in the presence of flu and other respiratory viruses, will be necessary. The sensitivity of SARS-CoV-2 molecular tests varies greatly depending on the time point at which they are performed. Furthermore, due to severe limitations in supply chains, 57 % of academic and commercial reference laboratories are simultaneously running three or more molecular test methods to minimize the effects of reagent shortage: this raises the question of the relative sensitivity of the tests. Despite widespread availability of commercial analysis instruments, some 21% of U.S. laboratories are using in-house developed molecular-testing methods, under submissions for FDA Emergency Use Authorization (EUA), as primary tests (slide#8; AMP survey SARS-CoV-2 molecular tests).

Thus, SARS-CoV-2 molecular tests, generally quantitative PCR, are less rigorously characterized than is normal for FDA diagnostic assays, with a remarkable three orders-of-magnitude difference in the reported sensitivities (expressed as a limit of detection – LOD – in the virus particles per milliliter) between the available tests, and uncertainty about the relationship between such reported sensitivities and commercially defined units (e.g., the tissue culture infective doses quoted by Roche and Hologic). This raises obvious issues of false negative results caused by high LODs (discussed in preprint by Arnaout et al. 2020) – and the question of when to retest.

Data from a large test series of symptomatic patients in Utah analyzed using the Hologic and Roche analysis platforms suggested that, given the often high viral loading in patients, reliable qPCR crossing thresholds (CTs) can be used to select samples for confirmatory re-testing.

Teng Peng, PhD, Technical Application Manager, ACROBiosystems described ACRO efforts to develop reliable recombinant SARS-CoV-2 protein for rapid sero-testing. Acro has developed a fully glycosylated form of the viral spike protein, expressed in HEK 293 cells and trimerized using the foldon domain of T4 phage fibritin. Stability of the essentially mono-disperse protein was improved by mutating the dibasic furin cut site. Cross-neutralizing antibodies bound this form of the protein, indicating near-native folding and antigenicity.

Kirsten St. George, PhD, Clinical Professor, School of Public Health, Biomedical Sciences, SUNY described the challenges that the New York Public Health Care System has faced during the evolving Covid-19 pandemic. She noted that each state public health lab is essentially independent and at the New York laboratory was aided by support from ThermoFisher and Zeptometrix.

Her laboratory monitored assay developments in advance of the Centers for Disease Control and Prevention (CDC) assay, and their EUA submission was the 2nd for SARS-COV2 and the first from a public health laboratory. The EUA was accepted, and, indicating the seriousness of the situation, the FDA authorized its expansion to other state hospitals clinical laboratories, and also gave them permission to oversee other EUA tests in New York state.

The laboratory needed an enormous increase in capacity of automated extractors, liquid handlers, qPCR, serotests using magnetic immunosphere and dried blood spot assay, and plaque neutralization assays. They initiated a rapid staff increase from 29 to 130, double shifts, and training initiatives, but the general virology screening program still needed to be shut down.

Supply chain logistics are a major issue. The flu epidemic of 2009 generated 200 samples per day, Zika 400, but SARS COV-2 currently involves 2600 samples per day.

The laboratory has found that analysis of five-sample pools work well, but that deconvolution logistics are difficult. The qPCR CTs signal from patient saliva, alone or in viral transport medium, even after storage at room temperature, remained unchanged at seven days, but no indication of virus was found in patient cerebrospinal fluid, despite clear encephalitis (n=30). Retrospective data analysis showed clear evidence of the efficacy of shelter-in-place (quarantine) orders.

Brian Krueger, Associate Vice President, Technical Director, R&D at LabCorp discussed how LabCorp became the first diagnostics company to get an EUA for SARS-CoV-2 RT-PCR testing and the at-home collection of test samples. The timeline of assay development starting from publication in Nature of the virus RNA sequence (GenBank accession number MN908947, January 12, 2020), validation and release of a manual assay for nucleocapsid sequences (March 4-5), and one step RT-PCR and FDA EUA (mid-March) for the TaqMan 7500 Dx low-throughput system. The current assay can run on a Quantstudio-7 384-well in multiplex format, using Roche MagnaPure sample extraction. Test validation included negative signal for 26 other respiratory viruses, and on negative samples from 177 patients, while 95/96 confirmed positive samples gave a positive signal.

The assay has now migrated to a Hamilton liquid handling system (mid-April) with 1152 runs per instrument per hour, and LabCorp has capacity for over 100,000 tests per day, which exceeds demand at datum. They have developed a sampling kit for at-home use.

Patricia Slev, D(ABCC), Associate Professor, University of Utah and Elitza Theel, PhD, Associate Professor at Mayo Clinic described the evolving situation and use of re-testing for anti-SARS-CoV-2 antibodies in the population.

SARS-CoV-2 is a novel virus so the population have no existing antibodies to it, although more than 90% of adults over the age of 15 have antibodies to four existing common coronavirus. Most COVID-19 patients developed antibody 1 to 2 weeks after developing symptoms. More than 95% of patients are antibody-positive after the second week of infection.

Original sero-tests were not intended to be diagnostic, but designed for CLIA-high complexity labs to reveal prevalence: they had low oversight without need for an EUA. The situation changed on March, 15, 2020 when it became necessary to notify the FDA of intent-to-market, and some general guidelines were provided, by which time there were over 200 tests on the U.S. market! Some were stated to be point-of-care tests, some were home tests. All in all there was much confusion. Some stated that they were “FDA-approved”- but they weren’t.

On May 4th the FDA issued new guidelines and it became obligatory to submit validation data for EUA within 10 days of FDA notification. The FDA defined performance-threshold requirements and provided a template for EUA submission for transparency. They also announced an independent “umbrella assay evaluation” through the National Institutes of Health and the National Cancer Institute.

As yet there are no “FDA-approved” assays, only EUAs. Some 190 sero-assays are commercially available, 15 have EUAs. In all, 41 manufacturers submitted no data or didn’t receive EUA approval. The granted EUA and forbidden manufacturers are listed on the sero-tests pages of the FDA website. No antibody tests are approved for point-of-care or at-home testing. Laboratory oversight is necessary for all.

There are three classes of sero-tests: lateral flow assays; enzyme-linked immunosorbent assays; and chemiluminescence immunoassays. They detect either nucleocapsid spike protein 1 or 2, or the spike protein receptor binding domain, and are mainly designed to work on serum or plasma – but there is also a finger stick test – from ChemBio – for use on whole blood. CDC guidelines (issued May 23rd) see no analytical advantage in tests targeting IgG, IgM or both antibodies – while IgA testing is not recommended.

Despite relatively high test sensitivity and selectivity, the low prevalence of SARS-CoV-2 means that the positive predictive value of serological tests is relatively low, so the CDC has recommended that: 1) assays with ›99.5% specificity be used; 2) orthogonal two-assay tests be used, preferably using a different antigen in each test; and 3) serological screens be performed only in high risk/high prevalence regions.

No reliable tests for neutralizing antibodies exist, and the necessary levels of neutralizing antibody required for protection is still unknown. For common viruses and for SARS-1, protective antibodies have been found to disappear within a few years.

Ralph Rogers, MD, Assistant Professor of Medicine, Warren Alpert Medical School of Brown University, described some of the demanding diagnostic challenges emerging during the pandemic.

COVID-19 patients often present with non-canonical symptoms, and may even have negative PCR tests for virus. This makes an appropriate clinical response challenging – for the individual, the institution, and the community. This applies especially to immune-compromised patients, and ambulatory patients. Physicians are often faced with difficult diagnoses, but COVID-19 is unique in that the analytical performance of available diagnostic tests is poorly characterized compared to an FDA-approved test. Furthermore, the breadth of impact of an inaccurate diagnosis may be severe.

We need to know the diagnostic sensitivity of SARS-CoV-2 molecular tests in various populations to support clinical interpretation of negative assay results. Current best-data from a large population in New York City suggests overall a negative predictive value of slightly over 81% and the local diagnostic sensitivity in the range of 58 – 95% depending on the false negative rate in the population. For high pre-test probability with a negative initial test, a same-day repeat or repeat two days later are advisable. Viral load in the positive patients routinely diminishes to zero 15 to 21 days after infection, so the clinical relevance of occasional persistently positive test results remains unclear.

Karen A. Heichman, PhD, Senior Program Officer, Global Health Division, Bill & Melinda Gates Foundation focused on their efforts to develop simplified home-sampling tools suitable for use in low- to middle-income countries, permitting more widespread testing. One challenge was to find alternatives to the technically demanding and physically distressing nasopharyngeal (NP) swabbing techniques currently used to obtain samples.

She highlighted six barriers to specimen collection: 1) transport of symptomatic individuals for testing; 2) supply of NP and oropharyngeal (OP) swabs; 3) expert Health Care personnel for NP collection; 4) personal protective equipment for limiting exposure to virus; 5) transport media for swabs; and 6) cold packs for swab transport.

Recent studies from the Everett Clinic comparing nasal home-sampling with the NP swabs, and from Stanford University School of Medicine, comparing nasal home sampling with OP swabs, showed over 94% positive and negative agreement, suggesting that nasal swabs might be an acceptable alternative to NP sampling.

In a fascinating demonstration that even simple logistical technologies have been brought to their limits by the pandemic, Dr. Heichman described how nasal swab studies were conducted with “Puritan” foam swabs, where supplies are limited. Spun polyester OP swabs were demonstrated to have an equivalent performance to foam, and can be manufactured at much higher capacity. However, the polyester swab-manufacturers cannot sterilize and pack the swabs – and that must be done by FDA-registered third-party providers.

In an important comparison of viral transport medium (VTM), saline, or dry swabs, for transport, conducted by the Everett Clinic and Quantigen, saline was superior to VTM, while dry swabs had better stability than saline up to three days, even with highly-diluted specimens and during elevated-temperatures stress tests. This data enabled the company Everlywell to generate an EUA for simple dry or in-saline home-sample collection for molecular testing – without the need for a cold chain. This may prove extremely valuable in low- and middle-income countries – and may also be useful during screening for other viral pathogens, for example during the forthcoming flu season. There may also be a potential for such collection for direct antigen-testing using lateral flow assays, which are a vital low cost analytic platform (for example for malaria tests – cost under 50¢ per test) currently used in low- and middle-income countries.

Finally, she noted that dry-swab sampling may open SARS-CoV-2 testing to agricultural biotech industry technologies, capable of millions of molecular tests per day. Their cost-per-test of around 10¢, could be used for potential mass repeated-screening for schools or high-density industrial sites, for example. The necessary supporting technologies for large-scale accession, logistics, and sample elution, which remain a challenge, are currently being researched.

Panel Discussion

The main meeting closed with Dr. Voelkerding moderating a panel discussion about “Considerations for All Tests and How to Properly Handle the Pandemic with the Current Testing Capacity”. Given the scale of the pandemic and the low cost of the individual tests, mass repeated screening might be funded at the Federal level.

How should clinics prepare for the forthcoming flu and respiratory disease-season during the COVID-19 pandemic? Drive-through vaccination facilities both for flu and, when and if it becomes available, for SARS-CoV-2, are preferable to mass attendance at clinics. Rapid and reliable multiplexed diagnostics covering SARS-CoV-2, flu and respiratory syncytial virus, combined with more experience of non-canonical presentation of COVID-19 will be essential.

Dr. Kruger described how every component of available PCR test systems was currently limiting. Even sterile swabs, of which Labcorp had several million in storage at the start of the pandemic, were sold within a few days – finding alternative swabs and an alternative VTM that works without a cold-chain was and is a challenge. He noted that there are major logistics challenges in handling millions of samples per day, and doubted that any provider could set up the necessary technology within the likely time-scale of the current pandemic.

Dr. Heichman noted that extraction from sample swabs was likely to be rate-limiting and that agricultural biotechnology companies have many insights into dealing with intransigent biological specimens. Her team goal is to develop solutions to such problems, which will be made publicly available. Dr. O’Leary noted that current logistical technology used during poultry management might be appropriate for SARS-CoV-2 sample handling and that pre-analytical issues, like correct NP sampling, although challenging, will be vital for rigorous epidemiology.

What proportion of testing was performed by public Health Laboratories in comparison to private commercial reference laboratory and individual hospital laboratories? Dr. Tu described how the Everett Clinic, north of Seattle, had the first community-acquired patient and COVID-19 hospitalization in the U.S. The State Public Health Laboratory is between Everett and Seattle. In the last week of February and the first two weeks of March, all the testing was performed in the state lab. By Tuesday March 10, 2020, the University of Washington had set up the CDC EUA SARS-Cov-2 test, and on Thursday (the 12th) they agreed to run the Everett trial, but by Saturday (the 16th) their capacity was also overrun by demand. Everett turned to QUEST diagnostics to run the tests, but they too were rapidly overrun such that their high-throughput PCR machines routinely broke down. Much of the reported usage of multiple platforms by clinical test laboratories is due to such availability issues. The state public health laboratory was vital, as they had the only testing facilities available at the start of the pandemic. Dr. Rogers noted that the situation varied between states, and that his institute has provided support for clinics across the state. A coordinated central U.S. wide testing mechanism would be highly desirable, but doesn’t exist.

Developing a rigorous algorithm for multiplex testing in the forthcoming respiratory disease season was agreed to be critical.

Five breakout discussion groups followed considering the translational diagnostic pipeline; biomarkers to predict COVID-19; what we know about serology testing; on-going challenges with COVID-19 diagnostics; and Rapid Testing.