To make it easy to do rigorous analysis of AIRR-seq data.
The purpose of our working group is to encourage practices that enable software tools to work, and to work with one another. A key first priority is to assemble data sets that people can use to test and compare the functionality of various programs. To progress this, we have defined summary statistics that can be used to characterize simulated data sets and compare them to real-world data sets. The result is a software tool, Sumrep, which is described in a recent paper. We will use this tool to assist in the collection of simulated and real-world data sets for testing and benchmarking.
We have also defined a standard for AIRR-Seq software tools. We will promote this over the next year, as a way of encouraging inter-operation and adoption of AIRR standard protocols by providing community support and publicity to compliant tools. We have also started work (in collaboration with the Germline Database Working Group) on an initiative to assess the biological credibility of an AIRR-Seq repertoire, and to identify common technical errors that can occur during its preparation, which can be heard to spot from read quality annotations and other technical measures commonly available today.
Plans for 2019/2020 include:
- Encourage better simulation via summary statistics
– Finish the initial release of Sumrep and publish a paper applying it to selected datasets. Now complete.
- Evaluate annotation tools, using simulated and real-world data
– Identify simulated and real-world datasets that are useful for evaluation.
– Develop a framework for the comparison of results.
- Encourage standard interchange formats
– Encourage tool providers to submit their tools for review against the guidelines. Promote by issuing a ‘badge’ that providers can use to confirm compatibility.
- A tool to assess biological quality/credibility of a repertoire – with GLDB WG
– Assess genome usage/coverage
– Hybridisation (existing work in GLDB WG)
– Identify other common technical errors
Co-Leaders: William Lees and Chaim Schramm
Members: Bryan Briney, Christian Busse, Brian Corrie, Azahara Fuentes Trillo, Susanna Marquez, Eric Matsen, Enkelejda Miho, Pejvak Moghimi, Nima Nouri, Mats Ohlin, Branden Olson, Adrian Shepherd, Mikhail Shugay, and Jason Vander Heiden