Develop a set of metadata standards (MiAIRR) for the submission of adaptive immune receptor repertoire sequencing (AIRR-seq) datasets. Develop standardized file formats, schemas and data field names to represent MiAIRR metadata, annotated antibody and T cell receptor sequences, and any downstream data representations. These standards are defined in formal machine-readable specifications, allowing interoperability between software from different developers.
Plans for 2022/23 include:
The next cycle will focus primarily on refinement of experimental schemas for release in production ready versions along with a manuscript.
- Releasing AIRR Standards v1.4, which is scheduled to include:
- Experimental release of the germline database schemas
- Experimental release of the single-cell schemas
- Experimental release of the receptor schema
- Updates to abundance fields to account for new technologies
- Support for additional schemas in the R and Python libraries
- Abandonment of Python v2 support
- Various minor improvements to field definitions and documentation
- Releasing AIRR Standards v2.0, which is scheduled to include:
- Production release of the germline database schemas
- Production release of the single-cell schemas
- Production release of receptor schemas
- Production release of the lineage schemas
- Experimental release of a file manifest schema, repertoire grouping schema, and a persistent identifier definition
- Several small, but backwards incompatible changes.
- Drafting a manuscript to accompany the v2.0 release describing new standards development since the original Minimal Standards and Data Representations publications in 2017 and 2018, respectively.
Long-term vision and how WG products integrate with the AIRR-C mission:
The Standards WG aims to facilitate data sharing and interoperability of analysis tools within the AIRR-seq field through common data and metadata standards and documentation.
Co-leaders: Christian Busse and Jason Vander Heiden
Members: Scott Christley, Chaim Schramm, Brian Corrie, William Lees, Felix Breden, Florian Rubelt, Lindsay Cowell, Nina Luning Prak, Veronique Giudicelli, Eli Harkins, Kenneth Hoehn, Susanna Marquez, Ulrik Stervbo, Kira Neller, Aditi Jain, Jingyun Li, Adrien Six, Artur Roca, Edward Lee, Marco Oliveira, Bjorn Peters, Francisco Arcila, Katharina Imkeller, Nicole Knoetze, Enkelejda Miho
- Machine-readable, open source schemas for AIRR-seq data.
- Reference API libraries in R and Python providing read, write and validation operations for finalized schema.
- Detailed documentation for Standards WG products, products of other WGs, public data submission, and listing of compliant community tools.