arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia...

12
Longitudinal high-throughput TCR repertoire profiling reveals the dynamics of T cell memory formation after mild COVID-19 infection Anastasia A. Minervina 1 , Ekaterina A. Komech 1,2 , Aleksei Titov 3 , Meriem Bensouda Koraichi 4 , Elisa Rosati 5 , Ilgar Z. Mamedov 1,2,6,7 , Andre Franke 5 , Grigory A. Efimov 3 , Dmitriy M. Chudakov 1,2,6,8 , Thierry Mora 4 , Aleksandra M. Walczak 4 , Yuri B. Lebedev 1,9 , Mikhail V. Pogorelyy 1,2 1 Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia 2 Pirogov Russian National Research Medical University, Moscow, Russia 3 National Research Center for Hematology, Moscow, Russia 4 Laboratoire de physique de l’ ´ Ecole normale sup´ erieure, ENS, PSL, Sorbonne Universit´ e, Universit´ e de Paris, and CNRS, Paris, France 5 Institute of Clinical Molecular Biology, Kiel University, Kiel, Germany 6 Masaryk University, Central European Institute of Technology, Brno, Czech Republic 7 V.I. Kulakov National Medical Research Center for Obstetrics, Gynecology and Perinatology, Moscow, Russia 8 Center of Life Sciences, Skoltech, Moscow, Russia 9 Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the SARS-CoV-2 coronavirus. The T cell response is a critical part of both individual and herd immunity to SARS-CoV-2 and the efficacy of developed vaccines. However neither the dynamics and cross-reactivity of the SARS-CoV-2-specific T cell response nor the diversity of resulting immune memory are well understood. In this study we use longitudinal high-throughput T cell receptor sequencing to track changes in the T cell repertoire fol- lowing two mild cases of COVID-19 infection. In both donors we identified SARS-CoV-2-responding CD4+ and CD8+ T cell clones. We describe characteristic motifs in TCR sequences of COVID- 19-reactive clones, suggesting the existence of immunodominant epitopes. We show that in both donors the majority of infection-reactive clonotypes acquire memory phenotypes. Certain CD4+ clones were detected in the memory fraction at the pre-infection timepoint, suggesting participation of pre-existing cross-reactive memory T cells in the immune response to SARS-CoV-2. INTRODUCTION COVID-19 is a global pandemic caused by the novel SARS-CoV-2 betacoronavirus [1]. T cells are crucial for clearing respiratory viral infections and providing long- term immune memory [2, 3]. Two major subsets of T cells participate in the immune response to viral infec- tion in different ways: activated CD8+ T cells directly kill infected cells, while subpopulations of CD4+ T cells produce signaling molecules that regulate myeloid cell behaviour, drive and support CD8 response and the for- mation of long-term CD8 memory, and participate in the selection and affinity maturation of antigen specific B- cells, ultimately leading to the generation of neutralizing antibodies. In SARS-1 survivors, antigen-specific mem- ory T cells were detected up to 11 years after the ini- tial infection, when viral-specific antibodies were unde- tectable [4, 5]. The T cell response was shown to be critical for protection in SARS-1-infected mice [6]. Pa- tients with X-linked agammaglobulinemia, a genetic dis- order associated with lack of B cells, have been reported to recover from symptomatic COVID-19 [7, 8], suggesting that in some cases T cells are sufficient for viral clearance. Theravajan et al. showed that activated CD8+HLA- DR+CD38+ T cells in a mild case of COVID-19 signif- icantly expand following symptom onset, reaching their peak frequency of 12% of CD8+ T cells on day 9 af- ter symptom onset, and contract thereafter [9]. Given the average time of 5 days from infection to the onset of symptoms [10], the dynamics and magnitude of T cell response to SARS-CoV-2 is similar to that observed af- ter immunization with live vaccines [11]. The exact im- munodominant CD8+ and CD4+ SARS-CoV-2 epitopes are yet unknown. However, SARS-CoV-2-specific T cells were detected in COVID-19 survivors by activation fol- lowing stimulation with SARS-CoV-2 proteins [12], or by viral protein-derived peptide pools [13, 14]. Some of the T cells activated by peptide stimulation were shown to have a memory phenotype [13], and some potentially cross-reactive CD4+ T cells were found in healthy donors [14, 15]. T cells recognise short pathogen-derived peptides pre- sented on the cell surface of the Major Histocompatibil- ity Complex (MHC) using hypervariable T cell receptors arXiv:2005.08290v1 [q-bio.GN] 17 May 2020

Transcript of arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia...

Page 1: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

Longitudinal high-throughput TCR repertoire profiling reveals the dynamics of T cellmemory formation after mild COVID-19 infection

Anastasia A. Minervina1, Ekaterina A. Komech1,2, Aleksei Titov3, Meriem Bensouda Koraichi4, Elisa

Rosati5, Ilgar Z. Mamedov1,2,6,7, Andre Franke5, Grigory A. Efimov3, Dmitriy M. Chudakov1,2,6,8,

Thierry Mora4, Aleksandra M. Walczak4, Yuri B. Lebedev1,9, Mikhail V. Pogorelyy1,2

1 Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia2Pirogov Russian National Research Medical University, Moscow, Russia

3National Research Center for Hematology, Moscow, Russia4Laboratoire de physique de l’Ecole normale superieure,

ENS, PSL, Sorbonne Universite,Universite de Paris, and CNRS, Paris, France

5Institute of Clinical Molecular Biology,Kiel University, Kiel, Germany

6Masaryk University,Central European Institute of Technology,

Brno, Czech Republic7V.I. Kulakov National Medical Research Center for Obstetrics,

Gynecology and Perinatology, Moscow, Russia8Center of Life Sciences, Skoltech, Moscow, Russia

9Moscow State University, Moscow, Russia

COVID-19 is a global pandemic caused by the SARS-CoV-2 coronavirus. The T cell response isa critical part of both individual and herd immunity to SARS-CoV-2 and the efficacy of developedvaccines. However neither the dynamics and cross-reactivity of the SARS-CoV-2-specific T cellresponse nor the diversity of resulting immune memory are well understood. In this study we uselongitudinal high-throughput T cell receptor sequencing to track changes in the T cell repertoire fol-lowing two mild cases of COVID-19 infection. In both donors we identified SARS-CoV-2-respondingCD4+ and CD8+ T cell clones. We describe characteristic motifs in TCR sequences of COVID-19-reactive clones, suggesting the existence of immunodominant epitopes. We show that in bothdonors the majority of infection-reactive clonotypes acquire memory phenotypes. Certain CD4+clones were detected in the memory fraction at the pre-infection timepoint, suggesting participationof pre-existing cross-reactive memory T cells in the immune response to SARS-CoV-2.

INTRODUCTION

COVID-19 is a global pandemic caused by the novelSARS-CoV-2 betacoronavirus [1]. T cells are crucial forclearing respiratory viral infections and providing long-term immune memory [2, 3]. Two major subsets of Tcells participate in the immune response to viral infec-tion in different ways: activated CD8+ T cells directlykill infected cells, while subpopulations of CD4+ T cellsproduce signaling molecules that regulate myeloid cellbehaviour, drive and support CD8 response and the for-mation of long-term CD8 memory, and participate in theselection and affinity maturation of antigen specific B-cells, ultimately leading to the generation of neutralizingantibodies. In SARS-1 survivors, antigen-specific mem-ory T cells were detected up to 11 years after the ini-tial infection, when viral-specific antibodies were unde-tectable [4, 5]. The T cell response was shown to becritical for protection in SARS-1-infected mice [6]. Pa-tients with X-linked agammaglobulinemia, a genetic dis-order associated with lack of B cells, have been reportedto recover from symptomatic COVID-19 [7, 8], suggesting

that in some cases T cells are sufficient for viral clearance.Theravajan et al. showed that activated CD8+HLA-DR+CD38+ T cells in a mild case of COVID-19 signif-icantly expand following symptom onset, reaching theirpeak frequency of 12% of CD8+ T cells on day 9 af-ter symptom onset, and contract thereafter [9]. Giventhe average time of 5 days from infection to the onsetof symptoms [10], the dynamics and magnitude of T cellresponse to SARS-CoV-2 is similar to that observed af-ter immunization with live vaccines [11]. The exact im-munodominant CD8+ and CD4+ SARS-CoV-2 epitopesare yet unknown. However, SARS-CoV-2-specific T cellswere detected in COVID-19 survivors by activation fol-lowing stimulation with SARS-CoV-2 proteins [12], orby viral protein-derived peptide pools [13, 14]. Some ofthe T cells activated by peptide stimulation were shownto have a memory phenotype [13], and some potentiallycross-reactive CD4+ T cells were found in healthy donors[14, 15].

T cells recognise short pathogen-derived peptides pre-sented on the cell surface of the Major Histocompatibil-ity Complex (MHC) using hypervariable T cell receptors

arX

iv:2

005.

0829

0v1

[q-

bio.

GN

] 1

7 M

ay 2

020

Page 2: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

2

(TCR). TCR repertoire sequencing allows for the quanti-tative tracking of T cell clones in time, as they go throughthe expansion and contraction phases of the response. Itwas previously shown that quantitative longitudinal TCRsequencing is able to identify antigen-specific expand-ing and contracting T cells in response to yellow fevervaccination with high sensitivity and specificity [16–18].Not only clonal expansion but also significant contractionfrom the peak of the response are distinctive traits of Tcell clones specific to the virus [17].

In this study we use longitudinal TCRalpha and TCR-beta repertoire sequencing to quantitatively track T cellclones that significantly expand and contract after a mildCOVID-19, and determine their phenotype. We revealthe dynamics and the phenotype of the memory cellsformed after infection, identify pre-existing T cell mem-ory clones participating in the response, and describepublic TCR sequence motifs of SARS-nCoV-2-reactiveclones, suggesting a response to immunodominant epi-topes.

RESULTS

On March 14th (day 0) donor W, 27 year old fe-male and donor M, 28 year old male, returned to Russiafrom the Rhone-Alpes region of France, one of the cen-ters of the COVID-19 outbreak in France at the time.Upon arrival, according to local regulations, they wereput into strict self-quarantine for 14 days. On day3 of self-isolation both developed low grade fever, fa-tigue and myalgia, which lasted 4 days and was fol-lowed by a temporary loss of smell for donor M. On days15, 30, 37 and 45 we collected peripheral blood sam-ples from both donors. The presence of SARS-CoV-2specific antibodies in the plasma was measured at alltimepoints using SARS-CoV-2 S-RBD domain specificELISA (Fig. S1). From each blood sample we isolatedPBMCs (peripheral blood mononuclear cells, in two bi-ological replicates), CD4+, and CD8+ T cells. Addi-tionally, on days 30 and 45 we isolated four T cell mem-ory subpopulations (Fig. S2): Effector Memory (EM:CCR7-CD45RA-), Effector Memory with CD45RA re-expression (EMRA: CCR7-CD45RA+), Central Mem-ory (CM: CCR7+CD45RA-), and Stem Cell-like Memory(SCM: CCR7+CD45RA+CD95+). From all samples weisolated RNA and performed TCRalpha and TCRbetarepertoire sequencing as previously described [19]. Forboth donors, TCRalpha and TCRbeta repertoires wereobtained for other projects one and two years prior to in-fection. Additionally, TCR repertoires of multiple sam-ples for donor M – including sorted memory subpopula-tions – are available from a published longitudinal TCRsequencing study after yellow fever vaccination (donorM1 samples in [16]).

From previously described activated T cell dynamicsfor SARS-CoV-2 [9], and immunization with live vac-cines [11], the peak of the T cell expansion is expected

around day 15 post-infection, and responding T cells sig-nificantly contract by day 45. However, Weiskopf etal. [13] reports an increase of SARS-CoV-2-reactive Tcells at later timepoints, peaking in some donors after30 days following symptom onset. To identify groupsof T cell clones with similar dynamics in an unbiasedway, we used Principal Component Analysis (PCA) inthe space of T cell clonal trajectories (Fig. 1b and c). Inboth donors, and in both TCRalpha and TCRbeta reper-toires, we identified three clusters of clones with distinctdynamics. The first cluster (Fig. 1bc, purple) corre-sponded to abundant TCR clonotypes which had con-stant concentrations across timepoints, the second clus-ter (Fig. 1bc, green) showed contraction dynamics fromday 15 to day 45, and the third cluster (Fig. 1bc, yel-low), showed an unexpected clonal expansion from day15 with a peak on day 37 followed by contraction. Theclustering and dynamics are similar in both donors andare reproduced in TCRbeta (Fig. 1bc) and TCRalpha(Fig. S3) repertoires. We next used edgeR, a software fordifferential gene expression analysis [20], to specificallydetect changes in clonotype concentration between pairsof timepoints in a statistically reliable way. EdgeR usesbiological replicate samples collected at each timepointto train a negative-binomial noise model. We identified374 TCRalpha and 373 TCRbeta clonotypes in donor W,and 797 TCRalpha and 672 TCRbeta in donor M signifi-cantly contracted from day 15 to day 45 (largely overlap-ping with cluster 2 of clonal trajectories). 438 TCRal-pha and 533 TCRbeta for donor W, and 531 TCRalphaand 599 TCRbeta clonotypes for donor M were signif-icantly expanded from day 15 to 37 (corresponding tocluster 3 of clonal trajectories). Reproducing the analysisusing NoiseET, a Bayesian differential expansion model[21], yielded similar results (see Fig. S4), suggestingthat our results are robust to the choice of statisticalmodel and parameter inference procedure. Note that,to identify putatively SARS-CoV-2 reactive clones, weonly used post-infection timepoints, so that our analysiscan be reproduced in other patients and studies wherepre-infection timepoints are unavailable. However, track-ing the identified responding clones back to pre-infectiontimepoints reveals strong clonal expansions from pre- topost-infection (Fig. 1de, Fig. S3cd). For brevity, wefurther refer to clonotypes significantly contracted fromday 15 to 45 as contracting clones and clonotypes sig-nificantly expanding from day 15 to 37 as expandingclones. For each contracting and expanding clone wedetermined their CD4/CD8 phenotype using separatelysequenced repertoires of CD4+ and CD8+ subpopula-tions (see Methods). Both CD4+ and CD8+ subsetsparticipated actively in the response (Fig. 1de). Inter-estingly, clonotypes expanding after day 15 were signifi-cantly biased towards the CD4+ phenotype, while con-tracting clones had balanced CD4/CD8 phenotype frac-tions in both donors (Fisher exact test, p < 0.0001 forboth donors).

On days 30 and 45 we identified both contracting (Fig.

Page 3: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

3

1.0

0.5

0.0

0.5

1.0

0.5 0.0 0.5 1.0 1.5PC1

PC2

0

0.25

0.50

0.75

1.00

15 30 37 45Days

Avg.

clo

ne tr

ajec

tory

0.5

0.0

0.5

1.0

0 1PC1

PC2

0

0.25

0.50

0.75

1.00

15 30 37 45Days

Avg.

clo

ne tr

ajec

tory

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Frac

tion

of re

activ

e ce

lls

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Frac

tion

of re

activ

e ce

lls

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

4

3

Timepoint after SARS CoV 2 infection

CD8

W

M

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

4

3

2

Timepoint after SARS CoV 2 infection

Total

CD4

CD8

W

M

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

4

3

2

Timepoint after SARS CoV 2 infection

Total

CD4

CD8

W

M

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

4

3

2

Timepoint after SARS CoV 2 infection

Total

CD4

CD8

W

M

W

M

a

b

c

d

e

FIG. 1. Longitudinal tracking of T cell clones after mild COVID-19. a, Study design. Peripheral blood of twodonors was sampled longitudinally on days 15, 30, 37, 45 after arrival in Russia. At each timepoint, we evaluated SARS-CoV-2-specific antibodies using ELISA (Fig. S1) and isolated PBMCs in two biological replicates. Additionally, CD4+ and CD8+cells were isolated from a separate portion of blood, and EM, CM, EMRA, SCM memory subpopulations were FACS sortedon days 30 and 45. For each sample we sequenced TCRalpha and TCRbeta repertoires. For both donors pre-infection PBMCrepertoires were sampled in 2018 and 2019 for other projects. b,c, PCA of clonal temporal trajectories identifies threegroups of clones with distinctive behaviours. Left: first two principal components of the 1000 most abundant TCRbetaclonotype normalized frequencies in PBMC at post-infection timepoints. Right: mean ± 2.96 SE of the normalized trajectoryof TCR clones in each cluster. Contracting (d) and expanding (e) clones include both CD4+ and CD8+ T cells,and are less abundant in pre-infection repertoires. T cell clones significantly contracted from day 15 to day 45 (d) andsignificantly expanded from day 15 to day 37 (e) were identified using edgeR in both donors. The fraction of contracting (d)and expanding (e) TCRbeta clonotypes in the total repertoire (corresponding to the fraction of responding cells of all T cells)is plotted in log-scale for all reactive clones (left), reactive clones with the CD4 (middle) and CD8 (right) phenotypes. Similardynamics were observed in TCRalpha repertoires (Fig. S3), and for significantly expanded/contracted clones identified withthe NoisET Bayesian differential expansion statistical model (Fig. S4).

2a,b) and expanding (SI Fig. 5a-c) T cell clones in thememory subpopulations of peripheral blood. Both CD4+and CD8+ responding clones were found in the CM andEM subsets, however CD4+ were more biased towardsCM, and CD8+ clones much more represented in theEMRA subset. A small number of both CD4+ and CD8+responding clonotypes were also identified in the SCMsubpopulation, which was previously shown to be a long-lived T cell memory subset [22]. Intriguingly, a numberof responding CD4+ clones, but very few CD8+ clones,were also represented in the repertoires of both donors 1and 2 years before the infection. Pre-existing clones wereexpanded after infection, and contracted afterwards forboth donors (SI Fig. 6). For donor M, for whom we hadpreviously sequenced memory subpopulations before the

infection [16], we were able to identify pre-existing SARS-CoV-2-reactive CD4+ clones in the CM subpopulation 1year before the infection. Interestingly, on day 30 afterinfection the majority of pre-infection CM clones weredetected in the EM subpopulation, suggesting recent Tcell activation and a switch of the phenotype from mem-ory to effector. These clones might represent memory Tcells cross-reactive for other infections, e.g. other humancoronaviruses. The search for responding clones TCR-beta amino acid sequences in VDJdb [23] – a database ofTCRs with known specificity – resulted in essentially nooverlap: a total of 3 matches all corresponded to the EBV(Epstein-Barr virus) epitope presented by the HLA-A*03MHC allele, which is absent in both donors (SI Table 1).At the time of the analysis there were no SARS-1, SARS-

Page 4: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

4

donor M

donor W

2 yearsbefore

1 yearbefore

Memoryday 30

Memoryday 45

0

0.2

0.4

0.6

0

0.2

0.4

0.6

Frac

tion

of re

activ

e cl

ones

CD4CD8

CD4 CD8

Memoryday 30

Memoryday 45

Memoryday 30

Memoryday 45

0

0.2

0.4

0.6

Frac

tion

of re

activ

e cl

ones

CMEMEMRASCM

CD4 CD8

Memory1 yearbefore

Memoryday 30

Memoryday 45

Memory1 yearbefore

Memoryday 30

Memoryday 45

0

0.2

0.4

0.6

Frac

tion

of re

activ

e cl

ones

CMEMEMRATscm

a b c

d fe

gAromatic

NegativelychargedNonpolar,aliphaticPolar,unchargedPositevlycharged

CD4 TCR alpha

WM

01234

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Bits

0

1

2

1 2 3 4 5 6 7 8 9 10 11 12 13

Bits

TRAV35/TRAJ42, n=39

TRAV9-2/TRAJ42, n=8

0

1

2

3

1 2 3 4 5 6 7 8 9 10 11 12 13

Bits

0

1

2

3

1 2 3 4 5 6 7 8 9 10 11

Bits

TRBV3-1/TRBJ1-1, n=12

TRBV5-1/TRBJ1-5, n=12

WM

WM

WM

CD4 TCR beta CD8 TCR alpha

CD8 TCR beta

FIG. 2. Memory phenotypes and TCR sequence motifs of responding clonotypes contracting after day 15.a, A large fraction of contracting clonotypes is identified in T cell memory subsets after infection. Fractionof contracting CD4+ and CD8+ TCRbeta clonotypes present in 1-year and 2-year pre-infection PBMC and in at least oneof memory subpopulation sampled on day 30 and day 37 post infection. b, Responding clones are found in differentmemory subsets. For both W (left) and M (right) donors, CD4+ clonotypes were more frequently found in the CentralMemory (CM) subset than CD8+ T cells, and both CD4+ and CD8+ contracting clonotypes were present in the EffectorMemory (EM) compartment. c, For donor M (right), CD4+ (but few CD8+) contracting clonotypes are also identified inmemory subsets 1 year before the infection, with a bias towards the CM subpopulation. d, Analysis of TCR aminoacid sequences of contracting clones reveal distinctive motifs. Each vertex in the similarity network correspondsto a contracting clonotype. An edge indicates 2 or less amino acid mismatches in the CDR3 region (and identical V and Jsegments). Networks are plotted separately for CD4alpha (d), CD4beta (e), CD8alpha (f) and CD8beta (g) contractingclonotypes. Clonotypes without neighbours are not shown. Sequence logos corresponding to the largest clusters are shownunder the corresponding network plots.

CoV-2 or any other coronavirus-specific TCR sequencesin VDJdb, so the absence of matches suggests that con-tracting and expanding clones are unlikely to be specificfor immunodominant epitopes of common pathogens cov-ered in VDJdb.

It was previously shown that TCRs recognising thesame antigens frequently have highly similar TCR se-quences [24, 25]. To identify motifs in TCR aminoacid sequences, we plotted similarity networks for sig-nificantly contracted (Fig. 2d-g) and expanded (Fig.

Page 5: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

5

S5d-g) clonotypes. In both donors we found clustersof highly similar clones in both CD4+ and CD8+ sub-sets for expanding and contracting clonotypes. Clus-ters were largely donor-specific, as expected, since ourdonors have dissimilar HLA alleles (SI Table 1) and thuseach is likely to present a non-overlapping set of T cellantigens. The largest cluster, described by the motifTRAV35-CAGXNYGGSQGNLIF-TRAJ42, was identi-fied in donor M’s CD4+ contracting alpha chains. Clonesfrom this cluster constituted 15.3% of all of donor M’sCD4+ responding cells on day 15, suggesting a responseto an immunodominant CD4+ epitope in the SARS-CoV-2 proteome. The high similarity of the TCR se-quences of responding clones in this cluster allowed us toindependently identify motifs from donor M’s CD4 alphacontracting clones using the ALICE algorithm [26] (Fig.S8). While the time dependent methods (Fig. 1) identifyabundant clones, the ALICE approach is complementaryto both edgeR and NoisET as it identifies clusters of Tcells with similar sequences independently of their indi-vidual abundances. Finding highly similar respondingTCR sequences in both donors, despite their differentHLA types, suggests that some of the identified respond-ing clones could be public, and are likely to be found inother SARS-CoV-2 infected individuals. Using a mecha-nistic model of TCR recombination [27, 28] we predictedthe probability of occurrence of the reactive clones (seeMethods). For TCRbeta we also looked for the identi-fied contracting and expanding clones in a large publicdataset of TCRbeta repertoires from [29]. The TCRbetaoccurrence probability correlated well with the frequencyof contracting and expanding clones in this public dataset(Fig. S7), suggesting that at least a fraction of the SARS-CoV-2 responding clones are public and are likely to beshared between patients.

DISCUSSION

Using longitudinal repertoire sequencing, we identifieda group of CD4+ and CD8+ T clones that contract af-ter recovery from a coronavirus infection. Our responsetimelines agree with T cell dynamics reported by Ther-avajan et al. [9] for mild COVID-19, as well as withdynamics of T cell response to live vaccines [11]. Surpris-ingly, in both donors we also identified a group of predom-inantly CD4+ clonotypes which expanded from day 15 today 37 after the infection. One possible explanation forthis second wave of expansion is the priming of CD4+ Tcells by antigen-specific B-cells, but there might be othermechanisms such as the migration of SARS-CoV-2 spe-cific T cells from lymphoid organs. It is also possible thatlater expanding T cells are triggered by another infection,simultaneously and asymptomatically occurring in bothdonors around day 30. The major limitation of our studyis that we do not know the specificity of contracted andexpanded clones observed after infection. However, accu-mulation of TCR sequences from SARS-CoV-2-specific T

cells will allow us to connect the TCR motifs we describedto particular epitopes of the virus and estimate their rel-ative contribution to the immune response. Databases ofSARS-CoV-2-associated TCR sequences combined withsequencing of patients’ TCR repertoires will allow for thetracking of the adaptive immune response in the earlystages of the disease, and possibly identify features cor-relating with clinical outcomes. We showed that a largefraction of putatively SARS-CoV-2 reactive T cell clonesare later found in memory subpopulations, and a subsetof CD4+ clones were identified in pre-infection centralmemory subsets. The presence of SARS-CoV-2 cross-reactive CD4 T cells in healthy individuals was recentlydemonstrated [14, 15]. Our data further suggests thatcross-reactive CD4+ T cells can participate in the re-sponse in vivo. It is interesting to ask if the presenceof cross-reactive T cells before infection is linked to themildness of the disease. Larger studies with cohorts ofsevere and mild cases with pre-infection timepoints areneeded to address this question.

METHODS

Donors and blood samples

Peripheral blood samples from two healthy volunteers,donor W (female, 27 years old) and donor M (male, 28years old) were collected with written informed consent ina certified diagnostics laboratory. Both donors gave writ-ten informed consent to participate in the study underthe declaration of Helsinki. HLA alleles of both donors(SI Table 1) were determined by an in-house cDNA high-throughput sequencing method.

SARS-CoV-2 S-RBD domain specific ELISA

An ELISA assay kit developed by the National Re-search Centre for Hematology was used for detection ofanti-S-RBD IgG according to the manufacturer’s proto-col. The relative IgG level was calculated by dividing theOD (optical density) values by the mean OD value of thecut-off positive control serum supplied with the Kit. ODvalues of 2 samples (d37 and d45) for donor M exceededthe limit of linearity for the Kit. In order to properlycompare the relative IgG levels between d30, d37 andd45, these samples were diluted 1:400 instead of 1:100,the ratios d37:d30 and d45:d30 were calculated and usedto calculate the relative IgG level of d37 and d45.

Isolation of PBMCs and T cell subpopulations

PBMCs were isolated with the Ficoll-Paque densitygradient centrifugation protocol. CD4+ and CD8+ Tcells were isolated from PBMCs with Dynabeads CD4+

Page 6: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

6

and CD8+ positive selection kits respectively. For iso-lation of EM, EMRA, CM and SCM memory subpop-ulations we stained PBMCs with the following anti-body mix: anti-CD3-FITC (UCHT1, eBioscience), anti-CD45RA-eFluor450 (HI100, eBioscience), anti-CCR7-APC (3D12, eBioscience), anti-CD95-PE (DX2, eBio-science). Cell sorting was performed on FACS Aria III,all four isolated subpopulations were lysed with Trizolreagent immediately after sorting.

TCR library preparation and sequencing

TCRalpha and TCRbeta cDNA libraries preparationwas performed as previously described in [19]. RNAwas isolated from each sample using Trizol reagent ac-cording to the manufacturer’s instructions. A universalprimer binding site, sample barcode and unique molec-ular identifier (UMI) sequences were introduced usingthe 5’RACE technology with TCRalpha and TCRbetaconstant segment specific primers for cDNA synthesis.cDNA libraries were amplified in two PCR steps, withintroduction of the second sample barcode and IlluminaTruSeq adapter sequences at the second PCR step. Li-braries were sequenced using the Illumina NovaSeq plat-form (2x150bp read length).

TCR repertoire data analysis

Raw data preprocessing. Raw sequencing data wasdemultiplexed and UMI guided consensuses were built us-ing migec v.1.2.7 [30]. Resulting UMI consensuses werealigned to V and J genomic templates of the TRA andTRB locus and assembled into clonotypes with mixcrv.2.1.11 [31]. See SI Table 2 for the number of cells,UMIs and unique clonotypes for each sample.

Identification of clonotypes with active dynam-ics. Principal component analysis (PCA) of clonal tra-jectories was performed as described before [16]. Firstwe selected clones which were present among the top1000 abundant in any of post-infection PBMC reper-toires. Next, for each clone we calculated its frequency ateach post-infection timepoint and divided this frequency

by the maximum frequency of this clone for normaliza-tion. Then we performed PCA on the resulting normal-ized clonal trajectory matrix and identified three clustersof trajectories with hierarchical clustering with completelinkage, using Euclidean distances between trajectories.We identify statistically significant contractions and ex-pansions with edgeR as previously described [17], usingFDR adjusted p < 0.01 and log2 fold change thresholdof 1. NoisET implements the Bayesian detection methoddescribed in [21]. Briefly, a two-step noise model account-ing for cell sampling and expression noise is inferred fromreplicates, and a second model of expansion is learnedfrom the two timepoints to be compared. The procedureoutputs the posterior probability of expansion or contrac-tion, and the median estimated log2 fold change, whosethresholds are set to 0.05 and 1 respectively.

TCR generative probability estimation. Modelsof V(D)J-recombination for the TCRalpha and TCRbetaloci were inferred for each individual using IGoR [32].Selection models were then inferred and evaluated usingSONIA [28] to estimate TCR occurrence probabilities.

Data availability

Raw sequencing data are deposited to the Short ReadArchive (SRA) accession: PRJNA633317. ProcessedTCRalpha and TCRbeta repertoire datasets, resultingrepertoires of SARS-CoV-2-reactive clones, and raw datapreprocessing instructions can be accessed from: https://github.com/pogorely/Minervina_COVID.

ACKNOWLEDGMENTS

The study was supported by RSF 20-15-00351. E.Rand A.F. are supported by the DFG Excellence ClusterPrecision Medicine in Chronic Inflammation (Exc2167)and the DFG grant n. 4096610003, M.B.K., T.M., andA.M.W. are supported by European Research CouncilCOG 724208. I.Z.M. is supported by RFBR 19-54-12011and 18-29-09132. D.M.C. is supported by the grant fromMinistry of Science and Higher Education of the RussianFederation 075-15-2019-1789.

[1] Vabret N, et al. (2020) Immunology of COVID-19: current state of the science. Immunity pS1074761320301837.

[2] Schmidt ME, Varga SM (2018) The CD8 T Cell Responseto Respiratory Virus Infections. Frontiers in Immunology9:678.

[3] Swain SL, McKinstry KK, Strutt TM (2012) Expandingroles for CD4+ T cells in immunity to viruses. NatureReviews Immunology 12:136–148.

[4] Ng OW, et al. (2016) Memory T cell responses target-ing the SARS coronavirus persist up to 11 years post-

infection. Vaccine 34:2008–2014.[5] Oh HLJ, et al. (2011) Engineering T Cells Specific

for a Dominant Severe Acute Respiratory SyndromeCoronavirus CD8 T Cell Epitope. Journal of Virology85:10464–10471.

[6] Zhao J, Zhao J, Perlman S (2010) T Cell ResponsesAre Required for Protection from Clinical Disease and forVirus Clearance in Severe Acute Respiratory SyndromeCoronavirus-Infected Mice. Journal of Virology 84:9318–9325.

Page 7: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

7

[7] Quinti I, et al. (2020) A possible role for B cells inCOVID-19? Lesson from patients with agammaglobu-linemia. Journal of Allergy and Clinical Immunology pS0091674920305571.

[8] Soresina A, et al. (2020) Two Xlinked agammaglobuline-mia patients develop pneumonia as COVID19 manifes-tation but recover. Pediatric Allergy and Immunology ppai.13263.

[9] Thevarajan I, et al. (2020) Breadth of concomitant im-mune responses prior to patient recovery: a case reportof non-severe COVID-19. Nature Medicine 26:453–455.

[10] Bi Q, et al. (2020) Epidemiology and transmission ofCOVID-19 in 391 cases and 1286 of their close contactsin Shenzhen, China: a retrospective cohort study. TheLancet Infectious Diseases p S1473309920302875.

[11] Miller JD, et al. (2008) Human effector and memoryCD8+ T cell responses to smallpox and yellow fever vac-cines. Immunity 28:710–722.

[12] Ni L, et al. (2020) Detection of SARS-CoV-2-Specific Hu-moral and Cellular Immunity in COVID-19 ConvalescentIndividuals. Immunity p S1074761320301813.

[13] Weiskopf D, et al. (2020) Phenotype of SARS-CoV-2-specific T-cells in COVID-19 patients with acute respira-tory distress syndrome. medRxiv.

[14] Braun J, et al. (2020) Presence of SARS-CoV-2 reac-tive T cells in COVID-19 patients and healthy donors.medRxiv.

[15] Grifoni A, et al. (2020) Targets of T cell responses toSARS-CoV-2 coronavirus in humans with COVID-19 dis-ease and unexposed individuals. Cell.

[16] Minervina AA, et al. (2020) Primary and secondary anti-viral response captured by the dynamics and phenotypeof individual T cell clones. eLife 9:e53704.

[17] Pogorelyy MV, et al. (2018) Precise tracking of vaccine-responding T cell clones reveals convergent and person-alized response in identical twins. Proceedings of the Na-tional Academy of Sciences of the United States of Amer-ica 115:12704–12709.

[18] DeWitt WS, et al. (2015) Dynamics of the cytotoxic Tcell response to a model of acute viral infection. Journalof Virology 89:4517–4526.

[19] Pogorelyy MV, et al. (2017) Persisting fetal clonotypesinfluence the structure and overlap of adult human T

cell receptor repertoires. PLoS computational biology13:e1005572.

[20] Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: aBioconductor package for differential expression analysisof digital gene expression data. Bioinformatics (Oxford,England) 26:139–140.

[21] Puelma Touzel M, Walczak AM, Mora T (2020) Inferringthe immune response from repertoire sequencing. PLOSComputational Biology 16:e1007873.

[22] Fuertes Marraco SA, et al. (2015) Long-lasting stem cell-like memory CD8+ T cells with a nave-like profile uponyellow fever vaccination. Science Translational Medicine7:282ra48.

[23] Bagaev DV, et al. (2020) VDJdb in 2019: database ex-tension, new analysis infrastructure and a T-cell receptormotif compendium. Nucleic Acids Research 48:D1057–D1062.

[24] Dash P, et al. (2017) Quantifiable predictive featuresdefine epitope-specific T cell receptor repertoires. Nature547:89–93.

[25] Glanville J, et al. (2017) Identifying specificity groups inthe T cell receptor repertoire. Nature 547:94–98.

[26] Pogorelyy MV, et al. (2019) Detecting T cell receptors in-volved in immune responses from single repertoire snap-shots. PLOS Biology 17:e3000314.

[27] Sethna Z, Elhanati Y, Callan CG, Walczak AM, Mora T(2019) OLGA: fast computation of generation probabili-ties of B- and T-cell receptor amino acid sequences andmotifs. Bioinformatics 35:2974–2981.

[28] Sethna Z, et al. (2020) Population variability in thegeneration and thymic selection of T-cell repertoires.arXiv:2001.02843 [q-bio] arXiv: 2001.02843.

[29] Emerson RO, et al. (2017) Immunosequencing identifiessignatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nature Genetics49:659–665.

[30] Shugay M, et al. (2014) Towards error-free profiling ofimmune repertoires. Nature Methods 11:653–655.

[31] Bolotin DA, et al. (2015) MiXCR: software for com-prehensive adaptive immunity profiling. Nature Methods12:380–381.

[32] Marcou Q, Mora T, Walczak AM (2018) High-throughput immune repertoire analysis with IGoR. Na-ture Communications 9:561.

Page 8: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

8

1

2

3

4

5

6

7

8

9

10

0 15 30 37 45Days after SARS−CoV−2 infection

Rel

ative

IgG

leve

l

WM

+—

FIG. S1. Both donors developed anti-SARS-CoV-2 IgG response by day 30 post infection. The relative level ofSARS-CoV-2 S-RBD domain specific IgG (y-axis) is plotted against time. Solid black line shows the threshold for positivetesting.

0 50K 100K 150K 200K 250K

FSC-A

0

50K

100K

150K

200K

250K

SSC-A

0 50K 100K 150K 200K 250K

FSC-W

0

50K

100K

150K

200K

250K

FSC-H

CD3+

0 102

103

104

105

CD3

0

50K

100K

150K

200K

250K

SSC-W

CM

EM EMRA

Naive

0 102

103

104

105

CD45RA

0

102

103

104

105

CCR7

TSCM

0 50K 100K 150K 200K 250K

SSC-H

0

102

103

104

105

CD95

Singlets

Lymphocytes

FIG. S2. Memory subpopulation gating strategy. Three populations of memory T cells: EM, CM and EMRA aredefined by CCR7 and CD45RA markers, SCM are distinguished from naive CCR7+ CD45RA+ T cells by CD95 expression.

Page 9: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

9

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Frac

tion

of re

activ

e ce

lls

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Frac

tion

of re

activ

e ce

lls

0.5

0.0

0.5

1.0

0.5 0.0 0.5 1.0 1.5PC1

PC2

0

0.25

0.50

0.75

1.00

15 30 37 45Days

Avg.

clo

ne tr

ajec

tory

0.5

0.0

0.5

1.0

0 1PC1

PC2

0

0.25

0.50

0.75

1.00

15 30 37 45Days

Avg.

clo

ne tr

ajec

tory

W

M

a

b

c

d

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Total

CD4

CD8

W

M

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Total

CD4

CD8

W

M

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

10

Timepoint after SARS CoV 2 infection

M

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Total

CD4

CD8

W

M

FIG. S3. Longitudinal tracking of T cell clones after mild COVID-19 with TCRalpha repertoire sequencing.a,b, PCA of clonal temporal trajectories identifies three groups of clones with distinctive behaviours. Left:first two principal components of 1000 most abundant TCRalpha clonotype normalized frequencies at post-infection PBMCtimepoints. Right: mean +- 2.96 SE of the normalized trajectory of TCR clones in each cluster. c,d Contracting andexpanding clones include both CD4+ and CD8+ T cells and are less abundant in pre-infection repertoires.T cell clones significantly contracted from day 15 to day 45 (c) and significantly expanded from day 15 to day 37 (d) wereidentified using edgeR in both donors. The fraction of contracting (c) and expanding (d) TCRalpha clonotypes in the totalrepertoire (corresponding to the fraction of responding cells of all T cells) is plotted in log-scale for all reactive clones (left),reactive clones with the CD4 (middle) and the CD8 phenotype (right).

donor W donor M

2yrs 1y

r 0 15 30 37 452yrs 1y

r 0 15 30 37 4510 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Frac

tion

of re

activ

e ce

lls

noisET

edgeR

donor W donor M

2yrs 1y

r 0 15 30 37 452yrs 1y

r 0 15 30 37 4510 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Frac

tion

of re

activ

e ce

lls

noisET

edgeR

a

d

donor W

donor M

donor W

donor M

b

c

e

f

166 207 90

177 595 350

243 290 50

121 478 566

FIG. S4. Comparison of clonal expansion detection procedures. The fraction (plotted in the log-scale) of contracting(a) and expanding (d) TCRbeta clonotypes in the total repertoire was estimated using subsets of expanded and contractedclones called by edgeR (green) and NoisET (purple) models. Overlaps for contracted clones (b,c) and expanded clones (e,f)identified by both models are shown on the right.

Page 10: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

10

donor M

donor W

2 yearsbefore

1 yearbefore

Memoryday 30

Memoryday 45

0.00

0.25

0.50

0.75

0

0.25

0.50

0.75

Frac

tion

of re

activ

e cl

ones

CD4CD8

CD4 CD8

Memoryday 30

Memoryday 45

Memoryday 30

Memoryday 45

0

0.2

0.4

0.6

Frac

tion

of re

activ

e cl

ones

CMEMEMRASCM

CD4 CD8

Memory1 yearbefore

Memoryday 30

Memoryday 45

Memory1 yearbefore

Memoryday 30

Memoryday 45

0

0.2

0.4

0.6

Frac

tion

of re

activ

e cl

ones

CMEMEMRATscm

a b c

Aromatic

NegativelychargedNonpolar,aliphaticPolar,unchargedPositevlycharged

CD4 TCR alpha

WM

TRAV12-3/TRAJ6, n=12

TRBV2/TRBJ2-1, n=7

TRBV4-1/TRBJ2-3, n=9

WM

WM

WM

CD4 TCR beta CD8 TCR alpha

CD8 TCR beta

0

1

2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Bits

0.00.51.01.52.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Bits

0

1

2

3

1 2 3 4 5 6 7 8 9 10 11 12 13

Bits

d fe

g

FIG. S5. Memory phenotypes and TCR sequence motifs of responding clonotypes expanding from day 15 today 37. a, A large fraction of expanding clonotypes is identified in T cell memory subsets after infection. Asmall fraction of expanding CD4+ and CD8+ TCRbeta clonotypes present in 1-year and 2-year pre-infection PBMC and inat least one of memory subpopulation sampled on day 30 and day 37 post infection. b, Responding clones are foundin different memory subsets. For both W (left) and M (right) donors, CD4+ clonotypes were more frequently found inthe Central Memory (CM) subset than CD8+ T cells, and both CD4+ and CD8+ expanding clonotypes were present in theEffector Memory (EM) compartment. c, For donor M (right), CD4+ (but few CD8) expanding clonotypes are also identifiedin memory subsets 1 year before the infection, with a bias towards the CM subpopulation. d, Analysis of TCR aminoacid sequences of expanding clones revealed distinctive motifs. Each vertex in the similarity network corresponds toa expanding clonotype. An edge indicates 2 or less amino acid mismatches in the CDR3 region (same V and J segments arerequired for connected vertices). Networks are plotted separately for CD4alpha (d), CD4beta (e), CD8alpha (f) and CD8beta(g) contracting clonotypes. Clonotypes without neighbours are not shown. Sequence logos corresponding to the largest clustersare shown under corresponding network plots.

Page 11: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

11

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 4510 5

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Frac

tion

of re

activ

e ce

lls Total

CD4

CD8

W

M

Total CD4 CD8

2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 45 2yrs 1y

r 0 15 30 37 4510 5

10 4

10 3

10 2

Timepoint after SARS CoV 2 infection

Frac

tion

of re

activ

e ce

lls Total

CD4

CD8

W

M

a

b

FIG. S6. Dynamics of pre-existing SARS-CoV-2 responding clones. The fraction of pre-existing (identified in -1 yr and/or -2 yr timepoint pre-infection) contracting (a) and expanding (b) TCRbeta clonotypes in the total repertoire(corresponding to the fraction of responding cells of all T cells) is plotted in log-scale for all reactive clones (left), reactiveclones with the CD4 (middle) and the CD8 phenotype (right).

Page 12: arXiv:2005.08290v1 [q-bio.GN] 17 May 2020 · 8Center of Life Sciences, Skoltech, Moscow, Russia 9Moscow State University, Moscow, Russia COVID-19 is a global pandemic caused by the

12

0

5

20

100

500

10 20 10 15 10 10 10 5

Ppost

Num

ber o

f don

ors

0

0.05

0.10

0.15

0.20

10 20 10 15 10 10 10 5

Ppost

Den

sity

a

b

FIG. S7. TCR occcurence probability predicts public responding clonotypes. a Probability of TCR generationand selection (Ppost) of TCRbeta clones with active dynamics (both contracted and expanded from both donors) is stronglycorrelated with the occurrence of these clones in the large public database of TCRbeta repertoires from Emerson et al. [29].(b)Distribution for TCR occurency probability of SARS-CoV-2 responding TCRbeta clonotypes. Dashed line shows TCR oc-currence probability distribution for all clones in total repertoire. Some identified clones are public and have large recombinationprobabilities, and thus are expected to be shared with additional SARS-CoV-2 patients.

TRAV35/TRAJ42, n=994

12 13 14 150123

1 2 3 4 5 6 7 8 9 10 11

Bits

FIG. S8. ALICE algorithm output for TCRalpha PBMC repertoire of donor M on day 15. Similarity networkshows ALICE hits (clones in repertoire with more neighbours than expected by chance), which differ by 2 mismatches or less inTCRalpha amino acid sequence. Darker colors indicate larger frequency of clone in the repertoire, vertex size indicates degree.The majority (54%, 99/183) of hits identified by the algorithm correspond to a single large TRAV35/TRAJ42 cluster of CD4+contracting clones also seen on Fig. 1d.