Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J...

55
1 Revealing the impact of recurrent and rare structural 1 variants in multiple myeloma 2 3 Even H Rustad 1 , Venkata D Yellapantula 1 , Dominik Glodzik 2 , Kylee H Maclachlan 1 , 4 Benjamin Diamond 1 , Eileen M Boyle 3 , Cody Ashby 4 , Patrick Blaney 3 , Gunes Gundem 2 , 5 Malin Hultcrantz 1 , Daniel Leongamornlert 5 , Nicos Angelopoulos 6 , Daniel Auclair 7 , 6 Yanming Zhang 8 , Ahmet Dogan 9 , Niccolò Bolli 10-11 , Elli Papaemmanuil 2 , Kenneth C. 7 Anderson 12 , Philippe Moreau 13 , Herve Avet-Loiseau 14 , Nikhil Munshi 12,15 , Jonathan 8 Keats 16 , Peter J Campbell 5 , Gareth J Morgan 3 , Ola Landgren 1 and Francesco Maura 1 9 10 1 Myeloma Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, 11 New York, NY, USA; 12 2 Epidemiology & Biostatistics, Department of Medicine, Memorial Sloan Kettering 13 Cancer Center, New York, NY, USA; 14 3 NYU Perlmutter Cancer Center, New York, NY, USA; 15 4 Myeloma Center, University of Arkansas for Medical Sciences, Little Rock, AR, USA; 16 5 The Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, 17 Hinxton, Cambridgeshire, United Kingdom; 18 6 School of Computer Science and Electronic Engineering, University of Essex, 19 Colchester, United Kingdom; 20 7 Multiple Myeloma Research Foundation (MMRF), Norwalk, US-CT; 21 8 Cytogenetics Laboratory, Department of Pathology, Memorial Sloan Kettering Cancer 22 Center, New York, NY, USA; 23 9 Hematopathology Service, Department of Pathology, Memorial Sloan Kettering Cancer 24 Center, New York, NY, USA; 25 10 Department of Medical Oncology and Hemato-Oncology, University of Milan, Milan, 26 Italy; 27 11 Department of Medical Oncology and Hematology, Fondazione IRCCS Istituto 28 Nazionale dei Tumori, Milan, Italy; 29 12 Jerome Lipper Multiple Myeloma Center, Dana–Farber Cancer Institute, Harvard 30 Medical School, Boston, MA; 31 13 CRCINA, INSERM, CNRS, Université d’Angers, Université de Nantes, Nantes, 32 France; 33 14 IUC-Oncopole, and CRCT INSERM U1037, 31100, Toulouse, France; 34 15 Veterans Administration Boston Healthcare System, West Roxbury, MA; 35 16. Translational Genomics Research Institute (TGen), Phoenix, AZ, USA; 36 37 Running Title: Impact of structural variants in myeloma 38 39 Key words: Multiple Myeloma, structural variants, hotspots, chromothripsis, templated 40 insertion, chromoplexy. 41 42 43 44 45 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint this version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086 doi: bioRxiv preprint

Transcript of Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J...

Page 1: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

1

Revealing the impact of recurrent and rare structural 1

variants in multiple myeloma 2 3 Even H Rustad1, Venkata D Yellapantula1, Dominik Glodzik2, Kylee H Maclachlan1, 4 Benjamin Diamond1, Eileen M Boyle3, Cody Ashby4, Patrick Blaney3, Gunes Gundem2, 5 Malin Hultcrantz1, Daniel Leongamornlert5, Nicos Angelopoulos6, Daniel Auclair7, 6 Yanming Zhang8, Ahmet Dogan9, Niccolò Bolli10-11, Elli Papaemmanuil2, Kenneth C. 7 Anderson12, Philippe Moreau13, Herve Avet-Loiseau14, Nikhil Munshi12,15, Jonathan 8 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 9 10 1 Myeloma Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, 11 New York, NY, USA; 12 2 Epidemiology & Biostatistics, Department of Medicine, Memorial Sloan Kettering 13 Cancer Center, New York, NY, USA; 14 3 NYU Perlmutter Cancer Center, New York, NY, USA; 15 4 Myeloma Center, University of Arkansas for Medical Sciences, Little Rock, AR, USA; 16 5 The Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, 17 Hinxton, Cambridgeshire, United Kingdom; 18 6 School of Computer Science and Electronic Engineering, University of Essex, 19 Colchester, United Kingdom; 20 7 Multiple Myeloma Research Foundation (MMRF), Norwalk, US-CT; 21 8 Cytogenetics Laboratory, Department of Pathology, Memorial Sloan Kettering Cancer 22 Center, New York, NY, USA; 23 9 Hematopathology Service, Department of Pathology, Memorial Sloan Kettering Cancer 24 Center, New York, NY, USA; 25 10 Department of Medical Oncology and Hemato-Oncology, University of Milan, Milan, 26 Italy; 27 11 Department of Medical Oncology and Hematology, Fondazione IRCCS Istituto 28 Nazionale dei Tumori, Milan, Italy; 29 12 Jerome Lipper Multiple Myeloma Center, Dana–Farber Cancer Institute, Harvard 30 Medical School, Boston, MA; 31 13 CRCINA, INSERM, CNRS, Université d’Angers, Université de Nantes, Nantes, 32 France; 33 14 IUC-Oncopole, and CRCT INSERM U1037, 31100, Toulouse, France; 34 15 Veterans Administration Boston Healthcare System, West Roxbury, MA; 35 16. Translational Genomics Research Institute (TGen), Phoenix, AZ, USA; 36 37 Running Title: Impact of structural variants in myeloma 38 39 Key words: Multiple Myeloma, structural variants, hotspots, chromothripsis, templated 40 insertion, chromoplexy. 41 42 43 44 45

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 2: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

2

Corresponding Author: 46 Francesco Maura, MD, 47 Myeloma Service, Department of Medicine 48 Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065 49 T: +1 646.608.3934 | F: +1 646.277.7116 50 [email protected] 51 52

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 3: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

3

Summary 53

The landscape of structural variants (SVs) in multiple myeloma remains poorly 54

understood. Here, we performed comprehensive classification and analysis of SVs in 55

multiple myeloma, interrogating a large cohort of 762 patients with whole genome and 56

RNA sequencing. We identified 100 SV hotspots involving 31 new candidate driver 57

genes, including drug targets BCMA (TNFRSF17) and SLAMF7. Complex SVs, 58

including chromothripsis and templated insertions, were present in 61 % of patients and 59

frequently resulted in the simultaneous acquisition of multiple drivers. After accounting 60

for all recurrent events, 63 % of SVs remained unexplained. Intriguingly, these rare SVs 61

were associated with up to 7-fold enrichment for outlier gene expression, indicating that 62

many rare driver SVs remain unrecognized and are likely important in the biology of 63

individual tumors. 64

65

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 4: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

4

Introduction 66

Structural variations (SVs) are increasingly recognized as key drivers of cancer 67

development1-3. Functional implications of SVs are diverse, including large-scale gain 68

and loss of chromosomal material, gene regulatory effects such as super-enhancer 69

hijacking, and gene fusions4. The basic unit of SVs are pairs of breakpoints, classified as 70

either deletion, tandem duplication, translocation or inversion5. Recently, whole genome 71

sequencing (WGS) studies have revealed complex patterns where multiple SVs are 72

acquired together in a single event, often involving multiple chromosomes, with a 73

potentially important role in cancer initiation and progression2,6-12. 74

In multiple myeloma, chromosomal translocations are considered to be initiating 75

events in ~40 % of patients, where the immunoglobulin heavy chain (IGH) locus on 76

chromosome 14 is juxtaposed with a class-defining oncogene13. Translocations involving 77

MYC have been identified in 15-42 % of newly diagnosed multiple myeloma patients and 78

are associated with progression to multiple myeloma from its precursor stage, smoldering 79

multiple myeloma14-17. Recurrent translocations involving the immunoglobulin lambda 80

(IGL) locus and other genes have recently been reported, expanding the catalogue of 81

potential drivers and confirming the critical role of SVs in multiple myeloma 82

pathogenesis14. 83

The genomic landscape of multiple myeloma is characterized by multiple 84

aneuploidies, which fall into two main categories13,18,19. First, whole chromosome gains 85

of more than two odd-number chromosomes is referred to as hyperdiploidy (HRD) and 86

present in ~60 % of patients20. HRD is considered to be the initiating event in most of the 87

patients without a canonical IGH-translocation, with some patients acquiring additional 88

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 5: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

5

whole chromosome gains later in tumor evolution13,20. The second category is made up of 89

all other recurrent copy number alterations (CNAs), one or more of which are present in 90

virtually all patients19,20. Among these, gain of chromosome 1q21 often occurs early in 91

tumor evolution, while loss of tumor suppressor genes tends occur later20-23. The 92

structural basis of these established drivers remains poorly understood, and there is a lack 93

of data about the role of SVs beyond the most recurrent aberrations. 94

We recently reported the first comprehensive SV characterization using WGS of 95

sequential samples from 30 multiple myeloma patients20. Despite the small sample set, 96

SVs emerged as key drivers during all evolutionary phases of disease, and we identified a 97

high prevalence of three main classes of complex SVs: chromothripsis, templated 98

insertions and chromoplexy20. In chromothripsis, chromosomal shattering and random 99

rejoining results in a pattern of tens to hundreds of breakpoints with oscillating copy 100

number across one or more chromosomes24. Templated insertions are characterized by 101

focal gains bounded by translocations, resulting in concatenation of amplified segments 102

from two or more chromosomes into a continuous stretch of DNA, which is inserted back 103

into any of the involved chromosomes2,20. Chromoplexy similarly connects segments 104

from multiple chromosomes, but the local footprint is characterized by copy number 105

loss25. Importantly, complex SVs represent large-scale genomic alterations acquired by 106

the cancer cell at a single point in time, potentially driving subsequent tumor evolution. 107

Here, we present the first comprehensive study of SVs in a large series of 762 108

multiple myeloma patients. We show that recurrent and rare SVs are critical in shaping 109

the genomic landscape of multiple myeloma, including complex events simultaneously 110

causing multiple drivers. 111

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 6: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

6

112

Results 113

Genome-wide landscape of structural variation in multiple myeloma 114

To define the landscape of simple and complex SVs in multiple myeloma, we 115

investigated 762 newly diagnosed patients from the CoMMpass study (NCT01454297) 116

who underwent low coverage long-insert WGS (median 4-8X) (Table S1). RNA 117

sequencing was also available from 592 patients. As a validation cohort, we compiled 118

previously published WGS data from 52 patients20,26,27. For each patient sample, we 119

integrated the genome-wide somatic copy number profile with SV data and assigned each 120

pair of SV breakpoints as either simple or part of a complex event according to the three 121

main classes previously identified in multiple myeloma (Methods)20. Templated 122

insertions involving more than two chromosomes were considered complex. Events 123

involving more than three breakpoint pairs which did not fulfill the criteria for a specific 124

class of complex event were classified as unspecified “complex”20. 125

Overall, we identified a median of 15 SVs per patient in the CoMMpass cohort 126

(interquartile range, IQR 7-28) (Figure 1A). Deletion was the most common SV type (33 127

%), followed by inversion and translocation (24 % each), and tandem duplication (18 %) 128

(Figure S1A). Fifty-one percent of SVs (8842/17185) were defined as part of 931 129

complex events (Figure 1A). Chromothripsis, chromoplexy and templated insertions 130

involving >2 chromosomes were observed in 21%, 11% and 21 % of patients, 131

respectively; 32 % of patients had an unspecified complex event. One or more complex 132

events were identified in 61 % of patients (median 1; range 0-14); 16 % had two complex 133

SVs; 14 % had three or more. The distribution across simple and complex classes varied 134

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 7: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

7

for each SV type: deletions were most commonly simple events (71 %), followed by 18 135

% attributed to chromothripsis; while translocations were simple events in 28 % (e.g. 136

reciprocal or unbalanced) and part of templated insertions or chromothripsis in 31 % each 137

(Figure S1A). In the validation cohort we identified a slightly higher overall SV burden, 138

with a median of 26 SVs per patient (IQR 10-36), as expected from the higher sequencing 139

coverage. Reassuringly, the distribution across simple and complex SV classes was 140

similar across the datasets (Figure S1A), suggesting that low-coverage long-insert WGS 141

provides a representative view of the SV landscape. 142

SV breakpoints tend to be enriched within specific genomic contexts, depending 143

on the underlying mechanism. Our findings from multiple myeloma corresponded with 144

previous reports from pan-cancer studies2,11. A higher density of SV breakpoints was 145

generally associated with early replication, chromatin accessibility (DNAse seq), highly 146

expressed genes and active enhancer regions as defined by histone H3K27 acetylation 147

(H3K27ac) (Figure S1B; Methods). These associations were particularly strong for the 148

breakpoints of tandem duplications and templated insertions, which also showed a strong 149

correlation with each other (Spearman’s rho = 0.33, p < 0.001). Turning to the SV burden 150

in individual patients, the number of templated insertion breakpoints was inversely 151

correlated with all other SV classes, except tandem duplications, which showed a striking 152

positive correlation (Spearman’s rho = 0.46, p <0.001) (Figure S1C). Taken together, 153

templated insertions and tandem duplications tended to occur in the same genomic 154

contexts, and to be enriched in the same tumors. 155

In patients with newly diagnosed multiple myeloma, complex SV classes showed 156

distinct patterns of co-occurrence, mutual exclusivity and association with recurrent 157

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 8: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

8

molecular alterations (Figures 1A and S2). High APOBEC mutational burden, a feature 158

associated with aggressive disease and poor outcomes28,29, was enriched in patients with 159

chromothripsis (OR = 2.8, 95% CI 1.8-4.3; p < 0.001) and unspecified complex events 160

(OR = 2.6, 95% CI 1.7-4.0; p < 0.001), but was mutually exclusive with templated 161

insertions (OR = 0.49, 95% CI 0.29-0.80; p = 0.003). Chromothripsis was the only SV 162

class with clear prognostic implications after adjustment for molecular and clinical 163

features, resulting in adverse progression free survival (PFS, adjusted HR = 1.57; 95% CI 164

1.13-2.22; p = 0.008) and overall survival (OS, adjusted HR = 2.4; 95% CI 1.5-3.83; p < 165

0.001) (Figure 1B-D; Methods)30. 166

Bi-allelic inactivation of TP53 was identified in 23 patients in our cohort, defining 167

a patient population with very poor prognosis21. Interestingly, these patients showed 168

strong enrichment for chromothripsis (15 out of 24; OR 7.8, p < 0.001) and high 169

APOBEC mutational burden (OR 6.6, p < 0.001) (Figure 1A and S2). In 3 of 15 cases, 170

the TP53 deletion was caused by a chromothripsis event; and the association remained 171

highly significant after excluding these cases (OR 6.2, p < 0.001). This relationship did 172

not reach statistical significance for monoallelic TP53 events. Consistent with our 173

findings, an association between chromothripsis and TP53 inactivation has been reported 174

in several other cancers31-33. 175

176

Structural basis of recurrent translocations and copy number variants 177

To define the structural basis of canonical translocations in multiple myeloma, we 178

identified all translocation-type events (single and complex) with one or more 179

breakpoints involving the immunoglobulin loci (i.e. IGH, IGK and IGL) or canonical 180

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 9: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

9

IGH-partners (e.g. CCND1, MMSET and MYC)13. Templated insertions emerged as the 181

cause of CCND1 and MYC translocations in 34 % and 73 % of cases, respectively 182

(Figure 2A). This is particularly important given that templated insertions connect and 183

amplify distant genomic segments, often involving several oncogenes and regulatory 184

regions (e.g. super-enhancers). Indeed, templated insertions of CCND1 and MYC were 185

associated with focal amplification in 63 % and 90 % of cases, respectively; and involved 186

more than two chromosomes in 24 % and 41 % of cases. Although rare, we also found 187

examples of chromothripsis and chromoplexy underlying canonical IGH translocations, 188

resulting in overexpression of the partner gene consistent with a driver event (Figure 189

2B). Twenty-four patients (3.2%) had a translocation involving an immunoglobulin locus 190

(IGH = 11, IGL = 12 and IGK = 1) and a non-canonical oncogene partner, most of which 191

are key regulators of B-cell development (e.g. PAX5 and CD40) (Figure 2B)34,35. One of 192

the patients with non-canonical IGH translocation also had an IGH-MMSET 193

translocation; while the remaining 10 patients either had HRD (n = 8) or no known 194

initiating event (n=2), suggesting the disease may have been initiated by a non-canonical 195

IGH translocation. Taken together, we show that different mechanisms of SV converge to 196

aberrantly activate key driver genes in multiple myeloma. Importantly, SVs with strong 197

biological implications and a potential role in tumor initiation are not necessarily 198

recurrent, even in a series as large as the one analyzed here. 199

Next, we addressed the structural basis of known recurrent CNAs (Table S2). 200

Aneuploidies involving a whole chromosome arm were most common (60 % of 2926 201

events); 24 % could be attributed to a specific SV and 14 % were intrachromosomal 202

events without a clear causal SV, classified as “unknown” (Figure 2C). The proportion 203

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 10: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

10

of CNAs explained by a SV was similar in the validation dataset (29 % of 450 events; 204

Fisher’s test p= 0.41; Table S3). There was considerable variation in the proportion and 205

class of SVs causing gains and losses between different loci, indicating the presence of 206

distinct underlying mechanisms being active at these sites (Figure 2C). Loss of 207

CDKN2C, FAM46C, 6q21, 6q25.3, MAX and TP53 could be attributed to SVs in ~50 % 208

of cases. SVs accounted for 78 % of BIRC2/BIRC3 losses, and 62 % of MYC gains, of 209

which 34 % were caused by templated insertions. Interestingly, unbalanced translocations 210

were responsible for 5.5 % of all recurrent deletions (n = 132), up to 8 % for 17p and 16 211

% for 6q. Finally, 44.5 % of all chromothripsis events resulted in the acquisition of at 212

least one recurrent driver CNA (n = 94); the corresponding numbers for chromoplexy and 213

templated insertions involving >2 chromosomes were 41.6 % (n = 40) and 19.9 % (n = 214

42), respectively. 215

In 12.5% of patients (n=95), two or more seemingly independent recurrent CNAs 216

or canonical translocations were caused by the same simple or complex SV (Figure 2D). 217

The most common event classes were chromothripsis (n = 33), chromoplexy (n = 22), 218

unbalanced translocation (n = 21) and templated insertion (n = 21) (Figure 3A-C). 219

Intriguingly, translocation of MYC occurred as part of the initiating IGH-translocation in 220

five patients, together with CCND1 (n=3) or CCND2 (n=2). Another interesting 221

phenomenon was observed on chromosome 6, where gain of one arm (p) and loss of the 222

other (q) resulted from a single inversion event in seven patients (Figure 3D). 223

224

Hotspots of structural variation 225

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 11: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

11

Twenty recurrently translocated regions have been previously reported in multiple 226

myeloma, defined by a translocation prevalence of >2 % within 1 Mb bins across the 227

genome14. Recurrent regions included the immunoglobulin loci, canonical partner 228

oncogenes, as well recurrent MYC partners, such as BMP6/TXNDC5, FOXO3 and 229

FAM46C14,17,36. We were motivated to expand the known catalogue of genomic loci 230

where SVs play a driver role in multiple myeloma and are therefore positively selected 231

(i.e. SV hotspots), considering all classes of single and complex SVs. To accomplish this, 232

we applied the Piecewise Constant Fitting (PCF) algorithm, comparing the local SV 233

breakpoint density to an empirical background model (Methods; Figure S3A-E; Data 234

S1)8,37. Overall, we identified 100 SV hotspots after excluding the immunoglobulin loci 235

(i.e., IGH, IGL and IGK), 85 of which have not been previously reported (Figure 4A; 236

Table S4). Each patient had a median of three hotspots involved by a SV (IQR 1-5); and 237

the number of hotspots involved was strongly associated with the overall SV burden 238

(Spearman’s Rho = 0.51; p < 0.001; Figure S3F). Two of the previously reported regions 239

of recurrent translocation were not recapitulated by our hotspot analysis: a locus of 240

unknown significance on 19p13.3, and the known oncogene MAFB on 20q12. This may 241

be explained by the behavior of the PCF algorithm, which favors the identification of loci 242

where breakpoints are tightly clustered compared with neighboring regions as well as the 243

expected background. Translocations involving MAFB were identified in 1.6 % of 244

patients in our series (n = 12), but the breakpoints were widely spread out across 600 Kb 245

in the intergenic region upstream of MAFB38. Similarly, a 2 Mb region on 19p13.3 was 246

involved in 44 patients (5.7 %), predominantly by templated insertions and tandem 247

duplications, but the breakpoints did not form a distinct cluster. 248

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 12: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

12

Given that SVs and CNAs reflect the same genomic events, we hypothesized that 249

functionally important SV hotspots would be associated with a cluster of CNAs. We 250

therefore performed independent discovery of driver CNAs using GISTIC (genomic 251

identification of significant targets in cancer)39. This algorithm identifies peaks of copy 252

number gain or loss which are likely to contain driver genes and/or regulatory elements 253

based on the frequency and amplitude of observed CNAs (Figure 4A, Table S5-6). In 254

addition, we generated cumulative copy number profiles for the patients involved by SV 255

at each hotspot. Finally, we evaluated the impact of SV hotspots on the expression of 256

nearby genes. By integrating SV, CNA, and expression data, we went on to determine the 257

most likely consequence of each hotspot in terms of gain-of-function, loss-of-function, 258

and potential involvement of driver genes and regulatory elements (Figures 4A-D and 259

S4). 260

Gain-of-function hotspots were defined by copy number gains and/or 261

overexpression of a putative driver gene (Table S4 and Figures 4A-C, 5A and S4). 262

Hotspots defined by these criteria showed overwhelming dominance of tandem 263

duplication, translocation and templated insertion SV breakpoints (median 73 % of 264

breakpoints [IQR 63-87 %] in gain-of-function hotspots vs 21 % [11-28 %] in other 265

hotspots; Wilcoxon rank sum test p < 0.001; Figure 5B). By contrast, chromothripsis, 266

chromoplexy and unspecified complex events had a similar contribution to gain- and 267

loss-of-function hotspots (Wilcoxon rank sum test, Bonferroni-Holm adjusted p-values > 268

0.05). Strikingly, gain-of-function hotspots showed 5.5-fold enrichment of super-269

enhancers as compared with the remaining mappable genome (1.6 vs 0.3 super-enhancers 270

per Mb, Poisson test p < 0.001)(Methods)40. Enhancer peaks (H3K27ac) clustered 271

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 13: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

13

around the SV breakpoints were observed in 46 out of 62 gain-of-function hotspots (74 272

%) (Figure 5A). High-confidence driver genes supported by both overexpression and an 273

amplification peak: CCND1, CCND2, MYC, NSD2 (WHSC1), P2RY2/FCHSD2, FOXO3, 274

MAP3K14, CD40, LTBR, TNFRSF17, PRKCD, SEC62/SAMD7, SOX30 and GLI3 275

(Figure 5A). Of particular interest, TNFRSF17 was involved by SVs in 5 % of patients (n 276

= 38) and encodes B-Cell Maturation Antigen (BCMA), a therapeutic target of chimeric 277

antigen receptor T-cells (CAR-T) and monoclonal antibodies (Figure 4B)41. We also 278

confirmed previous reports that virtually all SVs involving MYC resulted in its 279

overexpression, including deletions and inversions, which acted by repositioning MYC 280

next to the super-enhancers of NSMCE2, roughly 2 Mb upstream42. Numerous genes 281

were involved by gain-of-function hotspots without clear evidence of overexpression, 282

many of which are known to be important in multiple myeloma, such as IRF4, KLF2, 283

KLF13 and SLAMF7 (Table S4 and Figure 4C). Overall, translocation-type SVs 284

involving an immunoglobulin locus (i.e., IGH, IGL or IGK), either directly or as part of a 285

complex event, were associated with the strongest gene expression effects (pairwise 286

Wilcoxon rank sum test, Bonferroni-Holm adjusted p-values < 0.0001) (Figure S5). This 287

was true both for canonical driver genes (median Z-score 1.7, IQR 1.1-2.4) and other 288

putative drivers (median Z-score 1.83, IQR 0.85-2.5). Non-immunoglobulin 289

translocations had the second highest impact, followed by tandem duplications and other 290

SVs, all of which were associated with significantly higher expression than patients 291

without SV (adjusted p-values < 0.01) (Figure S5). 292

Loss-of-function hotspots, defined by copy number loss and/or reduced 293

expression of a putative driver gene, were highly enriched for deletion breakpoints 294

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 14: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

14

compared to the remaining hotspots (median 55 [IQR 40-68] % vs 13 [7-23] %; 295

Wilcoxon rank sum test p < 0.001; Table S4 and Figures 4D, 5A-B and S4). High-296

confidence drivers defined by copy number loss and significantly reduced gene 297

expression were FAM46C, FAF1, RB1, MAX, SP140L, SCAF8, BIRC2 and CYLD 298

(Figure 5A). Additional genes identified to be associated with loss of copy number, but 299

without a clear effect on expression, included established tumor suppressor genes such as 300

TP53 and CDKN2C, as well as novel candidate drivers, e.g., SP3 and BCL2L11 (Table 301

S4). Some genes with particular enrichment of intragenic deletion breakpoints were large 302

and late-replicating, suggesting they may be fragile sites in multiple myeloma, prone to 303

breaking during replication stress (Figure 5C; Table S4 and S7)43. The top fragile gene 304

candidate in our data was PTPRD, which is suspected to be a tumor suppressor gene in 305

several cancers (Figure 5C)27,43,44. Additional large genes involved by recurrent 306

intragenic deletions in this study included CCSER1, FHIT and WWOX, all of which are 307

known fragile sites in cancer43,45-47. However, in a recent pan-cancer study, the 308

accumulation of complex nested deletion SVs (termed rigma) in fragile site-associated 309

genes including FHIT and WWOX could not be fully explained by the classical features 310

of fragility, indicating that they may be positively selected6. Careful validation of the role 311

played by these hotspots will be necessary. 312

Consistent with recent findings in breast cancer37, in our series, SV hotspots 313

showed seven-fold enrichment for multiple myeloma germline predisposition SNPs as 314

compared with the remaining mappable genome (6 SNPs in 125 Mb of hotspots vs 17 in 315

the remaining genome, Poisson test p = 0.0002; Figure S6)48,49. The involved SNPs were 316

on 3q26.2 (SV hotspot involving SEC62/SAMD7), 6p21.3 (unknown SV target), 6q21 317

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 15: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

15

(ATG5/PRDM1), 9p21.3 (CDKN2A/CDKN2B), 19p13.11 (KLF2) and 2q31.1 (SP3) 318

(Table S8). One additional SNP on 8q24.21 (CCAT1) was less than 1 Mb upstream of 319

MYC, between two SV hotspots centered around NSMCE2 and MYC, respectively. 320

Assuming that multiple myeloma predisposition SNPs and driver SVs affecting the same 321

locus would have similar consequences, this co-occurrence may provide novel insights 322

into the target genes and functional implications. 323

In our validation dataset of 52 genomes, 35 out of 100 hotspots showed 324

significant enrichment of SV breakpoints compared with the remaining mappable 325

genome (Poisson test, FDR < 0.1; Table S9 and Figure S7A). Hotspots that were 326

confirmed in the validation dataset showed significantly higher prevalence in CoMMpass 327

as compared with hotspots that were not confirmed (median 4 [IQR 2.6-5.4] % vs 2.2 328

[1.5-3] %; Wilcoxon rank sum test p<0.001). 329

Taken together, we identified 62 gain of function and 28 loss of function hotspots; 330

with 10 hotspots being of unknown function (Figures 5A, Table S4). Thirty-three 331

hotspots involved genes with established tumor suppressor or oncogene function in 332

multiple myeloma; 31 additional hotspots involved novel putative driver genes (Table 333

S4; all novel putative driver hotspots are shown in Figures 4B-C and S4).50 334

335

Templated insertions are highly clustered and often involve multiple hotspots 336

Templated insertion breakpoints were highly clustered across the genome (Figure 337

6A) and associated with copy number gain in 76 % of cases (95 % CI 74-78 %), but only 338

rarely with copy number loss (8.3 %; 95 % CI 7.0-9.6 %). Gains were almost exclusively 339

single copy (92.5 % of 1200 gains), with less than one percent involving three or more 340

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 16: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

16

copies above the baseline (n = 9), highlighting the stability of these events. Strikingly, out 341

of 449 templated insertion events, 73 % involved one or more SV hotspots (including the 342

immunoglobulin loci) (Figure 6B-C). Half of the templated insertions (n = 210) involved 343

more than two chromosomes and 45 (10 %) involved more than four chromosomes. 344

Among the 210 events spanning more than two chromosomes, 39 % involved MYC or 345

CCND1; 34 % involved IGH, IGL or IGK; and 30 % involved none of the above, but one 346

or more other hotspots (Figure 6C). Overall, 54 % of multi-chromosomal templated 347

insertions involved two or more hotspots (Figure 6B-C). In summary, we show that 348

templated insertions in multiple myeloma are highly clustered and often string together 349

amplified segments from multiple known drivers and SV hotspots in a single event. 350

351

Chromothripsis is associated with wide-spread gain and loss of gene function 352

mechanisms 353

In contrast to templated insertions, breakpoints of chromothripsis did not show 354

strong focal clustering (Figure 7A). Instead, running the PCF algorithm for 355

chromothripsis revealed enrichment in wider regions (Table S10). (Figure 7A). The 356

most commonly involved loci included a 32 Mb region on chromosome 20q11-13 (n = 22 357

patients involved); 40 Mb spanning the MYC locus on 8q22-24 (n = 18); 23 Mb on 358

17q12-23 (n = 16); and 12 Mb on 12p12-13 (n=16). An important feature of 359

chromothripsis is its ability to cause both gain- and loss-of-function as part of the same 360

event51. Indeed, the breakpoints of chromothripsis were associated with chromosomal 361

loss in 50.1 % of cases (95 % CI 48.8-52.5%) and gain in 39.6 % (95 % CI 37.7-41.5 %). 362

High-level focal copy number gains (>5 copies) were predominantly caused by 363

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 17: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

17

chromothripsis (75 % of segments) and present in only 6 % of patients overall (n = 46) 364

(Figure 7B-C). In contrast to solid tumors, where chromothripsis may result in double 365

minute chromosomes with >50 copies2,6,10,52, we observed no segments with more than 9 366

copies in this series. Similarly, in the validation cohort, the highest copy number 367

associated with chromothripsis was 7 (Figure S7B). Genes involved by high-level focal 368

gains had a two standard deviations higher expression, on average, as compared with the 369

same genes in patients with normal copy number (median Z-score 2.2 vs -0.12, Wilcoxon 370

rank sum test, p < 0.0001) (Figure 7D). Interestingly, despite the profound overall impact 371

on gene expression, only four events (2.4 %) caused amplification and overexpression (Z-372

score > 2) of one or more established or putative drivers in multiple myeloma (i.e., 373

CCND1, CD40, NFKB1, MAP3K14 and TNFRSF17)13. For example, one patient 374

(MMRF_2330) had amplification and overexpression of MAP3K14 by chromothripsis, 375

along with multiple genes that have not been reported as genomic drivers in multiple 376

myeloma (Figure 7B). This included amplification and overexpression of the oncogene 377

RAD51C, which is commonly amplified in breast cancer and may play a drive role in this 378

one patient with multiple myeloma53. Two additional patients had >6 copies and outlier 379

overexpression of RPS6KB1, encoding the 70 kDa ribosomal protein S6 kinase 380

(p70S6K), a crucial regulator of cell cycle, growth and survival in the PI3K/mTOR 381

pathways (Figure S7C)54. These observations raise questions regarding the impact of rare 382

SVs in deregulating potential drivers, and their contribution to multiple myeloma biology 383

overall. 384

385

Functional implications of recurrent and rare structural variants 386

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 18: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

18

To determine the functional implications of rare SVs, we first identified all simple 387

and complex SVs where at least one breakpoint was potentially involved in a recurrent 388

event. SVs were assigned to categories of recurrent events in a hierarchical fashion, at 389

each step considering the SVs that had not yet been assigned. First, 9 % of SVs were 390

assigned as canonical translocations, on the basis of one or more breakpoint involving an 391

immunoglobulin locus or a canonical partner gene; 6.2 % were the cause of a known 392

recurrent CNA; 4.7 % were involved an SV hotspot; and 17.4 % had one or more 393

breakpoints within a wide GISTIC peak. Thirty-seven percent of all simple and complex 394

SVs had one or more breakpoint in a recurrently involved region, leaving 63 % (n = 395

5577) as rare events (Figure 8A). Given the permissive criteria we used to define a 396

recurrent event, this is likely to be a conservative estimate of their true contribution. 397

There were striking differences in the proportion of rare and recurrent events between SV 398

classes: the vast majority of single tandem duplications, deletions and inversions were 399

rare, while > 75 % of chromothripsis, chromoplexy and templated insertions involved at 400

least one recurrent region (Figure 8A-B). While this may be expected to some extent 401

from the number of breakpoints involved by each event, it also supports the notion that 402

complex events involve regions that are important in multiple myeloma and are therefore 403

positively selected. Considering the individual breakpoints of each complex event, a 404

greater proportion of chromothripsis breakpoints were rare as compared with templated 405

insertions (Figure 8B). Interestingly, 29 % of complex events remained orphan. 406

Determining the biological implications of rare events is challenging because 407

genomic drivers are typically defined based on recurrence above some background level. 408

To overcome this challenge, we evaluated the impact of rare events in aggregate, asking 409

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 19: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

19

the question whether genes near rare SVs were more or less likely to show outlier 410

expression levels than what would be expected if all rare SVs were passenger events55. 411

Outlier expression was defined for each gene as a z-score of +/- 2. Consistent with the 412

heterogenous nature of multiple myeloma, each patient had a median of 124 genes 413

showing outlier expression (IQR 52-436), of which a median of 3 (IQR 1-6) and 2 (IQR 414

1-5) outliers were within 1 Mb of recurrent and rare SV breakpoints, respectively (Figure 415

S8A). Conversely, each SV was associated with a median of 1 expression outlier (IQR 1-416

2), similar for recurrent and rare events. Chromothripsis and templated insertions were 417

associated with more outliers overall (Figure S8B-C), consistent with the complexity of 418

these events, sometimes associated with dramatic effects on gene expression (as shown in 419

Figure 7D). Structural variants showed striking enrichment for outlier over- and under-420

expression of nearby genes when compared with a permutation background model 421

(Methods). Recurrent and rare events showed similar overall trends, although the 422

magnitude of effect was greater for recurrent events (Figure 8C-D, respectively). The 423

directions of gene expression effects were consistent with the mechanism of each SV 424

class. Tandem duplications and templated insertions were strongly enriched for outlier 425

overexpression, with fewer nearby genes downregulated than expected by chance. 426

Tandem duplications exerted their effects only when involving the gene body; while 427

templated insertions maintained their effect up to 1 Mb away from the gene, indicating a 428

major effect on gene regulatory mechanisms, e.g., super-enhancer hijacking. 429

Chromothripsis and unspecified complex events were associated with both up- and down-430

regulation, through direct gene involvement as well as regulatory effects. 431

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 20: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

20

Based on the net excess of observed outliers over the background, we could 432

estimate the number of outliers likely caused by SVs (Figure S8D). Recurrent templated 433

insertions and chromothripsis events were associated with a net excess of 1334 outliers 434

(95 % CI 1168-1490); while recurrent events of the other classes combined were 435

associated with an excess of 595 (95 % CI 416-759) outliers. For rare SVs, the 436

corresponding numbers were 197 (95 % CI 155-233) and 427 (95 % CI 299-547), 437

respectively. These numbers certainly underestimate the overall impact of SVs, as we 438

only considered direct effects on gene expression, applying a strict definition of outlier 439

expression. Although the specific target genes often remain elusive, these data 440

demonstrate that rare SVs strongly influence gene expression, which may contribute to 441

the biology of individual tumors. 442

443

Discussion 444

We describe the first comprehensive analysis of SVs in a large series of multiple 445

myeloma patients with paired whole genome and RNA sequencing. Our findings reveal 446

how simple and complex SVs shape the driver landscape of multiple myeloma, with 447

events ranging from common CNAs and canonical translocations to a large number of 448

non-recurrent SVs seemingly able to exert significant biological effects. The results focus 449

attention on the importance of SVs in multiple myeloma and on the use of whole genome 450

analyses in order to fully understand its driver landscape. 451

Previous studies of SVs in multiple myeloma have focused on translocations 452

without consideration of complex events14,38, and our previous WGS study of 30 patients 453

lacked the power to perform comprehensive driver discovery20. Here, applying a robust 454

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 21: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

21

statistical approach37, we identified 100 SV hotspots, 85 of which have not previously 455

been reported. Integrated analysis of copy number changes, gene expression and the 456

distribution of SV breakpoints revealed 31 new potential driver genes, including the 457

emerging drug targets TNFRSF17 (BCMA) and SLAMF741,56. Overall, our 458

comprehensive catalogue of SV hotspots expands the current understanding of driver 459

gene deregulation by recurrent CNAs and SVs and identified promising new driver 460

candidates. 461

From a pan-cancer perspective, the SV landscape of multiple myeloma is 462

characterized by a lower SV burden and less genomic complexity than in many solid 463

tumors2,6,11. For example, we did not find any classical double minute chromosomes with 464

tens to hundreds of amplified copies, nor did we find any of the recently proposed 465

complex SV classes pyrgo, rigma and tyfonas6. Nonetheless, we found that complex SVs 466

play a crucial role in shaping the genome of multiple myeloma patients. There are 3 467

major types of complex events seen in multiple myeloma, each of which is characterized 468

by distinct junctional features and distribution patterns. Chromothripsis breakpoints are 469

distributed across the genome in a chaotic fashion and are a common cause of recurrent 470

deletions and gains involving tumor suppressor genes and oncogenes. While 471

chromoplexy accounts for only a small proportion of SV breakpoints it emerged as an 472

important cause of chained events associated with deletions potentially inactivating a 473

range of tumor suppressor genes distributed across multiple chromosomes. In contrast, 474

templated insertions result in amplification and concatenation of multiple SV hotspots, 475

involving oncogenes and/or super-enhancers that may result in gene overexpression. The 476

frequency of templated insertions emphasizes the importance of enhancer hijacking in the 477

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 22: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

22

pathogenesis of multiple myeloma and the relevance of transcriptional control in respect 478

to both its biology and therapy2,14. 479

A common feature of these 3 classes of SVs is the simultaneously deregulation of 480

multiple driver genes as part of a single event. Such multi-driver events are of particular 481

importance in myeloma progression as they can provide an explanation for the rapid 482

changes in clinical behavior that are frequently seen in the clinic57. Such changes may be 483

the result of “punctuated evolution” that occurs as a result of catastrophic intracellular 484

events leading to the complex SVs we describe. 485

Gene deregulation by SV is a major contributor to the biology of multiple 486

myeloma constituting a hallmark feature of its genome. However, the majority of SVs do 487

not involve known drivers, SV hotspots or regions of recurrent CNV, implying they may 488

simply be passenger variants irrelevant to tumor biology. However, we show a 489

remarkable enrichment for gene expression outliers in proximity to rare SVs, suggesting 490

that rare SVs may make a significant contribution to the biological diversity of multiple 491

myeloma, including the mechanisms of disease progression and drug resistance. 492

493

Methods 494

Patients and samples 495

As a discovery cohort, we analyzed data from 762 patients with newly diagnosed 496

multiple myeloma enrolled in the CoMMpass study (NCT01454297). As a validation 497

cohort, we combined two multiple myeloma WGS datasets with a total of 52 498

patients20,27,58. 499

500

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 23: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

23

Whole genome sequencing and analysis 501

CoMMpass samples underwent low coverage long-insert WGS (median 4-8X). 502

Paired-end reads were aligned to the human reference genome (HRCh37) using the 503

Burrows Wheeler Aligner, BWA (v0.7.8). Somatic variant calling was performed using 504

Delly (v0.7.6)5 for SVs and tCoNuT for CNAs 505

(https://github.com/tgen/MMRF_CoMMpass/tree/master/tCoNut_COMMPASS). The 506

validation dataset was analyzed and comprehensively annotated as previously 507

described20,26. 508

509

RNA sequencing analysis and fusion calling 510

CoMMpass samples underwent RNA sequencing to a target coverage of 100 511

million reads. Paired-end reads were aligned to the human reference genome (HRCh37) 512

using STAR v2.3.1z59. Transcript per million (TPM) gene expression values were 513

obtained using Salmon v7.260. 514

515

Classification of structural variants 516

Each pair of structural variant breakpoints (i.e. deletion, tandem duplication, 517

inversion or translocation) was classified as a single event, or as part of a complex event 518

(i.e. chromothripsis, chromoplexy or unspecified complex)2,20. Translocation-type events 519

were classified as single when involving no more than two breakpoint pairs and two 520

chromosomes, subdivided into reciprocal translocations, unbalanced translocations, 521

templated insertion or unspecified translocation as previously described2,20. Templated 522

insertions could be either simple or complex, depending on the number of breakpoints 523

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 24: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

24

and chromosomes involved, but was always defined by translocations associated with 524

copy number gain. Chromothripsis was defined by more than 10 interconnected SV 525

breakpoint pairs associated with oscillating copy number across one or more 526

chromosomes. The thresholds of 10 breakpoints was imposed as a stringent criterion to 527

avoid overestimating the prevalence of chromothripsis. Chromoplexy was defined by <10 528

interconnected SV breakpoints across >2 chromosomes associated with copy number 529

loss. Patterns of three or more interconnected breakpoint pairs that did not fall into either 530

of the above categories were classified as unspecified “complex”20. 531

532

Mutational signature analysis 533

SNV calls from whole exome sequencing were subjected to mutational signature 534

fitting, using the previously described R package mmsig26. High APOBEC mutational 535

burden was defined by an absolute contribution of APOBEC mutations (mutational 536

signatures 2 and 13) in the 4th quartile among patients with evidence of APOBEC 537

activity26. 538

539

Structural basis for recurrent CNAs in multiple myeloma 540

We applied the following workflow to determine the structural basis for each 541

recurrent CNA in multiple myeloma (Table S2). First, we identified in each patient every 542

genomic segment involved by recurrent copy number gain or loss. Gains were defined by 543

total copy number (CN) >2; loss as a minor CN = 0. Second, we called whole arm events 544

in cases where the CN was equal to the 2 Mb closest to the centromere and telomere, and 545

>90 % of the arm had the same CN state. Third, for segments that did not involve the 546

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 25: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

25

whole arm, we searched for SV breakpoints responsible for the CNA within 50 kb of the 547

CN segment ends. Finally, and intrachromosomal CNAs without SV support were 548

classified as unknown. We applied Fisher’s exact test to compare the proportion of CNAs 549

explained by an SV for each of the recurrent events, requiring Bonferroni-Holm adjusted 550

p-values < 0.05 for statistical significance. 551

Multi-driver events were defined by two or more independent driver copy number 552

segments and/or canonical driver translocations caused by the same SV (simple or 553

complex). 554

555

Copy number changes associated with structural variant breakpoints 556

To determine the genome-wide footprint of copy number changes resulting from 557

SVs, we employed an “SV-centric” workflow, as opposed to the CNA-centric workflow 558

described above. For each SV breakpoint, we searched for a change in copy number 559

within 50 kb. If more than one CNA was identified, we selected the shortest segment. 560

Deletion and amplification CNAs were defined as changes from the baseline of that 561

chromosomal arm. As a baseline, we considered the average copy number of the two Mb 562

closest to the telomere and centromere, respectively. This is important because deletions 563

are often preceded by large gains, particularly in patients with HRD. In those cases, we 564

are interested in the relative change caused by deletion, not the total CN of that segment 565

(which may still be > 2). We estimated the proportion of breakpoints associated with 566

copy number gain or loss across patients, collapsing the data in 2 Mb bins across the 567

genome. Confidence intervals were estimated using bootstrapping and the quantile 568

method. For the purposes of plotting (Figures 4A, 6A and 7A), we divided the SV-569

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 26: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

26

associated CNAs into bins of 2 Mb. The resulting cumulative CNA plot shows the 570

number of patients with an SV-associated deletion or amplification. 571

572

Hotspots of structural variation breakpoints 573

To identify regions enriched for SV breakpoints, we employed the statistical 574

framework of piecewise constant fitting (PCF). In principle, the PCF algorithm identifies 575

regions where SVs are positively selected, based on enrichment of breakpoints with short 576

inter-breakpoint distance compared to the expected background. We used the 577

computational workflow previously described by37. In brief, negative binomial regression 578

was applied to model local SV breakpoint rates under the null hypothesis (i.e. absence of 579

selection), taking into account local features such as gene expression, replication time, 580

non-mapping bases and histone modifications. The PCF algorithm can define hotspots 581

without the use of binning, based on a user-defined smoothing parameter and threshold of 582

fold-enrichment compared to the background. This allows for hotspots of widely different 583

sized to be identified, depending on the underlying biological processes. To avoid calling 584

hotspots driven by highly clustered breakpoints in a few samples, we also set a minimum 585

threshold of 8 samples involved (~ 1 % of the cohort) to be considered hotspot, as 586

previously reported37. Despite this threshold, we found that complex SVs with tens to 587

hundreds of breakpoints in a localized cluster (particularly chromothripsis) came to 588

dominate the results. To account for this, we ran the PCF algorithm twice on different 589

subsets of the data: 1) considering all breakpoints of non-clustered SVs (simple classes 590

and templated insertions); and 2) including all SV classes, but randomly downsampling 591

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 27: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

27

the data to include only one breakpoint per 500 kb per patient. Final output from both 592

approaches was merged for downstream analysis. 593

The full SV hotspot analysis workflow is attached as Data S1, drawing on generic 594

analysis tools that we have made available on github 595

(https://github.com/evenrus/hotspots/tree/hotornot-mm). The workflow is easily portable 596

to other datasets and includes utilities for data preparation and background model 597

generation as well as hotspot discovery and visualization. 598

599

Functional classification of structural variation hotspots 600

SV hotspots were classified based on local copy number and gene expression data 601

as gain of function, loss of function, or unknown. 602

Copy number data was integrated from two complimentary approaches. First, we 603

applied the GISTIC v2.0 algorithm to identify wide peaks of enrichment for 604

chromosomal amplification or deletion (FDR < 0.1), using standard settings39. Second, 605

we considered the cumulative copy number profiles of each hotspot, considering only the 606

patients with SV breakpoints within the region, looking for more subtle patterns of 607

recurrent CNA that was not picked up in the genome-wide analysis. 608

To determine the effects of SV hotspot involvement on gene expression, we 609

performed multivariate modeling using the limma package in R61. Gene expression count 610

data were first normalized using the DESeq/vst package, which allows the use of linear 611

models62. We went on to estimate the effect SV hotspot involvement on every expressed 612

gene after adjustment for the primary molecular classes, MYC translocation and 613

gain1q21. Significantly altered expression (Bonferroni-Holm adjusted p-values < 0.01) of 614

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 28: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

28

a putative driver gene within 5 Mb of the hotspot was considered as supporting evidence 615

for a driver role. 616

617

Identification of putative driver genes involved by SV hotspots 618

Multiple lines of evidence were considered to identify driver genes involved by 619

SV hotspots. Evidence of a putative driver gene included: 1) involved by driver SNVs in 620

multiple myeloma19,20; 2) included in the COSMIC cancer gene census 621

(https://cancer.sanger.ac.uk/census); 3) designated as putative driver gene in The Cancer 622

Genome Atlas63-66; 4) enrichment of SV breakpoints in or around the gene; 5) nearby 623

peak of SV-related copy number gain or loss; 6) SV classes and recurrent copy number 624

changes corresponding to a known role of that gene in cancer (i.e. oncogene or tumor 625

suppressor); and 7) differential gene expression. Having identified candidate driver genes 626

involved by SV hotspots, we reviewed the literature for evidence of a role in multiple 627

myeloma (Table S4). 628

629

Histone H3K27ac and super-enhancers 630

Active enhancer (H3K27ac) and super-enhancer data from primary multiple 631

myeloma cells was obtained from40. Enrichment of super-enhancers in hotspots was 632

assessed using a Poisson test, comparing the super-enhancer density within 100 kb of 633

hotspots with the remaining mappable genome. 634

635

Enrichment of germline predisposition loci 636

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 29: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

29

Germline predisposition SNPs for multiple myeloma were obtained from a recent 637

meta-analysis of GWAS-studies48,49. Poisson test for enrichment in hotspots was 638

performed as for super-enhancers. We also simulated 1000 sets of hotspots of identical 639

size using a bed shuffle implementation in R67, estimating the probability under the null 640

hypothesis to obtain a more extreme result than the observed data. As a final control 641

measure, we repeated the same analysis for loci associated with development of a 642

completely unrelated condition. We selected schizophrenia as a control condition because 643

the pathogenesis is distinct from multiple myeloma, with abundant data on germline risk 644

loci68. 645

646

External validation of hotspots 647

Hotspots identified by the PCF algorithm were validated in our external cohort of 648

52 patients. We applied one-sided poisson tests for enrichment of SV breakpoints within 649

each defined hotspot region as compared with the mappable non-hotspot genome (FDR < 650

0.1). To avoid bias from highly clustered events, we first downsampled the data as 651

described for the main hotspot analysis, randomly selecting a single breakpoint per 500 652

kb per patient. 653

654

Chromothripsis hotspot analysis 655

Empirical background models showed very poor ability to predict the distribution 656

of chromothripsis breakpoints, as may be expected if DNA breaks in chromothripsis tend 657

to be random. To identify regions enriched for chromothripsis, we applied the PCF 658

algorithm with a uniform background, only adjusting for non-mapping bases. 659

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 30: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

30

660

Gene expression analysis of rare structural variants 661

To determine whether rare SVs are functionally relevant, we assessed their 662

aggregated effect on gene expression outliers, adopting an approach previously published 663

for germline SVs55. First, we selected all genes in our cohort requiring >0 TPM 664

expression in >25 % of patients and a median expression level of > 1. Second, we defined 665

expression outliers with a Z-score of +/- 2 for each gene. Genes involved by SVs was 666

defined separately for deletion/tandem duplication type SVs and translocation/inversion 667

types. For deletions and tandem duplications, only focal events were considered (size < 3 668

Mb), and genes were considered directly involved if the gene was between the 669

breakpoints, or if a breakpoint transected the gene. Genes within 100kb or 100kb – 1 Mb 670

of a breakpoint were considered as separate categories. For translocations and inversions, 671

direct involvement was considered when the breakpoint transected the gene. Genes 672

within 100 Kb and 100 Kb - 1 Mb were considered as separate categories. Having 673

defined gene expression outliers and genes involved by SVs, we compared the observed 674

number of outliers within each category with the numbers expected by chance. In the 675

background permutation model, we randomly shuffled the outlier status (up, down or 676

non-outlier) of each gene. The median frequency across permutations was used as 677

background estimates, with 95 % confidence intervals estimated by the quantile method. 678

Fold enrichment and net excess of outliers was based on the observed frequency divided 679

by the background frequency and the upper and lower bounds of its 95 % confidence 680

interval. 681

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 31: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

31

Data and software availability 682

All the raw data used in the study are already publicly available (dbGap: 683

phs000748.v1.p1, dbGap: phs000348.v2.p1 and European Genome phenome Archive: 684

EGAD00001001898 . Analysis was carried out in R version 3.6.1. The full analytical 685

workflow in R to identify hotspots of structural variants is provided in Data S1. All other 686

software tools used are publicly available. Analysis code will be provided by the Lead 687

Contact upon request. 688

689

Acknowledgements 690

This work is supported by the Memorial Sloan Kettering Cancer Center NCI Core Grant 691

(P30 CA 008748), the Multiple Myeloma Research Foundation (MMRF), and the 692

Perelman Family Foundation.693

K.H.M. is supported by the Haematology Society of Australia and New Zealand New 694

Investigator Scholarship and the Royal College of Pathologists of Australasia Mike and 695

Carole Ralston Travelling Fellowship Award. 696

G.J.M is supported by The Leukemia Lymphoma Society. 697

698

Author contributions 699

F.M. designed and supervised the study, collected and analyzed data and wrote the paper. 700

E.H.R. designed the study, collected and analyzed data and wrote the paper. G.J.M. and 701

O.L. collected and analyzed data and wrote paper. D.G., V.Y, K.M. and B.D.D, analyzed 702

data and wrote the paper. P.J.C., E.P., G.G., D.L. and N.A., analyzed data. E.M.B., C.A., 703

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 32: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

32

M.H., A.D., Y.Z., P.B., D.A., K.C.A., P.M., N.B., H.A.L., N.M., J.K., G.M., collected 704

data. 705

706

Declarations of interests 707

No conflict of interests to declare. 708

709

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 33: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

33

Figure legends 710

Figure 1: Simple and complex classes of structural variants in multiple myeloma. A: 711

Genome-wide burden of structural variant types (horizontal panels) in each patient, 712

grouped according to primary multiple myeloma molecular subgroup. Bars within each 713

panel are colored according to the contribution of complex SV classes (i.e. chromoplexy, 714

chromothripsis, templated insertions or other ‘complex’ variants grouped together) versus 715

simple SVs (DEL, deletion; DUP, tandem duplication; INV, inversion and TRA, 716

translocation). Heatmap below shows the presence of complex structural variant classes 717

together with key genomic drivers of multiple myeloma, including non-canonical 718

immunoglobulin (IG) translocations. B-C: Kaplan-meier plots for progression free 719

survival (PFS) (B) and overall survival (OS) (C) in patients with and without 720

chromothripsis (shown in blue and red respectively). D: Hazard ratio for PFS and OS by 721

SV type, estimated using multivariate Cox regression. (Line indicates 95 % CI from 722

multivariate cox regression models, statistically significant features indicated by asterisks 723

(* p<0.05; ** p < 0.01). For templated insertions, chromothripsis, chromoplexy and 724

unspecified complex events, we compared patients with 0 versus 1 or more events. The 725

remaining SVs were considered by their simple class (i.e. DUP, DEL, TRA and INV), 726

comparing the 4th quartile SV burden with the lower three quartiles. The multivariate 727

models included all SV variables as well the following clinical and molecular features: 728

age, sex, ECOG status, ISS-stage, induction regimen, gain1q21, delFAM46C, delTRAF3, 729

delTP53, delRB1, high APOBEC mutational burden, hyperdiploidy and canonical 730

translocations involving CCND1, MMSET, MAF, MAFA, MAFB and MYC. 731

732

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 34: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

34

Figure 2: Structural basis of recurrent translocations and copy number changes. A: 733

Relative contribution (y-axis) of simple and complex SV classes to translocations (TRA) 734

involving the immunoglobulin loci and MYC with canonical and non-canonical partners 735

(x-axis). The category “Other” contains non-MYC translocations for IGK and IGL, while 736

for IGH it includes CCND2, CCND3, MAFA and MAFB. B: Transcript per million 737

(TPM) gene expression of canonical and non-canonical partners of translocations 738

involving the immunoglobulin (IG) loci. Each point represents a sample, colored by the 739

SV class involved or absence of SV (gray). Boxplots shows the median and interquartile 740

range (IQR) of expression across all patients, with whiskers extending to 1.5 * IQR. The 741

templated insertion of IGH and MAF with low expression was part of a multi-742

chromosomal event involving and causing the overexpression of CCND1. C) Structural 743

basis of recurrent multiple myeloma CNA drivers, showing the relative contribution of 744

whole arm events and CNAs associated with a specific SV. Intrachromosomal events 745

without a clear causal SV were classified as “unknown”. D) Co-occurrence matrix 746

showing all cases where two or more recurrent CNAs and/or canonical driver 747

translocations were caused by the same simple or complex SV. Each CNA segment was 748

only counted once, even if including more than one driver gene. For example, a deletion 749

involving both CDKN2C and FAM46C will be reported only if connected to another 750

CNA caused by the same SV. BIRC: BIRC2/BIRC3. 751

752

Figure 3: Multiple recurrent aberrations caused by the same SV. A-C) Genome plots 753

showing the SV breakpoints of a single complex SV in each panel (colored lines), with 754

bars around the plot circumference indicating copy number changes. The outer track 755

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 35: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

35

represents copy number loss (red), the inner represents copy number gains (blue). A) 756

Chromothripsis involving IGH and 9 recurrent driver aberrations across 10 different 757

chromosomes (sample MMRF_1890_1_BM). B) Chromoplexy involving chromosomes 758

11, 13, and 14, simultaneously causing deletion of key tumor suppressor genes on each 759

chromosome (sample MMRF_2194_1_BM). C) Templated insertion involving 5 760

different chromosomes, causing a canonical IGH-CCND1 translocation and involving 761

MYC in the same chain of events. Focal amplifications can be seen corresponding to the 762

translocation breakpoints (sample MMRF_1862_1_BM). D) Top: Unbalanced inversion 763

on chromosome 6, simultaneously causing gain of 6p and loss of 6q (sample 764

MMRF_1740_1_BM). Bottom: Copy number profile of chromosome 6 in sample 765

MMRF_2266_1_BM, generated through several independent structural variants. The 766

main events were: 1) whole chromosome gain; 2) unbalanced inversion causing an 767

additional gain of 6p and loss of one of the two duplicated 6q alleles; and 3) chromoplexy 768

involving chromosome 14 and 15, with loss of an interstitial segment containing SCAF8 769

and ARID1B on 6q. This patient also had an independent chained translocation event 770

involving the TXNDC5/BMP6 locus on 6p. 771

772

Figure 4: Genome-wide distribution of structural variation breakpoints and 773

hotspots. A) Top: Rainfall plot showing the log-distance of each SV breakpoint (points) 774

to its closest neighbor (y-axis) across the genome (x-axis), colored by SV class (legend 775

above figure). Middle: Distribution of SV hotspots (green) and recurrent copy number 776

changes (red/blue) identified by the GISTIC algorithm. Bottom: all copy number changes 777

caused by SV breakpoints, showing cumulative plots for gains (blue) and losses (red). B-778

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 36: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

36

D) Zooming in on three SV hotspots, showing the chromosomal region (top), cumulative 779

copy number and GISTIC peaks (middle), and the breakpoint density of relevant SV 780

classes (colors indicated in legend above A) around the hotspot (bottom). The lower plots 781

are annotated with enhancer marks (histone H3K27ac) in brown and genes in gray, with 782

putative driver gene names in black. B) Gain of function hotspot centered around 783

TNFRSF17 (BCMA), dominated by highly clustered templated insertions, associated 784

with focal copy number gain of TNFRSF17. C) Gain of function hotspot involving four 785

genes in the Signaling Lymphocyte Activation Molecule (SLAM) family of 786

immunomodulatory receptors, including the gene encoding the monoclonal antibody 787

target SLAMF7. The SV hotspot region was supported by a GISTIC peak of copy number 788

gain. D) Deletion hotspot associated with copy number loss centered on the cyclin 789

dependent kinase inhibitors CDKN2A/CDKN2B, supported by a peak of copy number 790

loss identified by GISTIC. 791

792

Figure 5: Gain- and loss- of function consequences of structural variant hotspots. A) 793

Summary of all 100 SV hotspots, showing (from the top): absolute and relative 794

contribution of SV classes within 100 Kb of the hotspot; involvement of active enhancers 795

in multiple myeloma, copy number changes and differential expression of putative driver 796

genes; known and candidate driver genes. B) Distribution of SV classes involving each 797

hotspot, colored by gain-of-function (green), loss-of-function (orange) or unknown 798

functional status (blue) as defined by copy number and gene expression data. Percentage 799

of templated insertion, simple translocation and tandem duplication SVs on the y-axis 800

and deletion SVs x-axis. C) Potential fragile sites in multiple myeloma, characterized by 801

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 37: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

37

deletions (y-axis) within the body of large genes (size) that are late replicating (x-axis). 802

Fragile sites indicated by the color scale have been validated in pan-cancer studies. 803

804

Figure 6: Templates insertions are highly clustered and string together multiple 805

hotspots. A) Distribution of templated insertions across the genome (above) and 806

associated cumulative copy number changes (below). Vertical black lines indicate SV 807

hotspots containing >3 patients with a templated insertion. Selected gene names are 808

annotated. Numbers are annotated where peaks extend outside of the plotting area B) Co-809

occurrence matrix showing the number of cases where pairs of SV hotspots occur as part 810

of the same templated insertion, involving two or more chromosomes. C) Example of a 811

templated insertion chain (brown lines) spanning three SV hotspots, involving the IGL, 812

MYC, and a hotspot on chromosome 15q24 (sample MMRF_1550_1_BM). Copy number 813

profile shown in blue, with an active enhancer track below (H3K27Ac). 814

815

Figure 7: Wide-spread involvement of chromothripsis with gain and loss of function 816

effects. A) Distribution of chromothripsis involvement across the genome (above), with 817

associated cumulative copy number changes (below). Black lines indicate hotspots of 818

chromothripsis. B) Example of chromothripsis causing high-level focal gains on 819

chromosome 17 (sample MMRF_2330_1_BM). The horizontal black line indicates total 820

copy number; the dashed orange line minor copy number. Vertical lines represent SV 821

breakpoints, color-coded by SV class. Selected overexpressed genes (Z-score >2) are 822

annotated in red, including the established multiple myeloma driver gene MAP3K14, and 823

RAD51C, an oncogene commonly amplified in breast cancer (6 copies). C) Number of 824

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 38: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

38

focal segments with >5 copies across chromosomes, colored by SV class. D) Violin plot 825

showing gene expression Z-scores of genes involved by high-level gains (>5 copies) 826

compared with expression of the same genes in patients where they have normal copy 827

number. Points with error bars show median and interquartile range. 828

829

Figure 8: Distribution and impact of recurrent and rare structural variations. A) 830

Proportion of single and complex SVs involved in one or more known or novel region of 831

recurrent SV and/or CNA. Canonical translocation loci were prioritized (i.e. IGH, IGK, 832

IGL, MYC, CCND1, CCND2, CCND3, MAF, MAFA, MAFB and MMSET), followed by 833

known recurrent aneuploidies, SV hotspots and GISTIC peaks (See also Tables S5-S6). 834

The leftover category “rare SV” constitutes those events where every breakpoint fell 835

outside of the above-mentioned recurrently involved regions. For example, a 836

chromothripsis event with 100 breakpoints will be classified as “Canonical TRA” if one 837

breakpoint falls within the IGH locus, even if the remaining 99 breakpoints are in non-838

recurrent regions. B) Heatmaps showing the proportion of breakpoints falling into each of 839

the categories in A), displaying chromothripsis events (above) and templated insertion 840

event involving more than two chromosomes (below). Each column in the heatmaps 841

represent one complex event, arranged and color-coded (above the heatmap) according to 842

the classification in A). Rows in the heatmap show the proportion of breakpoints 843

belonging to each complex event that fall into each of the categories, e.g., Canonical 844

TRA, Recurrent CNA, etc. The sum of proportions may exceed 1 for each event, because 845

many of the recurrent events overlap, e.g. the same region may be both recurrently 846

translocated, an SV hotspot, and a GISTIC peak. Breakpoints falling in that region will 847

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 39: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

39

be counted towards each of the categories. C-D) Enrichment analysis of outlier gene 848

expression (Z-score +/- 2) by proximity to recurrent (C) and rare (D) SVs, showing the 849

log10 fold-enrichment (y-axis) for each SV class (x-axis), compared with a permutation-850

based background model. Enrichment for overexpression outliers above and outlier 851

reduced expression below. Columns show outlier enrichment according to the distance 852

from SV breakpoints to involved genes. 853

854

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 40: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

40

Supplementary materials 855

Table S1: Cohort summary. Baseline clinical variables and availability of sequencing 856

data from patients in the CoMMpass cohort. 857

858

Table S2: Recurrent CNAs in multiple myeloma. Chromosomal coordinates used to 859

define recurrent CNAs in multiple myeloma. 860

861

Table S3: Proportion of CNAs explained by an SV in CoMMpass and the validation 862

dataset. For each of the recurrent CNAs in multiple myeloma, we report the proportion 863

of events explained by an SV in the CoMMpass and the validation dataset. Odds ratios 864

(OR) are from Fisher’s exact test, with Bonferroni-Holm adjusted p-values. 865

866

Table S4: SV hotspots in multiple myeloma. Summary table of all SV hotspots. 867

Detailed explanations for each column are provided as a second sheet in the table excel 868

file. 869

870

Table S5: GISTIC amplification peaks. Wide peaks of chromosomal gains as defined 871

by the GISTIC algorithm (FDR < 0.1). 872

873

Table S6: GISTIC deletion peaks. Wide peaks of chromosomal loss as defined by the 874

GISTIC algorithm (FDR < 0.1). 875

876

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 41: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

41

Table S7: Evaluation of fragile sites in multiple myeloma. Shows all genes with three 877

or more patients harboring deletion breakpoints within the gene body. Genes are assigned 878

as known fragile sites if their genomic coordinates intersect with previously reported 879

fragile sites in cancer, or as potential fragile sites if their size exceeds 500 Kb and they 880

are late replicating. Late replicating was defined here as below the mean (i.e. normalized 881

replication time below zero) across the genome of lymphoblastoid cells. 882

883

Table S8: Germline predisposition loci in multiple myeloma and associated SV 884

hotspots. Shows all previously described germline predisposition loci for multiple 885

myeloma, including the genes involved (SNP_gene) and suspected target genes 886

(SNP_target), as previously reported. We go on to report the overlapping SV hotspot (see 887

also Table S4). 888

889

Table S9: Validation of SV hotspots. Number and proportion of patients with an SV in 890

each hotspot across CoMMpass and the validation dataset, with results from Poisson-tests 891

for hotspot enrichment in the validation dataset. 892

893

Table S10: Hotspots of chromothripsis. Hotspot regions for chromothripsis identified 894

by the PCF algorithm when run exclusively with chromothripsis breakpoints. 895

896

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 42: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

42

References 897

1 Zhang, Y. et al. A Pan-Cancer Compendium of Genes Deregulated by Somatic 898 Genomic Rearrangement across More Than 1,400 Cases. Cell Rep 24, 515-527, 899 doi:10.1016/j.celrep.2018.06.025 (2018). 900

2 Li, Y. et al. Patterns of structural variation in human cancer. bioRxiv, 181339, 901 doi:10.1101/181339 (2017). 902

3 Yang, L. et al. Diverse Mechanisms of Somatic Structural Variations in Human 903 Cancer Genomes. Cell 153, 919-929, 904 doi:https://doi.org/10.1016/j.cell.2013.04.010 (2013). 905

4 Maciejowski, J. & Imielinski, M. Modeling cancer rearrangement landscapes. 906 Current Opinion in Systems Biology 1, 54-61, 907 doi:https://doi.org/10.1016/j.coisb.2016.12.005 (2017). 908

5 Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end 909 and split-read analysis. Bioinformatics 28, i333-i339, 910 doi:10.1093/bioinformatics/bts378 (2012). 911

6 Hadi, K. et al. Novel patterns of complex structural variation revealed across 912 thousands of cancer genome graphs. bioRxiv, 836296, doi:10.1101/836296 913 (2019). 914

7 Mitchell, T. J. et al. Timing the Landmark Events in the Evolution of Clear Cell 915 Renal Cell Cancer: TRACERx Renal. Cell 173, 611-623.e617, 916 doi:10.1016/j.cell.2018.02.020 (2018). 917

8 Glodzik, D. et al. Mutational mechanisms of amplifications revealed by analysis 918 of clustered rearrangements in breast cancers. Annals of oncology : official 919 journal of the European Society for Medical Oncology / ESMO 29, 2223-2231, 920 doi:10.1093/annonc/mdy404 (2018). 921

9 Lee, J. J. et al. Tracing Oncogene Rearrangements in the Mutational History of 922 Lung Adenocarcinoma. Cell 177, 1842-1857.e1821, 923 doi:10.1016/j.cell.2019.05.013 (2019). 924

10 Zhang, C.-Z. et al. Chromothripsis from DNA damage in micronuclei. Nature 925 522, 179-184, doi:10.1038/nature14493 (2015). 926

11 Wala, J. A. et al. Selective and mechanistic sources of recurrent rearrangements 927 across the cancer genome. bioRxiv, 187609, doi:10.1101/187609 (2017). 928

12 Maciejowski, J., Li, Y., Bosco, N., Campbell, P. J. & de Lange, T. 929 Chromothripsis and Kataegis Induced by Telomere Crisis. Cell 163, 1641-1654, 930 doi:10.1016/j.cell.2015.11.054 (2015). 931

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 43: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

43

13 Manier, S. et al. Genomic complexity of multiple myeloma and its clinical 932 implications. Nat Rev Clin Oncol 14, 100-113, doi:10.1038/nrclinonc.2016.122 933 (2017). 934

14 Barwick, B. G. et al. Multiple myeloma immunoglobulin lambda translocations 935 portend poor prognosis. Nature communications 10, 1911, doi:10.1038/s41467-936 019-09555-6 (2019). 937

15 Bolli, N. et al. Genomic patterns of progression in smoldering multiple myeloma. 938 Nature communications 9, 3363, doi:10.1038/s41467-018-05058-y (2018). 939

16 Misund, K. et al. MYC dysregulation in the progression of multiple myeloma. 940 Leukemia, doi:10.1038/s41375-019-0543-4 (2019). 941

17 Walker, B. A. et al. Translocations at 8q24 juxtapose MYC with genes that harbor 942 superenhancers resulting in overexpression and poor prognosis in myeloma 943 patients. Blood Cancer J 4, e191, doi:10.1038/bcj.2014.13 (2014). 944

18 Walker, B. A. et al. A compendium of myeloma-associated chromosomal copy 945 number abnormalities and their prognostic value. Blood 116, e56-e65, 946 doi:10.1182/blood-2010-04-279596 (2010). 947

19 Walker, B. A. et al. Identification of novel mutational drivers reveals oncogene 948 dependencies in multiple myeloma. Blood 132, 587-597, doi:10.1182/blood-2018-949 03-840132 (2018). 950

20 Maura, F. et al. Genomic landscape and chronological reconstruction of driver 951 events in multiple myeloma. Nature communications 10, 3835, 952 doi:10.1038/s41467-019-11680-1 (2019). 953

21 Walker, B. A. et al. A high-risk, Double-Hit, group of newly diagnosed myeloma 954 identified by genomic analysis. Leukemia 33, 159-170, doi:10.1038/s41375-018-955 0196-8 (2019). 956

22 Weinhold, N. et al. Clonal selection and double-hit events involving tumor 957 suppressor genes underlie relapse in myeloma. Blood 128, 1735-1744, 958 doi:10.1182/blood-2016-06-723007 (2016). 959

23 Jones, J. R. et al. Clonal evolution in myeloma: the impact of maintenance 960 lenalidomide and depth of response on the genetics and sub-clonal structure of 961 relapsed disease in uniformly treated newly diagnosed patients. Haematologica, 962 doi:10.3324/haematol.2018.202200 (2019). 963

24 Korbel, J. O. & Campbell, P. J. Criteria for inference of chromothripsis in cancer 964 genomes. Cell 152, 1226-1236, doi:10.1016/j.cell.2013.02.023 (2013). 965

25 Baca, S. C. et al. Punctuated evolution of prostate cancer genomes. Cell 153, 666-966 677, doi:10.1016/j.cell.2013.03.021 (2013). 967

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 44: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

44

26 Rustad, E. H. et al. Timing the Initiation of Multiple Myeloma. Sneak Peak (pre-968 print), doi:https://dx.doi.org/10.2139/ssrn.3409453 (2019). 969

27 Lohr, J. G. et al. Widespread genetic heterogeneity in multiple myeloma: 970 implications for targeted therapy. Cancer Cell 25, 91-101, 971 doi:10.1016/j.ccr.2013.12.015 (2014). 972

28 Maura, F. et al. Biological and prognostic impact of APOBEC-induced mutations 973 in the spectrum of plasma cell dyscrasias and multiple myeloma cell lines. 974 Leukemia 32, 1044-1048, doi:10.1038/leu.2017.345 (2018). 975

29 Walker, B. A. et al. APOBEC family mutational signatures are associated with 976 poor prognosis translocations in multiple myeloma. Nature communications 6, 977 6997, doi:10.1038/ncomms7997 (2015). 978

30 Magrangeas, F., Avet-Loiseau, H., Munshi, N. C. & Minvielle, S. Chromothripsis 979 identifies a rare and aggressive entity among newly diagnosed multiple myeloma 980 patients. Blood 118, 675-678, doi:10.1182/blood-2011-03-344069 (2011). 981

31 Rausch, T. et al. Genome sequencing of pediatric medulloblastoma links 982 catastrophic DNA rearrangements with TP53 mutations. Cell 148, 59-71, 983 doi:10.1016/j.cell.2011.12.013 (2012). 984

32 Quigley, D. A. et al. Genomic Hallmarks and Structural Variation in Metastatic 985 Prostate Cancer. Cell 174, 758-769.e759, doi:10.1016/j.cell.2018.06.039 (2018). 986

33 Grobner, S. N. et al. The landscape of genomic alterations across childhood 987 cancers. Nature 555, 321-327, doi:10.1038/nature25480 (2018). 988

34 Nutt, S. L., Hodgkin, P. D., Tarlinton, D. M. & Corcoran, L. M. The generation of 989 antibody-secreting plasma cells. Nat Rev Immunol 15, 160-171, 990 doi:10.1038/nri3795 (2015). 991

35 Boothby, M. R., Hodges, E. & Thomas, J. W. Molecular regulation of peripheral 992 B cells and their progeny in immunity. Genes & development 33, 26-48, 993 doi:10.1101/gad.320192.118 (2019). 994

36 Mikulasova, A. et al. Microhomology-mediated end joining drives complex 995 rearrangements and over expression of MYC and PVT1 in multiple myeloma. 996 Haematologica, haematol.2019.217927, doi:10.3324/haematol.2019.217927 997 (2019). 998

37 Glodzik, D. et al. A somatic-mutational process recurrently duplicates germline 999 susceptibility loci and tissue-specific super-enhancers in breast cancers. Nat Genet 1000 49, 341-348, doi:10.1038/ng.3771 (2017). 1001

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 45: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

45

38 Walker, B. A. et al. Characterization of IGH locus breakpoints in multiple 1002 myeloma indicates a subset of translocations appear to occur in pregerminal 1003 center B cells. Blood 121, 3413-3419, doi:10.1182/blood-2012-12-471888 (2013). 1004

39 Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of 1005 the targets of focal somatic copy-number alteration in human cancers. Genome 1006 Biology 12, R41, doi:10.1186/gb-2011-12-4-r41 (2011). 1007

40 Jin, Y. et al. Active enhancer and chromatin accessibility landscapes chart the 1008 regulatory network of primary multiple myeloma. Blood 131, 2138-2150, 1009 doi:10.1182/blood-2017-09-808063 (2018). 1010

41 Cho, S. F., Anderson, K. C. & Tai, Y. T. Targeting B Cell Maturation Antigen 1011 (BCMA) in Multiple Myeloma: Potential Uses of BCMA-Based Immunotherapy. 1012 Frontiers in immunology 9, 1821, doi:10.3389/fimmu.2018.01821 (2018). 1013

42 Affer, M. et al. Promiscuous MYC locus rearrangements hijack enhancers but 1014 mostly super-enhancers to dysregulate MYC expression in multiple myeloma. 1015 Leukemia 28, 1725-1735, doi:10.1038/leu.2014.70 (2014). 1016

43 Glover, T. W., Wilson, T. E. & Arlt, M. F. Fragile sites in cancer: more than 1017 meets the eye. Nature reviews. Cancer 17, 489-501, doi:10.1038/nrc.2017.52 1018 (2017). 1019

44 Veeriah, S. et al. The tyrosine phosphatase PTPRD is a tumor suppressor that is 1020 frequently inactivated and mutated in glioblastoma and other human cancers. 1021 Proceedings of the National Academy of Sciences 106, 9435, 1022 doi:10.1073/pnas.0900571106 (2009). 1023

45 Hussain, T., Liu, B., Shrock, M. S., Williams, T. & Aldaz, C. M. WWOX, the 1024 FRA16D gene: A target of and a contributor to genomic instability. Genes, 1025 chromosomes & cancer 58, 324-338, doi:10.1002/gcc.22693 (2019). 1026

46 Patel, K. et al. FAM190A deficiency creates a cell division defect. The American 1027 journal of pathology 183, 296-303, doi:10.1016/j.ajpath.2013.03.020 (2013). 1028

47 Waters, C. E., Saldivar, J. C., Hosseini, S. A. & Huebner, K. The FHIT gene 1029 product: tumor suppressor and genome "caretaker". Cell Mol Life Sci 71, 4577-1030 4587, doi:10.1007/s00018-014-1722-0 (2014). 1031

48 Went, M. et al. Identification of multiple risk loci and regulatory mechanisms 1032 influencing susceptibility to multiple myeloma. Nature communications 9, 3707, 1033 doi:10.1038/s41467-018-04989-w (2018). 1034

49 Janz, S. et al. Germline Risk Contribution to Genomic Instability in Multiple 1035 Myeloma. Frontiers in genetics 10, doi:10.3389/fgene.2019.00424 (2019). 1036

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 46: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

46

50 Hu, G., Wei, Y. & Kang, Y. The Multifaceted Role of MTDH/AEG-1 in Cancer 1037 Progression. Clinical Cancer Research 15, 5615, doi:10.1158/1078-0432.CCR-1038 09-0049 (2009). 1039

51 Cortés-Ciriano, I. et al. Comprehensive analysis of chromothripsis in 2,658 1040 human cancers using whole-genome sequencing. bioRxiv, 333617, 1041 doi:10.1101/333617 (2018). 1042

52 Stephens, P. J. et al. Massive Genomic Rearrangement Acquired in a Single 1043 Catastrophic Event during Cancer Development. Cell 144, 27-40, 1044 doi:10.1016/j.cell.2010.11.055 (2011). 1045

53 Barlund, M. et al. Multiple genes at 17q23 undergo amplification and 1046 overexpression in breast cancer. Cancer research 60, 5340-5344 (2000). 1047

54 Bahrami, B. F., Ataie-Kachoie, P., Pourgholami, M. H. & Morris, D. L. p70 1048 Ribosomal protein S6 kinase (Rps6kb1): an update. Journal of clinical pathology 1049 67, 1019-1025, doi:10.1136/jclinpath-2014-202560 (2014). 1050

55 Chiang, C. et al. The impact of structural variation on human gene expression. 1051 Nature Genetics 49, 692, doi:10.1038/ng.3834 (2017). 1052

56 Campbell, K. S., Cohen, A. D. & Pazina, T. Mechanisms of NK Cell Activation 1053 and Clinical Activity of the Therapeutic SLAMF7 Antibody, Elotuzumab in 1054 Multiple Myeloma. Frontiers in immunology 9, 2551, 1055 doi:10.3389/fimmu.2018.02551 (2018). 1056

57 Maura, F. et al. Moving From Cancer Burden to Cancer Genomics for 1057 Smoldering Myeloma: A Review. JAMA oncology, 1058 doi:10.1001/jamaoncol.2019.4659 (2019). 1059

58 Chapman, M. A. et al. Initial genome sequencing and analysis of multiple 1060 myeloma. Nature 471, 467-472, doi:10.1038/nature09837 (2011). 1061

59 Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 1062 15-21, doi:10.1093/bioinformatics/bts635 (2013). 1063

60 Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon 1064 provides fast and bias-aware quantification of transcript expression. Nature 1065 methods 14, 417-419, doi:10.1038/nmeth.4197 (2017). 1066

61 Gerstung, M. et al. Combining gene mutation with gene expression data improves 1067 outcome prediction in myelodysplastic syndromes. Nature communications 6, 1068 5901, doi:10.1038/ncomms6901 (2015). 1069

62 Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and 1070 dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550, 1071 doi:10.1186/s13059-014-0550-8 (2014). 1072

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 47: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

47

63 Sanchez-Vega, F. et al. Oncogenic Signaling Pathways in The Cancer Genome 1073 Atlas. Cell 173, 321-337.e310, doi:10.1016/j.cell.2018.03.035 (2018). 1074

64 Seiler, M. et al. Somatic Mutational Landscape of Splicing Factor Genes and 1075 Their Functional Consequences across 33 Cancer Types. Cell Reports 23, 282-1076 296.e284, doi:https://doi.org/10.1016/j.celrep.2018.01.088 (2018). 1077

65 Ge, Z. et al. Integrated Genomic Analysis of the Ubiquitin Pathway across Cancer 1078 Types. Cell Reports 23, 213-226.e213, 1079 doi:https://doi.org/10.1016/j.celrep.2018.03.047 (2018). 1080

66 Knijnenburg, T. A. et al. Genomic and Molecular Landscape of DNA Damage 1081 Repair Deficiency across The Cancer Genome Atlas. Cell Reports 23, 239-1082 254.e236, doi:https://doi.org/10.1016/j.celrep.2018.03.076 (2018). 1083

67 Riemondy, K. A. et al. valr: Reproducible genome interval analysis in R. 1084 F1000Research 6, 1025-1025, doi:10.12688/f1000research.11997.1 (2017). 1085

68 Huo, Y., Li, S., Liu, J., Li, X. & Luo, X.-J. Functional genomics reveal gene 1086 regulatory mechanisms underlying schizophrenia risk. Nature communications 1087 10, 670, doi:10.1038/s41467-019-08666-4 (2019). 1088

1089

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 48: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

DEL

DU

PIN

VTR

A

0 100 200 300 400 500 600 700

0

50

100

0

50

100

0

50

100

0

50

100

Patients

SVs

(n)

SV classChromoplexyChromothripsisComplexDELDUPINVTemplated insTRA

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++ +

+++++++++++++++++++++++ ++++ ++++++++++++++++++++++++++++++++++++++ +

+p = 0.0033

0.00

0.25

0.50

0.75

1.00

0 1 2 3 4 5Time (years)

PFS

prob

abilit

y

Chromothripsis + +0 1 or more

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++

+++++++++++++++++++++++++++++ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++++ ++

p = 0.00018

0.00

0.25

0.50

0.75

1.00

0 1 2 3 4 5Time (years)

OS

prob

abilit

y

Chromothripsis + +0 1 or more

MMSET

MAFMA

FACCN

D-1CCN

D-2Nei

ther

HRDA

CCND-3

MAFB

absent1/present>1

ChromoplexyChromothripsisComplexDELDUPINVTemplated insTRA

SV class

Heatmap

B

Chromothripsisabsentpresent

Hazard ratio (95 % CI)

C DPFS OS

**

*

**

1 2 3 1 2 3

Chromoplexy

Chromothripsis

Complex

Templated ins

DEL

DUP

INV

TRA

Hazard ratio (95 % CI)

ChromothripsisTemplated insTemplated ins >2chrComplexChromoplexyt_MYCNon−canonical t_IGAPOBEC highTP53gain 1q21del FAM46Cdel CDKN2Cdel CYLDdel RB1del 13q34del TRAF3del MAXdel BIRC2/BIRC3del 4p15gain 6p22del 6q21del 6q25.3del 8p22del 12p13del 12q21del 20p12del 20q13

Figure 1

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 49: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

del F

AM46

Cde

l CD

KN2C

gain

1q2

1t_

WH

SC1

del 4

p15

gain

6p2

2de

l 6q2

5.3

del 6

q21

t_M

YCde

l 8p2

2am

p M

YCde

l BIR

Ct_

CC

ND

1de

l 12p

13de

l 12q

21t_

CC

ND

2de

l RB1

del 1

3q34

del M

AXde

l TR

AF3

t_M

AFde

l CYL

Dde

l TP5

3de

l 20p

12t_

MAF

Bde

l 20q

13de

l UTX

del CDKN2Cgain 1q21t_WHSC1

del 4p15gain 6p22

del 6q25.3del 6q21

t_MYCdel 8p22

amp MYCdel BIRCt_CCND1del 12p13del 12q21t_CCND2

del RB1del 13q34

del MAXdel TRAF3

t_MAFdel CYLDdel TP53

del 20p12t_MAFB

del 20q13del UTX

del Xq22

012581520

Co−

occu

rrenc

e

C

A

D

BMYCIGLIGKIGH

MMSE

TMA

FCC

ND1

Othe

rMY

C

Othe

rMY

C

Othe

rMY

C

Non−

IG0.00

0.25

0.50

0.75

1.00

Translocation partner

Prop

ortio

nPr

opor

tion

1.00

0.75

0.50

0.25

0.00

del C

DKN2

Cde

l FAM

46C

gain

1q2

1de

l 4p1

5de

l 6q2

1de

l 6q2

5.3

gain

6p2

2ga

in M

YCde

l 8p2

2de

l BIR

Cde

l 12p

13de

l 12q

21de

l 13q

34de

l RB1

del M

AXde

l TRA

F3de

l CYL

Dde

l TP5

3de

l 20p

12

Prop

ortio

n

1.00

0.75

0.50

0.25

0.00

MM

SET

MAF

CCND

1O

ther

MYC

Oth

erM

YC

Oth

erM

YC

Non-

IG

IGLIGKIGH Rare IG

Expr

essio

n (lo

g TP

M+1

)

0

2

4

6

8

CCND

1CC

ND2

CCND

3M

AFM

AFA

MAF

BM

MSE

TM

YC

BCL2

A1CD

40FC

HSD2

FOXO

3M

AP3K

14NF

KB1

NRAS

P2RY

2P2

RY6

PAX5

TNFR

SF13

BTN

FRSF

17TN

FRSF

13TR

IM33

CCND

1CC

ND2

CCND

3M

AFM

AFA

MAF

BM

MSE

TM

YC

CCND

1CC

ND2

CCND

3M

AFM

AFA

MAF

BM

MSE

TM

YC

del C

DKN2

Cga

in 1

q21

del 4

p15 de

l 6q2

1de

l 6q2

5.3

gain

6p2

2

gain

MYC

del B

IRC

del 1

2p13

del 1

2q21 de

l 13q

34de

l RB1 de

l MAX

del T

RAF3

del C

YLD

del T

P53

del 2

0p12

del 8

p22

del CDKN2C

del F

AM46

C

gain 1q21

del 4p15

del 6q21del 6q25.3gain 6p22

del 8p22gain MYCdel BIRC

del 12p13del 12q21

del RB1del 13q34

del MAX

del CYLD

del TRAF3

del TP53

t_MAFB

del FAM46C

del CDKN2C

gain 1q21

t_WHSC1

del 4p15

gain 6p22

del 6q25.3

del 6q21

t_MYC

del 8p22

amp MYC

del BIRC

t_CCND1

del 12p13

del 12q21

t_CCND2

del RB1

del 13q34

del MAX

del TRAF3

t_MAF

del CYLD

del TP53

del 20p12

del CDKN2Cgain 1q21t_WHSC1

del 4p15gain 6p22del 6q25.3

del 6q21t_MYC

del 8p22amp MYC

del BIRCt_CCND1del 12p13del 12q21t_CCND2

del RB1del 13q34

del MAXdel TRAF3

t_MAFdel CYLDdel TP53del 20p12

t_MAFB

01234510

Co−occurrence

01258

1520

Co-o

ccur

renc

e

del 20p12

t_MAF

t_CCND2

t_CCND1

t_MYC

t_MMSET

t_M

MSE

T

t_M

YC

t_CC

ND1 t_CC

ND2

t_M

AF

Reciprocal TRA

ChromoplexyChromothripsisComplex

Unbalanced TRA

INV

Templated ins

DUPDELUnknownWhole arm

Unclassified TRA

0.00

0.25

0.50

0.75

1.00

del C

DK

N2C

del F

AM

46C

gain

1q2

1de

l 4p1

5de

l 6q2

1de

l 6q2

5.3

gain

6p2

2am

p M

YC

del 8

p22

del B

IRC

del 1

2p13

del 1

2q21

del 1

3q34

del R

B1

del M

AX

del T

RA

F3

del C

YLD

del T

P53

del 2

0p12

del 2

0q13

prop

ortio

n

del 2

0q13

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●●

●●●

● ●

●● ●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

IGH IGK IGL Rare IG

CCND

1CC

ND2

CCND

3M

AFM

AFA

MAF

BM

MSE

TM

YC

CCND

1CC

ND2

CCND

3M

AFM

AFA

MAF

BM

MSE

TM

YC

CCND

1CC

ND2

CCND

3M

AFM

AFA

MAF

BM

MSE

TM

YC

BCL2

A1CD

40FC

HSD2

FOXO

3M

AP3K

14NF

KB1

NRAS

P2RY

2P2

RY6

PAX5

TNFR

SF13

BTN

FRSF

17TN

FSF1

3TR

IM33

0

2

4

6

8

Expr

essio

n (lo

g(TP

M+1

))

del 20q13

del Xq22del UTX

del 2

0q13

del U

TX

t_M

AFB

Reciprocal TRA

ChromoplexyChromothripsisComplexUnbalanced TRA

Templated ins

Figure 2

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 50: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

0MB 30MB 60MB90MB

120MB150MB

180MB210M

B240MB

2

0MB

30MB

60MB

90MB

120M

B

150M

B

180MB

4

0MB

30MB

60MB

90MB

120MB

150MB7

0MB30MB60MB

90MB

120MB

8

0MB

30MB

60MB

90MB

14

0MB

30MB

60MB

90MB

16

0MB

30MB

60MB

170MB30MB

60MB

18

0MB30M

B190MB

30MB 60MB20A B

C

MMSET

IGH

4p15

MYC

MAF

20p12

TP53

CYLD

MAXTRAF3

0MB 10MB20MB

30MB40MB

50MB

60MB

70MB

80MB

90MB

100M

B

110M

B

120MB

130MB

11

0MB

10MB

20MB

30MB40MB50MB60MB

70MB

80MB

90MB

100MB

110MB

13

0MB

20MB

30MB

40MB

50MB

60MB

70MB

80MB

90MB

10MB

100MB

14RB1

BIRC2BIRC3

TRAF3

0MB10MB 20MB 30MB 40MB 50MB60MB

70MB80MB

90MB100M

B110M

B120M

B130M

B

140M

B

8

0MB

10MB

20MB

30MB

40MB

50MB

60MB

70MB

80MB

90MB100MB110MB120MB

130MB

110MB

10MB

20MB

30MB

40MB50MB

60MB

70MB

80MB

90MB

100MB

14

0MB10M

B20MB

30MB

40MB50MB60MB

70MB

18

0MB10

MB20M

B 30MB40MB50MB22

IGH

CCND1

MYC

DChromosome 6

DEL DUP INV TRASV classesLossGainCopy number

IGLC

opy

num

ber

TXN

DC

5BM

P6

PRD

M1

SCAF

8AR

ID1B

MMRF_1740_1_BM − Chromosome 6

024

Cop

y N

umbe

r

MMRF_2266_1_BM − Chromosome 6

0246

Cop

y N

umbe

r

1 141519

Figure 3

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 51: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

Chromosome 1

159 mb

160 mb

161 mb

162 mb

0

2

4

6

8

10

12

Cum

ulat

ive

CN

V

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X

0

2

4

6

SV in

ter−

dist

(log

10)

120

80

40

0

40

80

no. p

atie

nts

Hotspot

GISTIC

A

Chromosome 16

10 mb

11 mb

12 mb

13 mb

14 mb

0

5

10

Cum

ulat

ive

CN

V

Chromosome 9

21 mb

22 mb

23 mb

24 mb

0

5

10

15

Cum

ulat

ive

CN

V

Cum

ulat

ive

copy

num

ber

H3K27AcGenes

B D

MTAP

CDKN

2ACDKN

2B

GISTIC peak

SV b

reak

poin

t den

sity

SLAM

F1

CD

KN2A

CD

KN2B

Templated insDUP ChromothripsisChromoplexyComplexDEL INVCopy number LossGain SVs TRA

TNFR

SF17

TNFR

SF17 SLAM

F1CD

48SLAM

F7LY9

SLAMF1

CD48

SLAMF7

LY9

SLAM

F1CD

48SLAM

F7LY9

CD

48SL

AMF7

LY9

Chr 16 Chr 1 Chr 9C

TNFR

SF17

10

5

0 0

4

8

12

0

5

10

15

Chromosome 9

21 mb

22 mb

23 mb

24 mb

0

5

10

15

Cum

ulat

ive

CN

V

Figure 4(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 52: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

CDKN

2C /

FAF1

FAM

46C

CD48

/ SL

AMF7

BTG

2 / F

MO

D

SPTB

N1PC

BP1

BCL2

L11

SP3

CASP

8 / C

FLAR

SP14

0KA

T2B

PRKC

D

RASA

2 / Z

BTB3

8SE

C62

/ SAM

D7TB

L1XR

1W

HSC1

/ FG

FR3

FBXW

7

SOX3

0CP

EB4

NSD1

/ FG

FR4

IRF4

BMP6

/ TX

NDC5

NEDD

9

ATG

5 / P

RDM

1FO

XO3

SCAF

8G

LCCI

1HN

RNPA

2B1

GLI

3

TBXA

S1 /

HIPK

2

MTD

H

NSM

CE2

/ MYC

MYC

PTPR

DCD

KN2A

/ CD

KN2B

ECHD

C3

NFKB

2

CCND

1P2

RY2

/ FCH

SD2

BIR

C2

POU2

AF1

CCND

2LT

BR /

LAG

3AS

UNDE

NND5

B

PAG

M5

/ PO

LERB

1M

AXTS

HR /

CEP1

28TR

AF3

KLF1

3BU

B1B

USP3

/ HE

RC1

TNFR

SF17

CYLD

HERP

UD1

MAF

/ W

WOX

TP53

MAP

3K14

KLF2

CD40

PTPN

1CH

EK2

/ XBP

1

KDM

6A

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20 22 X

p36.

13p3

2.3

p13.

3p1

2q2

1.1

q23.

2;q2

3.3

q32.

1q4

3p1

3.3

p16.

2p1

4q1

3q3

1.1

q33.

1q3

7.1

p24.

3p2

1.1

p14.

2p1

2.2

q23

q26.

2q2

6.32

p16.

3q2

2.1;

q22.

2;q2

2.3

q31.

1;q3

1.21

q31.

3q3

3.1

q11.

2q2

2.2

q33.

3q3

5.1;

q35.

2q3

5.3

p24.

3p2

5.3

p24.

2p2

2.1;

p21.

33;p

21.3

2q1

2;q1

3q2

1q2

1_2

q25.

2p2

1.3

p15.

3;p1

5.2

p14.

1q1

1.22 q3

4p2

3.3;

p23.

2q2

2.1

q22.

2;q2

2.3

q22.

3q2

4.13

q24.

21p2

1.3

p24.

1;p2

3p1

4q2

5.2

q24.

32q2

2.1

q13.

3q1

3.4

q14.

1q2

2.3;

q23.

1p1

3.32

p13.

31p1

1.23

p11.

21q2

4.31

q24.

33q1

4.2

q23.

2;q2

3.3

q31.

1q3

2.31

;q32

.32

q11.

2;q1

2q1

3.2;

q13.

3q1

4;q1

5.1

q21.

2;q2

1.3

q22.

2;q2

2.31

q24.

1;q2

4.2

q25.

3p1

3.13

;p13

.12

q12.

1q1

2.2;

q13;

q21

q23.

1;q2

3.2

p13.

2p1

3.1

q21.

2;q2

1.31

q21.

31q2

5.1

p12

p13.

2p1

3.12

;p13

.11

q13.

12;q

13.1

3p1

2.1;

p11.

23q1

1.22

;q11

.23

q13.

12q1

3.13

q12.

1p2

2.31

p11.

3q2

1.33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20 22 X

p36.13

p32.3

p13.3

p12

q21.1

q23.2;q23.3

q32.1

q43

p13.3

p16.2

p14

q13

q31.1

q33.1

q37.1

p24.3

p21.1

p14.2

p12.2

q23

q26.2

q26.32

p16.3

q22.1;q22.2;q22.3

q31.1;q31.21

q31.3

q33.1

q11.2

q22.2

q33.3

q35.1;q35.2

q35.3

p24.3

p25.3

p24.2

p22.1;p21.33;p21.32

q12;q13

q21

q21_2

q25.2

p21.3

p15.3;p15.2

p14.1

q11.22 q34

p23.3;p23.2

q22.1

q22.2;q22.3

q22.3

q24.13

q24.21

p21.3

p24.1;p23

p14

q25.2

q24.32

q22.1

q13.3

q13.4

q14.1

q22.3;q23.1

p13.32

p13.31

p11.23

p11.21

q24.31

q24.33

q14.2

q23.2;q23.3

q31.1

q32.31;q32.32

q11.2;q12

q13.2;q13.3

q14;q15.1

q21.2;q21.3

q22.2;q22.31

q24.1;q24.2

q25.3

p13.13;p13.12

q12.1

q12.2;q13;q21

q23.1;q23.2

p13.2

p13.1

q21.2;q21.31

q21.31

q25.1

p12

p13.2

p13.12;p13.11

q13.12;q13.13

p12.1;p11.23

q11.22;q11.23

q13.12

q13.13

q12.1

p22.31

p11.3

q21.33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20 22 X

p36.13

p32.3

p13.3

p12

q21.1

q23.2;q23.3

q32.1

q43

p13.3

p16.2

p14

q13

q31.1

q33.1

q37.1

p24.3

p21.1

p14.2

p12.2

q23

q26.2

q26.32

p16.3

q22.1;q22.2;q22.3

q31.1;q31.21

q31.3

q33.1

q11.2

q22.2

q33.3

q35.1;q35.2

q35.3

p24.3

p25.3

p24.2

p22.1;p21.33;p21.32

q12;q13

q21

q21_2

q25.2

p21.3

p15.3;p15.2

p14.1

q11.22 q34

p23.3;p23.2

q22.1

q22.2;q22.3

q22.3

q24.13

q24.21

p21.3

p24.1;p23

p14

q25.2

q24.32

q22.1

q13.3

q13.4

q14.1

q22.3;q23.1

p13.32

p13.31

p11.23

p11.21

q24.31

q24.33

q14.2

q23.2;q23.3

q31.1

q32.31;q32.32

q11.2;q12

q13.2;q13.3

q14;q15.1

q21.2;q21.3

q22.2;q22.31

q24.1;q24.2

q25.3

p13.13;p13.12

q12.1

q12.2;q13;q21

q23.1;q23.2

p13.2

p13.1

q21.2;q21.31

q21.31

q25.1

p12

p13.2

p13.12;p13.11

q13.12;q13.13

p12.1;p11.23

q11.22;q11.23

q13.12

q13.13

q12.1

p22.31

p11.3

q21.33

0

100

200

13 22

Enhancer

Gain/UpLoss/DownBoth

ExpressionCNA

Even

ts (n

)

Templated insTRA DUPChromothripsisChromoplexyComplexDELA

B

Prop

INV

C

●●

●●

●●

●●

● ●●

●●

●●

●●

● ●

0

25

50

75

100

0 25 50 75Deletions (%)

Tem

plat

ed in

serti

ons,

tran

sloc

atio

ns a

nd d

uplic

atio

ns (%

)

Hotspot class●

gainlossunknown

no. events●

1050100250

Events (n)

Gain of function

UnknownLoss of function

Hotspot class

DIAPH2

CNTN5WWOX

FAF1WHSC1

CCSER1

NSMCE2AUTS2

BCL2L11

PTPRD

RB1 TRAF3FHIT TBL1XR1SP140

MAX

0

10

20

30

40

−1.0 −0.5 0.0 0.5 1.0 1.5Replication Time (normalized)

Dele

tion

brea

kpoi

nts

with

in g

ene

(no.

of p

atie

nts)

Size (Mb)0.1

1.0

2.0

Gene in validated fragile site

Gene annotations

Tumor suppressor & fragileTumor suppressor gene

Neither

Size (Mb)●

●●

●●

●●

●●●

● ●

●●

●●

●●

● ●

●●

● ●

0

25

50

75

100

0 25 50 75Deletions (%)

Tem

plat

ed in

serti

ons,

trans

loca

tions

and

tand

em d

uplic

atio

ns (%

)

no. events●

1050100

Hotspot class●

gainlossunknown

FAM46CTNFRSF17

ATG5/PRDM1CCND1

IRF4

KDM6A

CD40

MAF/WWOX

CSMD1

BCL2L11SP140

SCAF8

PTPRDNFKB2

CHEK2/XBP1

Tand

em d

uplic

atio

ns, t

empl

ated

in

serti

ons

and

simpl

e tra

nslo

catio

ns (%

)

Deletions (%)

0

10

20

30

40

−1.0 −0.5 0.0 0.5 1.0 1.5Replication Time (normalized)

Del

etio

n br

eakp

oint

with

in g

ene

(num

ber o

f pat

ient

s)

Size (Mb)0.1

1.0

2.0

PTPRD

DIAPH2CCSER1

BCL2L11

WWOXCNTN5

FHIT

MAX

TRAF3RB1

SP140AUTS2PDE4DIMMP2L

KDM6A

Replication Time (normalized)

Dele

tion

brea

kpoi

nt w

ithin

gen

e (n

umbe

r of p

atie

nts)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20 22 Xp36.13

p32.3

p13.3

p12

q21.1

q23.2;q23.3

q32.1

q43

p13.3

p16.2

p14

q13

q31.1

q33.1

q37.1

p24.3

p21.1

p14.2

p12.2

q23

q26.2

q26.32

p16.3

q22.1;q22.2;q22.3

q31.1;q31.21

q31.3

q33.1

q11.2

q22.2

q33.3

q35.1;q35.2

q35.3

p24.3

p25.3

p24.2

p22.1;p21.33;p21.32

q12;q13

q21

q21_2

q25.2

p21.3

p15.3;p15.2

p14.1

q11.22 q34

p23.3;p23.2

q22.1

q22.2;q22.3

q22.3

q24.13

q24.21

p21.3

p24.1;p23

p14

q25.2

q24.32

q22.1

q13.3

q13.4

q14.1

q22.3;q23.1

p13.32

p13.31

p11.23

p11.21

q24.31

q24.33

q14.2

q23.2;q23.3

q31.1

q32.31;q32.32

q11.2;q12

q13.2;q13.3

q14;q15.1

q21.2;q21.3

q22.2;q22.31

q24.1;q24.2

q25.3

p13.13;p13.12

q12.1

q12.2;q13;q21

q23.1;q23.2

p13.2

p13.1

q21.2;q21.31

q21.31

q25.1

p12

p13.2

p13.12;p13.11

q13.12;q13.13

p12.1;p11.23

q11.22;q11.23

q13.12

q13.13

q12.1

p22.31

p11.3

q21.33

0.0

0.5

1.0

Figure 5

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 53: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

MYC IG

LIG

HCC

ND1

BMP6

/ TX

NDC5

chr2

0q11

KLF2

MAP

3K14

P2RY

2 / F

CHSD

2FB

XW7

HNRN

PA2B

1SO

X30

IGK

PRKC

DFA

M46

CFO

XO3

chr6

p21−

22CC

ND2

CD40

chr1

7q21

chr1

3q14

.2SE

C62

/ SAM

D7TN

FRSF

17ch

r11q

14.1

chr1

7q25

.1IR

F4BT

G2

/ FM

OD

CD48

/ SL

AMF7

chr1

4q23

chr1

5q25

.3ch

r2p1

4CP

EB4

ECHD

C3LT

BR /

LAG

3ch

r15q

24

IGLIGH

CCND1BMP6 / TXNDC5

chr20q11KLF2

MAP3K14P2RY2 / FCHSD2

FBXW7HNRNPA2B1

SOX30IGK

PRKCDFAM46CFOXO3

chr6p21−22CCND2

CD40chr17q21

chr13q14.2SEC62 / SAMD7

TNFRSF17chr11q14.1chr17q25.1

IRF4BTG2 / FMOD

CD48 / SLAMF7chr14q23

chr15q25.3chr2p14CPEB4

ECHDC3LTBR / LAG3

chr15q24POU2AF1

015102040

Co−o

ccur

renc

e

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X40

20

0

20

40

no. p

atie

nts

Copy

number

ENSEMBL

MYC

128.7 mb

128.8 mb

128.9 mb

129 mb

129.1 mb

129.2 mb

129.3 mb

129.4 mb

Copy

number

ENSEMBL NEIL1

MAN2C1

SIN3A

75.1 mb

75.2 mb

75.3 mb

75.4 mb

75.5 mb

75.6 mb

75.7 mb

CB

A

Copy number gain (cumulative)Templated insertions

n =

100

n =

201

n =

61

n =

111

CCND

1

MYC IG

H

IGL

BMP6

FAM

46C

CCND

2

MAP

3K14

KLF2

FBXW

7

FOXO

3

Copy

number

ENSEMBL

BCR

22.8 mb

22.9 mb

23 mb

23.1 mb

23.2 mb

23.3 mb

23.4 mb

23.5 mb

23.6 mb

23.7 mb

23.8 mb

Chr 8q24 - MYC

Chr 15q24

Chr 22q11 - IGL

H3K27ac

CNA

Genes

TNFR

SF17

PRKC

D

MYC IGLIGH

CCND1

BMP6|TXNDC5

chr20q11.22;q11.23KLF2

MAP3K14

P2RY2|FCHSD2

FBXW7

HNRNPA2B1

SOX30IGK

PRKCD

FAM46C

FOXO3

chr6p22.1;p21.33;p21.32

CCND2CD40

chr17q21.2;q21.31

chr13q14.2

SEC62|SAMD7

TNFRSF17

chr11q14.1

chr17q25.1IRF4

BTG2|FMOD

CD48|SLAMF7

chr14q23.2;q23.3

chr15q25.3

chr2p14

CPEB4

ECHDC3

LTBR|LAG3

MAN2C1

POU2AF1

MYCIGLIGH

CCND1BMP6|TXNDC5

chr20q11.22;q11.23KLF2

MAP3K14P2RY2|FCHSD2

FBXW7HNRNPA2B1

SOX30IGK

PRKCDFAM46CFOXO3

chr6p22.1;p21.33;p21.32CCND2CD40

chr17q21.2;q21.31chr13q14.2

SEC62|SAMD7TNFRSF17chr11q14.1chr17q25.1

IRF4BTG2|FMOD

CD48|SLAMF7chr14q23.2;q23.3

chr15q25.3chr2p14CPEB4

ECHDC3LTBR|LAG3MAN2C1POU2AF1

0 1 5 10 20 40Co−occurrence

0 1 5 10 20 40Co-occurrence

Copy number loss (cumulative)

SLAM

F7

H3K27ac

CNA

Genes

H3K27ac

CNA

GenesMYC IG

LIG

HCC

ND1

BMP6

/ TX

NDC5

chr2

0q11

KLF2

MAP

3K14

P2RY

2 / F

CHSD

2FB

XW7

HNRN

PA2B

1SO

X30

IGK

PRKC

DFA

M46

CFO

XO3

chr6

p21−

22CC

ND2

CD40

chr1

7q21

chr1

3q14

.2SE

C62

/ SAM

D7TN

FRSF

17ch

r11q

14.1

chr1

7q25

.1IR

F4BT

G2

/ FM

OD

CD48

/ SL

AMF7

chr1

4q23

chr1

5q25

.3ch

r2p1

4CP

EB4

ECHD

C3LT

BR /

LAG

3ch

r15q

24

IGLIGH

CCND1BMP6 / TXNDC5

chr20q11KLF2

MAP3K14P2RY2 / FCHSD2

FBXW7HNRNPA2B1

SOX30IGK

PRKCDFAM46CFOXO3

chr6p21−22CCND2

CD40chr17q21

chr13q14.2SEC62 / SAMD7

TNFRSF17chr11q14.1chr17q25.1

IRF4BTG2 / FMOD

CD48 / SLAMF7chr14q23

chr15q25.3chr2p14CPEB4

ECHDC3LTBR / LAG3

chr15q24POU2AF1

015102040

Co−o

ccur

renc

e

Figure 6

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 54: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

0

10

20

30

1 2 3 4 5 6 7 8 11 13 14 15 16 17 19 20 22

−5

0

5

10

>5 copies normal

Gen

e ex

pres

sion

(Z−s

core

)

0

5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X15

10

5

0

5

10

15no

. pat

ient

s

B

A

Copy number gains (cumulative)

DCopy number loss (cumulative)Chromothripsis

Patie

nts

(n)

INVDUP

Foca

l gai

n >

5 co

pies

(n)

C

TRADEL Complex No SV

Templated insChromothripsisINVDUP

MAP

3K14

SKAP

1IT

GA3

RAD

51C

10

21

34

6789

Chromosome 17

Cop

y nu

mbe

r

Gen

e ex

pres

sion

(Z-s

core

)

MMRF_2330_1_BM − Chromosome 17

0123456789

1011

Copy

Num

ber

MAP3K

14SK

AP1

ITGA

3RA

D51C

p < 0.0001

Figure 7

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint

Page 55: Revealing the impact of recurrent and rare structural …...2019/12/18  · 9 Keats16, Peter J Campbell5, Gareth J Morgan3, Ola Landgren1 and Francesco Maura1 10 11 1 Myeloma Service,

0.00

0.25

0.50

0.75

1.00

Tem

plat

ed in

sCh

rom

oplex

yCh

rom

othr

ipsis

Unba

lanc

ed T

RARe

cipro

cal T

RACo

mpl

exUn

class

ified

TRA

INV

DUP

DEL

Prop

ortio

n

D

A

C

Canonical TRA

10.80.60.40.20

Canonical TR

A

Recurrent C

NV

SV hotspot

GISTIC

Rare SV

0 0.2

0.4

0.6

0.8

1

Recurrent CNASV hotspotGISTICRare SV

SV breakpoints (prop)

Canonical TRA Recurrent CNA SV hotspot GISTIC Rare SV

Chr

omot

hrip

sis

Tem

plat

ed in

s>2

chr

omos

omes

B

Recurrent CNACanonical TRA

SV hotspotGISTICRare SV

Canonical TRA

Recurrent CNV

SV hotspot

GISTIC

Rare SV0

0.2

0.4

0.6

0.8

1

Canonical TRA

Recurrent CNV

SV hotspot

GISTIC

Rare SV0

0.2

0.4

0.6

0.8

1

Canonical TRARecurrent CNASV hotspotGISTICRare SV

Chr

omop

lexy

Tem

plat

ed in

s

Chr

omot

hrip

sis

Unb

alan

ced

TRA

Rec

ipro

cal T

RA

Com

plex

Unc

lass

ified

TR

A

INV

DU

PD

EL

1.0

0.75

0.50

0.25

0.00

Prop

ortio

n

● ●

● ●

●●

● ●●

● ●

Gene body 100 Kb flanking 1 Mb flanking

Up

Dow

n

Chr

omop

lexy

Chr

omot

hrip

sis

Com

plex

DEL

DU

PIN

VTe

mpl

ated

ins

TRA

Chr

omop

lexy

Chr

omot

hrip

sis

Com

plex

DEL

DU

PIN

VTe

mpl

ated

ins

TRA

Chr

omop

lexy

Chr

omot

hrip

sis

Com

plex

DEL

DU

PIN

VTe

mpl

ated

ins

TRA

0.1

1.0

10.0

0.1

1.0

10.0Fo

ld e

nric

hmen

t (lo

g10)●

●●

● ●

●●

●●

● ●

● ●●

● ●

●● ●

●●

Gene body 100 Kb flanking 1 Mb flanking

Up

Dow

n

Chr

omop

lexy

Chr

omot

hrip

sis

Com

plex

DEL

DU

PIN

VTe

mpl

ated

ins

TRA

Chr

omop

lexy

Chr

omot

hrip

sis

Com

plex

DEL

DU

PIN

VTe

mpl

ated

ins

TRA

Chr

omop

lexy

Chr

omot

hrip

sis

Com

plex

DEL

DU

PIN

VTe

mpl

ated

ins

TRA

0.1

1.0

10.0

0.1

1.0

10.0

Fold

enr

ichm

ent (

log1

0)

10.0

1.0

0.110.0

1.0

0.1

Fold

enr

ichm

ent o

f out

liers

(log

10)

10.0

1.0

0.110.0

1.0

0.1

Chr

omop

lexy

Tem

plat

ed in

s

Chr

omot

hrip

sis

TRA

Com

plex INV

DU

PD

EL

Chr

omop

lexy

Tem

plat

ed in

s

Chr

omot

hrip

sis

TRA

Com

plex INV

DU

PD

EL

Chr

omop

lexy

Tem

plat

ed in

s

Chr

omot

hrip

sis

TRA

Com

plex INV

DU

PD

EL

Dow

n

Dow

n

Up

Up

1 Mb flanking100 Kb flankingGene body

Chr

omop

lexy

Tem

plat

ed in

s

Chr

omot

hrip

sis

TRA

Com

plex INV

DU

PD

EL

Chr

omop

lexy

Tem

plat

ed in

s

Chr

omot

hrip

sis

TRA

Com

plex INV

DU

PD

EL

Chr

omop

lexy

Tem

plat

ed in

s

Chr

omot

hrip

sis

TRA

Com

plex INV

DU

PD

EL

1 Mb flanking100 Kb flankingGene body

Recurrent SVs Rare SVs

Fold

enr

ichm

ent o

f out

liers

(log

10)

Figure 8

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted December 19, 2019. . https://doi.org/10.1101/2019.12.18.881086doi: bioRxiv preprint