Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa...

28
1 Modeling and docking antibody structures with Rosetta Brian D. Weitzner, 1* Jeliazko R. Jeliazkov, 2* Sergey Lyskov, 1* Nicholas Marze, 1 Daisuke Kuroda, 1,3 Rahel Frick, 4 Naireeta Biswas, 1 and Jeffrey J. Gray 1,5,6,7 1 Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, USA. 2 T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland, USA. 3 Department of Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation, Department of Biosciences, University of Oslo, Oslo, Norway; Centre for Immune Regulation, Department of Immunology, Oslo University Hostpital Rikshospitalet, University of Oslo, Oslo, Norway. 5 Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland, USA. 6 Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, Maryland, USA. 7 Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland, USA. Correspondence should be addressed to J.J.G. ([email protected]). * Equal contribution authors ABSTRACT We describe Rosetta-based computational protocols for predicting the three-dimensional structure of an antibody from sequence and then docking the antibody–antigen complexes. Antibody modeling leverages canonical loop conformations to graft large segments from experimentally-determined structures as well as (1) energetic calculations to minimize loops, (2) docking methodology to refine the V L –V H relative orientation, and (3) de novo prediction of the elusive CDR H3 loop. To alleviate model uncertainty, antibody–antigen docking resamples CDR loop conformations and can use multiple models to represent an ensemble of conformations for the antibody, the antigen or both. These protocols can be run fully- automated via the ROSIE web server or manually on a computer with user control of individual steps. For best results, the protocol requires roughly 2,500 CPU-hours for antibody modeling and 250 CPU-hours for antibody–antigen docking. Both tasks can be completed in under a day by using public supercomputers. INTRODUCTION The vertebrate adaptive immune system is capable of promoting cells to degranulate or phagocytose nearly any foreign pathogen by producing immunoglobulin G (IgG) proteins (antibodies) that recognize a specific region (epitope) of a pathogenic molecule (antigen). The ability to bind diverse antigens requires a diverse population of antibodies, which is achieved through complex processes in bone marrow and lymphatic tissues, namely V(D)J recombination and somatic hypermutation. The diversity of antibodies is astonishing; the size of the theoretical naïve antibody repertoire is estimated to be > 10 13 in humans 1 . In addition to their biological importance, antibodies are routinely used in biotechnology as probes and diagnostics, and there are dozens of antibodies approved as therapeutics 2 . . CC-BY-NC 4.0 International license certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (which was not this version posted August 16, 2016. . https://doi.org/10.1101/069930 doi: bioRxiv preprint

Transcript of Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa...

Page 1: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

1

ModelinganddockingantibodystructureswithRosettaBrianD.Weitzner,1*JeliazkoR.Jeliazkov,2*SergeyLyskov,1*NicholasMarze,1DaisukeKuroda,1,3RahelFrick,4NaireetaBiswas,1andJeffreyJ.Gray1,5,6,7

1DepartmentofChemicalandBiomolecularEngineering,JohnsHopkinsUniversity,Baltimore,Maryland,USA.2T.C.JenkinsDepartmentofBiophysics,JohnsHopkinsUniversity,Baltimore,Maryland,USA.3DepartmentofAnalyticalandPhysicalChemistry,ShowaUniversitySchoolofPharmacy,Tokyo142-8555,Japan.4CentreforImmuneRegulation,DepartmentofBiosciences,UniversityofOslo,Oslo,Norway;CentreforImmuneRegulation,DepartmentofImmunology,OsloUniversityHostpitalRikshospitalet,UniversityofOslo,Oslo,Norway.5PrograminMolecularBiophysics,JohnsHopkinsUniversity,Baltimore,Maryland,USA.6InstituteforNanoBioTechnology,JohnsHopkinsUniversity,Baltimore,Maryland,USA.7SidneyKimmelComprehensiveCancerCenter,JohnsHopkinsSchoolofMedicine,Baltimore,Maryland,USA.CorrespondenceshouldbeaddressedtoJ.J.G.([email protected]).*Equalcontributionauthors

ABSTRACT

WedescribeRosetta-basedcomputationalprotocolsforpredictingthethree-dimensionalstructureofanantibodyfromsequenceandthendockingtheantibody–antigencomplexes.Antibodymodelingleveragescanonicalloopconformationstograftlargesegmentsfromexperimentally-determinedstructuresaswellas(1)energeticcalculationstominimizeloops,(2)dockingmethodologytorefinetheVL–VHrelativeorientation,and(3)denovopredictionoftheelusiveCDRH3loop.Toalleviatemodeluncertainty,antibody–antigendockingresamplesCDRloopconformationsandcanusemultiplemodelstorepresentanensembleofconformationsfortheantibody,theantigenorboth.Theseprotocolscanberunfully-automatedviatheROSIEwebserverormanuallyonacomputerwithusercontrolofindividualsteps.Forbestresults,theprotocolrequiresroughly2,500CPU-hoursforantibodymodelingand250CPU-hoursforantibody–antigendocking.Bothtaskscanbecompletedinunderadaybyusingpublicsupercomputers.

INTRODUCTION

ThevertebrateadaptiveimmunesystemiscapableofpromotingcellstodegranulateorphagocytosenearlyanyforeignpathogenbyproducingimmunoglobulinG(IgG)proteins(antibodies)thatrecognizeaspecificregion(epitope)ofapathogenicmolecule(antigen).Theabilitytobinddiverseantigensrequiresadiversepopulationofantibodies,whichisachievedthroughcomplexprocessesinbonemarrowandlymphatictissues,namelyV(D)Jrecombinationandsomatichypermutation.Thediversityofantibodiesisastonishing;thesizeofthetheoreticalnaïveantibodyrepertoireisestimatedtobe>1013inhumans1.Inadditiontotheirbiologicalimportance,antibodiesareroutinelyusedinbiotechnologyasprobesanddiagnostics,andtherearedozensofantibodiesapprovedastherapeutics2.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 2: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

2

Next-generationsequencingtechniqueshaveenabledrapiddeterminationoflargenumbersofantibodysequences1.Alimitationoftheseapproachesisthatnoinformationaboutthespecificatomiccontactsbetweentheantibodyandantigencanbegleanedfromthesedatasets.Atomicdetailisrequiredtoconsiderspecificantibody–antigeninteractions,forexample,inordertodeveloptherapeuticantibodiesorvaccinesthataremimeticsofextremelyinfectiousantigens3.Althoughthereareexperimentalmethodscapableofgeneratingstructuralmodelsinatomicdetail(X-raycrystallography,NMR,neutrondiffraction,cryo-EM),notallproteinstructurescanbedeterminedwiththesemethods,andlimitedresourcesmakeitimpossibletodeterminethestructuresofallofthesequencesidentifiedinhigh-throughputsequencingexperiments.Tobridgethesequence–structuregap,onemustemploycomputationalstructurepredictionmethods.Perhapsmoreimportantly,structurepredictionmethodsareusefulindiagnosticsanddrugdiscoverytodefineepitopesandhelpinferbiologicalortherapeuticmechanisms.

Thefunctionofanantibodyarisesfromitsthree-dimensionalstructure.TheIgGisoform,themostcommontypeofnaturallyoccurringantibodies,consistsoftwoidenticalsetsofheavyandlightchainsarrangedintoa“Y”shape,withthefourpolypeptidechainsjoinedbydisulfidelinkages.Theheavychaincontainsfourdomains,threeadjacentconstantdomains(CH1,CH2,CH3)andonevariabledomain(VH),andthelightchainconsistsofasingleconstantdomain(CL)andavariabledomain(VL).TheCH1andVHdomainsinteractwiththeCLandVLdomainstoformtheantigen-bindingfragment(Fab)orthe“arms”oftheY.WithintheFab,bothvariabledomainsaredirectedawayfromtheremainingheavychainconstantdomainsandmakeupthevariablefragment(FV).AtthetipoftheFVarethreecomplementaritydeterminingregion(CDR)loopsoneachchain(CDRL1–3andCDRH1–3)thatformtheregionoftheantibody,calledtheparatope,thatrecognizesitstarget.ThisFvstructureiscommontootherantibodyisoforms(IgA,IgE,etc.).

Antibodyhomologymodeling

TheFVisthefocalpointoftherecombinationandhypermutationevents;assuch,theprimarydifferenceamongantibodiesistheconformation,structuralcontext,andchemicalidentityoftheirCDRloops.Forthisreason,antibodystructurepredictionmethodsfocusonmodelingtheFV.TheFVcanbesplitintotworegions:frameworkregions,andCDRloops.Theframeworkregionshaveahighdegreeofstructuralconservation,makingitpossibletogenerateaccuratemodelsofframeworkregionsfromtemplatestructures.

Similarly,analysisofantibodycrystalstructureshasrevealedthatfiveofthesixCDRloops(CDRL1–3,H1,H2)adoptalimitednumberofdistinctstructures,referredtoascanonicalloopconformations4.ThecanonicalconformationofaparticularCDRloopcantypicallybeidentifiedfromitslengthandsequence.Liketheframeworkregions,theCDRsL1–3,H1,andH2arealsomodeledusingtemplatestructures.

TheremainingCDRloop,H3,doesnotadoptcanonicalconformationsandmustbemodeleddenovo.Additionally,theH3loopliesattheinterfaceofthetwodomains(VHandVL)andcaninteractwithresiduesoneitherchain.Toaccountfortheseinteractionsaswellastheoverallgeometryoftheparatope,theVL–VHorientationisoptimizedduringH3modeling.Accurately

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 3: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

3

modelingCDRH3andtheVL–VHorientationaretypicallythemostchallengingaspectsofantibodystructureprediction.

Protein–proteindocking

Whileaccuratepredictionsofunboundantibodystructuresareinformative,theyarevoidofanimportantbiologicalcontext:theantibody–antigen(Ab–Ag)interaction.High-resolutionstructuresofAb–Agcomplexesgiveinsighttothemolecularmechanismbywhichantibodiesfunction,anecessityforrationaldesignofvaccinesorantibodytherapeutics.StructuresofAb–Agcomplexescanbedeterminedthroughexperimentalmethods,however,justaswithunboundantibodies,thesemethodsarelimitedbytheirthroughputandexpenseandarenotviableforallproteins.Whenexperimentalmethodscannotbeusedtodeterminecomplexstructures,computationalprotein–proteininterfaceprediction(docking)providesanalternativeapproach.

Ingeneral,computationaldockingapproachesstrivetosampleallpossibleinteractionsbetweentwoproteinstodiscernthebiologically-relevantinteraction.Predictingaprotein–proteininteractiondenovoischallengingduetothesheernumberofpossibledockedconformations.However,thesamplespacecanbemadetractablewithinformationabouttheinteraction.InthecaseofAb–Aginteractions,thesearchspaceislimitedbecausetheantibodyparatope,comprisedofthesixCDRloops,isthebindingsiteforthecognateantigenepitope.

TheRosettaSnugDockalgorithmleveragestheinformationabouttheflexibleand/oruncertainregionsoftheantibodytoperformrobustAb–Agdocking5.SnugDocksimulatestheinduced-fitmechanismthroughsimultaneousoptimizationofseveraldegreesoffreedom.Itperformsrigid-bodydockingofthemulti-body(VL–VH)–Agcomplex,aswellasre-modelingoftheCDRH2andH3loops,thelatterofwhichtypicallycontributesapluralityofatomiccontactstotheAb–Aginteraction.SnugDockcanalsosimulateconformerselectionbyswappingeithertheantibodyortheantigenwithanothermemberofapre-generatedstructuralensemble.BecauseSnugDocksamplesmostoftheconformationspaceavailabletoantibodyparatopes,itcanrefineantibodyhomologymodelswithinaccuraciesinthedifficult-to-predictVL–VHorientationandCDRH3loop.

Whendockinghomologymodels,itisbestifthereisexperimentalevidencetosuggestthegenerallocationoftheepitope(within~10Å,approximatelythecorrectsideoftheantigendomain),andinthisprotocolpaper,wedescribethelocaldockingprocedureindetail.Ifnoinformationisavailableabouttheepitope,thereareseveralprogramsthatperformglobaldockingorepitopeprediction6.Inparticular,therearetwofast-Fouriertransform(FFT)rigid-bodydockingapproachesthatimplementantibody-specificenergypotentials:PIPER7withtheantibody-ADARSpotential8,andZDOCK9withtheAntibodyi-Patchpotential10.FFTrigid-bodyapproachesarefast,buttheycannotaccountforantibodymotionsuponantigenbindingorcompensateforerrorsintheinitialhomologymodel;SnugDockistheonlyflexible-backboneantibodydockingmethod.Itcanprovideaglobal-antigendockingalternativebutitisslowerand,likeothers,canproducefalse-positiveepitopepredictions5.Forlocaldocking,SnugDockhasbeendemonstratedtoproducehigh-qualitymodelswhenusinganantibodyhomology

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 4: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

4

modelorcrystalstructureandtheunboundantigencrystalstructureasinput5.Inaddition,SnugDockapproachesusedintheCAPRIblinddockingchallenge11producedthebeststructureamongallpredictorsforaflexible-looptarget.CAPRIusesstar-basedrankings(***=highquality,**=medium,*=acceptable,0=incorrect)12.ExaminingthehighestattainedCAPRIqualityamongthetenlowest-scoringdockedmodels(startingwithahomologymodeledantibody),SnugDockcurrentlyproduces1***,10**,and4*modelsinatestsetof15antibody-antigentargets(Table1).TheseperformancedataareimprovedsincetheoriginalSnugDockpublication5duetoupdatesintheenergyfunction13andaswitchtothekinematicloopclosure(KIC)loopmodelingmethod.14–16

Protocoloverview

Theprotocoldescribedinthispaperenablesausertogenerateastructuralmodelofanantibodyfromitssequenceandastructuralmodelofanantibody–antigencomplexfromstructuresoftheantibodyanditsantigen(Fig.1).

Protocoloverview:Antibodyhomologymodeling(steps1–8)

GeneratingastructuralmodelofanantibodyfromsequenceinRosettaAntibodyuseshomologymodelingtechniques,thatis,itusessegmentsfromknownstructureswithsimilarsequences.Asdescribedindetailbelow,theinputsequenceissplitintoseveralcomponents.Foreachcomponent,RosettaAntibodysearchesacurateddatabaseofknownstructuresfortheclosestmatchbysequenceandthenassemblesthosestructuralsegmentsintoamodel.ThatmodelisthenusedastheinputforthenextstageinwhichtheCDRH3loopismodeledandtheVL–VHorientationisoptimized.

Numberingtheresiduesinthesequence

TheRosettaAntibodyprotocolidentifiestheCDRsoftheinputantibodysequencethroughregularexpressionmatchingtotheKabatCDRdefinition17,anditnumberstheantibodyresiduesaccordingtotheChothiascheme4.

Templateselection

Foreachstructuralcomponentconsidered(FRL,FRH,CDRsL1–3,H1–3),templatesareselectedbymaximumsequencesimilarityusingaBLAST-basedmethodwithcustomdatabasesconstructedfromhigh-qualitystructuresinthePDB.CanonicalCDRconformationsarebasedonlength,soweuseseparatedatabasesforeachloop–lengthcombination.Forexample,ten-residueH1loopsandeleven-residueH1loopsareseparateBLAST-formatteddatabases.

TheresultsforeachstructuralcomponentaresortedbyBLASTbitscore,andthesequencewithbestscoreisselectedasthetemplate.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 5: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

5

InitialVH–VLorientations

TheinitialVL–VHorientationisselectedinmuchthesameway,withtheexceptionthattenVL–VHtemplatesareselectedratherthanasingleone18.Startingfromthelistofallpossibletemplatesorderedbybitscore,thebestmatchisselectedasthefirsttemplate.TodiversifytheinitialVL–VHorientations,alltemplateswithsimilarVL–VHorientations(0.5OCD,seeMarze&Gray18)tothistemplateareprunedfromthelist.Thebestmatchremaininginthelistisselectedasthesecondtemplate,andcandidatetemplatessimilartothesecondtemplatearenowremovedfromthelist.Thiswinnowingisrepeatedtocreatetendistincttemplates.OnegraftedmodelwillbecreatedfromeachoftheseteninitialVL–VHorientations.

GraftingCDRtemplates

OncetheinitialVL–VHorientationsareset,theCDRtemplatesaregraftedontoeachframeworkregionbysuperposingthetwooverlappingresiduesoneithersideoftheloopwiththeircorrespondingresiduesontheframeworkregions.ThegraftpointsarethenadjustedusingCyclicCoordinateDescent(CCD)19,20topreventunphysicalbondlengthsandanglesfrombeingincorporatedintothemodel.Finally,thestructureisrelaxed21,22viaiterationsofside-chainoptimizationandgradient-basedminimizationwhileconstrainingthebackboneandside-chainheavyatomstofindanative-likeconformationatalocalenergyminimuminRosetta’sscorefunction.

All-atomrefinementofCDRH3andtheVL–VHorientation

Thegraftedmodelsarecrudeandmustberefined,particularlyintheCDRH3loopandtheVL–VHorientation.TheH3loopisfirstcompletelyremodeledinthecontextoftheantibodyframeworkusingthenext-generationKIC(NGK)loopmodelingprotocol16.Forspeed,theH3loopsidechainsareeachreducedtoasinglelow-resolutionpseudo-atom,andtoensuresamplingoftheC-terminalkinkconformation,atomicconstraintsareappliedtothegoverningscorefunction23.Forsubsequenthigh-resolutionrefinement,theall-atomCDRH3sidechainsarerecovered,allCDRsidechainsarerepacked,andtheCDRsidechainsandbackbonesareminimized.TheVLandtheVHdomainsarere-dockedwitharigid-backboneRosettaDockprotocol24,25toremoveanyclashescreatedbythenewH3conformation,andtheantibodysidechainsareagainrepacked.UsingNGK,H3isrefinedagaininthecontextoftheupdatedVL–VHorientation.TheCDRsarepackedandminimizedagain,andthemodelissavedasacandidatestructure,ordecoy.Thefirstgraftedmodelisusedasthestartingpointfor1,000refinedmodels,ordecoys,andtheothergraftedmodelsareeachusedasthestartingpointfor200decoys,foratotalof2,800decoys.ThedecoysaresortedbyRosettascore,andthelowest-scoringonesaregivenasthefinalmodels.

Protocoloverview:antibody–antigendocking(steps11–16)

ComputationaldockingcanbeusedtogeneratemodelsofAb–Agcomplexes.Ingeneral,dockingentails(1)roughlyidentifying(within8Å)theinteractinginterfacethrougheither

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 6: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

6

experimentorglobaldockingand(2)refiningtheinitialmodelthroughlocaldocking.BelowwedescribelocaldockingwithSnugDockindetail.

Generatingthestartingmodel

SnugDockrequires,asaninput,aputativeAb–Agcomplexthatcontainsareasonableinterface26.Thecomplexcanbecomposedofsinglestructuresorsetsofstructures(ensembles,seeBox1).Theinterfacedefinesthelocalsearch,betweentheantibodyCDRsandtheantigen.InitialmodelsareoftenbasedonexperimentalresultsthatidentifyinteractingresiduesattheAb–Aginterface,suchasmutagenesisorchemicalcrosslinkingassays.Intheabsenceofexperimentalresults,aglobaldockingapproachsuchasZDOCK/iPatch10orPIPER/ADARS8cangenerateputativecomplexesforrefinement.GlobaldockingcanalsobeachievedwithSnugDock,albeitatahighercomputationalexpense.

AntigenorantibodystructuresthathavenotbeengeneratedbyaRosettaprotocolneedtoberefinedbeforebeingplacedincontact.Refinement,commonlyreferredtoastheRelaxprotocol21,22,entailsiterationsofside-chainoptimizationandgradient-basedminimizationinRosetta’sscorefunction.TheRelaxprotocolsampleslocalconformationalspacearoundthestartingstructuretoidentifyanenergeticminimuminthescorefunction.Throughthisprocess,Rosetta-identifiednon-idealities(suchasvanderWaalsbumps)areabated.Oncethepartnershavebeenrefined,aputativecomplexcanbeassembledandprepacked.Prepackingoptimizesside-chainconformationstopreventbiasingtowardtheinputcomplexmodel’sside-chainconformations,ensuringuniformscoringofallpotentialboundcomplexstates.

Performingdocking

SnugDockiterativelyperformsmulti-bodydockingofboththeAb–AgandVL–VHorientationsandremodelingoftheH2andH3CDRloops.Priortodocking,theprepackedstartingAb–Agcomplexissubjecttothreerigid-bodyperturbations:(1)arandomizedrotationabouttheAb–Agprimaryaxis,(2)asmall-magnituderandomtranslation,and(3)asmall-magnituderandomrotation.Dockingoperatesintwophases:low-resolutionmode,wheresidechainsarerepresentedbyasinglepseudoatomlocatedatthecentroidoftheside-chainheavyatoms,andhigh-resolutionmode,whereallproteinatomsareexplicit.Low-resolutionmodeconsistsoftwotypesofinterspersedMonteCarlomoves:rigid-bodyAb–Agtranslationandrotation,andbackboneensembleconformerswaps.Additionally,attheendoflow-resolutionmode,theH2&H3loopsarerefined.High-resolutionmodeconsistsofa50-stepMonteCarlotrajectorywhereeachmoveisselectedfromasetoffivepossiblemoves:rigidbodyAb–Agdocking(40%),rigidbodyVL–VHdocking(40%),CDRminimization(10%),H2looprefinement(5%),andH3looprefinement(5%),wherethepercentagesindicatetheprobabilitiesofselectingeachmove.Eachtrajectoryresultsinonedecoy.Typically,SnugDockisusedtogenerateatotalof1,000decoys,withthelow-scoringdecoysmostlikelytobenearthenativeconformation.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 7: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

7

Incorporatingexperimentaldataintothesimulation

TwomaintypesofexperimentaldatathatinformtheAb–AgbindingmodecanbeincorporatedintoSnugDock.First,knowledgeaboutspecificresiduesorpairsofresiduesthatinteractacrosstheinterfacecanbeusedtoguidedocking.Thisinformationcould,forexample,bederivedfromalaninescanningorothermutagenesisexperiments.Second,knowledgeabouttheepitopeandtheoverallAb-Agorientationcanbeincorporated.Bindingpatchdatamaybederivedfromdifferentexperiments,includinghydrogen/deuteriumexchangeorchemicalcrosslinkingofthebindingpartnerswithsubsequentanalysisbymassspectrometry.Othermethodsforepitopemappingmayalsobesuitable.

Dependingonthetypeofexperimentaldataavailable,therearedifferentwaysofincorporatingitintothedockingsimulation.High-confidenceresidue–residueinteractionscanbepreservedwiththeuseofatompairconstraints.Less-specificandpoorly-characterizedinteractions(hydrophobicpockets,ambiguousH-bonds)canbelooselyconstrainedwithambiguousandsiteconstraints.PredictedepitopesandbindingpatchescanbesampledbyproperlyplacingtheSnugDockinputstructureandadjustingthesizeoftheinitialstartingmove.Forfurtherinformationonincorporatingexperimentalconstriants,seetheRosettadocumentation27.

Caveats,challengesandpitfalls

Thereareseveralcaveatsassociatedwithcomputationalmodelingofantibodiesanddockingofantibodiesandantigens.Keepingthesecaveatsinmind,theusershouldcriticallyassesseachprediction(seeBox2).RosettaAntibodyisahomologymodelingapproachandcanbehamperedbytemplateavailability.Forexample,challengingtargetsincludeheavilyengineeredantibodiesorantibodiesderivedfromaspeciesthatdiversifiesitsantibodiesthroughgeneconversion,suchaschickensorrabbits.ErrorsintheFRandCDRL1–3,H1,H2loopsaretypicallysmall(nogreaterthan1ÅbackboneRMSDtonative)28.TheVL–VHorientation,correctlycapturedbyRosettaAntibodyin43of46benchmarkantibodytargets18.TheCDRH3loop,ontheotherhand,ismodeleddenovo,andloopmodelqualitydecreaseswithlooplength.IntheKICloopbenchmark,16,29loopsof12–17residuesaremodeledtonear1ÅbackboneRMSDrelativetothenativestructure—theaveragehumanCDRH3fallswithinthatrangewithanaveragelengthof15residues(IMGTdefinition)30.However,thebenchmarkismeasuredbymodelingloopsoncrystallographicframeworks,whereasinablindcontext,CDRH3loopsaremodeledonahomologyframeworks,whichintroducesuncertaintyintheloopenvironment.Nevertheless,inarecentassessment23RosettaAntibodyproducedmodelswithCDRH3loopswithin1.59ÅbackboneRMSDtonativeandsub-angstromaccuracyinallotherregions.

WhileSnugDockexplicitlysamplestheCDRH2orH3loopconformationandVL–VHorientationtoaccountformodeluncertaintyintroducedduringhomologymodelingandtosampletheregionsmostlikelytoundergoconformationalchangesuponantigenbinding,itdoesnotexplicitlysamplebackbonedegreesoffreedomoftheantigenorofnon-CDR-H2orH3regionsoftheantibody.Thus,iftheunboundandboundconformationsdiffersubstantiallyorifthehomologymodelsarepoor,itcouldbedifficultorimpossibletomodelthedockedcomplex

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 8: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

8

accurately31.Despitethiscomplication,SnugDockhassuccessfullypredictedAb–Agcomplexesfromhomologymodels5.

Availability

RosettaAntibodyandSnugDockcanberunviaapublicwebserver(http://rosie.rosettacommons.org),pythonbindings(PyRosetta,http://www.pyrosetta.org)andthroughlocalinstallationsofRosetta.RosettaisdistributedassourcecodeandlicensesareavailablefromtheRosettaCommons(http://www.rosettacommons.org)freeofchargeforacademicandnon-profitusers.RosettacanbeinstalledonUNIX-likeoperatingsystems(includingMacOSX).

MATERIALS

EQUIPMENT

Homologymodelingdata

• Primaryamino-acidsequenceofthevariabledomainofthelightandheavychains.

Dockingdata

• PDB-formattedfileoftheantigenstructure.• PDB-formattedfileoftheantibodystructure,fromthehomologymodelingoutput.• Bothofthesecanbesinglestructuresoranensembleofstructures.

SoftwareforrunningsimulationsviaROSIEwebserver

• Modernwebbrowser

Hardwareforrunningsimulationsmanually(optional)

• Workstationwithmulti-coreCPU(s) runningaPOSIXcompliantoperatingsystem(e.g.,GNU/Linux,OSX)

OR

• a Linux-based cluster. Several public facilities are available. For example, the U.S.National Science Foundation’s provides clusters like Stampede through the ExtremeScience and Engineering Discovery Environment (XSEDE, www.xsede.org). In Europe,thePartnershipforAdvancedComputinginEurope(PRACE,www.prace-ri.eu)providesaccess to clusters like JUQUEEN. Resources like the Norwegian Metacenter forComputational Science (Notur, www.notur.no) or Japan’s supercomputer facilities ofNational Institute ofGenetics (sc.ddbj.nig.ac.jp) andofHumanGenomeCenter at theUniversityofTokyo(hgc.jp)arealsosuitable.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 9: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

9

Softwareforrunningsimulationslocally(optional)

• TheRosettasoftwaresuite,availableatwww.rosettacommons.org/softwareo Compilationinstructionsavailableatwww.rosettacommons.org/build.

?TROUBLESHOOTINGo Supportforanyissuesencounteredthatarenotcoveredinthismanuscriptcan

beaddressedontheRosettauserforums:www.rosettacommons.org/forum• BLAST+(version2.2.28orlater),availableat

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/• Texteditor(e.g.,vim,emacs,nano)• Optional:Python(www.python.org)orR(www.r-project.org)foranalyzingresults• Optional:Amolecularvisualizationpackageforviewingresultsandcustomizingstarting

structuresfordocking.RecommendedpackagesincludePyMOL(www.pymol.org)32,UCSFChimera(www.cgl.ucsf.edu/chimera)33,andKinemage(kinemage.biochem.duke.edu)34

PROCEDURE

Thesimplestwaytocreateantibodyandantibody-antigencomplexstructuresisthroughtheuseoftheROSIEwebserver(rosie.rosettacommons.org)35.OnROSIE,theAntibodyappusestheinputantibodysequencetogenerateahomologymodel,andtheSnugDockappusestheantibodymodel(s)andanantigenstructuresfordocking.Bothoperationsareentirelyautomatedwithaminimumofuserinput.

Forgreatercontroloftheoperation,wedescribebelowthestepstoruntheprotocolsmanually,includingthekeypointsforcheckingintermediatedataandinterveningwithalternatechoices.Userswithstructuresoftheunboundantibodyandantigencanskiptodockingstage(step11).

I.AntibodyHomologyModeling

ConstructionofagraftedFvmodelTIMING75-90minutes

1. Setupyourterminal.AfterinstallingBLAST+andRosetta(seeMaterials),launchaninteractiveterminal(e.g.,TerminalonmacorxtermonLinux)andsetpathvariablestotheexecutableprogramsneededasfollows(bashsyntax):

export ROSETTA=~/Rosetta export ROSETTA3_DB=$ROSETTA/main/database export ROSETTA_BIN=$ROSETTA/main/source/bin export PATH=$PATH:$ROSETTA_BIN

Inthefirstlineabove,replace“~”withtheparentdirectorywhereyouinstalledRosettaonyourmachine.Similarly,besurethePATHvariableincludestheblastpprogram(e.g.export PATH=$PATH:/path/to/blastpwhere/path/to/blastpisreplacedwiththedirectorycontainingtheblastpexecutable.Thesepathsettingsmaybeaddedtoa

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 10: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

10

configurationfilesuchas.bashrcsotheyareautomaticallyseteachtimeaterminalisopen(loggedinto).

2. Createaworkingdirectoryandnavigatetoit:

mkdir /path/to/my_dir cd /path/to/my_dir

3. Obtaintheaminoacidsequencesforthevariabledomainofyourantibody(lightchainandheavychain)andsavetheminFASTAformat(inyourworkingdirectory)withtheheavyandlightchainsnotedinthecommentlines,asfollows:

> heavy VKLEESGGGLVQPGGSMKLSCATSGFRFADYWMDWVRQSPEKGLEWVAEIRNKANNHATYYAESVKGRFTISRDDSKRRVYLQMNTLRAEDTGIYYCTLIAYBYPWFAYWGQGTLVTVS > light DVVMTQTPLSLPVSLGNQASISCRSSQSLVHSNGNTYLHWYLQKPGQSPKLLIYKVSNRFSGVPDRFSGSGSGTDFTLKISRVEAEDLGVYFCSQSTHVPFTFGSGTKLEIKR

4. UseRosetta’sgraftingapplicationtofindsuitabletemplatesandgraftthemtogethertoobtainacrudemodeloftheantibody.Executetheapplicationwiththelinebelow.

antibody.macosclangrelease \ -fasta antibody_chains.fasta | tee grafting.log

Theapplicationwilloutputadirectorycalledgrafting.ThePDB-formattedfilesnamedmodel-0.relaxed.pdb,model-1.relaxed.pdb,…,model-9.relaxed.pdbwillbeyourinputfortheH3modeling.The“| tee grafting.log”partofthecommandrecordsalltheprogramoutputinthefilegrafting.logforlaterreview.The“\”permitsthecommandtobespreadacrossmultiplelinesratherthanjustone.

?TROUBLESHOOTING

(Optional)CheckgraftedtemplatestructuresTIMING:10minutes–2hours

5. AssigntheCDRloopsinyourmodelstotheCDRloopclustersdescribedbyNorthetal.36andcheckwhetherthechosentemplatesaresuitable.

Runtheclusteridentificationapplicationasfollows:

identify_cdr_clusters.macosclangrelease \ –s grafting/model-*.relaxed.pdb \ –out:file:score_only north_clusters.log

Northetal.clusteredallCDRloopstructuresbytheirbackbonedihedralanglesandnamedthembyCDRtype,looplengthandclustersize(e.g.“H1-13-10”isthe10thmostcommonconformationfor13-residueH1loops).Occasionally,RosettachoosestemplatesthatarerareorinconsistentwiththesequencepreferencesobservedbyNorthetal.Forexample,ifRosettarecommendstheH1-13-10cluster,theusermight

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 11: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

11

alsoconsidertheH1-13-1cluster.Tables3-7ofNorthetal.presentconsensussequencesforeachclusterthatcaninformthisdecision.Loopsandclusterswithprolineresiduesarealsoworthamanualexamination.SeveralclustersofNorthetal.arecontingentonthepresenceofprolinesinparticularlocations(e.g.L3-9-cis7-1hasacis-prolineatposition7).BecauseRosettaAntibodyreliesonBLASTtochooselooptemplates,occasionallyaloopfromanuncommonnon-cis-prolinecluster(e.g.L3-9-2)ischosen.Insuchcasesitisbesttomanuallyselectalooptemplatefromthewell-populatedcis-prolinecluster.

6. If desired, rerun grafting to replace a template with one from a manually-specifiedsourcestructure.Usetheantibodycommandlineasabovewithanextraflagtospecifyatemplate.Followthebelowexample:toforceRosettatousetheCDRH1loopfromthePDB1RZI as the template in themodel, add the flag–antibody:h1_template 1rzi.Selecttemplatesforotherregionsaccordingly:

antibody.macosclangrelease \ -fasta antibody_chains.fasta \ -antibody:h1_template 1rzi | tee graft.log

Flag region-antibody:l1_template -antibody:l2_template -antibody:l3_template

lightchainCDRloops

-antibody:h1_template -antibody:h2_template –antibody:h3_template

heavychainCDRloops

-antibody:light_heavy_template -antibody:n_multi_templates 1

VL–VHorientation

-antibody:frl_template -antibody:frh_template

Frameworkregionofthelightorheavychain

H3modelingTIMING1hourto4days

7. Copy the set of standard H3 modeling flags to your working directory and create adirectoryfortheH3modelingoutput:

cp $ROSETTA/tools/antibody/abH3.flags . mkdir H3_modeling

8. Run Rosetta’s antibody_H3 application on the 10 models generated during grafting.This step requires 2,500CPUhours and is oftenperformed in parallel on a computercluster(seeBox3).ForaMacworkstation,usethefollowingcommandline:

$ROSETTA_BIN/antibody_H3.macosclangrelease \ @abH3.flags \ -s grafting/model-0.relaxed.pdb \ -nstruct 1000 \ -antibody:auto_generate_kink_constraint \

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 12: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

12

-antibody:all_atom_mode_kink_constraint \ -multiple_processes_writing_to_one_directory \ -out:file:scorefile H3_modeling_scores.fasc \ -out:path:pdb H3_modeling > h3_modeling-0.log 2>&1 &

• -sspecifiestheinputfile(oneofthegraftedmodelsgeneratedinstep2).• -nstruct specifies the number of structures generated, which should be 1000 for

model.0.pdband200eachforallothergraftedmodels.

TheexpectedoutputisthespecifiednumberofPDBfilesaswellasascorefilenamedH3_modeling_scores.fasc.AllthesefileswillappearinanoutputdirectorynamedH3_modeling/.

Totriviallyruninparallel,simplyrepeatedlyexecutetheabovecommand(changinginputmodels,numberofstructures,andtheoutputlogasyouwish).Eachtimethecommandisexecuted,anantibody_H3processisruninthebackground.

!Caution:Generatingthe2,800antibodystructurestakesapproximately2,500CPUhours.Running24processesinparallel,onamodern24-CPUworkstation,expect~4daysofruntime.Distributingtheworkovernodesonasupercomputercanreducethistimetohours(seeMaterials).

(Optional)CheckVL–VHorientationTIMING:5min

9. Check whether the VL–VH orientations of the antibody models are close to theorientationsobservedinantibodycrystalstructuresfoundinthePDB.Runthepythonscriptplot_LHOC.pyusingthefollowingcommandline:

python $ROSETTA/main/source/scripts/python/public/plot_VL_VH_orientational_coordinates/plot_LHOC.py

Thisscriptwillcreateasubfolder(lhoc_analyis)withseparateplotsforeachofthefourLHOCmetrics.EachplotshowsthenativedistributionofVL–VHorientations(grey),theorientationssampledbyRosetta(blackline)aswellasthetop10models(labeleddiamonds)andthe10differenttemplatestructuresgeneratedduringstep2(dots).Antibodymodelsthatareoutsidethenativedistributionsareunlikelytobecorrect.

ChoosefinalantibodymodelsTIMING10min

10. Choose 10of the antibodymodels as an ensemble for docking. The following criteriamaybeuseful toconsiderasdockingwithensemblesaimsto increaseconformationaldiversityandsampling:

a. Selectmodelswiththelowesttotalscore–thesearepurportedlynative-likeb. Select models with natural VL–VH orientation, falling within the observed

distribution(grey).c. Selectmodelsderivedfromdifferenttemplatestomaintaindiversity.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 13: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

13

Ifalltentop-scoringmodelsareoutsidethenativedistribution,considerreturningtostep6andmanuallyselectnewtemplatesfortherelativeorientationoftheVLandVHchainsbyusingthe-antibody:light_heavy_template flag(e.g.,antibody.macosclangrelease -antibody:light_heavy_template 1ABC)

II.Antibody-AntigenDockingTIMING1hour

11. Preparetheantigenandantibodyfordocking.Formatyourantigen(andantibodyifyouarenotusingahomologymodelproducedbyRosettaAntibody)PDB file so it canbereadbyRosetta.Runthefollowingscript:

$ROSETTA/tools/protein_tools/scripts/clean_pdb.py antigen.pdb C

Where antigen.pdb is a PDB file of your antigen and C is the one-letter chainidentifier(s)fortheantigenchain(s)inthePDBfile.

(Optional)RefineantibodyinRosetta’sscorefunctionTIMING10min

12. If you are not using an antibody model produced by Rosetta, you must refine theantibodystructurebyrunningtherelaxapplication.Thecommandlineis:

relax.macosclangrelease \ -s antibody.pdb \ -relax:constrain_relax_to_start_coords \ -relax:ramp_constraints false \ -ex1 \ -ex2 \ -use_input_sc \ -flip_HNQ \ -no_optH false

Youmayalsowishtogenerateanensembleofantibodystructures,seeBox2.

PrepackingTIMING10min

13. GenerateaPDBfilethatcontainsbothyourantibodyandyourantigeninthefollowingorder:lightchainofyourantibody(L),heavychainofyourantibody(H),andantigen(A).ThereareseveralwaystocreateandmodifyaPDBfile.Forexample,withPyMOL:

a. LoadtheantibodyinaPyMOLsession.i. IfitisamodelfromRosettaAntibody,thechainswillalreadybelabeled

asHandL.Otherwise,usethealtercommandtochangethechainIDofaselection:

alter chain A, chain=’H’ alter chain B, chain=’L’

b. LoadtheantigenintothesamePyMOLsession.ChangetheantigenchainIDinasimilar fashion.!Warning: if antigen chains share an IDwith the antibody, youwill have tobe

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 14: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

14

more specific with your selections (e.g., alter chain H and antigen,

chain=’A’).c. Reorient the antibody and antigen using the RotO, MovO, and MvOZ editing

commands. Alternatively, one can also use the translate command (i.e.translate [x,y,z], selection). If you know an approximate bindinglocation,adjusttheorientationaccordingly.

d. SavebothobjectsinthesamePDBfile:

save antibody_antigen_start.pdb, chains L+H+A

14. Toensurelow-energystartingside-chainconformations,prepackthemonomers:

docking_prepack_protocol.macosclangrelease \ -in:file:s antibody_antigen_start.pdb \ -ex1 \ -ex2 \ -partners LH_A \ -ensemble1 antibody_ensemble.list \ -ensemble2 antigen_ensemble.list \ -docking:dock_rtmin

antibody_ensemble.listisatextfilethatcontainsfilenameswithabsolutepathstothetenantibodymodelsselectedafterantibodymodeling.Inthecasethatyouhaveasinglecrystalstructure,youcanomitthe–ensemble1flag.

Ifantigenflexibilityisexpected,afamilyofstructurescanbecreatedwithotherRosettaapplications(seeBox1).Thetextfileantigen_ensemble.listwillcontainthefilenamesofyourantigen(usingabsolutepaths).NMRstartingstructuresmustbesplit(i.e.eachmodelshouldbeinitsownPDBfile).Touseasingleantigenstructure,omitthe–ensemble2flag.

DockingTIMING15min

15. Docktheantibodytotheantigen.Asinstep8,thisisanexpensivecomputationalstepand you have the option of running a single process, multiple processes on onemachine,orsplittingthejobacrossprocessorsonasupercomputer(seeBox3).Usingtheexecutable foranMPI-basedcomputingclusterasanexample, thecommand linefordockingis:

snugdock.mpi.linuxgccrelease \ -s prepack/antibody_antigen_start.prepack.pdb \ -ensemble1 antibody_ensemble.list \ -ensemble2 antigen_ensemble.list \ -antibody:auto_generate_kink_constraint \ -antibody:all_atom_mode_kink_constraint -nstruct 1000

?TROUBLESHOOTING

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 15: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

15

TIMING

Herewereportthetimetogenerateasingle,dockedmodelfromantibodysequenceandantigencrystalstructure.Typically,however,thousandsofmodelsaregenerated,soweindicateinparenthesesthetimingforthefull,recommendedsimulations.Thesetimeestimateswerecomputedona2x2.4GHzQuad-CoreIntelXeonprocessor;timingwillvaryforothercomputerconfigurations.

Step HumanTime CPUTimeperModel TotalCPUTime(1–4)ConstructionofgraftedFvmodels 5 min 20 min 200 min

(5)Checkgraftedmodels 10 min 1 min 10 min (7–8)H3modeling 5 min 50 min 2500 hrs

(9)CheckVL–VHorientation 5 min 1 min 10 min (10)Choosemodels 10 min 5 min 5 min

(11)Prepareantibodyandantigenfordocking 5 min 15 min 15 min (12)RefineantibodyinRosetta’sscorefunction 5 min 20 min 20 min

(13–14)Prepacking 5 min 10 min 10 min (15)Docking 5 min 15 min 250 hrs

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 16: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

16

TROUBLESHOOTING

STEP PROBLEM POSSIBLEREASON SOLUTION

0 Rosettadoesnotcompile. Likelytoberelatedtothespecificcomputeroperatingsystemandconfiguration

SeekhelpontheRosettaforums,www.rosettacommons.org/forum

4 RosettaAntibodyencounterserror“sh: blastp: command not found”

Theblastpexecutableisnotinstalledornotininyour$PATH

Onthecommandline,try‘which blastp’tocheckifyoursystemhasitinstalled.Ifneeded,downloadandinstallBLASTor/andaddblastptoyourPATH(export PATH=$PATH:/path/to/blastp/).Youcanalsospecifythepathusingthecommandlineflag-antibody:blastp /my/path

4 RosettaAntibodyencountersencounters“BLAST Database error”

Theblastpdatabaseisnotspecified,andRosettaAntibodyisnotfindingitinthedefaultlocation($ROSETTA/tools/antibody/blast_database/)

Specifythegraftingdatabaselocationwith-antibody:grafting_database /database/location

4 RosettaAntibodyproducesBLASToutput(e.g.grafting/orientation.align)butdoesnotproducestructuralmodels(e.g.model.0.pdb)

YourversionofBLAST+maybeoutofdate.

DownloadacompatibleversionofBLAST+(version2.2.28orlater).SeeMaterialssection.

4 RegularexpressionfailureforCDRidentification

MutationsinregionsofthechainthatRosettaexpectstobeconservedpreventthesequencefrombeingsplitintostructuralsegmentscorrectly.

Checkyourantibodysequenceagainstotherknownantibodysequences,focusingontheregiongivenintheerrormessage.SeekhelpontheRosettaforums,www.rosettacommons.org/forum.

15 SnugDockreports“ERROR:Couldnotfinddisulfidepartnerforresidue23”

Adisulfidebondwasdisruptedduringdocking.

Youcandisabledisulfidebonddetectionwiththeflag-detect_disulf false

15 snugdock reportserror “chains are not named correctly or are not in the expected order”

InputPDBdoesnotcontainchainsincorrectorder(light,heavy,thenantigen)orchainIDsare

AdjustchainorderininputPDBorspecifychainIDswiththe–partners AB_C flag,whereA,BandCarethelight,heavy,and

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 17: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

17

notL,H,andA. antigenchainIDs,respectively.2–15 UnknownRosettaerror. SeekhelpontheRosettaforums,

www.rosettacommons.org/forum2–15 Commonfixes • Checkformisspellings

• Checkpathsarecorrect• CheckFASTAformatting• CheckPDBformatting

ANTICIPATEDRESULTS

Theantibodystructurepredictionanddockingmethodsdescribedinthispapereachproduceasetofstructuralmodelsthathavebeenevaluatedbyascorefunction.Inthecaseofantibodystructureprediction,wehavefoundthroughbenchmarkingandparticipationintheAMAthattheaccuracyofframeworksandnon-H3CDRloopscantypicallybeexpectedtobewithin1.0ÅRMSDofthecoordinatesinacrystalstructure.Whenthemodeldeviatesmorethan1.0ÅinRMSDfromcrystallographiccoordinatesitisusuallybecausethereisnotasuitableknowntemplateinthePDB.ThesesituationsshouldbecomeincreasinglyrareasmorestructuresaredepositedintothePDB,althoughheavilyengineeredantibodiesshouldalwaysbemodeledwithcare.

TheH3loopaccuracyisvariableanddependsbothonlengthandVL–VHorientation.Looplengthisanimportantfactorintheaccuracyofdenovoloopmodelingmethodsbecausethesearchspaceincreasesexponentiallywitheachadditionalresidueintheloop.WeexpectaccuratemodelsofCDRH3loopsoflength14orless23,butthetop-scoringmodelmaynotbethemostaccurate.Wethereforerecommendusingalltenmodelsfordownstreamanalysis.InAMA-II,wefoundthatnon-nativeVL–VHorientationscanleadtoexplicitinteractionsbetweenthelightchainandtheCDRH3loopthatareindistinguishablefromnativeinteractions28.UsingmultipleVL–VHorientationtemplates18allowsbroaderexplorationofconformationalspace,samplingmorelow-scoringwells.ModelsgeneratedfromatleastthreedifferenttemplatesshouldbeusedtomaximizethechanceofcapturingthenativeVL–VHorientation.

ThroughbenchmarkingAb–Agdocking,wehavefoundthattheaccuracyofacomplexmodeldependsonthestartingconfigurationofthepartnersandtheaccuracyofthemodelsforeachpartner.SnugDocksampleslocalconformationspace,thusagoodstartingstructure(within8Å)generallyresultsinsamplinganear-nativeconformation.Equallyimportantisthequalityoftheinitialunboundmodels;near-nativemodelsenableincreaseddockingperformance(seeTable1:B-Brigidbody-dockingvs.U-Urigid-bodydocking).Wehavefoundthatdockingahomologymodeledantibodytothecrystalstructureoftheunboundantigentypicallyresultsinatleastonemodelofacceptablequalityinthetentop-scoringmodels(Table1).

AUTHORCONTRIBUTIONS

BDW,NM,SL,DK,JRJ,andJJGdevelopedthecurrentversionofRosettaAntibody.SLdevelopedROSIEandimplementedtheRosettaAntibodyandSnugDockserverapps.BDWimplementedSnugDockinRosetta3,JRJbenchmarkedSnugDock’sperformance.RFandNBwrotethe

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 18: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

18

procedure,codifiedthemanualinterventionstepsdevelopedbyBDW,NM,andDK,andrecordedtiminginformation.BDW,NM,SL,JRJ,DK,RF,NBandJJGwrotethemanuscript.

ACKNOWLEDGMENTS

TheauthorswishtothankArvindSivasubramanian,AroopSircar,andSidharthaChaudhuryfortheirdevelopmentoftheoriginalRosettaAntibody,SnugDock,andEnsembleDockmethods.JianqingXurefactoredtheantibodycode.WealsothankthemembersoftheRosettaCommonsforthecontinueddevelopmentoftheRosettaSoftwareSuite.ROSIEsimulationsarecarriedout,inpart,withintheExtremeScienceandEngineeringDiscoveryEnvironment(XSEDE),whichissupportedbyNationalScienceFoundationgrantnumberACI-1053575.BW,NM,JRJandJJGaresupportedbyNationalInstitutesofHealthGrantR01GM078221.SLissupportedbyNationalInstitutesofHealthGrantR01GM73151.DKissupportedbytheDARPAAntibodyTechnologyProgram(HR-0011-10-1-0052)andtheJapanSocietyforthePromotionofScience(grantnumber15H06606).RFissupportedbytheSouth-EasternNorwayRegionalHealthAuthority(grantnumber850703-6051-39788).

COMPETINGFINANCIALINTERESTS

Theauthorsdeclarenocompetingfinancialinterests.AllrevenuegeneratedbylicensingRosettatofor-profitentitiesisinvestedintothecontinueddevelopmentofthesoftware.

REFERENCES

1. Georgiou,G.,Ippolito,G.C.,Beausang,J.,Busse,C.E.,Wardemann,H.&Quake,S.R.Thepromiseandchallengeofhigh-throughputsequencingoftheantibodyrepertoire.Nat.Biotechnol.32,158–168(2014).

2. Reichert,J.M.Antibodiestowatchin2016.MAbs8,197–204(2016).

3. Correia,B.E.,Bates,J.T.,Loomis,R.J.,Baneyx,G.,Carrico,C.,Jardine,J.G.,Rupert,P.,Correnti,C.,Kalyuzhniy,O.,Vittal,V.,etal.Proofofprincipleforepitope-focusedvaccinedesign.Nature507,201–6(2014).

4. Al-Lazikani,B.,Lesk,A.M.&Chothia,C.Standardconformationsforthecanonicalstructuresofimmunoglobulins.J.Mol.Biol.273,927–948(1997).

5. Sircar,A.&Gray,J.J.SnugDock:Paratopestructuraloptimizationduringantibody-antigendockingcompensatesforerrorsinantibodyhomologymodels.PLoSComput.Biol.6,e1000644(2010).

6. Ponomarenko,J.V&Bourne,P.E.Antibody-proteininteractions:benchmarkdatasetsandpredictiontoolsevaluation.BMCStruct.Biol.7,64(2007).

7. Kozakov,D.,Brenke,R.,Comeau,S.R.&Vajda,S.PIPER:AnFFT-basedproteindocking

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 19: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

19

programwithpairwisepotentials.ProteinsStruct.Funct.Genet.65,392–406(2006).

8. Brenke,R.,Hall,D.R.,Chuang,G.Y.,Comeau,S.R.,Bohnuud,T.,Beglov,D.,Schueler-Furman,O.,Vajda,S.&Kozakov,D.Applicationofasymmetricstatisticalpotentialstoantibody-proteindocking.Bioinformatics28,2608–2614(2012).

9. Chen,R.,Li,L.&Weng,Z.ZDOCK:Aninitial-stageprotein-dockingalgorithm.ProteinsStruct.Funct.Genet.52,80–87(2003).

10. Krawczyk,K.,Baker,T.,Shi,J.&Deane,C.M.Antibodyi-Patchpredictionoftheantibodybindingsiteimprovesrigidlocalantibody-antigendocking.ProteinEng.Des.Sel.26,621–629(2013).

11. Sircar,A.,Chaudhury,S.,Kilambi,K.P.,Berrondo,M.&Gray,J.J.AgeneralizedapproachtosamplingbackboneconformationswithRosettaDockforCAPRIrounds13-19.ProteinsStruct.Funct.Bioinforma.78,3115–3123(2010).

12. Méndez,R.,Leplae,R.,Lensink,M.F.&Wodak,S.J.AssessmentofCAPRIpredictionsinrounds3-5showsprogressindockingprocedures.ProteinsStruct.Funct.Bioinforma.60,150–169(2005).

13. O’Meara,M.J.,Leaver-Fay,A.,Tyka,M.D.,Stein,A.,Houlihan,K.,Dimaio,F.,Bradley,P.,Kortemme,T.,Baker,D.,Snoeyink,J.,etal.Combinedcovalent-electrostaticmodelofhydrogenbondingimprovesstructurepredictionwithRosetta.J.Chem.TheoryComput.11,609–622(2015).

14. Coutsias,E.A.,Seok,C.,Jacobson,M.P.&Dill,K.A.Akinematicviewofloopclosure.J.Comput.Chem.25,510–528(2004).

15. Mandell,D.J.,Coutsias,E.A.&Kortemme,T.Sub-angstromaccuracyinproteinloopreconstructionbyrobotics-inspiredconformationalsampling.Nat.Methods6,551–552(2009).

16. Stein,A.&Kortemme,T.ImprovementstoRobotics-InspiredConformationalSamplinginRosetta.PLoSOne8,e63090(2013).

17. Johnson,G.&Wu,T.T.Kabatdatabaseanditsapplications:30yearsafterthefirstvariabilityplot.NucleicAcidsRes.28,214–8(2000).

18. Marze,N.A.&Gray,J.J.ImprovedpredictionofantibodyVL–VHorientation.ProteinEng.Des.Sel.AdvancedAccess,gzw013(2016).

19. Canutescu,A.A.&Dunbrack,R.L.Cycliccoordinatedescent:Aroboticsalgorithmforproteinloopclosure.ProteinSci.12,963–72(2003).

20. Wang,C.,Bradley,P.&Baker,D.Protein–ProteinDockingwithBackboneFlexibility.J.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 20: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

20

Mol.Biol.373,503–519(2007).

21. Bradley,P.,Misura,K.M.S.&Baker,D.TowardHigh-ResolutiondeNovoStructurePredictionforSmallProteins.Science.309,1868–1871(2005).

22. Misura,K.M.S.&Baker,D.Progressandchallengesinhigh-resolutionrefinementofproteinstructuremodels.ProteinsStruct.Funct.Genet.59,15–29(2005).

23. Weitzner,B.D.&Gray,J.J.AccuratestructurepredictionofCDRH3loopsenabledbyanovelstructure-basedC-terminal‘kink’constraint.InReview,(2016).

24. Gray,J.J.,Moughon,S.,Wang,C.,Schueler-Furman,O.,Kuhlman,B.,Rohl,C.A.&Baker,D.Protein–ProteinDockingwithSimultaneousOptimizationofRigid-bodyDisplacementandSide-chainConformations.J.Mol.Biol.331,281–299(2003).

25. Chaudhury,S.&Gray,J.J.ConformerSelectionandInducedFitinFlexibleBackboneProtein–ProteinDockingUsingComputationalandNMREnsembles.J.Mol.Biol.381,1068–1087(2008).

26. Kuroda,D.&Gray,J.J.Shapecomplementarityandhydrogenbondpreferencesinprotein-proteininterfaces:implicationsforantibodymodelingandprotein-proteindocking.Bioinformaticsbtw197(2016).doi:10.1093/bioinformatics/btw197

27. RosettaConstraintsDocumentation.https://www.rosettacommons.org/docs/wiki/rosetta_basics/Incorporating-Experimental-Data

28. Weitzner,B.D.,Kuroda,D.,Marze,N.,Xu,J.&Gray,J.J.BlindpredictionperformanceofRosettaAntibody3.0:Grafting,relaxation,kinematicloopmodeling,andfullCDRoptimization.ProteinsStruct.Funct.Bioinforma.82,1611–1623(2014).

29. ÓConchúir,S.,Barlow,K.A.,Pache,R.A.,Ollikainen,N.,Kundert,K.,O’Meara,M.J.,Smith,C.A.&Kortemme,T.AWebresourceforstandardizedbenchmarkdatasets,metrics,androsettaprotocolsformacromolecularmodelinganddesign.PLoSOne10,e0130433(2015).

30. Zemlin,M.,Klinger,M.,Link,J.,Zemlin,C.,Bauer,K.,Engler,J.A.,Schroeder,H.W.&Kirkham,P.M.ExpressedMurineandHumanCDR-H3IntervalsofEqualLengthExhibitDistinctRepertoiresthatDifferintheirAminoAcidCompositionandPredictedRangeofStructures.J.Mol.Biol.334,733–749(2003).

31. Kuroda,D.&Gray,J.J.Pushingthebackboneinprotein-proteindocking.InReview,(2016).

32. Schrödinger,L.ThePyMOLMolecularGraphicsSystem,Version1.8.(2015).

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 21: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

21

33. Pettersen,E.F.,Goddard,T.D.,Huang,C.C.,Couch,G.S.,Greenblatt,D.M.,Meng,E.C.&Ferrin,T.E.UCSFChimera--avisualizationsystemforexploratoryresearchandanalysis.JComputChem25,1605–1612(2004).

34. Chen,V.B.,Davis,I.W.&Richardson,D.C.KiNG(Kinemage,NextGeneration):Aversatileinteractivemolecularandscientificvisualizationprogram.ProteinSci.18,2403–2409(2009).

35. Lyskov,S.,Chou,F.-C.,Conchúir,S.Ó.,Der,B.S.,Drew,K.,Kuroda,D.,Xu,J.,Weitzner,B.D.,Renfrew,P.D.,Sripakdeevong,P.,etal.ServerificationofMolecularModelingApplications:TheRosettaOnlineServerThatIncludesEveryone(ROSIE).PLoSOne8,e63906(2013).

36. North,B.,Lehmann,A.&Dunbrack,R.L.AnewclusteringofantibodyCDRloopconformations.J.Mol.Biol.406,228–256(2011).

37. RosettaPrepackProtocolDocumentation.https://www.rosettacommons.org/docs/wiki/application_documentation/docking/docking-prepack-protocol

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 22: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

22

Table1.LocalAb–Agdockingbenchmarkresults.Co-crystalPDBIDsindicatethenativecomplex.PDBIDslistedunderthe“Type”columnalsoindicatetheuseofunbound(U)orbound(B)componentstructures(asavailable).ModelqualityisdefinedbytheCAPRIrankingcriteriarepresentedbyanumberofstarsorazero(0).Three,two,andonestar(s)indicatehigh,medium,andacceptablequality,respectively,andazeroindicatesincorrectmodels.Forthe“Rigid-Body”and“SnugDock”columns,thequalityofthelowest-scoringmodel,byinterfaceenergy,isreported(an“f”indicatesastrongenergyfunneldefinedasfiveormoreofthetenlowest-scoringmodelsbeingmediumqualityorbetter).EnsembleSnugDocksimulationswererunwithmulti-templategraftingandaCDRH3kinkconstraint.CAPRISummarylinessummarizemodelqualityforalltargets.CAPRISummaryTop10takesthehighest-qualitymodelfromthetenlowest-scoringmodels.

Co-crystal(PDBID) Type(Ab-Ag)

CDRH3Length

Rigid-BodyDockXtal

Rigid-BodyDockModel SnugDock

EnsembleSnugDock

1mlc U(1mlb)-U(1lza) 7 0 0 ** *1ahw U(1fgn)-U(1boy) 8 0 * * **1jps U(1jpt)-U(1tfh) 8 ** * * **1wej U(1qbl)-U(1hrc) 8 0 0 0 *1vfb U(1vfa)-U(8lyz) 8 * * 0 *1bql B-U(1dkj) 7 ***f 0 * 01k4c B-U(1jvm) 9 **f 0 * **2jel B-U(1poh) 9 ***f 0 ** *1jhl B-U(1ghl) 9 ***f 0 0 **1nca B-U(7nn9) 11 ***f 0 0 *2bdn B-B 8 ***f * * *1ynt B-B 9 ***f ***f **f ***f2aep B-B 9 ***f ** * 02b2x B-B 10 ***f * 0 **1ztx B-B 10 ***f * **f 0

CAPRISummaryTopDecoy 9***/2**/1* 1***/1**/6* 0***/4**/6* 1***/5**/6*CAPRISummaryTop10Decoys 9***/4**/2* 1***/6**/8* 1***/10**/4* 2***/11**/2*

No.ofFunnels 10 1 2 1

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 23: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

23

Figure1.Aschematicofthemodelingprotocols.ThestructureontheleftshowstheFvantibodydomainspredictedbyhomologymodeling(heavychainindarkbluewithCDRH1andH2loopsinorangeandCDRH3loopinred;lightchaininyellowwithitsCDRloopsinlightblue).Thestructureontherightdepictsanantibody–antigenstructureoutputbydocking(antigeningreen).

HOMOLOGY MODELLING

ANTIBODY SEQUENCES

CRYSTAL STRUCTURE OF THE ANTIBODY

STRUCTURE OF THE ANTIGEN

ANTIBODY- ANTIGEN DOCKING

FINAL DOCKED MODELS

>H|antibody EVQLLESDGGLV... >L|antibody TDIMQSPSSLAV...

H2 H1

H3

L3L1

L2

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 24: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

24

Figure2:Exampleoutputofplot_LHOC.py.ThetwoplotsshowdistributionsoftheHeavyOpeningAngle18asobtainedbyplot_LHOC.pyfortwodifferentantibodies.The10distinctlight-heavyorientationtemplatesarerepresentedbythecircles.Thetentop-scoringmodelsafterH3loopmodelingarerepresentedbythediamondswiththefillcolorcorrespondingtothestartingtemplate;inthelegend,thesepointsareorderedfromsmallesttolargestmetricvalue.ForAntibody_1,theanglessampledbyRosettaoverlapwiththeanglesobservedinantibody

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 25: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

25

crystalstructures.Thetentop-scoringmodelsareclosetothecenterofthedistribution.InAntibody_2,mostoftheanglessampledarefoundrarelyornotatallinantibodycrystalstructures.Thetentop-scoringmodelsarealsoshiftedtolargeranglesthantypicallyfoundinantibodies.ForAntibody_2,theusermightconsidertryingalternatelight-heavyorientationtemplates(Step10).

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 26: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

26

Box1|Increasingsamplingduringdockingbyincorporatingbackbonestructuralensembles.

InRosetta,anensembleisasetofdiscreteconformationsofaproteinstructure.SnugDockusesensemblestoapproximatebackboneconformationalflexibilitybysamplingconformationsfromtheensembleduringdocking.Throughthisapproach,notonlydoestheprotocolexploremoreconformationalspacethanstandarddocking,butitcanalsocompensateformodelerror,forexamplebyusinganensembleofmodelsproducedbyamodelingapproachinapreviousstepsuchasRosettaAntibody.

RosettaensemblescanbeconverteddirectlyfromNMRensembles,ortheycanbegeneratedusinganymethodthatinducesstructuraldiversity,suchasmoleculardynamicsorvariousRosettarefinementprotocols.Theensemblestypicallyspansmallstructuralvariationsof1-2ÅbackboneRMSD25.Rosetta'srelaxprotocol(unconstrained)21,22orKICprotocol14–16aresuggestedtogeneratedockingensemblesforantigens.Inaddition,RosettaAntibodycreatesensemblesofantibodiesbydefault.MoreonhowtogeneratinganddockingensemblescanbefoundinChauduryandGray25andinRosetta’sdocumentation37.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 27: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

27

Box2|Assessingantibodymodelingandantibody–antigendockingresults

Theusermustcriticallyanalyzecomputationalmodels.ModelsoutputbyRosettashouldberankedaccordingtoscore.Inmostsimulations,approximately90%ofthemodelswillbenon-native-like,andtheywilloccupyabulkscorerangeof30-50RosettaEnergyUnits(REU).Onlyabout1-5%ofmodelswillbenative-like,andtheywillhavescoresrangingfromwithinthebulkscorerangetoawellof5-10REUbelowthebulkscorerange.Ifaclusterofstructurallyrelatedmodelslieinthewellbelowthebulkscorerange,thesearelikelynative-like,withdeeperscoringandmorepopulatedwellsprovidinghigherconfidence.

Antibodymodelassessment.

First,assessthephysicalfeasibilityofthelowest-scoringmodelsbyeyeinamolecularvisualizationpackagesuchasPyMOL.Itisimportanttocheckforobviousflawsthatcanoccurinrarecircumstancessuchaspolypeptidechainbreaksorbackboneclashes,particularlywithintheCDRsandattheirgraftpoints.Theaccuracyofthenon-H3CDRloopsshouldbefurtherassessedbycomparingtheCDRclusterofthegraftedloopwiththeclusteroftheinputsequenceasidentifiedbyNorthetal.36(seestep5).Next,ensurethecomponentsoftheVL–VHorientationliewithinnature’sdistribution(step9),asmodelswithoutlyingorientationsarenotlikelytobenative-like;anexceptiontothisrulecanbemadeiftheVL–VHorientationgraftingtemplatesandRosettasamplingallliefartowardtheedgeofnature’sdistribution.

Ab–Agdockingmodelassessment.

Makesurethatthelowest-scoringmodelsmakegoodcontactsbetweentheantigenandtheantibodyparatope.Higherconfidencecanbeassignedtomodelswithlarge(~1200Å2),complementaryinterfaces,26aswellasthoseinwhichtheH3CDRloopmakesseveralspecificcontacts.Ifexperimentaldatasuggestanantigenbindingsite,ensuretheparatopecontactsatthissite.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint

Page 28: Modeling and docking antibody structures with Rosetta · Analytical and Physical Chemistry, Showa University School of Pharmacy, Tokyo 142-8555, Japan. 4 Centre for Immune Regulation,

28

Box3|UsingRosettaondifferentplatformsandrunninginparallel.

Rosettaondifferentplatforms.Throughoutthisprotocolexecutablesaresuffixedbytheplatformandmodeforwhichtheywerecompiled(i.e.antibody.macosclangreleaseindicatesthattheantibodyexecutablewascompiledonaMacOSoperatingsystemusingtheClangcompileranditwascompiledinreleasemode).Thesuffixishighlightedinorangethroughout(.macosclangrelease).Onotherplatformsyouwillreplacethisstringwithyouroperatingsystemandcompiler(forexample,GNU/Linuxplatformswithgccasthecompilerwilldefaultto.linuxgccrelease).Additionally,thesuffixisprefixedby.mpi (.mpi.linuxgccrelease)whentheexecutableisbuiltforthemessagepassinginterface(MPI)byanMPIcompiler.MPI-compatibleexecutablescancommunicatewithoneanotherforparallelprocessing,andsomeRosettaexecutablesuseMPInon-trivially.However,moststandardRosettaapplicationsaretriviallyparallelizable(“embarrassinglyparallel”)andthuscapableofrunningonbothMPIandnon-MPIsystems.Runninginparallel.Anexampleofhowtolocallyrunanon-MPIexecutableinparallelisgiveninstep8.Ingeneral,addthe-multiple_processes_writing_to_one_directoryflagtoyourcommandline,andthenexecutemultipleinstancesoftheprocess.ThisprocedureworksonasingledesktopcomputerwithmultipleCPUsorremotelyonasupercomputercluster.However,runningaRosettaexecutableonaclusterstronglydependsonthehardwareconfigurationandavailablesoftware(e.g.workloadmanagementsoftware).

Forexample,torunanon-MPIexecutableviaHTCondor:(1)savethestandardcommandlineasanexecutablebashscript,(2)writeasubmitdescriptionfilespecifyingtheexecutablebashscriptandthenumberofprocessestoexecute,and(3)usethecondor_submitcommandwiththedescriptionfileasanargumenttosubmityourjobstothecluster.

Ontheotherhand,MPIexecutablescanberuninparallellocallybyprependingthecommandlinewiththempirun –n XX command,whereXXisthenumberofprocessestorun,ifyourmachineisconfiguredtousetheOpenMPIlibrary.Again,theexactdependonthespecificclusterconfiguration.Forexample,torunanMPIexecutableonStampedeviatheslurmworkloadmanager:(1)savethestandardcommandlineasanexecutablebashscript,(2)writeaslurmbatchscriptspecifyingtheexecutablebashscriptandthenumberoftasks,and(3)usethesbatchcommandwiththebashscriptasanargumenttosubmityourjobstothecluster.

.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted August 16, 2016. . https://doi.org/10.1101/069930doi: bioRxiv preprint