# A 3D structural SARS-CoV-2–human interactome to explore genetic and drug perturbations – Nature.com

### Generation and validation of SARS-CoV-2 homology models

Homology-based modeling of all 29 SARS-CoV-2 proteins was performed in Modeller95 using a multiple template modeling procedure consistent with previous high-profile homology modeling resources96. In brief, candidate template structures for each query protein were selected by running BLAST97 against all sequences in the PDB64 retaining only templates with at least 30% identify. Remaining templates were ranked using a weighted combination of percent identity and coverage described previously96. The final set of overlapping templates to use was first seeded with the top-ranked template with additional templates being added iteratively if (1) overall coverage increase from the template was at least 10% and (2) percent identify of the new template was no less than 75% the identity of the initial seed template (that is, if the template seed showed 80% identity, additional templates with percent identity as low as 60% could be included). Query-template pairwise alignments were generated in Modeller using default settings and were manually trimmed to remove large gaps (five or more gaps in a ten-residue window). Finally, modeling was carried out using the Modeller automodel function.

This approach generated homology models for 18 out of 29 proteins. Based on manual inspection of the template quality and sources, homology models were further filtered to 12 models for which a high-quality template from a SARS-CoV-1 homolog was available. Moreover, during revision of this manuscript, newly deposited PDB structures for many SARS-CoV-2 proteins (https://rcsb.org/covid19) allowed independent validation of homology model quality based on the RMSD following alignment and refinement in PyMol98. Visual representations of these alignments between modeled and solved structures are provided in Extended Data Fig. 1. For all analyses SARS-CoV-2 PDB structures were prioritized where available and only the homology model for nsp14 was retained.

### Interface prediction using ECLAIR

Interface predictions for all 332 interactions reported previously20 were made in two phases. In phase one, we leveraged our previously validated ECLAIR framework44 to perform initial residue-level predictions across all interactions. ECLAIR compiles five sets of features: biophysical, conservation, coevolution, structural and docking. In brief, biophysical features are compiled using a windowed average of several ExPASy ProtScales99, conservation features are derived from the Jensen–Shannon divergence100,101 from known homologs for each protein, coevolution features between interacting proteins are derived from direct coupling analysis102 and statistical coupling analysis103 among paired homologs, structural features are obtained by calculating the solvent-accessible surface area of available PDB64 or ModBase65 models using NACCESS104 and docking features are the average interchain distance and surface occlusion per residue from a consensus of independent Zdock105 trials.

Slight alterations were made to accommodate SARS-CoV-2–human predictions. First, construction of multiple sequence alignment (MSA) for statistical coupling analysis and direct coupling analysis calculations require at least 50 species containing homologs of both interacting proteins. Therefore, coevolution features could not be calculated for interspecies interactions. Second, MSAs for conservation features typically only allow one homolog per species. Because viral species classifications are less precise and are often subdivided into unique strains (and because all higher-order ECLAIR classifiers require protein conservation features) we modified the MSAs for viral proteins to include homologs from various strains in a single species. The initial prediction results from ECLAIR are provided in Supplementary Table 1.

### Interface prediction using guided HADDOCK docking

Interface predictions for all 332 interactions reported previously20 were made in two phases. In phase two, we leveraged high-confidence interface predictions from ECLAIR to perform guided docking in HADDOCK45,46. An introduction to protein–protein docking in HADDOCK is provided at https://www.bonvinlab.org/education/HADDOCK-protein-protein-basic/.

In brief, HADDOCK is designed to perform data-driven docking using (traditionally experimentally derived) priors about the interface. These data (for example scanning mutagenesis) often indicate sets of residues involved in the interface but no pairwise information linking interface residues between each protein. These residues (termed active residues) are used in conjunction with any neighboring surface residues (termed passive residues) to drive rigid body docking, by introducing a scoring penalty for any active residue on one protein not in proximity of an active or passive residue on the other. This approach is formalized as a set of ambiguous interaction restraints (AIRs) that evaluate the distances of each active residue to the active or passive residues on the other protein. The approach ensures that experimental priors about interface composition are enforced, but leaves the exact orientation and pairing of residues flexible to HADDOCK’s energy-based scoring function.

To incorporate computational interface predictions from ECLAIR we use the standard HADDOCK protein–protein docking framework. Active residues are encoded as all high-confidence ECLAIR predictions at the surface (≥15% solvent-accessible surface area (SASA)). Passive residues are identified as all surface residues (≥40% SASA) within 6 Å of an active residue. For definition of surface residues, the 15% SASA cutoff provides consistency with our definition of interface residues, whereas the 40% SASA cutoff provides consistency with the typical recommendation in HADDOCK. All SASA calculations were carried out using NACCESS104 and neighboring residues were selected using PyMol98. Following HADDOCK recommendations to reduce computational burden from using many restraints, we defined our AIRs using only the α-carbons and increased the upper distance limit for from 2 Å to 3 Å. All other HADDOCK run parameters were left at the default. In total, 1,000 rigid body docking trials were performed and the top-200-scored orientations were retained for subsequent iterations, refinement and analysis.

For each interaction we identified available PDB or homology model structures to determine whether the interaction should be eligible for docking. Previous benchmark evaluations show that HADDOCK performs well using homology models, but that performance drops off for models produced from low sequence identity templates106. In all cases PDB models were prioritized over homology models. We next evaluated risks of using low-coverage structures for protein–protein docking; using structure fragments that completely exclude the true interface residues will produce false interface predictions. We aimed to minimize this risk while maximizing the dockable interactome by setting two conditions for determining structure eligibility. First, protein structures covering at least 33% of the total protein length were considered sufficiently large for docking. Second, protein structures at least 50 residues in length and containing at least one high-confidence ECLAIR-predicted interface residue to use as an active residue were made eligible. Inclusion of an ECLAIR-defined active residue gives us reasonable confidence that part of the interface is covered and therefore, true docked interface predictions should be possible. When multiple structures were available for one protein, ranking was based on the sum of ECLAIR scores for all residues covered by each structure; we always selected the available structure most likely to include the true interface.

In total we performed guided HADDOCK docking on 138 out of 332 interactions. The remaining 194 interactions did not have reliable 3D models for both interactors. The top-scored docked conformation from each HADDOCK run was retained. The final docked interface annotations are provided in Supplementary Table 2.

### Definition of interface residues

We annotated interface residues from atomic-resolution docked models, using an established definition for interface residues44. The SASA for both bound and unbound docked structures was calculated using NACCESS104. We defined an interface residue as any residue that is both (1) at the surface of a protein (defined as ≥15% relative accessibility) and (2) in contact with the interacting chain (defined by a ≥1.0 Å2 decrease in absolute accessibility).

### Human–pathogen co-crystal structure benchmark set

We constructed a benchmark set of experimentally determined co-crystal structures to evaluate the performance of both our ECLAIR and guided HADDOCK docking interface predictions on interspecies interactions (Fig. 2a). First, we parsed 165,567 PDB structures, identified all interacting chains by interface residue calculation and mapped PDB chains to UniProt protein IDs using SIFT74 to identify a total of 33,242 unique protein–protein interactions. Using taxonomic lineages from UniProt we filtered this set to 7,738 interactions involving human proteins, of which 6,256 represented human–human intraspecies interactions and 1,482 represented interspecies interactions between humans and some other species. Finally, to provide the most relevant set of interactions that would be biologically similar to SARS-CoV-2–human interactions, we considered only interactions between human and viral proteins (346) or between human and bacterial proteins (163). We refer to this collective set of 509 co-crystal structures as our human–pathogen PDB benchmark set. The full list of structures and interface annotations for this benchmark set is provided in Supplementary Table 3.

To validate performance of ECLAIR predictions on the human–pathogen PDB benchmark, ECLAIR predictions were run as described above for SARS-CoV-2–human interactions. Evaluation of raw prediction probabilities was performed by AUROC in Python using scikit-learn and was compared against ECLAIR’s original test set containing 200 intraspecies interactions44. Precision and recall metrics were calculated based on ECLAIR’s binary definition for high-confidence versus non-interface predictions.

To validate HADDOCK guided docking performance using our human–pathogen PDB benchmark, we compared performance with a raw HADDOCK docking protocol. Guided docking was performed as described for SARS-CoV-2–human interactions. No PDB protein chains from the human–pathogen benchmark were used during docking. For raw HADDOCK docking no experimental constraints (AIRs) were provided and the ranair and surfrest parameters in the run.cns were set to true. Using these parameters, each rigid dock generates one random AIR between one surface residue from each protein A and B, which is used to ensure that the two protein chains slide together during docking. Overall performance of protocols was evaluated based on precision and recall of the true interface (Fig. 2c). Secondary evaluation was conducted based on RMSD in PyMol before refinement between the docked and co-crystal structures (Fig. 2d). When multiple co-crystal structures were used to define the interfaces, the RMSD was reported as the average RMSD against all co-crystal structures.

### Compilation of sequence variation sets

For analysis of genetic variation that may impact the viral–human interactome, two sets of mutations were compiled: (1) viral mutations and (2) human population variants.

For viral mutations, we identified sequence divergences between SARS-CoV-1 and SARS-CoV-2 versions of each protein based on alignment. Representative sequences for 16 SARS-CoV-1 proteins were obtained from UniProt (Proteome ID UP000000354)107,108. Sequences for 29 SARS-CoV-2 proteins were reported previously20 and based on GenBank accession code MN985325 (refs. 109,110). Notably, UniProt accession codes for the SARS-CoV-1 proteome report two sequences for the uncleaved ORF1a and ORF1a-b, which correspond to NSP1 through NSP16 in SARS-CoV-2. Sequence divergences were reported after pairwise Needleman Wench alignment111,112 (using Blosum62 scoring matrix, gap open penalty of 10 and gap extension penalty of 0.5) between the corresponding protein sequences from each species. A total of 1,003 missense variants were detected among 23 SARS-CoV-2 proteins. No suitable alignment form a SARS-CoV-1 sequence was available for orf3b orf8 or orf10. Additionally orf7b, nsp3 and nsp16 were excluded because they were not involved in any viral–human interactions. The full list of SARS-CoV-2 mutations is reported in Supplementary Table 5.

We obtained human population variants for all 332 human proteins interacting with SARS-CoV-2 proteins from gnomAD61. We used gnomAD’s graphQL API to run programmatic queries to fetch all missense variants per gene. Details on performing gnomAD queries in this manner are available at https://github.com/broadinstitute/gnomad-browser/tree/master/projects/gnomad-api. We used the Ensembl Variant Effect Predictor113 to map gnomAD DNA-level single-nucleotide polymorphisms (SNPs) to equivalent protein-level UniProt annotations. After Variant Effect Predictor mapping, variants were parsed to ensure the reported reference amino acid and position agree with the UniProt sequence and roughly 4.4.6% of variants that did not match were dropped from our dataset because they could not reliably be mapped to UniProt coordinates. In total 127,528 human population variants were curated. The full list of human population variants from gnomAD is reported in Supplementary Table 4.

### Log odds enrichment calculations

To determine enrichment or depletion, ORs were calculated as described previously.114

$$\mathrm {OR} = \frac{{a/c}}{{b/d}}$$

Where, a, b, c and d describe values in a contingency table between case and exposure criteria. For a particular application, where we are interested in the enrichment of viral mutations or human population variants (case, variant versus nonvariant) along predicted interaction interfaces (exposure, interface versus non-interface), we would have:

$${a} = {\mathrm{number}}\,{\mathrm{of}}\,{\mathrm{variant}}\,{\mathrm{interface}}\,{\mathrm{residues}}$$

$${b} = {\mathrm{number}}\,{\mathrm{of}}\,{\mathrm{nonvariant}}\,{\mathrm{interface}}\,{\mathrm{residues}}$$

$${c} = {\mathrm {number}}\,{\mathrm{of}}\,{\mathrm{variant}}\,{\mathrm{noninterface}}\,{\mathrm{residues}}$$

$${d} = {\mathrm {number}}\,{\mathrm{of}}\,{\mathrm{nonvariant}}\,{\mathrm{noninterface}}\,{\mathrm{residues}}$$

Statistical tests for enrichment or depletion were performed by calculating the z-statistic and corresponding two-sided P value for the OR (unadjusted for multiple hypothesis testing).

$$z = \frac{{\ln OR}}{{\sqrt {\frac{1}{a} + \frac{1}{b} + \frac{1}{c} + \frac{1}{d}} }}$$

All reported ORs were log2 transformed to maintain interpretable symmetry between enriched and depleted values. To avoid arbitrary OR inflation or depletion from missing data, in all cases where the interface residues were predicted by molecular docking, the OR was altered to only account for positions that were included in the structural models used for docking.

### Curation of disease-associated variants

To explore whether human proteins interacting with SARS-CoV-2 proteins were enriched for disease or trait-associated variants, three datasets were curated: HGMD68, ClinVar69 and the NHGRI-EBI GWAS catalog70. Disease annotations for HGMD and ClinVar were downloaded directly from these resources and mapped to UniProt. To calculate enrichment of individual disease terms, we reconstructed the disease ontology from NCBI MedGen term relationships (https://ftp.ncbi.nlm.nih.gov/pub/medgen/MGREL.RRF.gz) and propagated counts up through all parent nodes up to a singular root node. A meaningful subset of significantly enriched terms were reported using the most general term with no more significant ancestor term (Supplementary Table 7, sheet 1). Raw enrichment values for all terms are also provided (Supplementary Table 7, sheet 2).

For curation of disease and trait associations from the NHGRI-EBI GWAS catalog (http://www.ebi.ac.uk/gwas/)70, lead SNPs (P value <5 × 10−8) for all diseases/traits were retrieved on 16 June 2020. Proxy SNPs in high linkage disequilibrium (LD) (parameters, R2 > 0.8; pop, ‘ALL’) for individual lead SNPs were obtained through programmatic queries to the LDproxy API115, which used phase 3 haplotype data from the 1000 Genomes Project as reference for calculating pairwise metrics of LD. Both lead SNPs and proxy SNPs were filtered to retain only missense variants.

### In silico scanning mutagenesis and ΔΔG estimation

To explore the importance of each SARS-CoV-2–human interface residue and the impact of all possible mutations along the interface, we performed in silico scanning mutagenesis. We used a setup provided by the PyRosetta documentation (https://graylab.jhu.edu/pyrosetta/downloads/scripts/demo/D090_Ala_scan.py) designed around an approach previously benchmarked to correctly identify nearly 80% of interface hotspot mutations59. For consistency, we replaced the PyRosetta implementation’s definition of interface residues (≤ 8.0 Å away from partner chain), with our definition described above.

We encourage reference to the original well-documented demo for details, but in brief, we considered all interface residue positions and began by estimating the wild-type binding energy for the interaction. The complex state energy is calculated following a PackRotamersMover operation to optimize the side chains of residues within 8.0 Å of the interface residue to be mutated. The chains are separated 500.0 Å to eliminate any interchain energy contributions and energy for the unbound state is calculated the same way. The difference between these two values provides the binding energy for the wild-type (WT) structure.

$${{{{\rm{\Delta}}}{{\rm{G}}}_{{\rm{WT}}}}} = {E}_{\mathrm {complex}} – {E}_{\mathrm {unbound}}$$

To estimate the binding energy for all 19 amino acid mutations possible at the given position, each mutation is made iteratively and the ΔGMut is as above using the mutated structures. Finally, the change in binding energy from each mutation is the difference between these two binding energies.

$${{\rm{\Delta}}}{{\rm{\Delta}}}{{\rm{G}}} = {\Delta}{G}_{Mut} – {\Delta}{G}_{WT}$$

The scoring function used for these calculations is as described previously59 using the following weights: fa_atr = 0.44, fa_rep = 0.07, fa_sol = 1.0, hbond_bb_sc = 0.5, hbond_sc = 1.0. To account for stochasticity of the PackRotamersMover optimization between trials, all ΔΔG values are reported from an average of ten independent trials. To test whether a mutation had a significantly nonzero impact on binding energy, a two-sided z-test between the ten independent trials was performed. To account for average impact of other same amino acid mutations at other positions along the interface, each average ΔΔG was z-normalized relative to the rest of the interface and outliers were called at ≥1 × s.d. away from the mean. Mutations that passed both criteria were identified as interface binding affinity hotspots. No adjustments were made for multiple hypothesis corrections.

### Predicting ΔΔG from SARS-CoV-1 and SARS-CoV-2 divergences

Estimates of the overall impact of the cumulative set of mutations between SARS-CoV-1 and SARS-CoV-2 were made based on the in silico mutagenesis framework modified to introduce multiple mutations at a time. We generated interaction models using the SARS-CoV-1 protein by applying all amino acid substitutions between the two viruses to initial docked models containing the SARS-CoV-2 protein. A minority of mutations that comprised insertions or deletions could not be modeled under this framework. The ΔΔG calculation here was identical to the single mutation ΔΔG described above, except that side-chain rotamer optimization involved all residues within 8.0 Å of any of the mutated residues. The ΔΔG values were calculated considering the SARS-CoV-1 as the wild-type such that a negative ΔΔG indicates that the interaction is more stable (lower binding energy) in the SARS-CoV-2 version of the interaction compared to the SARS-CoV-1 version of the interaction:

$${{{\rm{\Delta}}}{{\rm{\Delta}}}{\rm{G}}} = {\Delta}{G}_{SARSCoV2} – {\Delta}{G}_{SARSCoV1}$$

To account for stochasticity between trials for these predictions (which notably had a larger impact likely due to the decreased constraints on rotamer optimization in these cases), this set of ΔΔG values was reported as an average of 50 trials. Outliers for overall binding affinity change from SARS-CoV-1 to SARS-CoV-2 were called based on similar criteria to the individual mutations, except the z score normalization was performed relative to all other interactions.

### Protein–ligand docking using smina

To further prioritize 76 previously reported candidate drugs targeting human proteins in the SARS-CoV-2–human interactome20, we performed protein–ligand docking for, 30 interaction–drug pairs (involving 25 unique drugs) that were amenable to docking. For docking, we excluded any human protein targets whose structures were below 33% coverage. To prep for docking, 3D structures for all ligands were first generated using Open Babel116 and the command:

obabel -:”[SMILES_STRING]”–gen3d -opdb -O [OUT_FILE] -d

Protein–ligand docking was executed using smina87 with the following parameters. The autobox_ligand option was turned on and centered around the receptor PDB file with an autobox_add border size of 10 Å. To increase the number of independent stochastic sampling trajectories and increase the likelihood of identifying a global minimum, the exhaustiveness was set to 40 and the num_modes was set to retain the top 1,000 ranked models. To reduce real wall time, each docking process was run using five CPU cores (no impact on net CPU time). The final smina command used was as follows:

smina -r [RECEPTOR] -l [LIGAND] –autobox_ligand [RECEPTOR] –autobox_add 10 -o [OUT_FILE] –exhaustiveness 40 –num_modes 1000 –cpu 5 –seed [SEED]

Each protein–ligand docking command was repeated ten times (essentially the same as one trial with exhaustiveness set to 400) with a unique seed to saturate the ligand binding search space as thoroughly as possible. We note that a single run with exhaustiveness ranging from 30–50 is considered sufficient for most applications87. To retain candidate poses covering different low-energy binding sites, a final set of up to ten of the best-scoring poses with centers at least 1 Å away from one another was selected. Results described in this manuscript are reported based the top-ranked pose. Protein residues involved in drug binding sites were annotated using the same criteria used to define interface residues. The record type for all ligand atoms was first manually changed from HETATM to ATOM because NACCESS otherwise excluded ligand atoms from the solvent-accessible surface area calculations.

### Validation of smina docking to identify drug binding sites

Past evaluation of smina shows competitive performance across numerous Community Structure-Activity Resources87,88. However, traditional docking evaluation tasks, focus on sampling and correctly scoring docked conformations within a single known binding site and may frequently restrict the docking space to a few angstroms bounding box around the known ligand conformation. The focus is on recovering precisely how a ligand orients within a binding site rather than identifying the binding site from the whole protein surface.

Because this performance metric may not provide sufficient confidence in smina’s ability to identify a binding site from scratch (our application in this manuscript) we re-benchmarked smina’s performance using an established drug docking benchmark set containing 4,399 protein–ligand complexes representing 95 protein targets88. We defined true ligand binding site residues from the available crystal structure and evaluated the fraction correctly recovered by smina’s top-ranked dock across the full protein surface.

Docking was performed as above and evaluated based on both re-docking (ligand docked back into the exact receptor structure it came from) and cross-docking (ligand docked into an alternate conformation of the receptor it came from) conditions. Because the conformation of the binding pocket from an alternate receptor may not perfectly accommodate the ligand, cross-docking is considered more difficult, but also more representative of real conditions when making new predictions.

To provide a reference for whether smina selectively recovered the true binding site we calculated a baseline random expectation. Artificial binding sites were defined by selecting a single surface residue and its N nearest neighbors, where N is the number of binding site residues in the true binding site. The average recovery of the true binding site from all such artificial binding sites was used as the null expectation for each drug–target pair.

### Construction of plasmids for Y2H and co-IP

Clones of all human proteins tested were picked from the hORFeome 8.1 library117. Clones for all SARS-CoV-1 and SARS-CoV-2 proteins tested were designed to match GenBank entries AY357076 and MN908947, respectively. To construct plasmids for testing by Y2H, viral genes were PCR amplified and cloned into pDEST-AD and pDEST-DB vectors (for Y2H). For co-IP, Gateway LR reactions were used to transfer bait SARS-CoV-2 nsp1 protein into a pQXIP (ClonTech, 631516) vector modified to include a Gateway cassette featuring a carboxy-terminal 3× FLAG.

### Yeast two-hybrid screens

Y2H experiments were carried out as previously described76,81,118 to (1) confirm that SARS-CoV-2–human interactions previously detected by IP–mass spectrometry could be recapitulated in Y2H, (2) compare the occurrence of interactions using SARS-CoV-1 versus SARS-CoV-2 viral baits and (3) profile the disruption of SARS-CoV-2–human interactions by human population variants. In brief human and viral clones were transferred into Y2H vectors pDEST-AD and pDEST-DB by Gateway LR reactions then transformed into MATa Y8800 and MATα Y8930, respectively. For comparisons of interest, the viral–human interactions were screened in both orientations; namely viral DB-ORF MATα transformants were mated against corresponding human AD-ORF MATa transformants and vice versa. All DB-ORF yeast cultures were also mated against MATa yeast transformed with an empty pDEST-AD vector to screen for autoactivators. Mated transformants were incubated overnight at 30 °C, before being plated onto selective Synthetic Complete agar medium lacking leucine and tryptophan (SC-Leu-Trp) to select for mated diploid yeast. After another overnight incubation at at 30 °C, diploid yeast were plated onto two sets of SC-Leu-Trp agar selection plates; one lacking histidine and supplemented with 1 mM of 3-amino-1,2,4-triazole (SC-Leu-Trp-His+3AT), the other lacking adenine (SC-Leu-Trp-Ade). After overnight incubation at 30 °C, plates were replica-cleaned and incubated again for 3 d at 30 °C for final interaction calling.

### Cell culture, co-immunoprecipitation and western blotting

HEK 293T cells (ATCC, CRL-3216) were maintained in complete DMEM supplemented with 10% FBS. Cells were seeded onto six-well dishes and incubated until 70–80% confluency. Cells were then transfected with 1 µg of either empty vector, SARS-CoV-1 nsp1 or SARS-CoV-2 nsp1, respectively and combined with 10 µl of 1 mg ml−1 PEI (Polysciences, 23966) and 150 µl OptiMEM (Gibco, 31985-062). After 24 h incubation, cells were gently washed three times in 1× PBS and then resuspended in 200 µl cell lysis buffer (10 mM Tris-Cl, pH 8.0, 137 mM NaCl, 1% Triton X-100, 10% glycerol, 2 mM EDTA and 1× EDTA-free Complete Protease Inhibitor tablet (Roche)) and incubated on ice for 30 min. Extracts were cleared by centrifugation for 10 min at 16,000g at 4 °C. For co-IP, 100 µl cell lysate per sample was incubated with 5 μl EZ view Red Anti-FLAG M2 Affinity Gel (Sigma, F2426) for 2 h at 4 °C under gentle rotation. After incubation, bound proteins were washed three times in cell lysis buffer and then eluted in 50 μl elution buffer (10 mM Tris-Cl pH 8.0, 1% SDS) at 65 °C for 10 min. Cell lysates and co-IP samples were then treated in 6× SDS protein loading buffer (10% SDS, 1 M Tris-Cl, pH 6.8, 50% glycerol, 10% β-mercaptoethanol and 0.03% bromophenol blue) and subjected to SDS–PAGE. Proteins were then transferred from gels onto PVDF (Amersham) membranes. Anti-FLAG (Sigma, F1804) and anti-PRIM2 (abcam, ab241990) at 1:3,000 dilutions were used for immunoblotting analysis.

### Cloning human population variants through site-directed mutagenesis

Mutant clones containing human population variants were generated using site-directed mutagenesis as described previously83. In brief, wild-type G3BP2 was picked from the hORFeome 8.1 library117 and used as a template for site-directed mutagenesis. Site-specific mutagenesis primers (Eurofins) for mutagenesis were designed using the webtool primer.yulab.org. To minimize sequencing artifacts, PCR was limited to 18 cycles using Phusion polymerase (NEB, M0530). PCR products were digested overnight with DpnI (NEB, R0176) then transformed into competent bacteria cells to isolate single colonies. To confirm successful mutagenesis single colonies were then hand-picked, incubated for 21 h at 37 °C under constant vibration and submitted for Sanger sequencing to ensure the desired single base-pair mutation (and no other mutations) had been introduced.

### Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Source of this news: https://www.nature.com/articles/s41592-021-01318-w

IP Rotating Proxy Onsale

SPECIAL LIMITED TIME OFFER

00
Months
00
Days
00
Hours
00
Minutes
00
Seconds
First month free with coupon code FREE30