SARS-CoV 3CL protease cleaves its C-terminal ... · C-terminal autoprocessingof3CLpro in...

6
SARS-CoV 3CL protease cleaves its C-terminal autoprocessing site by novel subsite cooperativity Tomonari Muramatsu a,b,1 , Chie Takemoto a,c , Yong-Tae Kim a,2 , Hongfei Wang a,3 , Wataru Nishii a,b , Takaho Terada a,b , Mikako Shirouzu a,c , and Shigeyuki Yokoyama a,b,1 a RIKEN Systems and Structural Biology Center, Tsurumi, Yokohama 230-0045, Japan; b RIKEN Structural Biology Laboratory, Tsurumi, Yokohama 230-0045, Japan; and c Division of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, Tsurumi, Yokohama 23-0045, Japan Edited by Masaru Tanokura, University of Tokyo, Bunkyo-ku, Japan, and accepted by Editorial Board Member Gregory A. Petsko September 29, 2016 (received for review February 3, 2016) The 3C-like protease (3CL pro ) of severe acute respiratory syndrome coronavirus (SARS-CoV) cleaves 11 sites in the polyproteins, includ- ing its own N- and C-terminal autoprocessing sites, by recognizing P4P1 and P1. In this study, we determined the crystal structure of 3CL pro with the C-terminal prosequence and the catalytic-site C145A mutation, in which the enzyme binds the C-terminal prose- quence of another molecule. Surprisingly, Phe at the P3position [Phe(P3)] is snugly accommodated in the S3pocket. Mutations of Phe(P3) impaired the C-terminal autoprocessing, but did not affect N-terminal autoprocessing. This difference was ascribed to the P2 residue, Phe(P2) and Leu(P2), in the C- and N-terminal sites, as follows. The S3subsite is formed by Phe(P2)-induced conforma- tional changes of 3CL pro and the direct involvement of Phe(P2) itself. In contrast, the N-terminal prosequence with Leu(P2) does not cause such conformational changes for the S3subsite forma- tion. In fact, the mutation of Phe(P2) to Leu in the C-terminal autoprocessing site abolishes the dependence on Phe(P3). These mechanisms explain why Phe is required at the P3position when the P2 position is occupied by Phe rather than Leu, which reveals a type of subsite cooperativity. Moreover, the peptide consisting of P4P1 with Leu(P2) inhibits protease activity, whereas that with Phe (P2) exhibits a much smaller inhibitory effect, because Phe(P3) is missing. Thus, this subsite cooperativity likely exists to avoid the auto- inhibition of the enzyme by its mature C-terminal sequence, and to retain the efficient C-terminal autoprocessing by the use of Phe(P2). SARS | 3CL protease | specificity | subsite cooperativity | crystal structure S evere acute respiratory syndrome coronavirus (SARS-CoV) produces several functional proteins in infected human cells by cleaving them from its two overlapping polyproteins,pp1a (486 kDa) and pp1ab (790 kDa) (1). Papain-like protease 2 (PL2 pro ) and 3C-like protease (3CL pro , also referred to as the main protease, M pro ), are included among these polyproteins. PL2 pro cleaves three sites and 3CL pro cleaves 11 sites in the polyproteins to generate individual functional proteins, including an RNA-dependent RNA polymerase, a helicase, a single-stranded RNA-binding protein, an exoribonuclease, an endoribonuclease, and a 2-O-ribose methyltransferase (1). 3CL pro is a cysteine pro- tease that is excised from polyproteins by its own proteolytic ac- tivity (1) and forms a homodimer with one active site per subunit (2). 3CL pro reportedly recognizes the residues from P4 to P1 on the N-terminal side and P1on the C-terminal side of the cleavage sites, based on the consensus sequences around the processing sites in the SARS-CoV polyproteins and the extensive mutagenesis analyses of the N-terminal autoprocessing site (3, 4). Three types of crystal structures of SARS-CoV 3CL pro have been reported (as reviewed in ref. 5): the wild-type active dimer (wt-dimer) (2, 6); the monomeric forms or the G11A, R298A, and S139A mutants, which cannot dimerize (710); and the super- active octamer (11). The 3CL pro subunit consists of the N-terminal finger (residues 18), the catalytic domain (residues 8184), and the C-terminal domain (residues 201306) (2), and the overall domain structures are the same among all of the reported 3CL pro structures. 3CL pro requires dimerization for the proteolytic activity (12), as suggested by the structure of the wt-dimer (2, 6). As for the C-terminal autoprocessing mechanism of 3CL pro , the structure of the C-terminal product-bound form has been reported for the mature-type 3CL pro C145A mutant, in which the C-terminal portion of one subunit is bound to a subunit in an ad- jacent asymmetric unit (13). In the present study, we crystallized the dimer of the 3CL pro proform, containing a 10-residue C-terminal prosequence, with the C145A mutation of the catalytic cysteine residue. As expected, we found that one of these prosequences is bound, as a substrate, to the active site of a subunit from an adjacent asymmetric unit. Based on this structure and biochemical experi- ments, we conclude that Phe at the P3position is required when the P2 residue is Phe. This recognition mode appears only for the C-terminal autoprocessing of 3CL pro in the SARS-CoV polyproteins. Results and Discussion The Crystal Structure of the C-Terminal Proform of SARS-CoV 3CL pro Revealed Unexpected Binding of the P3Residue to the Enzyme. The C-terminal proformof 3CL pro was designed with a 10-residue Significance The SARS-CoV protease (3CL pro ) has noncanonicalsubstrate specificity for its C-terminal autoprocessing. Phe is required at both the second position upstream of the cleavage point (P2) and the third downstream position (P3). This finding is sur- prising, given that 3CL pro reportedly requires Leu at the P2 position with no preference at the P3position. The conven- tional consensus sequencecannot explain this noncanonical specificity. Crystallography revealed that Phe at the P2 position changes the conformation of the substrate-binding pocket, and thereby creates the subsite for Phe at the P3position. This noncanonical specificity avoids the autoinhibition due to the mature C-terminal sequence of 3CL pro , which should be serious if Leu exists at the P2 position. Author contributions: T.M. and S.Y. designed research; T.M., C.T., Y.-T.K., H.W., W.N., T.T., and M.S. performed research; T.M. and S.Y. analyzed data; and T.M. and S.Y. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. M.T. is a Guest Editor invited by the Editorial Board. Freely available online through the PNAS open access option. Data deposition: The atomic coordinates and structure factors of the mature form and the pro-form of SARS-CoV 3CL pro have been deposited in the Protein Data Bank, www.pdb. org (PDB ID codes 2DUC and 5B6O). 1 To whom correspondence may be addressed. Email: [email protected] or [email protected]. 2 Present address: Department of Food Science and Biochemistry, Kunsan 573-701 National University, Kunsan 573-701, Jeonbuk, Korea. 3 Present address: Institute of Opto-Electronics and Institute of Molecular Science, Shanxi University, Shanxi 030006, China. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1601327113/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1601327113 PNAS | November 15, 2016 | vol. 113 | no. 46 | 1299713002 BIOCHEMISTRY Downloaded by guest on September 21, 2020

Transcript of SARS-CoV 3CL protease cleaves its C-terminal ... · C-terminal autoprocessingof3CLpro in...

Page 1: SARS-CoV 3CL protease cleaves its C-terminal ... · C-terminal autoprocessingof3CLpro in theSARS-CoV polyproteins. Results and Discussion The Crystal Structure of the C-Terminal Proform

SARS-CoV 3CL protease cleaves its C-terminalautoprocessing site by novel subsite cooperativityTomonari Muramatsua,b,1, Chie Takemotoa,c, Yong-Tae Kima,2, Hongfei Wanga,3, Wataru Nishiia,b, Takaho Teradaa,b,Mikako Shirouzua,c, and Shigeyuki Yokoyamaa,b,1

aRIKEN Systems and Structural Biology Center, Tsurumi, Yokohama 230-0045, Japan; bRIKEN Structural Biology Laboratory, Tsurumi, Yokohama 230-0045,Japan; and cDivision of Structural and Synthetic Biology, RIKEN Center for Life Science Technologies, Tsurumi, Yokohama 23-0045, Japan

Edited by Masaru Tanokura, University of Tokyo, Bunkyo-ku, Japan, and accepted by Editorial Board Member Gregory A. Petsko September 29, 2016 (receivedfor review February 3, 2016)

The 3C-like protease (3CLpro) of severe acute respiratory syndromecoronavirus (SARS-CoV) cleaves 11 sites in the polyproteins, includ-ing its own N- and C-terminal autoprocessing sites, by recognizingP4–P1 and P1′. In this study, we determined the crystal structure of3CLpro with the C-terminal prosequence and the catalytic-siteC145A mutation, in which the enzyme binds the C-terminal prose-quence of another molecule. Surprisingly, Phe at the P3′ position[Phe(P3′)] is snugly accommodated in the S3′ pocket. Mutations ofPhe(P3′) impaired the C-terminal autoprocessing, but did not affectN-terminal autoprocessing. This difference was ascribed to the P2residue, Phe(P2) and Leu(P2), in the C- and N-terminal sites, asfollows. The S3′ subsite is formed by Phe(P2)-induced conforma-tional changes of 3CLpro and the direct involvement of Phe(P2)itself. In contrast, the N-terminal prosequence with Leu(P2) doesnot cause such conformational changes for the S3′ subsite forma-tion. In fact, the mutation of Phe(P2) to Leu in the C-terminalautoprocessing site abolishes the dependence on Phe(P3′). Thesemechanisms explain why Phe is required at the P3’ position whenthe P2 position is occupied by Phe rather than Leu, which reveals atype of subsite cooperativity. Moreover, the peptide consisting ofP4–P1 with Leu(P2) inhibits protease activity, whereas that with Phe(P2) exhibits a much smaller inhibitory effect, because Phe(P3′) ismissing. Thus, this subsite cooperativity likely exists to avoid the auto-inhibition of the enzyme by its mature C-terminal sequence, and toretain the efficient C-terminal autoprocessing by the use of Phe(P2).

SARS | 3CL protease | specificity | subsite cooperativity | crystal structure

Severe acute respiratory syndrome coronavirus (SARS-CoV)produces several functional proteins in infected human cells

by cleaving them from its two overlapping “polyproteins,” pp1a(486 kDa) and pp1ab (790 kDa) (1). Papain-like protease 2(PL2pro) and 3C-like protease (3CLpro, also referred to as themain protease, Mpro), are included among these polyproteins.PL2pro cleaves three sites and 3CLpro cleaves 11 sites in thepolyproteins to generate individual functional proteins, includingan RNA-dependent RNA polymerase, a helicase, a single-strandedRNA-binding protein, an exoribonuclease, an endoribonuclease,and a 2′-O-ribose methyltransferase (1). 3CLpro is a cysteine pro-tease that is excised from polyproteins by its own proteolytic ac-tivity (1) and forms a homodimer with one active site per subunit(2). 3CLpro reportedly recognizes the residues from P4 to P1 on theN-terminal side and P1′ on the C-terminal side of the cleavagesites, based on the consensus sequences around the processing sitesin the SARS-CoV polyproteins and the extensive mutagenesisanalyses of the N-terminal autoprocessing site (3, 4).Three types of crystal structures of SARS-CoV 3CLpro have

been reported (as reviewed in ref. 5): the wild-type active dimer(wt-dimer) (2, 6); the monomeric forms or the G11A, R298A, andS139A mutants, which cannot dimerize (7–10); and the super-active octamer (11). The 3CLpro subunit consists of the N-terminalfinger (residues 1–8), the catalytic domain (residues 8–184), andthe C-terminal domain (residues 201–306) (2), and the overalldomain structures are the same among all of the reported 3CLpro

structures. 3CLpro requires dimerization for the proteolytic activity(12), as suggested by the structure of the wt-dimer (2, 6).As for the C-terminal autoprocessing mechanism of 3CLpro, the

structure of the “C-terminal product”-bound form has beenreported for the mature-type 3CLpro C145A mutant, in which theC-terminal portion of one subunit is bound to a subunit in an ad-jacent asymmetric unit (13). In the present study, we crystallized thedimer of the 3CLpro proform, containing a 10-residue C-terminalprosequence, with the C145A mutation of the catalytic cysteineresidue. As expected, we found that one of these prosequences isbound, as a substrate, to the active site of a subunit from an adjacentasymmetric unit. Based on this structure and biochemical experi-ments, we conclude that Phe at the P3′ position is required whenthe P2 residue is Phe. This recognition mode appears only for theC-terminal autoprocessing of 3CLpro in the SARS-CoV polyproteins.

Results and DiscussionThe Crystal Structure of the C-Terminal Proform of SARS-CoV 3CLpro

Revealed Unexpected Binding of the P3′ Residue to the Enzyme. The“C-terminal proform” of 3CLpro was designed with a 10-residue

Significance

The SARS-CoV protease (3CLpro) has “noncanonical” substratespecificity for its C-terminal autoprocessing. Phe is required atboth the second position upstream of the cleavage point (P2)and the third downstream position (P3′). This finding is sur-prising, given that 3CLpro reportedly requires Leu at the P2position with no preference at the P3′ position. The conven-tional “consensus sequence” cannot explain this noncanonicalspecificity. Crystallography revealed that Phe at the P2 positionchanges the conformation of the substrate-binding pocket, andthereby creates the subsite for Phe at the P3′ position. Thisnoncanonical specificity avoids the autoinhibition due to themature C-terminal sequence of 3CLpro, which should be seriousif Leu exists at the P2 position.

Author contributions: T.M. and S.Y. designed research; T.M., C.T., Y.-T.K., H.W., W.N., T.T.,and M.S. performed research; T.M. and S.Y. analyzed data; and T.M. and S.Y. wrotethe paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. M.T. is a Guest Editor invited by theEditorial Board.

Freely available online through the PNAS open access option.

Data deposition: The atomic coordinates and structure factors of the mature form and thepro-form of SARS-CoV 3CLpro have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes 2DUC and 5B6O).1To whom correspondence may be addressed. Email: [email protected] [email protected].

2Present address: Department of Food Science and Biochemistry, Kunsan 573-701National University, Kunsan 573-701, Jeonbuk, Korea.

3Present address: Institute of Opto-Electronics and Institute of Molecular Science, ShanxiUniversity, Shanxi 030006, China.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1601327113/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1601327113 PNAS | November 15, 2016 | vol. 113 | no. 46 | 12997–13002

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

21, 2

020

Page 2: SARS-CoV 3CL protease cleaves its C-terminal ... · C-terminal autoprocessingof3CLpro in theSARS-CoV polyproteins. Results and Discussion The Crystal Structure of the C-Terminal Proform

C-terminal prosequence, together with a cleavage-defective ac-tive-site mutation (C145A) (Fig. 1A). We determined the crystalstructure of the proform at 2.2-Å resolution (Fig. 1B) and that ofthe mature form at 1.7-Å resolution (Fig. S1A), as summarized inTable S1. Their overall homodimer structures are almost thesame as the mature-form structures reported previously (2, 13).In the crystal of the proform, the C-terminal portion of one of

the subunits is bound to the active site of the subunit from anadjacent asymmetric unit (Fig. 1C and Fig. S2), as in the struc-ture of the C-terminal product-bound form (13). The interactionof the C-terminal portion with the adjacent molecule is facili-tated by the inherent flexibility of the C-terminal heptapeptideCSGVTFQ (residues 300–306; Fig. S1B), which correspondsto the P7–P1 residues. The P4, P3, P2, and P1 residues interactwith the substrate recognition sites (S4–S1) (Fig. 1C and Fig.S2). Their binding modes are almost the same as those in theC-terminal product-bound structure (13) and the N-terminalsubstrate-bound structure (17), which clarifies the substratepreferences for the P4–P1 positions, as determined by statisticaland experimental methods (1, 3, 4, 18) (Fig. 1 D and E).Our structure contains the P1′–P4′ residues of the prosequence,

and thus represents the “C-terminal substrate”-bound form (Fig.1C and Fig. S2). Statistical and experimental analyses (1, 3, 4, 18)revealed that the P1′ position has a preference for small aminoacids, but no preferences have been reported for the P2′ and P3′positions (Fig. 1 D and E). Surprisingly, we found that the sidechain of the P3′ residue, Phe, is also accommodated in a specifi-cally complementary pocket (Fig. 1C and Fig. S2). Therefore, wetested whether the P3′ position is actually recognized by thepocket, and whether it is important for autoprocessing.

The P3′ Residue of the C-Terminal Processing Position Is Recognizedby the Enzyme, in Contrast to the N-Terminal Processing Site. Wepreviously established an analysis system for 3CLpro autopro-cessing (19) by duplicating and differentiating the participatingmolecules as the enzyme and the substrate (Fig. 2A). As theenzyme, the 3CLpro moiety with the N- and C-terminal prose-quences, accompanied by an S-tag (a 15-residue tag derived fromRNase S, which is detected by the S protein, the other part of

RNase S) and a His-tag, respectively, was synthesized by anEscherichia coli cell-free protein synthesis method (20) (Fig. 2A).Note here that the mature 3CLpro is formed during the proteinsynthesis reaction (30 °C, 4 h), because both the N- and C-terminalprocessing sites are completely cleaved by the activity of the3CLpro moiety (19). The substrate molecule was differentiatedfrom the above construct by replacing the core region of the3CLpro moiety with green fluorescent protein (GFP) (Fig. 2A), andwas prepared by the same cell-free method. It has the S-tag, theN-terminal prosequence, the N-terminal 10 residues of 3CLpro,GFP, the C-terminal 10 residues of 3CLpro, the C-terminal pro-sequence, and the 6× His tag, in that order. This substrate mole-cule lacks the catalytic site, the substrate-binding site, anddimerization ability. The enzyme and the substrate were combinedimmediately after their synthesis, and the apparent kcat/KM value(i.e., cleavability) was obtained as described previously (19) (Fig.S3). Using this analysis system, we separately examined the effectsof various mutations of the substrate and the enzyme on thecleavage reactions (Fig. S4).The C-terminal cleavability (i.e., apparent kcat/KM), with the mature-

form wild-type enzyme, was reduced to 1/5, by the alteration ofPhe309 of the substrate (Fig. 2B, enzyme:wt, substrate:wt vs. F309A/F309N). This means that the P3′ position for C-terminal processingis actually recognized by the enzyme. In contrast, for N-terminalprocessing, the alteration of Phe3 of the substrate to Ala/Asn (F-to-A/N) did not affect the kcat/KM value (Fig. 2C, wt). Therefore, theP3′ position is recognized in C-terminal processing, in good agree-ment with the crystal structure of the proform with the C-terminalprosequence (Fig. 1C). In contrast, the P3′ position is not recognizedin N-terminal processing, in agreement with the previous study usingthe N-terminal processing sequence as a substrate (4).In addition to the mature enzyme, its proforms with the N- and/

or C-terminal prosequences may be involved in autoprocessing.Because this enzyme prefers Gln at the P1 position (4), the Gln(Q)-to-Asn(N) mutations were used to cause the loss of cleavageat its own N- and/or C-terminal processing site(s), which shouldresult in the proforms. In fact, the Q-to-N mutation(s) at the P1position(s) (position –1 and/or position 306 of the enzyme forautoprocessing of the N and C termini, respectively) abolished

Fig. 1. The crystal structure of the C-terminal pro-form of SARS-CoV 3CLpro reveals unexpected bindingof the P3′ residue to the enzyme. (A) The SARS-CoV3CLpro proform used for the 3D structure analyses.This proform contains the mutated catalytic siteC145A, with the 10-aa residue C-terminal prose-quence. (B) Crystal structure of the proform homo-dimer (green and blue ribbon models). Residues 1–310and 1–301, respectively, can be observed in the elec-tron density map. Ser1, His41, Ala145, and Gln306 areindicated by red sticks. (C) The C-terminal sequence(the P4, P3, P2, and P1 residues; blue stick models) andthe prosequence (the P1′, P2′, P3′, and P4′ residues;pink stick models) are bound to the enzyme as asubstrate in the adjacent asymmetric unit (surfacemodel with charge). The unbiased Fo–Fc differenceelectron density map for omitted residues 303–310(P4–P5′) is contoured at 3.0σ. The map was developedusing PHENIX (14). The replaced active site residue(Ala145) is also indicated. (D) The 11 sequences thatare cleaved by SARS-CoV 3CLpro in the SARS-CoVpolyproteins. The first and second sequences are theC- and N-terminal cleavage sequences, respectively,of 3CLpro itself (1). (E) The consensus sequence forcleavage by 3CLpro, analyzed by the sequence logoprogram (15), using the sequences shown in D andWebLogo version 2.8.2 (weblogo.berkeley.edu/) (16).

12998 | www.pnas.org/cgi/doi/10.1073/pnas.1601327113 Muramatsu et al.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

21, 2

020

Page 3: SARS-CoV 3CL protease cleaves its C-terminal ... · C-terminal autoprocessingof3CLpro in theSARS-CoV polyproteins. Results and Discussion The Crystal Structure of the C-Terminal Proform

the autocleavage (19). These proforms (Q–1N, Q306N, andQ–1N&Q306N) actually exhibited lower activities compared withthe mature form for the C- and N-terminal processing of the wild-type substrate (Fig. 2 B and C, respectively) (19). Nevertheless,the patterns of the relative activities for the wt substrate and theP3′ F-to-A/N variant substrate are quite similar between thewt and mutant enzymes (Fig. 2 B and C). Consequently, neitherthe N- nor the C-terminal prosequence in the enzyme affectsthe specificities for the N- and C-terminal P3′ positions in thesubstrates.

The Specificity for the P3′ Position in C-Terminal Processing Dependson the Phe Residue in the P2 Position. Why does the specificity forthe P3′ residue differ between the N- and C-terminal autopro-cessing reactions? In the crystal structure (Fig. 1C), the activesite of the enzyme interacts with the residues from P4 to P3′(from Val303 to Phe309), and this region must include thestructural feature that causes the P3′ specificity. Actually, Phe atthe P2 position is involved in the S3′ subsite formation (Fig. 3D).Usually, the terms “pocket” and “subsite” both refer to someregion in the enzyme; however, hereinafter we use the term“pocket” to indicate a hollow consisting only of enzyme residues,complementary to (a part of) the substrate, and the term “sub-site” regardless of whether it consists only of parts of the enzymeor contains some part(s) of the substrate. Among the 11 3CLpro

processing sites in the SARS-CoV polyproteins, only theC-terminal autoprocessing site of 3CLpro has Phe at the P2 po-sition, whereas the rest have aliphatic amino acid residues (eightLeu, one Val, and one Met) (Fig. 1D). When the P2 Phe (po-sition 305) in the C-terminal autoprocessing site of the substratewas converted to Leu, the F-to-A/N conversions at the P3′ po-sition (position 309) reduced the apparent kcat/KM to a muchsmaller extent compared with the parent P2 Phe substrate (FF toFA/FN vs. LF to LA/LN in Fig. 2D).The overall mechanism of this dual specificity can be

explained based on the 3D structures of the enzyme. Our pro-form structure is that of the C-terminal substrate-bound enzymewith Phe at the P2 position (Fig. 3A). Xue et al. (17) reported thestructure of the N-terminal substrate-bound enzyme with Leu atthe P2 position (Fig. 3B). When P2 is Phe, in addition to the S2pocket, the S3′ subsite is formed to accommodate the P3′ Phe

(Fig. 3 A and D). The N-terminal processing sequence of 3CLpro

also has Phe at the P3′ position, but because it has Leu at the P2position, the side chain of the P3′ Phe is not recognized (Fig. 3 Band F). In both structures, Gln at the P1 position is tightly boundin the S1 pocket, as reported previously (16, 17).

Formation of the S2 and S3′ Subsites and Their Interactions with theP2 and P3′ Residues in the C-Terminal Processing. The structuraldifferences in the C- and N-terminal substrate-bound structuresare caused mainly by the large variations in (i) the side chain χ1and χ2 angles of Gln189, (ii) the ψ angle of Asp48 in the“TAEDMLN loop” (positions 45–51), and (iii) the ψ angle of theP2′ residue (Fig. S5 A and B), as described in detail below. WhenPhe is in the P2 position (Fig. 3A and Fig. S5A), the side chainsof Gln189 and Met49 are shifted outward, and thus there issufficient space to accommodate the P2 Phe (Fig. 3 A and C andFig. S5 A and C). In contrast, when Leu is in the P2 position (Fig.3B and Fig. S5B), the side chain of Met49 is extended and liesalong the side chain of the P2 Leu, and thus is tightly held by theresulting narrow S2 pocket (Fig. 3 B and E and Fig. S5 B and E).The detailed mechanisms for the formation of the S2 and S3′

subsites are as follows. The Gln189 side chain movement is local(green in Fig. S5 A and B). In contrast, the Met49 side chainmovement is coupled with large movements of the TAEDMLNresidues (orange in Fig. S5 A and B), resulting in the formation ofa short 310 helix. The TAEDMLN loop is considered inherentlyflexible and tends to form a 310 helix, with such conformationaldifferences observed among previously reported crystal structures(2, 5–12). Even in the present crystal structure of the mature formwithout any substrate peptide, this TAEDMLN loop region formsa short 310 helix in one subunit (yellow in Fig. S1B), but is not wellordered in the other subunit (red in Fig. S1B). This conversion isinduced mainly by the rotation about a single bond, i.e., thechange in the ψ angle of Asp48 (Fig. S5 G and H). This results information of the S2 subsite for the P2 Phe (Fig. 3C and Fig. S5C).The S2 subsite for the P2 Phe consists of His41, Met49–Tyr54,His164, Met165, Asp187–Gln189, and the P3′ residue of thesubstrate (Fig. 3C and Fig. S5C). The binding of the P2 Phe to theS2 subsite results in formation of the S3′ pocket, consisting ofThr25, His41, Cys44–Ala46, Met49, Phe (P2), and Gly (P1′) (Fig.3D and Fig. S5D). Of note, two residues of the peptide substrate,

Fig. 2. Dual specificity of SARS-CoV 3CLpro. (A) Theenzyme and substrate molecules prepared by theE. coli cell-free protein synthesis method. Arrowheadsindicate sites that are cleaved by the proteolytic ac-tivity of 3CLpro; the circled C indicates the catalyticallyactive Cys145. Figure reprinted from ref. 19. (B) Effectof the P3′ position (residue 309) on the cleavability(apparent kcat/KM) of the C-terminal processing site.Because residue 3 has no effect on C-terminal pro-cessing, the F3A and F3N mutations are negativecontrols. (C) Effect of the P3′ position (residue 3) onthe cleavability (apparent kcat/KM) of the N-terminalprocessing site. Because residue 309 has no effect onthe N-terminal processing, the F309A and F309N mu-tations are negative controls. (D) Effect of the P2 po-sition on P3′ dependency. When P2 = Phe, effectivecleavage occurs only when P3′ = Phe; compare FF andFA/FN. However, when P2 = Leu, effective cleavageoccurs regardless of the P3′ position; no significantdifference in the apparent kcat/KM is seen among LF,LA, and LN. (E) Effect of residue 305 on the proteolyticactivity toward the N-terminal processing site. Valuesare mean ± SD from technical replicates (n = 3).

Muramatsu et al. PNAS | November 15, 2016 | vol. 113 | no. 46 | 12999

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

21, 2

020

Page 4: SARS-CoV 3CL protease cleaves its C-terminal ... · C-terminal autoprocessingof3CLpro in theSARS-CoV polyproteins. Results and Discussion The Crystal Structure of the C-Terminal Proform

Phe (P2) and Gly (P1′), are components of the S3′ subsite, wherethe location of Gly (P1′) is fixed by Gln (P1) binding. In thismanner, the P2 Phe and the P3′ Phe are recognized and bound tothe enzyme. In both positions, CH–π interactions (21) occur be-tween the CH3 group of Met49 and the phenyl group of Phe (P2)and between the CH3 group of Ala46 and the phenyl group of Phe(P3′) (Fig. 3A and Fig. S5A).A comparison of Fig. 3A and Fig. 3B reveals that the confor-

mations of the Phe (P2) and Leu (P2) substrates, respectively, arequite different; the C-terminal portion of the Phe (P2) substrate isbent, allowing the P3′ Phe to bind to the S3′ pocket. In fact, themajor conformational difference of the Phe (P2) substrate issimply the rotation about the ψ angle of the P2′ residue (Fig. S5 Aand I), compared with the Leu (P2) substrate (Fig. S5 B and J).

Dual Specificity.As described above, SARS-CoV 3CLpro recognizesthe N- and C-terminal autoprocessing sites with different mannersand specificities. On the basis of this dual specificity of SARS-CoV3CLpro, along with previous reports on the N-terminal autopro-cessing site (3, 4), the different roles of Phe at the P3′ positionbetween the N- and C-terminal processing sites are discussed in SIText. The dual specificity cannot be expressed as a single “con-sensus sequence,” which is based on the assumption of the “AND”

linkage of the recognition of each substrate residue (Pi) by thecorresponding subsite (Si) (i = . . .4, 3, 2, 1, 1′, 2′, 3′, 4′. . .). Incontrast, this cooperativity between subsites S2 and S3′ of SARS-CoV 3CLpro can be expressed as another logical operation, “IMP”(implication), rather than as “AND,” as described in detail inTable 1 and SI Text.

Strategy of the Virus. What is the advantage of this “alternative”specificity of SARS-CoV 3CLpro? The P1 and P2 positions areconsidered the major recognition sites of 3CLpro (Fig. 4A, Left)(1, 4, 18). In this study, we found a substrate recognition modeunique to the C-terminal autoprocessing of 3CLpro (Fig. 4A,Right), in which the P3′ Phe is also recognized, owing to theP2 Phe (Fig. 4A). In the canonical specificity, this protease

recognizes residues on the N-terminal side of the cleavage site(P1–P4), so that the C-terminal portion of the N-terminal-sideproduct retains the main recognition sites. Therefore, the N-terminalside products may compete with the substrates, and better substrateswould become stronger inhibitors after cleavage. There are 113CLpro cleavage sites in the SARS-CoV polyproteins (1). The3CLpro cleavage products could become inhibitors of 3CLpro,but except for the mature 3CLpro, they can escape from the enzyme,and thus their inhibitory activities are inconsequential (Fig. 4B). Ifthe C-terminal portion of the mature 3CLpro had Leu at the P2

Fig. 3. Structural basis for the dual specificity.(A) Binding of the substrate peptide portion con-taining the C-terminal processing site. The substrateis depicted by a stick model, and the enzyme isshown as a surface presentation. The carbon atomsare colored as indicated, and the nitrogen and oxy-gen atoms are in blue and red, respectively. The cyanstick model represents the C-terminal portion of3CLpro, and the magenta stick model depicts theC-terminal prosequence. The surface presentation ofthe flexible heptapeptide TAEDMLN is in orange.The surface presentation of Gln189 is in green,whereas that of the rest of the enzyme (C145A) is ingray. Dotted lines represent CH–π interactions (18)between Met49 and P2 Phe and between Ala46 andP3′ Phe. Although the active Cys145 was mutated toAla, it is shown in yellow in this figure. (B) Binding ofthe substrate containing the N-terminal processingsite to the H41A mutant enzyme (17) (PDB ID code2Q6G). (C and D) The positions of P2 Phe (C) and P3′Phe (D) in the C-terminal processing site on the en-zyme. (E and F) The positions of P2 Leu (E) and P3′Phe (F) in the N-terminal processing site on the en-zyme. The surfaces of the P2 and P3′ residues aredepicted by space-filling models. (G and H) Sche-matic drawings of the interactions between the S2pocket/ P2 position and the S3′ pocket/ P3′ position.(G) When P2 = Phe, the side chains of Gln189 andMet49 move outward, thereby providing sufficientspace to accommodate P2 Phe. (H) When P2 = Leu,the side chains of Met49 and Gln189 move to narrowthe S2 pocket, and thus the S3′ pocket is not formed.

Fig. 4. Strategy for reducing the inhibitory effect of the C-terminal portionof the mature protease. (A) Two types of substrate recognition by SARS-CoV3CLpro. (B) Strategy of the virus for reducing the inhibitory effect of theC-terminal portion of its mature protease.

13000 | www.pnas.org/cgi/doi/10.1073/pnas.1601327113 Muramatsu et al.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

21, 2

020

Page 5: SARS-CoV 3CL protease cleaves its C-terminal ... · C-terminal autoprocessingof3CLpro in theSARS-CoV polyproteins. Results and Discussion The Crystal Structure of the C-Terminal Proform

position, then the strong self-inhibitory activity would be a seri-ous problem. Consequently, it seems reasonable that the al-ternative recognition pattern (P2 = Phe/P3′ = Phe) is used for theC-terminal processing of 3CLpro (Fig. 4B), to minimize the in-hibitory activity. After the cleavage, the P3′ Phe residue is separatedfrom the P2 Phe residue, and thus the C-terminal portion of theenzyme no longer has sufficient binding affinity for the active site ofthe enzyme. This is the only site that uses this recognition pattern(P2 = Phe/P3′ = Phe) in the SARS-CoV polyproteins. This uniqueproperty is advantageous because it provides sufficient autopro-cessing activities from the polyproteins while minimizing the in-hibitory activities of the autoprocessed products.

Competitive Inhibition of Proteolytic Activity by the AutoprocessedC-Terminal Region. Indeed, the mature enzyme with the Phe305-to-Leualteration (the P2 position of the C-terminal processing site)exhibited reduced cleavage activities for both the C terminus(Phe305/Phe309) and N terminus (Leu-2/Phe3) [F305L vs. wt inFig. 2D (FF) and Fig. 2E, respectively]. This reduced activity likelyoccurs because the F305L mutant has a stronger inhibitory se-quence (P2 = Leu) than the wild-type enzyme (P2 = Phe). Incontrast to the mature form, the proforms with the C-terminalprosequence show no difference between P2 = Leu and P2 = Phe inC- and N-terminal proteolytic activities [F305L&Q306N vs. Q306Nin Fig. 2D (FF) and Fig. 2E, respectively]. Thus, for the C-terminallyunprocessed enzymes, there is no difference between the two sys-tems, P2 = Leu and P2 = Phe/P3′ = Phe, with regard to both thecleavability (apparent kcat/KM) as a substrate (LF vs. FF in Fig. 2D)and the inhibitory effect against the proenzyme. After cleavage atthis site, the mature P2 = Phe enzyme (wt) shows higher proteolyticactivity than the mature P2 = Leu enzyme (F305L) (Fig. 2E), in-dicating that the C-terminal portion of the P2 = Phe (wt) enzymehas a lower inhibitory effect compared with the C-terminal portionof the P2 = Leu (F305L) variant (Fig. 4B).We directly measured the inhibitory effect of the tetrapeptide

derived from the C-terminal portion of the enzyme in a peptidaseassay using an 11-aa residue fluorogenic peptide (Table 2). Thetetrapeptide Val-Thr-Leu-Gln (P2 = Leu) had an 11-fold strongerinhibitory effect (Ki = 11.5 ± 6.0 μM) than the tetrapeptide Val-Thr-Phe-Gln (wt; P2 = Phe) (Ki = 126 ± 42 μM). This resultprovides further evidence of the reduced inhibitory effect of theC-terminal portion of the mature 3CLpro by the presence of Pheinstead of Leu at position P2. This difference might be cruciallyimportant in the maturation process of the polyprotein in the cell.Two polyprotein molecules produce one 3CL protease dimer,which cleaves 11 positions for one polyprotein (1). In this stoi-chiometry, the stronger total Ki value of the two inhibitory moi-eties at the two C-termini of the protease dimer compared withthe KM value for the substrates must matter, and this is the case ifthis moiety has the P2 = Leu sequence (Ki = 11.5 μM ± 6.0 μM vs.KM = 39 ± 5.4 μM; Table 2). In contrast, the C-terminal moiety ofthe wild-type enzyme has lower affinity for the binding site thanfor the substrate (Ki = 126 ± 42 μM > KM = 39 ± 5.4 μM), whichensures the reduced inhibitory effect of this region.

As shown in Fig. 2D, the difference in Ki value causes an ap-proximate twofold increase in the apparent kcat/KM (wt vs. F305L)even at low enzyme concentrations (i.e., low [I]) in this experiment(Figs. S3 and S4). The difference in the inhibitory effect betweenPhe and Leu at the P2 position of the C terminus on the apparentkcat/KM (cleavability) must be greater in the cells. When the pol-yproteins are expressed in mammalian cells, the 3CLpro moiety (orNSP5) exists between two membrane proteins, NSP4 and NSP6.The 3CLpro moiety and the C and N termini of NSP4 and NSP6,respectively, are on the cytoplasmic side of the endoplasmic re-ticulum (22, 23). Moreover, 3CLpro forms a dimer. Therefore,the local concentration of the C-terminal inhibitory sequenceof 3CLpro on the endoplasmic reticulum likely would be high.This mechanism appears to exist only in the SARS-CoV 3CL

protease. However, there may be different types of mechanismsfor reducing the inhibitory effects of the C-termini of the matureautoprocessing proteases in other viruses.

MethodsCell-Free Syntheses of SARS-CoV 3CLpro Species and Their Substrates, andAssays of the Proteolytic Activities (Cleavabilities). The mature and pro-forms of SARS-CoV 3CLpro (the enzyme in Fig. 2A) and their substrates (thesubstrate in Fig. 2A) were synthesized by the E. coli cell-free protein synthesismethod (19, 20). The proteolytic activities (trans-processing assays) weremeasured as reported previously (19) and as described briefly in SI Text andFig. S3.

Purification of the Mature 3CLpro. The wild-type 3CLpro was synthesized andautomatically processed in the E. coli cell-free protein synthesis system, andwas purified by successive chromatography steps on columns of Econo-PackHigh Q (two tandemly connected 5-mL cartridges; Bio-Rad), Mono P 5/50 (GEHealthcare), and HiPrep 16/60 Sephacryl S-300 HR (GE Healthcare). From the9-mL cell-free synthesis solution, approximately 5.8 mg of the purified 3CLpro

was obtained.

Preparation of the Proform of 3CLpro with the C-Terminal Prosequence. Using aQuikChange Mutagenesis Kit (Stratagene), the plasmid encoding wild-type3CLpro was modified at three points: the active site Cys145 was changed toAla, the S-tag portion and the 10-aa extension at the N terminus were re-moved so that the first translated methionine was followed directly by thefirst Ser residue of 3CLpro, and then the His-tag was removed. The resultingplasmid was introduced into E. coli strain BL21(DE3), and protein expressionwas induced by isopropyl β-D-1-thiogalactopyranoside. After ultrasonic dis-ruption of the cells, the proform of 3CLpro with the C-terminal extension waspurified in the same manner as the mature form of 3CLpro. The first trans-lated methionine residue was completely removed in the E. coli cells, asdetermined by N-terminal amino acid sequencing and MALDI-TOF massspectrometry, as described previously (19). From 200 mL of the E. coli culture,approximately 15.9 mg of the purified protein was obtained.

Crystallization and Data Collection. Well-diffracting crystals of the matureform of SARS-CoV 3CLpro were obtained in drops composed of 2 μL of theprotein (11 mg/mL in 10 mM Tris·HCl pH 7.5 buffer containing 0.1 mM EDTAand 1 mM DTT) and 2 μL of reservoir solution, by the hanging drop vapordiffusion technique at 20 °C. The reservoir solution contained 0.1 M MESbuffer (pH 5.8), 4% (wt/vol) PEG 6000, 3% (vol/vol) DMSO, and 1 mM DTT.Large, single crystals (0.3 mm in the longest dimension) appeared within 2wk. The crystals were flash-cooled in a final cryoprotectant solution com-posed of the mother liquor and 25% (vol/vol) glycerol. Diffraction data werecollected on an ADSC Quantum210 detector at beamline AR-NW12 in the

Table 1. The logical operation between the P2 and P3′ positionsA (P2 = Phe?) B (P3′ = Phe?) A IMP B (Cleavage?)

True True TrueTrue False FalseFalse True TrueFalse False True

A: Amino acid residue in P2 position. True: Phe; false: Leu (and presumablyMet, Cys).

B: Amino acid residue in P3′ position. True: Phe; false: Asn, Ala (presumablyany amino acid).

A IMP B: efficient cleavage. True: yes; false: no.

Table 2. Kinetic parameters obtained using peptide substratesand peptide inhibitors

Parameter Value

KM (NMA-TSAVLQSGFRK(DNP)-NH2) 39 ± 5.4 μMkcat (NMA-TSAVLQSGFRK(DNP)-NH2) 0.37 ± 0.040 s−1

Ki (VTFQ) 126 ± 42 μMKi (VTLQ) 11.5 ± 6.0 μMKi (VTAQ) 18.4 ± 9.3 μM

Muramatsu et al. PNAS | November 15, 2016 | vol. 113 | no. 46 | 13001

BIOCH

EMISTR

Y

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

21, 2

020

Page 6: SARS-CoV 3CL protease cleaves its C-terminal ... · C-terminal autoprocessingof3CLpro in theSARS-CoV polyproteins. Results and Discussion The Crystal Structure of the C-Terminal Proform

Photon Factory (Tsukuba, Japan). The data were indexed, integrated, andscaled using HKL2000 (24). The crystal belongs to the space group P21, with unitcell dimensions a = 52.3 Å, b = 96.3 Å, and c = 67.8 Å, and diffraction data wereobtained up to 1.7-Å resolution.

The crystal of the proform of 3CLpro with the mutated active site C145Aand the C-terminal prosequence was grown in drops composed of 2 μL ofprotein solution (10 mg/mL in 10 mM Tris·HCl pH 7.5 buffer, containing0.1 mM EDTA and 1 mM DTT) and 2 μL of reservoir solution [0.1 M sodiumchloride, 0.1 M HEPES-Na buffer pH 7.3, and 12% (wt/vol) PEG 4000] by thesame method. Large, single crystals measuring 0.5 mm in the longest di-mension appeared within 1 wk. Diffraction data were collected on an ADSCQuantum210 detector at beamline BL-5A in the Photon Factory. The datawere indexed, integrated, and scaled using HKL2000 (24). This crystalbelongs to the space group P212121, with unit cell dimensions a = 63.7 Å,b = 90.2 Å, and c = 110.3 Å, and diffraction data were obtained up to2.2-Å resolution.

Structure Determination and Refinement. General processing of the scaled datawas performed with the programs in the CCP4 suite (25). The phases weredetermined by the molecular replacement method with the Molrep programin the suite. As search models, the reported structure (PDB ID code 1UJ1) (2)was used for the mature form (PDB ID code 2DUC), and the mature-formstructure was then used for the proform (PDB ID code 5B6O). The models wererebuilt manually using O (26), and were refined with CNS (27) and PHENIX (14).The structures were refined to an R-factor of 19.4% (Rfree = 22.1%) at 1.7-Åresolution for SARS-3CLpro-wild type (PDB ID code 2DUC) and to an R-factor of21.8% (Rfree = 25.9%) at 2.2-Å resolution for 3CLpro-C145A-10aa (PDB ID code5B6O). Structural alignments were accomplished with the DEJAVU (28),LSQMAN (29), and LSQKAB (30). The protein secondary structure was definedby the DSSP algorithm (31). The quality of the model was inspected with thePROCHECK program (32). Graphic figures were created with Pymol (33).

Kinetic Parameters Using Peptides. The fluorogenic peptide NMA-TSAVLQSGFRK(DNP)-NH2, which has the amino acid sequence of the N-terminal processing siteof SARS-CoV 3CLpro, was used as a substrate. Various concentrations of thissubstrate (6.25, 12.5, 25, or 50 μM) were cleaved with 25 nM 3CLpro at 30 °C in0.2 mL of a solution consisting of 20 mM Tris·HCl pH 7.5, 200 mM NaCl, 1 mMEDTA, 1 mM DTT, and 1% DMSO. Cleavage of the peptide bond between theGln (Q) and Ser (S) residues was monitored with a Tecan Infinite F200 fluo-rescence microplate reader, with excitation at 380 nm and emission at 465 nm.From the double-reciprocal plot of the substrate concentration vs. the initialvelocity of the cleavage reaction, the KM and kcat values were calculated.

The inhibition assay was performed with the same system, using thetetrapeptides VTFQ (ValThrPheGln), VTLQ (ValThrLeuGln), and VTAQ(ValThrAlaGln) at 0, 2.0, 2.2, 2.4, 2.6, and 2.8 mM concentrations, in 0.2 mLof a solution consisting of 1.5 μM (i.e., 1/26 of the KM value) NMA-TSAVLQSGFRK(DNP)-NH2, 20 mM Tris·HCl pH 7.5, 200 mM NaCl, 1 mM EDTA,1 mM DTT, and 8% (vol/vol) DMSO. The ratio of the rates with/withoutan inhibitor tetrapeptide, v0/vi, was measured. From the slope of the plot of(v0/vi) − 1 against [I]0, the 1/Ki(app) value was obtained.

ACKNOWLEDGMENTS. We thank Kunihiro Ohta (University of Tokyo) forhelpful discussions and Hideaki Tanaka (RIKEN Systems and StructuralBiology Center) for technical support. We also thank the AR-NW12 beamlinestaff at the Photon Factory (Tsukuba, Japan) for assistance with datacollection. This work was supported by grants from the RIKEN StructuralGenomics/Proteomics Initiative; the National Project on Protein Structuraland Functional Analyses from the Ministry of Education, Culture, Sports,Science and Technology (MEXT) of Japan; the Targeted Proteins ResearchProgram from MEXT of Japan; and the Platform Project for Supporting DrugDiscovery and Life Science Research (Platform for Drug Discovery, Informat-ics, and Structural Life Science) from MEXT and the Japan Agency forMedical Research and Development (to S.Y.) and by a Grant-in-Aid forScientific Research (20570115) from the Japan Society for the Promotion ofScience (to T.M.).

1. Thiel V, et al. (2003) Mechanisms and enzymes involved in SARS coronavirus genomeexpression. J Gen Virol 84(Pt 9):2305–2315.

2. Yang H, et al. (2003) The crystal structures of severe acute respiratory syndrome virusmain protease and its complex with an inhibitor. Proc Natl Acad Sci USA 100(23):13190–13195.

3. Goetz DH, et al. (2007) Substrate specificity profiling and identification of a new classof inhibitor for the major protease of the SARS coronavirus. Biochemistry 46(30):8744–8752.

4. Chuck C-P, et al. (2010) Profiling of substrate specificity of SARS-CoV 3CL. PLoS One5(10):e13197.

5. Xia B, Kang X (2011) Activation and maturation of SARS-CoV main protease. ProteinCell 2(4):282–290.

6. Xue X, et al. (2007) Production of authentic SARS-CoV M(pro) with enhanced activity:Application as a novel tag-cleavage endopeptidase for protein overproduction. J MolBiol 366(3):965–975.

7. Chen S, et al. (2008a) Mutation of Gly-11 on the dimer interface results in the com-plete crystallographic dimer dissociation of severe acute respiratory syndrome coro-navirus 3C-like protease: Crystal structure with molecular dynamics simulations. J BiolChem 283(1):554–564.

8. Chen S, et al. (2008b) Residues on the dimer interface of SARS coronavirus 3C-likeprotease: Dimer stability characterization and enzyme catalytic activity analysis.J Biochem 143(4):525–536.

9. Shi J, Sivaraman J, Song J (2008) Mechanism for controlling the dimer-monomerswitch and coupling dimerization to catalysis of the severe acute respiratory syn-drome coronavirus 3C-like protease. J Virol 82(9):4620–4629.

10. Hu T, et al. (2009) Two adjacent mutations on the dimer interface of SARS coronavirus3C-like protease cause different conformational changes in crystal structure. Virology388(2):324–334.

11. Zhang S, et al. (2010) Three-dimensional domain swapping as a mechanism to lock theactive conformation in a super-active octamer of SARS-CoV main protease. ProteinCell 1(4):371–383.

12. Li C, et al. (2010) Maturation mechanism of severe acute respiratory syndrome (SARS)coronavirus 3C-like proteinase. J Biol Chem 285(36):28134–28140.

13. Hsu M-F, et al. (2005) Mechanism of the maturation process of SARS-CoV 3CL pro-tease. J Biol Chem 280(35):31257–31266.

14. Adams PD, et al. (2010) PHENIX: a comprehensive Python-based system for macro-molecular structure solution. Acta Crystallogr D Biol Crystallogr 66(Pt 2):213–221.

15. Schneider TD, Stephens RM (1990) Sequence logos: A new way to display consensussequences. Nucleic Acids Res 18(20):6097–6100.

16. Crooks GE, Hon G, Chandonia J-M, Brenner SE (2004) WebLogo: A sequence logogenerator. Genome Res 14(6):1188–1190.

17. Xue X, et al. (2008) Structures of two coronavirus main proteases: Implications forsubstrate binding and antiviral drug design. J Virol 82(5):2515–2527.

18. Chu L-HM, Choy W-Y, Tsai S-N, Rao Z, Ngai S-M (2006) Rapid peptide-based screeningon the substrate specificity of severe acute respiratory syndrome (SARS) coronavirus3C-like protease by matrix-assisted laser desorption/ionization time-of-flight massspectrometry. Protein Sci 15(4):699–709.

19. Muramatsu T, et al. (2013) Autoprocessing mechanism of severe acute respiratorysyndrome coronavirus 3C-like protease (SARS-CoV 3CLpro) from its polyproteins. FEBSJ 280(9):2002–2013.

20. Kigawa T, et al. (2004) Preparation of Escherichia coli cell extract for highly productivecell-free protein expression. J Struct Funct Genomics 5(1-2):63–68.

21. Brandl M, Weiss MS, Jabs A, Sühnel J, Hilgenfeld R (2001) C-H...π-interactions inproteins. J Mol Biol 307(1):357–377.

22. Oostra M, et al. (2007) Localization and membrane topology of coronavirus non-structural protein 4: Involvement of the early secretory pathway in replication. J Virol81(22):12323–12336.

23. Oostra M, et al. (2008) Topology and membrane anchoring of the coronavirus rep-lication complex: Not all hydrophobic domains of nsp3 and nsp6 are membranespanning. J Virol 82(24):12392–12405.

24. Otwinowski Z, Minor W (1997) Processing of X-ray diffraction data collected in os-cillation mode. Methods Enzymol 276:307–326.

25. Winn MD, et al. (2011) Overview of the CCP4 suite and current developments. ActaCrystallogr D Biol Crystallogr 67(Pt 4):235–242.

26. Jones TA, Zou J-Y, Cowan SW, Kjeldgaard M (1991) Improved methods for buildingprotein models in electron density maps and the location of errors in these models.Acta Crystallogr A 47(Pt 2):110–119.

27. Brünger AT, et al. (1998) Crystallography & NMR system: A new software suite formacromolecular structure determination. Acta Crystallogr D Biol Crystallogr 54(Pt 5):905–921.

28. Kleywegt GJ, Jones TA (1997) Detecting folding motifs and similarities in proteinstructures. Methods Enzymol 277:525–545.

29. Kleywegt GJ (1996) Use of non-crystallographic symmetry in protein structure re-finement. Acta Crystallogr D Biol Crystallogr 52(Pt 4):842–857.

30. Kabsch W (1976) Solution for the best rotation to relate two sets of vectors. ActaCrystallogr A 32:922–923.

31. Kabsch W, Sander C (1983) Dictionary of protein secondary structure: Pattern rec-ognition of hydrogen-bonded and geometrical features. Biopolymers 22(12):2577–2637.

32. Laskowski RA, MacArthur MW, Moss DS, Thornton JM (1993) PROCHECK: A programto check the stereochemical quality of protein structures. J Appl Cryst 26:283–291.

33. DeLano WL (2002) The PyMOL Molecular Graphics System (DeLano Scientific, SanCarlos, CA).

34. Chen S, Jonas F, Shen C, Hilgenfeld R (2010) Liberation of SARS-CoV main proteasefrom the viral polyprotein: N-terminal autocleavage does not depend on the maturedimerization mode. Protein Cell 1(1):59–74.

35. Ng NM, Pike RN, Boyd SE (2009) Subsite cooperativity in protease specificity. BiolChem 390(5-6):401–407.

36. Kontijevskis A, Petrovska R, Yahorava S, Komorowski J, Wikberg JES (2009) Pro-teochemometrics mapping of the interaction space for retroviral proteases and theirsubstrates. Bioorg Med Chem 17(14):5229–5237.

37. Wallace AC, Laskowski RA, Thornton JM (1995) LIGPLOT: A program to generateschematic diagrams of protein-ligand interactions. Protein Eng 8(2):127–134.

13002 | www.pnas.org/cgi/doi/10.1073/pnas.1601327113 Muramatsu et al.

Dow

nloa

ded

by g

uest

on

Sep

tem

ber

21, 2

020