8. H. Neuweiler et al., J. Mol. Biol. 390, 1060–1073 (2009).
9. C. N. Pace et al., J. Mol. Biol. 408, 514–528 (2011).
10. C. L. Araya et al., Proc. Natl. Acad. Sci. U.S.A. 109,
11. S. Xiao et al., Proc. Natl. Acad. Sci. U.S.A. 110, 11337–11342
12. C. N. Pace et al., Protein Sci. 23, 652–661 (2014).
13. K. Lindorff-Larsen, S. Piana, R. O. Dror, D. E. Shaw, Science
334, 517–520 (2011).
14. S. Piana, K. Lindorff-Larsen, D. E. Shaw, Proc. Natl. Acad. Sci.
U.S.A. 109, 17845–17850 (2012).
15. H. Nguyen, J. Maier, H. Huang, V. Perrone, C. Simmerling,
J. Am. Chem. Soc. 136, 13959–13962 (2014).
16. P.-S. Huang, S. E. Boyken, D. Baker, Nature 537, 320–327 (2016).
17. C. A. Rohl, C. E. M. Strauss, K. M. S. Misura, D. Baker, Methods
Enzymol. 383, 66–93 (2004).
18. T. J. Magliery, Curr. Opin. Struct. Biol. 33, 161–168 (2015).
19. H. Park et al., J. Chem. Theory Comput. 12, 6201–6212 (2016).
20. B. I. Dahiyat, S. L. Mayo, Science 278, 82–87 (1997).
21. H. Liang et al., Angew. Chem. Int. Ed. 48, 3301–3303 (2009).
22. Z. Li, Y. Yang, J. Zhan, L. Dai, Y. Zhou, Annu. Rev. Biophys. 42,
23. S. Kosuri, G. M. Church, Nat. Methods 11, 499–507 (2014).
24. M. G. F. Sun, M.-H. Seo, S. Nim, C. Corbi-Verge, P. M. Kim, Sci.
Adv. 2, e1600692 (2016).
25. E. T. Boder, K. D. Wittrup, Nat. Biotechnol. 15, 553–557 (1997).
26. See supplementary materials.
27. V. Sieber, A. Plückthun, F. X. Schmid, Nat. Biotechnol. 16,
28. M. D. Finucane, M. Tuna, J. H. Lees, D. N. Woolfson,
Biochemistry 38, 11604–11612 (1999).
29. C. Park, S. Zhou, J. Gilmore, S. Marqusee, J. Mol. Biol. 368,
30. C. Park, S. Marqusee, Nat. Methods 2, 207–212 (2005).
31. P. Leuenberger et al., Science 355, eaai7825 (2017).
32. M. Jäger, M. Dendle, J. W. Kelly, Protein Sci. 18, 1806–1813
33. G. Bhardwaj et al., Nature 538, 329–335 (2016).
34. N. Koga et al., Nature 491, 222–227 (2012).
35. S. Kamtekar, J. M. Schiffer, H. Xiong, J. M. Babik, M. H. Hecht,
Science 262, 1680–1685 (1993).
36. A. R. Davidson, R. T. Sauer, Proc. Natl. Acad. Sci. U.S.A. 91,
37. M. H. Hecht, A. Das, A. Go, L. H. Bradley, Y. Wei, Protein Sci. 13,
38. R. J. Fox et al., Nat. Biotechnol. 25, 338–344 (2007).
39. P. A. Romero, A. Krause, F. H. Arnold, Proc. Natl. Acad. Sci.
U.S.A. 110, E193–E201 (2013).
40. A. Leaver-Fay et al., Methods Enzymol. 523, 109–143 (2013).
41. M. D. S. Kumar et al., Nucleic Acids Res. 34, D204–D206 (2006).
42. E. G. Baker et al., Nat. Chem. Biol. 11, 221–228 (2015).
43. D. S. Doering, P. Matsudaira, Biochemistry 35, 12677–12685 (1996).
44. J. Meng et al., Biochemistry 44, 11963–11973 (2005).
45. M. A. Verdecia, M. E. Bowman, K. P. Lu, T. Hunter, J. P. Noel,
Nat. Struct. Biol. 7, 639–643 (2000).
46. B. K. Shoichet, W. A. Baase, R. Kuroki, B. W. Matthews, Proc.
Natl. Acad. Sci. U.S.A. 92, 452–456 (1995).
47. P. A. Chong, H. Lin, J. L. Wrana, J. D. Forman-Kay, J. Biol.
Chem. 281, 17069–17075 (2006).
48. E. Aragón et al., Structure 20, 1726–1736 (2012).
49. A. H. Elcock, J. Mol. Biol. 312, 885–896 (2001).
50. E. T. Boder, J. R. Bill, A. W. Nields, P. C. Marrack, J. W. Kappler,
Biotechnol. Bioeng. 92, 485–491 (2005).
51. S. Piana, J. L. Klepeis, D. E. Shaw, Curr. Opin. Struct. Biol. 24,
52. J. S. Appelbaum et al., Chem. Biol. 19, 819–830 (2012).
53. J. R. LaRochelle, G. B. Cobb, A. Steinauer, E. Rhoades,
A. Schepartz, J. Am. Chem. Soc. 137, 2536–2541 (2015).
Supported by the Howard Hughes Medical Institute (D.B.) and the
Natural Sciences and Engineering Research Council of Canada
(C.H.A.). G.J.R. is a Merck Fellow of the Life Sciences Research
Foundation. C.H.A. holds a Canada Research Chair in Structural
Genomics. We thank S. Rettie for mass spectrometry support;
C. Lee for deep sequencing support, S. Ovchinnikov for assistance
quantifying sequence conservation; V. Nguyen, A. Yehdego,
T. Howard, and K. Lau for assistance with protein purification; and
H. Gelman and many other members of the Baker lab for helpful
discussions. This work was facilitated by the Hyak supercomputer at
the University of Washington and by donations of computing time
fromRosetta@Homeparticipants.Th e Structural Genomics
Consortium is a registered charity (number 1097737) that receives
funds from AbbVie; Bayer Pharma AG; Boehringer Ingelheim; Canada
Foundation for Innovation; Eshelman Institute for Innovation; Genome
Canada through Ontario Genomics Institute grant OGI-055; Innovative
Medicines Initiative (EU/EFPIA) through ULTRA-DD grant 115766;
Janssen Pharmaceuticals; Merck & Co.; Novartis Pharma AG; Ontario
Ministry of Research, Innovation and Science (MRIS); Pfizer; São
Paulo Research Foundation–FAPESP; Takeda; and the Wellcome
Trust. The RosettaScripts code and blueprint files used for protein
design are provided in the supplementary materials. The data for this
work (designed sequences and structures, deep sequencing counts,
EC50 values, stability scores, and structural analysis of the designed
models) are also provided in supplementary materials. The python
code for inferring EC50 values and for fitting the unfolded state model
is provided at https://github.com/asford/protease_experimental_
analysis. G.J.R. and D.B. are inventors on provisional patent
application no. 62/491,518 filed 28 April 2017 by the University of
Washington that covers (i) the method described in this work for
computationally designing and experimentally verifying stable
miniproteins, and (ii) the 4000 most stable protein sequences
designed in the work. Author contributions: G.J.R. designed the
research, the experimental approach, and the proteins; G.J.R.,
T.M.C., I.G., S.H., L.C., R.R., and A.C. performed experiments; all
authors analyzed data; G.J.R., A.F., and V.K.M. contributed new
computational tools; C.H.A. and D.B. supervised research; and
G.J.R. and D.B. wrote the manuscript.
Materials and Methods
Figs. S1 to S12
Tables S1 to S3
28 February 2017; accepted 9 June 2017
Snap deconvolution: An informatics
approach to high-throughput
discovery of catalytic reactions
Konstantin Troshin1,2 and John F. Hartwig1,2*
We present an approach to multidimensional high-throughput discovery of catalytic
coupling reactions that integrates molecular design with automated analysis and
interpretation of mass spectral data. We simultaneously assessed the reactivity of three
pools of compounds that shared the same functional groups (halides, boronic acids,
alkenes, and alkynes, among other groups) but carried inactive substituents having
specifically designed differences in masses. The substituents were chosen such that the
products from any class of reaction in multiple reaction sets would have unique differences
in masses, thus allowing simultaneous identification of the products of all transformations in
a set of reactants. In this way, we easily distinguished the products of new reactions from
noise and known couplings. Using this method, we discovered an alkyne hydroallylation
and a nickel-catalyzed variant of alkyne diarylation.
High-throughput experimentation (HTE) is one of the essential tools in drug discov- ery (1), but the potential of these methods to influence chemical reaction discovery has been limited (2). The most common
application of HTE in reaction development is
for rapid assessment of the effect of reaction parameters on yield or selectivity (3–10). In this context, HTE methods have been applied to search
for conditions to improve known reactions, including one recent example of making them compatible with reactions on nanomoles of material
for improving synthetic routes to druglike molecules (11). HTE also has been used recently to identify conditions for the late-stage functionalization
of complex molecules, including a photoredox-based functionalization of biologically active heterocycles (12) and peptide-catalyzed site-selective
modification of natural products (13).
The application of HTE to discover unknown
classes of reactions has been more limited and
often relies on customized use of analytical techniques such as colorimetry (14, 15), microscopy
(16), fluorescence (17, 18), label-assisted matrix-assisted laser desorption/ionization–time of flight
spectroscopy (MALDI-TOF) (19), self-assembled
monolayer/MALDI mass spectrometry (SAMDI)
(20), microfluidic reactors (21), immunoassays
(22), and DNA-templated methods (23). Despite
the value of these analytical techniques, their application requires one or more of the reactants to
contain certain functional groups or markers, such
as conjugated esters, halides (14, 15), bifunctional
linkers (16), fluorescence dyes, complex mass tags,
or DNA strains (17–23). These requirements limit
the scope of reactants that can be used in this type
of experimentation, thereby limiting the breadth
of reactions that can be discovered. Strategies
that do not require the introduction of functional
tags into the reactants impose the least restriction
on the scope of reactants and reaction conditions
(2). The most straightforward of such strategies
involves identification of products by mass spectrometry (MS).
SCIENCE sciencemag.org 14 JULY 2017 • VOL 357 ISSUE 6347 175
1Department of Chemistry, University of California, Berkeley,
CA 94720, USA. 2Chemical Sciences Division, Lawrence
Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA
*Corresponding author. Email: firstname.lastname@example.org