xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam 2200409Ka 4500
controlfield tag 001 002069314
007 cr mnu|||uuuuu
008 100421s2009 flu s 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0003195
Computer simulations of apomyoglobin folding
h [electronic resource] /
by Mariangela Dametto.
[Tampa, Fla] :
b University of South Florida,
Title from PDF of title page.
Document formatted into pages; contains 95 pages.
Dissertation (Ph.D.)--University of South Florida, 2009.
Includes bibliographical references.
Text (Electronic dissertation) in PDF format.
ABSTRACT: The differences between refolding mechanisms of sperm whale apomyoglobin subsequent to three different unfolding conditions have been examined by atomistic level computer simulations. The three unfolding conditions used in this work are high-temperature, low temperature and low pH. The folding of this protein has been extensively studied experimentally, providing a large data base of folding parameters which can be probed using simulations. The crystal structure of sperm whale myoglobin was taken from Protein Data Bank, followed by the removal of the heme unit and a subsequent energy minimization was performed in order to generate the native apomyoblogin form. Thus, the native conformation of apomyoglobin utilized is the same in all the three different refolding simulations done in the present work. The differences are the way the initial unfolded conformations were obtained.The refolding trajectories were obtained at room temperature using the Stochastic Difference Equation in Length algorithm. The results reveal differences between the three refolding routes. In contrast to previous molecular simulations that modeled low pH denaturation, an extended intermediate with large helical content was not observed in the refolding simulations from the high-temperature unfolded state. Otherwise, a structural collapse occurs without formation of helices or native contacts. Once the protein structure is more compact (radius of gyration less than 18 angstroms) secondary and tertiary structures appear. The low pH simulations show some agreement with the low pH experimental data and previous molecular dynamics simulations, like formation of a conformation having radius of gyration around 20 angstroms and large helical content.And the refolding simulations after the low temperature unfolding present differences in the properties of apomyoglobin folding route, comparing to the other two previous conditions. The collapse of the protein during folding occurs later in the simulation when compared with high-temperature denaturing state, but earlier when compared to low pH simulations. These differences strongly suggest that a protein can follow different folding routes, depending on the nature and the structure of the unfolded state.
Mode of access: World Wide Web.
System requirements: World Wide Web browser and PDF reader.
Co-advisor: Randy Larsen, Ph.D.
Co-advisor: Alfredo Cardenas, Ph.D.
Long time dynamics
t USF Electronic Theses and Dissertations.
Computer Simulations of Apomyoglobin Folding by Mariangela Dametto A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Chemistry College of Arts and Sciences University of South Florida Co-Major Professor: Randy Larsen, Ph.D. Co-Major Professor: Alfredo Cardenas, Ph.D. Brian Space, Ph.D. Wayne Guida, Ph.D. Susana K. Lai-Yuen, Ph.D. Date of Approval: November 10, 2009 Keywords: protein folding, molecular simulations, de naturation conditions, classical action, long time dynamics Copyright 2009 Mariangela Dametto
Acknowledgments I would like to (try to) express my immense gratitude to my parents, Maria Osoria Roberti Dametto and Antonio Damett o. Their incredible de ep and unconditional love and guidance through my entire life have been invalu able. The dedicated support they have devoted and their belief in me ha s been a true source of much happiness and inspiration in my life. Words cannot closely begin to express my gr atitude, and the deep respect that I have for them and I always received from them. They provide to me infinite knowledge about human respect and all the wonderful qualities that brought me to this stage of my life, in which I am greatly and completely satisfied to be. Thus, I extend my deepest and most sincere gratitude because they made me a person that I definitely would like to be, and only hope that my support of them continues to be as meaningful as what they have given to me. My brothers, Pedro Dametto Neto and Antonio Henrique Dametto, have been of an immeasurable support during all these years, always encouraging me to keep myself in the track that I decided to trek. This thank is also ex tended to my syster-inlaw, Beatriz Curi Dametto, my nephew, Lu cas Curi Dametto and nieces, Laura Curi Dametto and Raquel Curi Dametto. Their love and deep affection have made me strong to overcome the obstacles of the way being far from home. There is a very special thank to my best friend Patricia Butarello Cassini. Our friendship is completely beyond from what I can describe here. Support, encouraging and words to comfort me in the difficu lt times were always coming from her.
Other friend that I have to thank here is Claudia Amira Fiaschitello. Her happiness with all my achievements duri ng my research life until now was very valuable to me. Another special person that is part of this dissertation certainly is Pedro Izao Ando Junior. I would never come here withou t his great efforts to help and encourage me also. Many thanks also to all my friends made at the Chemistry Department at University of South Florida, and to my cu rrent and former group members. They have been great friends and colleagues. I cannot leave University of South Fl orida without thanking very much the people who work so hard behind the scenes. Wi thout them, it would be very difficult to "keep the machinery working" in the Chem istry Department. And they are (please, there is not an importance order here): Roberto Avergonzado, Kimberly Read, Lorene Hall-Jennings, Cheryl Graham, Adrie nne McCain, John Connaughton, Linda Lowe, Venice Osteen, and Sarala Rao. I would also like to thank my committee: Dr. Brian Space, Dr. Randy Larsen, Dr. Wayne Guida and Dr. Susana Lai-Yuen. I am deeply thankful to Dr. Brian Space and Dr. Randy Larsen for being with me in this journey since the beginning of my studies, and always being truly accessible to talk and help whatever the need that I ever had. My enormous thanks go to a person that holds a great significance in my process of learning during the Ph.D, Dr. Alfredo Crdenas. I have been guided and
oriented by this distinguis hed person, and his help and advice through all these five years have been of such an incredible im portance to me, since the beginning when he encouraging me to come from Brazil to do my Ph.D. here at USA. He has opened up and broadened a world of opportunity to me, an d I will always have with me his very positive and important influence in my research life. Finally, I would like to thank the University of South Florida, National Science Foundation and Research Computing at University of South Florida for all the services and financ ial support provided.
TABLE of CONTENTS LIST OF FIGURES iii ABSTRACT vii 1 INTRODUCTION 1 1.1 Protein folding process 1 1.2 Possible mechanisms of the protein folding process 5 1.3 Intermediates in the apoMb foldi ng route(s) 7 2 OVERVIEW 9 2.1 Myoglobin and its physiological function 9 2.2 Apomyoglobin 9 2.3 Apomyoglobin and its folding pathway 11 3 THEORETICAL BACKGROUND 19 4 REFOLDING AFTER HIGH TEMPERATURE DENATURATION 26 4.0 Introduction 26 4.1 Materials and Methods 27 4.2 Results and Conclusions 27 5 REFOLDING AFTER LOW PH DENA TURATION 34 5.0 Introduction 34 i
5.1 Materials and Methods 35 5.2 Results and Conclusions 36 6 REFOLDING AFTER LOW TEMPERATURE DENATURATION 40 6.0 Introduction 40 6.1 Materials and Methods 41 6.2 Results and Conclusions 42 7 CONCLUSIONS COMPARING THE THREE UNFOLDING CONDITIONS 46 8 FINAL REMARKS 53 REFERENCES 56 APPENDICES 63 Appendix A Figures 64 ABOUT THE AUTHOR End Page ii
LIST OF FIGURES Figure 1. Representation of the folding energy landscape 64 Figure 2. Myoglobin amino acid sequence 65 Figure 3. Native structure of apomyoglobi n (apoMb), showing helices A to H 66 Figure 4. Contour plot of the radius of gyration (Rg) and helicity for the high temperature denaturing condition 67 Figure 5. Graphic of number of contacts and Rg for the high temperature condition 68 Figure 6. Plot showing amino acids cont act between the helices (A-H) in the native structure of apoMb for th e high temperature condition 69 Figure 7. Plots showing amino acids c ontact between the helices (A-H) along the four folding high temperature trajectories 70 Figure 8. The variation of the helical co ntent for each specific helix of apoMb as a function of the normalized path length (high temperature condition) 71 Figure 9. Molecular snapshots along one of the high temperature folding trajectories of apoMb 72 Figure 10. Contour plot showing the popul ation distribution of conformation iii
having different values of radius of gyration and helical content for the four low pH folding trajectories computed 73 Figure 11. Contour plot representing the population distribution of structures for the four low pH folding trajectories as a function of the number of native contacts and Rg 74 Figure 12. Plots showing amino acids cont act between the helices (A-H) along the four folding low pH trajectories 75 Figure 13. The variation of the helical co ntent for each specific helix of apoMb as a function of the normalized path length 76 Figure 14. Molecular snapshots along one of the low pH folding trajectories of apoMb 77 Figure 15. Contour plot showing the popul ation distribution of conformation having different values of radius of gyration and helical content for the four low temperature folding trajectories computed 78 Figure 16. Contour plot representing the population distribution of structures for the four low temperature foldi ng trajectories as a function of the number of native contacts and radius of gyration 79 Figure 17. Plots showing amino acids cont act between the helices (A-H) along the four folding low temperature trajectories 80 Figure 18. The variation of the helical cont ent for each specific helix of apoMb as a function of the normalized path length (low temperature condition) 81 iv
Figure 19. Molecular snapshots along one of the low temperature folding trajectories of apoMb 82 Figure 20. Comparison of the behavior of the Rg as a function of the helical content for the folding trajectories obtained in the three different conditions 83 Figure 21. Plot showing the total number of contacts as a function of the formation of the eight helices present in the apoMb (high temperature condition) 84 Figure 22. Plot showing the total number of contacts as a function of the formation of the eight helices present in the apoMb (low temperature condition) 85 Figure 23. Plot showing the total number of contacts as a function of the formation of the eight helices present in the apoMb (low pH condition) 86 Figure 24. Relation between the amino acids belonging to each of the specific contacts associated with each specific helix or loop 87 Figure 25. Representation of the contact number 4 (HIS24-HIS119) from figure 24 (native state) 88 Figure 26. Figure showing the distance between the contact number 4 from figure 24 (unfolded from high temperature condition) 89 v
Figure 27. Figure showing the distance between the contact number 4 from figure 24 (unfolded from low pH condition) 90 Figure 28. Figure showing the distance between the contact number 4 from figure 24 (unfolded from low temperature condition) 91 Figure 29. Molecular representation of the native structure of apoMb showing the contact number 12 (HIS116-GLY124) from the figure24 (native state) 92 Figure 30. Figure showing the distance between the contact number 12 from figure 24 (unfolded from high temperature condition) 93 Figure 31. Figure showing the distance between the contact number 12 from figure 24 (unfolded from low pH condition) 94 Figure 32. Figure showing the distance between the contact number 12 from figure 24 (unfolded from low temperature condition) 95 vi
Computer Simulations of Apomyoglobin Folding Mariangela Dametto Abstract The differences between refolding m echanisms of sperm whale apomyoglobin (apoMb) subsequent to three different unf olding conditions have been examined by atomistic level computer simulations. The three unfolding conditions used in this work are high-temperature, low temperature and low pH. The folding of this protein has been extensively studied experimentall y, providing a large data base of folding parameters which can be probed using simulations. The crystal structure of sperm whale myoglobin was taken from Protein Data Bank, followed by the removal of the he me unit and a subsequent energy minimization was performed in order to ge nerate the native a poMb form. Thus, the native conformation of apoMb utilized is the same in all the three different refolding simulations done in the present work. The di fferences are the way the initial unfolded conformations were obtained. The refoldin g trajectories were obtained at room temperature using the Stoc hastic Difference Equation in Length algorithm. The results reveal differences between the three re folding routes. In contrast to previous molecular simulations that modeled low pH denaturation, an ex tended intermediate vii
with large helical content was not observed in the refolding simulations from the high-temperature unfolded state. Otherwise, a structural collapse occurs without formation of helices or native contacts. Once the protein structure is more compact (radius of gyration < 18) secondary and te rtiary structures appear. The low pH simulations show some agreement with the low pH experimental data and previous molecular dynamics simulations, like form ation of a conformation having radius of gyration around 20 and large he lical content. And the refo lding simulations after the low temperature unfolding present differences in the properties of apoMb folding route, comparing to the othe r two previous conditions. Th e collapse of the protein during folding occurs later in the simulation when compared with high-temperature denaturing state, but earlier when compared to low pH simulations. These differences strongly suggest that a protein can follow different folding routes, depending on the nature and the structure of the unfolded state. viii
CHAPTER 1 INTRODUCTION 1.1 Protein folding process Proteins achieve their st able tridimensional confor mation through an important process known as protein folding. The correct folding of a chain of amino acids into a protein is of fundamental physiological impor tance since the functi onality, meaning the particular biological activity, of a protein is directly related to its overall conformation. That protein conformation is important to physiological function ha s been confirmed by experiments [1-3], in which the denaturing of a protein induced by temperature, pH changes or addition of denatu rant species produces a loss of function. However, the conformations of the denature d state can be very different depending on the denaturing conditions used and may still contain residua l structure or native topology [4-5]. The mechanisms through which a protein fo lds to different conformations have been the focus of extensive experimental a nd theoretical investigation. Overall, these studies have led to significant a dvances in our unders tanding of this vital cellular process. They have also provided insights into the occurrence of certain ailments such as AlzheimerÂ’s and prion diseases that occur due to the formation of misfolded proteins. In order to function correc tly, a newly synthesized am ino acid sequence in the 1
cell must fold into a unique, native, tertiary structure that is specific to each individual protein. The covalent structure of a protein is determined by the structure of the twenty natural amino acids and the order they ar e linked together into a polypeptide chain. In general terms, the folding is cons idered a progressive collapse of the polypeptide chain followed by the formation and stabilization of seco ndary and tertiary structure. There are terms used to character ize the states of the folding. The folded (native) state is the conformation of th e protein that is ac tive biologically, and corresponds to the struct ure with the lowest free energy. Th e latter is give n by taking into account the potential energies i nvolved in the bonds and the inte ractions present in the protein. The bond variations can be due to bond length and rotation of flexible angles along the protein sequence. Thes e variations are found mostly in the side chains of the amino acids and in the psi ( ) and phi ( ) angles in the core of the polypeptide chain. The native state of a protei n is also acquired by the fo rmation of many non-covalent interactions like hydrogen bonds, hydrophobic intera ctions and electrostatic interactions. The hydrophobic effect is important in the fo lding process. The nonpolar residues of a protein tend to approximate in order to redu ce their contacts with water molecules. In general terms, the formation of local native c ontacts, also giving rise to the formation of the local secondary elements, like -helices and -sheets, is what initiates the folding process. Since proteins have a compact hydrop hobic core, it is very lik ely that after that initiation of secondary structures a protein undergoes a phenomenon known as hydrophobic collapse. The ensemble of the states with lowest free energy favors the decrease of the 2
entropy compensated by an increase of enthalpy of what is considered the most common native conformation of a particular protein. This idea is summarized in the funnel model of the proteinÂ’s conformational search for its own native and biological active structure. There are many possible unfolded conformations that are narrowed down in the correct folding pathway to be able to achieve th e unique folded conformation of a functional protein. This process still intrigues many res earchers in the field, and considerable effort has been made to find patterns of the sequence of events that explain how pr oteins find their way to the correct fold. The analysis of the conformational cha nges that the proteins undergo during the folding process has been challenging scientists during the last decades. The investigation of this process can contribute to an increa sing in the knowledge of the protein synthesis process, improving the understand ing of the development of some diseases related to the misfolding and aggregation of ce rtain proteins. Alzheimer's, Parkinson's, some types of cancer, type-II diabetes, and prion diseases are some examples of illnesses caused by aggregation and tissue depositi on of proteins that were misfolded during their cellular synthesis. The study of the changes in the conformational proteinsÂ’ stability through the analysis of the unfolding pro cess of the proteins is a powerful technique to help the understanding of the folding process. And a be tter comprehension of the process of how proteins fold can facilitate the discovery and production of dr ugs designed specifically to prevent misfolding and aggregat ion. The objective of the study of the folding pathway of proteins is to comprehend the patterns and forces that are involved in this process. The folding of the polypeptides into their native conformations (under 3
physiological condition) is very stable, lead ing to a low concentration of the unfolded states. Then, the methodology used to study the folding process is shifting the equilibrium reaction (Native Unfolded) through the introductio n of some perturbation like variation in pH or temperatur e, adding some denaturant ag ent as urea or guanidinium chloride, increasing of the hydr ostatic pressure, or a combin ation of these factors. The reversibility property of the folding reaction is an important requirement for this kind of approach. And it has been observed that prot eins are able to refold after undergoing a mild denaturing perturbation . Thus, unf olded state is a term that refers to conformations that were obtai ned under some reversible de naturing conditions (high/low temperature, low pH). Obtaining basic knowledge of the molecular mechanism responsible for the folding of these flexible polypeptides chain is still one of the biggest challenges of structural biol ogy: the relation between the polypeptide sequence and the folding behavior of a protein is not fully understood yet. A nd the relation between structure and function is essential to determ ining functional and structural domains, such as their relevance to the protein stability. Th at the amino acid sequence of a protein is crucial is justified by the theory of which all the information needed to the folding of a protein is based in this seque nce [6, 7]. There are many questi ons that have been raised regarding the mechanisms and the paths that a protein takes to generate its native tridimensional conformation. A protein can adopt different conformations that are specified by the rotations that can take place in the single bonds of their covalent structure. And there is an enormous number of possible conformations that a protein can have. The folding process has to be quick rather than being a trial-a nd-error process, in 4
order for the protein to start its functionality for the organism. Then, one of the questions that arises from this observation is how a prot ein finds the lowest free energy structure in a reasonable time. Regarding this important and intriguing cellular process, many efforts have been made to explain the mechanism of protein folding. Many studies have been done about this topic, once this comprehension can fulfill the gap between the information about th e genetic knowledge and the tridimensional structure that the genes codify [8-11]. T hus, the knowledge gained with these studies helps to understand the complicated questi on of how the folding process occurs. The importance of this knowledge is to help us comprehend how a misfolding process occurs, resulting in an incorrect overall shape of the protein. The hope is to affect this process by having misfolded proteins folded in a correct shape in order to avoid (or decrease the effects) of the diseases related to these proteins that were not correctly folded. 1.2 Possible mechanisms of th e protein folding process Protein folding cannot be a random proce ss, because it could take an infinite amount of time for a single protein to test a ll the possible conforma tions. Accordingly to Levinthal , this time would raise exponen tially with the increase in the amino acid chain, what makes a random process not probable due to the rate which the folding occurs in nature. This suggests the existence of specific mechanisms that can simplify the formation of folded, stable and functional pr otein structures. In the search for these mechanisms, several protein folding models have been proposed, which are going to be 5
presented in this section. Anfise n [6, 7] showed that RNAseA unfol ds, and loses its biological function in the presen ce of 8M of urea and -mercaptoethanol. Although after the removal of these denaturant agents, the protein refolds and recovers its biological activity. This work gave support to the theory that states that the necessary information to the correct protein fold is contained in the primary sequence (sequence of amino acid of a protein) Anfisen [6, 7]. From these early studies, the protein fold ing field was triggered and many diverse models were proposed and debated in order to try to explain how proteins find the way to the correct fold. The main proposed models for the folding mechanism were nucleation and condensation [13-15]; framework [16-17]; diffusion-collision  and hydrophobic collapse . Some advances in the experimental studies combined to theoretical tools, where the statistical mechanics and the mini mum energy models (the conformational state of a protein is associated to its minimum free energy state) gained credibility, led to the introduction of a new view about protein fo lding: the energetic funnel model [20-24] (Figure 1 in the Appendix secti on). This view suggests that th e folding starts in a large conformational space where one single protein can follow different and parallels routes, being capable of resulting in the formation of the same native conformation. However, along the folding pathway, some specific and well-defined intermediate states may be formed (trapped in a local minima of free energy) depending on the followed route for the protein to fold. 6
1.3 Intermediates in the apoMb folding route(s) Several proteins present stable intermed iate conformations between the native and the denatured states [25-30], indicating a s econd order phase tran sition explained by a three-state folding mechanism. Ptitsyn  suggested a folding mechanism for apoMb that occurs in three steps, in cluding formation of the secondary structure, in the time scale of micro to milliseconds; collapse of this stru cture in a compact form, and without a rigid tertiary conformation; and formation of the native compact structure. Then, the equilibrium reaction Native Unfolded (N U) mentioned before would not be a twostep process. The presence of intermediate conformations would be a solution to the Levinthal paradox, since the partially folded st ructures would restrict the conformational space. However, the comprehension of this pr ocess requires a better understanding of the pathways involved in the mechanism. And, as sa id before, for certain proteins, it has been discovered some intermediate states in this process, making the mechanism of folding of some proteins not a single two-step proces s with only the native a nd the unfolded states, but a more complex pathway with the presence of intermediate states. To understand protein folding, it is desirable to identify and analyze the struct ure of these possible intermediates. The presence of intermediate states is usually detected in kinetic studies. Some properties of these intermediates can be determined from these kinetic experiments, but there is not a large amount of structural information that can be extracted from them. Intermediate structures were observed during th e kinetics studies of the folding of small proteins [25, 26, 30]. 7
The practical problem in st udying the intermediates is th at the folding reactions of these proteins are highly coopera tive, i.e., the intermediates ar e formed and are present in this form only for a short period of time [25, 26, 30]. However, -lactalbumine  and apoMb [27-29] are some examples of longer pr oteins that display some intermediates in equilibrium. The studies about intermediates states have helped to answer questions regarding the speed and the effi ciency of the protein folding. More details about the structure and inte rmediates formation are given in section 2.3, which describes apoMb and its folding pathway. In this dissertation, the fo lding of apoMb protein had been studied by computer simulations. The folding pathway was simula ted through an approximate computational approach. The present method provides a folding trajectory, meaning th at the initial point was considered the unfolded configuration of the protein, and the end point was the native configuration of the prot ein. And this approximate methodology permitted that the calculations were carried out us ing large step size, allowing the study of a large time scale molecular process, which is the case of the fo lding process for a protein with the size of apoMb. The relevance of this work is that understanding the mechanism of how a protein fold can improve the production of drugs ag ainst diseases associated to misfolded proteins. Other applicab ility of this st udy is in the nanomaterials area. ApoMb was chosen for this study because of the large amount of data available that can guide a theoretical study like this. 8
CHAPTER 2 OVERVIEW: APOMYOGLOBIN IN THE STUDY OF PROTEIN FOLDING 2.1 Myoglobin and its physiological function Myoglobin is a heme-protein present in the skeletal muscles of mammals and it has a very important function of oxygen stor age in those muscle cells. The oxygen bound reversibly to the myoglobin and the protein can also bind other molecules, such as CO2, NO and CO. The ability of myoglobin to bound oxygen depends on the presence of a prosthetic group (heme group). This is a non-polypeptide group an d consists of a porphyrin (organic part) and a centr al iron atom (inorganic part). The crystal structure of whale myoglobin (Mb) ( Physeter catodon ) has been determined . This is a globular protei n of approximately 17 kDa molecular mass and 153 amino acids. It is a monomeri c protein and contains eight -helices named from A-H. The sequence of amino acids of the protein is shown in Figure 2 of the Appendix section. 2.2 Apomyoglobin The apo form of myoglobin is obtained after the removal of the heme group (apomyoglobin apoMb) that still conserves similar properties to myoglobin, as the 9
solubility in water (but in smaller conc entrations), and some of its structural characteristic, like the helical content. In this work we address the folding process of the apo form of the Mb. ApoMb is a globular protein at pH 6  and its secondary structure is still composed of eight -helices, labeled with letters A-H (Figure 3 in the Appendix section). It has an extensive hydrophobic core compri sing 61 residues (~ 40% of the protein), mainly located in helices A, G and H, although there is considerable tertiary structure consisting of helices B and E. The radius of gyration (Rg) of the native structure of apoMb is approximately 15, and its helical content is about 55% [27, 29]. It is believed that in its native state apoMb mainly loses th e structure of the F-he lix compared to the holo form, which reduces the st ability of the apo form [29, 35]. C and D helices are the smallest ones, and they also present signifi cant degree of disorder in the apo-protein. However, it is believed that the apoMb unfolding is different from the holo form and occurs with the formation of an intermediate state (the low pH intermediate state ). The study of apoMb can be particularly useful to the unders tanding of folding patterns for helical globular proteins. In addition, there are many experimental studies about this protein that can guide the molecular simulatio ns done in computers. The polypeptide chain of globular prot eins is folded into a compact form, and this native form is the conformation which is important for th eir biological functions, as described before. Globular proteins are a complex type of condensed matter, a nd contain a solid core and a flexible surface. This configuration allows a protein to have the correct motions in order to function. 10
2.3 Apomyoglobin and its folding pathway ApoMb is considered a good model to the study of the folding pathway of globular helical proteins since it remains the helices from th e holo form (i. e., even after the removal of the heme group). Its folding was extensively studied by experiments [29, 36-40], and by some computer simulations [ 27, 28]. Even with the advancements of computer capabilities, experiment al data is needed to aid in protein folding simulations. As there is experimental data about the folding of apoMb, the study in computer can be guided by the experimental data available. Ap oMb has been one of the most extensively studied proteins regarding its folding mechanisms. However, a clear understa nding of the apoMb folding pathway has not been reached ye t, and the details of the molecular events that occur during the folding process remains limited. There is only one way to study protein fo lding, and it is thr ough the initiation of folding from an unfolded state. Although, there are different ways to initiate the unfolding process. Because of proteins are sensitive to the environment where they are located in the cell, some perturbations (such as increa sing/decreasing the temperature or lowering the pH or using some denaturants) ca n lead to the unfol ding of the protein. The folding process of apoMb has been studied extensively mostly driven by variations of the pH of the solution cont aining the protein. Experimental  and computational modeling results [27, 28], have identified an intermediate state when the unfolding process of apoMb is induced by a rapid drop in solution pH. This acidic intermediate, called the I -state, has been charact erized recently by Uzawa et al  using continuous flow, time-resolved circular dichroism, and SAXS (Small-Angle X-ray 11
Scattering). The results suggested that the I -intermediate has a radius of gyration (Rg) close to 23 and its helical content is ar ound 33%. These authors proposed that the helices present in the I-state are helices A, G, and H. It is believed that during the low pH induced unfolding, apoMb forms this equilibri um acidic intermediate at pH 4.2 with properties from the folded stat e (pH 7) and from the unfolde d state (pH 2) [29, 39-40]. Uzawa et al  also observed a second intermed iate with a Rg of 23.1 and helical content of 44% that has not been described be fore. Although, in agreem ent to their model, this first folding intermediate (predicted within 300 s) was not able to be resolved with the time resolution used in that work. Two key molecular dynamics studies about the apoMb unfolding have been carried out to simulate pH induced unfol ding. The first unfolding simulations with a duration of 500ps were computed using a te mperature of 358 K and pH=4 with spermwhale apoMb . These studies suggested that the F helix is already disordered in the native structure of the protein. At pH 4, an intermediate state was also observed with some of the same characteristics described e xperimentally , i.e., with helical content of around 33% and formation of the AGH core. But the intermediate described by this work has a radius of gyration of 16.7 , more compact than the intermediate characterized by Uzawa . An intermediate state during the unfolding process was also observed in a different molecular dynamics simulation of the acid-induced unfolding process of sperm whale apoMb . In that case the unfolding trajec tory was obtained at 300K and its duration was 1.6 ns. The un folding process at pH=2 produced an intermediate state with preserved A(B)GH co re, and radius of gyration of 21.6, similar to the experimental result of Uzawa et al. . 12
The data available for the apoMb unfolding is very broad. But the puzzle is that some of them provide an unfolding pattern based on the application of more than one perturbation, like lowering the pH and incr easing the temperature, or adding small amounts of denaturants. This being said, a co mpilation of the results of the apoMb until the present shows that, regarding the denaturi ng conditions used to unfold the protein, the main features are: lost of secondary struct ure, but keeping helices A, G and H [29, 41, 42] and a part of B-helix [29, 42, 43, 44] more stru ctured, and the tertiary structure is less compact than the native structure . A more detailed characterization of this intermediate were done with experiments such as deuteriu m exchange [41, 45, 46]. These works show some evidence that the helices A, G and H are more structured compared to the other helices. They also suggested that the formation of this intermediate is an important step in order apoM b reaches its native state. The participation of the helices A, B, G and H in the formation of the pH 4 intermediate of apoMb was also studied in some works that used site-directed mutagenesis [47, 48, 49]. The evidences of th ese studies support the idea of a sequential folding with a subsequent incorporation of th e helices that are still unstructured, resulting in a protein with a stable and compact structure . Some works suggest the pres ence of two intermediates in the landscape of the folding of this protein under low pH unf olding condition [29, 35, 43-45, 50-51]. The rest of the B-helix is formed after the formati on of this first intermediate. The sequential incorporation of B-helix (pH 4) is supported by experiment of deuterium exchange in the presence of TCA (Trichloroace tic acid) 20 mM . Adding TCA increases the helicity 13
and stability of the intermediate that has hi gh protection factor for the helices A, G and H and for B-helix . An increase in helicity propensity and the relative stability of the pH 4 induced-intermediate are also shown when the B-helix is stabilized by mutations of residues G23A and G25A . Ja min and Baldwin (1998) and Uzawa et al. (2004) identified two intermediate forms, Ia and Ib, that coexist in pH 4 in an equilibrium that depends on pH, on urea concentrat ion and on the presence of anions. The evidence of the two different intermediates comes from th e difference in the fluorescence emission spectra of the protein in the pH range of 3.4 and 4.2 and from the increase in the fluorescence after the use of 1M of urea. The conclusion was that the Ia had not still the B-helix formed and it would be the only form in pH 3.4. At pH 4.2, the predominant form would be Ib, when the B-helix being total or partially formed. This helix would be the first to unfold in presence of urea 0-1M, wh ere only the Ia would be present. Eliezer et al (1998) showed the B-helix being more structured in the intermediate at pH 4, suggesting that the A[B]GH is the nucleus of this inte rmediate. These results were corroborated by one simulation experiment made by Onufriev et al (2003). Nishimura et al (2003) studied the hydrogen exchange evaluated by mass spectrometric method applied to apoMb mutants, and they show different pr otection factors for di fferent residues of Bhelix, suggesting that this helix can be partially folded in the intermediate state. Taking all these results into consideration, they demonstrated that the Cterminal of B-helix is more structured than the N-terminal. Other low pH experimental studies pr ovide a model of the presence of intermediate state(s) in the unfolding pathway of some pr oteins [41, 52]. This also 14
suggests that lowering the pH is a very co mmon way to achieve a denatured state of a protein. But it is still not clear whether each denatured state obtained in each denaturing condition should follow the same folding route. This is the main goal of this work: comparing three different denaturing conditions used to unfold apoMb in order to investigate the importance of the nature of the unfolded state in the folding process of apoMb. The refolding of apoMb has also been examined after changes in temperature. Sabelko et al  studied the folding of apoMb after cooling or heating denaturing conditions. Interestingly, these authors did no t observe a stable inte rmediate during the refolding of the protein after cold denaturation. It was suggest ed that the exposure of the inner core of the helix A to the solvent upon cold denaturation makes the formation of the AGH core (observed during the refolding of acid-induced unfolded conformations of apoMb) more difficult. In a separate study, th e refolding of apoMb after cold denaturation was examined using a T-jump/fluorescence te chnique . Experiments conducted at pH 5.2 also did not show any evidence of th e existence of the intermediate state. Thus, another way to achieve the unfolded state is through applying low temperature to the system. However, expe rimental and theoretical works using low temperature to unfold apoMb or other protei ns are not common. One reason can be the difficulty in performing cooling jumps because of the dissipation of the heat within the relaxation time of the protein. But, as men tioned in the previous paragraph, there are some experimental works that have been studied the cold dena turation of globular proteins [53, 54-57]. 15
The few works that applied low temper ature to study the unfolding process of proteins indicate a two-state process for the co ld denaturation from the native state to the unfolded state of globular prot eins. Even for apoMb [53, 57-58], there is not such an evidence of an intermediate state after cold denaturation, and its folding process represents a first order phase transition, contrarily to the wo rks that studied unfolding of apoMb through lowering pH of the solution. Th eoretical works have also been done on cold denaturation of different proteins employing different techniques [59, 60, 61]. One specific work  simulated the cold de naturation of apoMb, a lthough they did not studied the whole unfolding process (because of time restrictions), and they intended to probe the initial stages of the apoMb low temperature unfolding. Another complication in simulating cold denaturation is the complexity of the interactions between solvent (water) a nd the protein, making more difficult the understanding of the mech anistic details of the cold perturbation of th e system. In order to overcome the properties of the water at low temperatures (d ensity behavior near the freezing the point), Manuel I. Marqus (2007)  applied high pressure simultaneously with low temperature in his simulations. The main conclusion drawn from this work was that the hydrophobic forces that make the prot ein structure more compact are due to the hydrogen bonds organized in a network of low density structure of water molecules. And at higher pressures (from 200MPa to 700MPa), th is network is not formed and the solvent water molecules adopt a more dense structural configuration, allowing a decreasing in the repulsion interactions between the non-polar residues of the protein and the water molecules. Thus, in this scenario, the water mol ecules can enter more easily in the core of 16
the protein and promote the unfolding of the native conformation of the protein. Nishii et al.  showed a thermal unfolding of apoMb, but adding two other conditions as lowering the pH until 2 and promoting a salt -induced unfolding also at the same time. They could observe the stabiliz ation of what they called molten globule of apoMb, but this molten globule presents the sa me characteristics of the I-state described in the previous works. The main point that we want to make here is that they observed a complete unfolding of apoMb after applying low pH and adding salt to the system. Thus, until the present, it was not reported any comput ational simulation applying low temperature in order to obtain a comple te unfold of apoMb. There is one already cited  c about the topic, but they used the range of 265278K in order to extract some information about the first steps of the low temperature unfold of apoMb. The mechanisms of each denaturing condi tion have been tried to be reasonably explained for each different unfolding conditio n case, when the denatured states are expected to be different as well. The explanations have been examining the ensembles of the configurations obtained after cold denatu ration or after heat denaturation [53, 57] or even after pressure-induc ed denaturation . The main goal of this dissertation was to make a comparison between the folding route of apoMb using three different dena turing conditions (high temperature, low temperature and low pH). These three differe nt ways of achievi ng the denatured state from apoMb were studied by computer simula tions, and there was not a combination of different unfolding conditions, since for each simulation done, only one of the unfolding conditions was applied at a time The expectation after doing this work was to obtain a 17
better idea of the importance of the nature of the unfolded state to the folding process of apoMb. The principal objective was to de termine, through numerical computer simulations, if there is an obligatory intermediate in the apoMb folding route or if there could be an alternate parallel pathway that the protein could follow to fold into its native state. It was observed that kinetic folding in termediates exist in many proteins. However, it is interesting to investigate if there is some local minima in the free energy surface between the unfolded and the native states and if some transi ent population could accumulate in these local minima, giving the pe rspective of the existence of also static intermediates (thermodynamic intermediates un der steady-state conditions) in the folding pathway. The present results were compared to each other and to previous experimental and theoretical works that have been done regarding this topic. In this way, some conclusions were made about the folding pathway of this helical globular protein, and it is believe that they help to understand how a protein can find its way to reach its native and functional conformation. 18
CHAPTER 3 THEORETICAL BACKGROUND: DESCRIPTION OF THE COMPUTER SIMULATION METHODS USED In this work, the folding process of the protein apoMb was studied by molecular computer simulations. A computational approach that has b een commonly used to study the folding process of a protein is simulating the reve rse unfolding process via molecular dynamics simulations. The classical molecular simulations techniques had their origins, mainly, in the work of Newton and in the work of Gibbs and Boltzmann. They identified the laws that govern the motion of the bodies, and the wa y to correlate the microscopic states with the macroscopic properties. The knowledge of the equations of motion for the system under study is the basis of the molecular dynamics simulations. The molecular dynamics algorithm is based on the numerical soluti on of those equations of motion (obtained through integrating the equations) providing a trajectory in which the coordinates and velocities of the system are given as a functio n of time. From this trajectory, equilibrium properties (such as temperat ure, total energy, pressu re), and dynamic magnitudes (coordinates and velocities of the atoms in the molecule) can be computed. It is also needed to control these parameters in order to achieve a reasonable, and reliable trajectory to the system. 19
There is a combination of classical mech anics, and thermodynamics (statistical mechanics) taking place in the molecular dynamics programs. A thermodynamic state has to be specified at the beginni ng, in order the trajectory can start to be built. And a statistical ensemble (NVT, NpT, NVE) is chos en in which the system evolves with time. Thence, it was necessary to wait by the advent of high performance computers with high power for calculations combined to more powerful algorithms to obtain the atomic trajectories. These trjectories become possible to obtain in the last years, and all the information computed in these computa tional simulations became a potent research technique with a broad spectrum of applications [66-72]. One of the reasons for studying unfolding pr ocesses rather than folding processes with computer models is that the force field parameters used in these simulations are not accurate enough to predict correctly the native state of a protein . Another reason is that there is better experiment al understanding of the native structure of a protein relative to the structures of th e unfolded state. Actually, the ensemb le of denatured states is very large and diverse, making experimental determination of all possible unfolded conformations not possible. However, the unfol ding process can be too slow to be studied by molecular dynamics simulations which requir e small time steps to preserve numerical stability. Therefore, increasing the temper ature is a common strategy to accelerate a process such as unfolding in mol ecular dynamics simulations [74-75]. In this dissertation, different simulation techniques were used in order to obtain the folding trajectories of apoMb. Mainly, one of the algorithms developed by the group of Ron Elber at Cornell Univ ersity (SDEL Stochastic Di fference Equation in Length) 20
will be discussed showing its power and versati lity, mainly with respect to the diversity of the microscopic information that can be extrac ted. Another important point is the fact that microscopic detailed information that can be obtained via computer simulation, but can not be reached experimentally. The present work entailed the study of the dynamics of the folding mechanism through the application of th e SDEL method, which is a boundary value formalism and is based on the classical action principle. The folding trajectories computed through the use of this algorithm generated paths that started from ensembles of the denatured conformations and ended at the native conf ormation of apoMb. This classical actionbased algorithm allows the study of slow biological processes because the trajectories are obtained using large step sizes, although these approximate solutions could become more accurate by systematically decreasing the st ep size, becoming closer to the classical trajectories. Thus, SDEL was used to simula te the refolding process of apoMb starting with each of the three (high temperature, low temperature and low pH) different unfolded state of the protein. The refold ing trajectories obtained wi th SDEL are room temperature trajectories. This is an advantage of this method relative to the computation of unfolding trajectories at high temperatures. However, it has been shown for several molecular systems that results from these high-temper ature trajectories se em to agree with experimental results under room temperature co nditions . This suggests that the effect of high temperatures can be minor on traject ory space, at least for the few molecular systems for which simulations and experiment al results have been compared. However, for a protein system such as apoMb diffe rent unfolding conditions produce observable 21
changes in the unfolding process [39, 76-77]. This indicates that unfolding conditions do matter for this system, and the simulations should model the molecular environment accordingly. SDEL allows for the study of the slow refolding process under physiological thermal conditions (300K), and we are seeki ng to have equilibrium folding conditions, after the achievement of the unfolded state. The trajectories obtained are approximate since to study these long-time processes a larg e step size is required. The folding of apoMb is on the order of submillisecond . In the last decades, scientific research has been making strong efforts to be able to bring the process of protein folding subject to the accessible timescales (milliseconds). The filte ring of high frequency motions caused by the large step size also prevents an accurate determination of the rate s of these processes. This is a disadvantage of SDEL relative to the initial-value molecular dynamics algorithms that provide more accurate trajec tories although on a shorter time scale. Because SDEL is a boundary-value formulation, two boundaries conditions are needed to compute the trajectory. In the present work, these two states are representative of the folded and the unfolded states of apoMb, and their configurations ar e given as inputs of the calculation. This method has been applied to the study of protein folding for three smaller proteins: protein A , cytochrome c [7 9] and barstar . SDEL is boundary value formalism, based on the classical mechanics l east-action principle. The starting point of the algorithm is the classical action paramete rized as a function of the length of the trajectory : 22
f uX Xdl X U E S 2 In this work, Xu and Xf are the coordinates of the initial unfolded and final folded conformations of the protein, E is the total energy and U is the potential energy for the system. The path (from unfolded to folded states) is connected by a trajectory with infinitesimal path element dl In Eq. (1), the action is para meterized as a function of the length. An approximate trajectory connecting the two boundary states is generated using a large step size (l >> dl ). This discretized action is: 1 , 0 1 ,2N i i i il X U E S (2) where N is the number of intermediate configurations in the trajectory connecting the two boundary structures, and li,i+1 is the distance between tw o consecutive configurations ( li,i+1 = |Xi Â– Xi+1| ). According to classical mechanics, if l is infinitesimal, a classical trajectory connecting the two boundary conforma tions is a stationary solution of the action. In SDEL the same Â“least actionÂ” prin ciple is used to obtain an approximate trajectory (with large step sizes) connecti ng the boundary states. In SDEL the value of l is fixed between consecutive configurations in the trajectory to keep a uniform conformation distribut ion along the pathway. The target for optimization in SDEL is not Eq. (2) since a classical trajectory can be a minimum or saddle point of the action . To remove th is ambiguity the target for optimization of SDEL is 2 iX S Thus, the algorithm computes a trajectory that minimizes this functional (subject to additional constraints to remove overall rigid 23 (1)
body motions) . Cartesian coor dinates are used in this st udy to represent a relative large molecule as apoMb. The initial guess for the trajectory is a minimum energy path computed with the program Chmin in MOIL ba sed on a self penalty walk algorithm . In this algorithm, an initial trajectory to be optimized by SDEL later is generated based on interpolation of coordinates set between the two boundaries fi xed structures (native and unfolded). Those body motions that are eliminat ed through the constraints could affect the distance between the interpolated stru ctures, since apoMb has many degrees of freedom. These constraints allow the structures in the path to maintain the distances between to consecutive monomers constant, an d the other constraint applied here is the repulsion that could aggregat e the polypeptide chain. Chmin applies a line integral in order to produce an average value of the potential energy V(R) al ong a trajectory l(R): where Ra and Rb are the cartesian coordinate s unfolded and native st ates, respectively. L is the total length of the trajectory, and dl (R) is the line element. Each trajectory, with 1500 structures each, starts with a different unfolded conformation generated using each one of the high temperature, the low temperature or the low pH molecular dynamics simulations to be described and ends at the same native conformation (Figure 3 in the Appendix). A trajectory that minimizes the SDEL functional is then obtained using a simulated annealing protocol using a parallel implementation of the algorithm. The value of the total energy for the system, E was 24 (3)
determined by molecular dynamics equilibr ium runs at 300K. The SDEL calculations regarding the high-temperature runs were performed using 10 nodes of a 2.2 GHz AMD Opteron 246, and it took approximately 72 hours to optimize a single trajectory using the SDEL program. The SDEL calculations regard ing the low pH and low temperature denaturing conditions were performed usi ng 20 nodes of a 2 x 2.8 GHz Dual-Core AMD Opteron 2220, taking approximately 96 hours for a single trajectory. These calculations required more time due to the explicit water molecules present in the low pH and low temperature runs. 25
CHAPTER 4 REFOLDING AFTER HIGH TEMPERATURE DENATURATION 4.0 Introduction In this chapter, the results of the refo lding process of apoM b using SDEL starting with high-temperature unfolded conformatio ns is described. Done carefully, this unfolding process represents another way to unfold a protein. Then, after applying a higher temperature, the protein unf olds, but refolds if it is pu t back in room temperature. This is what was done in the first part of this work in order to reach an unfolded state to a subsequent study of the refolding pathway of apoMb. The folding trajectories computed in this work confirm the experimental suggestions [53-54] that after thermal stress the refolding of ApoMb occurs without the formation of an extended intermediate structure. 4.1 Materials and Methods The structure of apoMb used is derived from the crystal structure of myoglobin (Protein Data Bank: 2mb5 ) by removi ng the heme-group, and then subjecting the structure to an energy minimiza tion protocol to obtain the na tive conformation used in the 26
simulations (Figure 3 in the Appendix secti on). This was the same procedure used by Onufriev et al.  In this work, all of the simulations were computed using the molecular simulations package MOIL  and were performed using implicit solvent (Generalized Born model ). The force fi elds in MOIL are a combination of AMBER  and OPLS . The minimization proc edure was carried out using a conjugate gradient method for 5000 steps. After minimization of the apo structure, high-temperature molecular dynamics simulations were performed to produce unfolding conformations of the protein to be used as initial conformations in the refolding simu lations. Since no experimental data of the unfolded state of apoMb under high temperature conditions were available the temperatures used were high enough (~ 2000K) to generate extended unfolded conformations in a short period of time. Four unfolded structures were extracted from these simulations. They were subjected to a short minimization to remove any close contacts. The four minimized conformations have a radius of gyr ation between 26 to 29. With the knowledge of the native and unfol ded structures, the folding trajectories were computed with the SDEL algorithm. 4.2 Results and Conclusions The results presented here are an average over the four different folding trajectories computed. Each tr ajectory contains 1500 configur ations. The folding process of apoMb was described through variations of va rious structural properties such as radius 27
of gyration, the number of native contacts and helical content. Figure. 4 (Appendix) displays the popula tion distribution of conformations as a function of the number of helical residues and the radius of gyrat ion. This contour plot shows an increase of the helical content only when the radius of gyration of apoMb is smaller than 16 (close to the radius of gyration of the native structure of the protein) suggesting that the formation of secondary st ructure in apoMb is preceded by a collapse of the unfolded structure. A correspond ing contour plot showing the population distribution of conformations wi th different values of the ra dius of gyration and number of native contacts is displayed in figure 5. This plot shows that most native contacts are formed subsequent to the collap se of the unfolded structure (Rg 17.5). A contact map for the native structure of apoMb is shown in figure 6. The contacts between each specific residue for every sing le helix were computed and this figure displays the distribution for th e native conformation of apoMb. Contact maps at different stages of the folding process of apoMb are displayed in figure 7. From these maps, it is observed that the A helix in the native state makes contact with the H helix, and the EF l oop. In the native structure, Gly5 (A helix) interacts with Gly80 (E-F loop) and this contact occurs later in the simulati ons (figure 7c and d). Also, forming late in the folding simulations are contacts between Trp7 (A helix) and Met131 and Ala134 (H helix). The late formation of the A-H contacts suggests that a nucleation core involving the AGH helices is not observed in these folding simulations. The B helix also forms contacts with the GH loop later in the simulation (figure 7d and e) involving 28
residues Glu83-Phe138, Lys87-Arg139, Lys87 -Ile142, Lys87-Ala143, Pro88-Tyr146, and Pro88-Leu14; while the Fa nd G-helices form some native contacts with the H helix early in the folding (figure 7b, specifica lly residues Glu105-Ala143, Glu105-Ala144, Glu109-Arg139, Glu109-Lys140, His113-Gl y129, His113-Lys133, His113-Glu136, His116-Gly124, Ser117-Gly124, and Ser117-Gly129). These contact maps show that the B-GH and B-E contacts start to form during th e second half of the folding trajectories. Figure 8 shows the formation of each of the eight helices as a function of the path length for the folding trajectories computed. It is observed that the helices are formed mainly in the second half of the simulation after the protein has collapsed, and they form in a cooperative manner. From this plot, ther e is no suggestion that helices A, G or H form first, characteristic of the intermediate state described in experimental studies  and in simulations of the acid induced unfoldi ng of apoMb [27, 28]. It is observed that the A helix forms late. This results agrees with the suggestion in Sabelko et al , that the A helix is more exposed to the solvent upon cold denaturation, preventing the formation of the AGH core observed at low pH experiments. Representative snapshots of the structures obtained during the folding trajectory are shown in figure 9. From these molecular snap shots it is clear that the collapse of the protein occurs first (top three conformati ons) and this is followed by helix formation (bottom three figures). A large intermediate structure with high helical content is not observed. Thus, the results observed in the present folding trajectories at neutral pH starting with high-temperature unfolded structures do not agree with experime nts or simulations 29
under low pH conditions that show an intermediate state for the folding/unfolding of apoMb. Some studies have suggested the exis tence of different folding pathways under various mild unfolding conditions [39, 76 ]. The pH dependence of the unfolding pathways might be the result of protonation of histidines residues which stabilizes the AGH core observed at pH 4. It is important to notice here that in th e case of the present study, the apoMb is kept neut ral, and none of the acidic amino acids are in their protonated form. Studies of di fferent proteins show that the acid-base equilibrium of histidine residues can affect the conformational stability of the polypeptide chain. McNutt et al.  observed a strong effect of the substitution of Glu58 with Ala on the pKa values of two histidines from RNase T1, a nd that affects the overall stability of the protein. Thus, it is not surprising that chan ges in structural stabilities can produce differences in the pathways followed duri ng folding processes under various conditions. In the computer simulati on carried out by Onufriev et al , all the histidines, aspartic acids, and glutamic acids were protonated, in order to model the pH 2 environment to study the unfolding of apoMb. In the other simulation made by Tirado-Rives et al. , all the histidines were protonated to mimeti ze the pH 4.2, at which th e intermediate state was observed. However, the intermediate char acterized in these two simulations have very different radius of gyration, as pointed out before. Tirado-Rives et al.  also carried out an unfolding MD trajectory at ne utral pH and 358K. They observed a compact intermediate with the same propertie s (Rg ~ 16.5 and helicity ~ 33%) as the intermediate noticed at their simulation at pH 4 and 358K. Altough the Rg of the intermediate found by latter work is not si milar to the other works, in which it was 30
observed that the Rg of the intermediate state being around 21-22 . In this way, our simulations agree with TiradoRives  results suggesting helical formation when the structure is compact. Other studies have shown distinct stru ctural changes during refolding of apoMb that depend on the conditions used. Gast et al  observed the formation of an intermediate state even when the pH was lowered below 4 under high ionic strength conditions. Goto and Fink  observed simila r refolding dynamics for apoMb at low pH and different ionic strength a nd salts. Structurally distinct partly folded configurations have also been identified at low pH wh en anions where added to stabilize the I forms . Considering all these observations, it is suggested that the folding pathways can be significantly affected by th e overall solution environment. St ructural differences due to variations in temperature have also been noted as well. Gast et al.  described the existence of different structural conformati ons for apoMb in solutions at low (1C) or high (30C) temperature. Intere stingly, similar pH effects on folding pathways have been described for other helical proteins. For example, Gorski et al.  explored the effect of the variation of pH on the stability of differen t states (intermediate states) in the folding of two helical bacterial immunity proteins. Th ey observed that an intermediate state is populated at low pH, and the stability of that state is pH-dependent. This result was in disagreement with two earlier investigations that did not detect a ny intermediate during the refolding of the same pr otein at neutral pH [91-93]. A possible reason why an extended but partially folded I state is not observed 31
under the present conditions, i.e., neutral pH, is that the A helix is le ss stabilized (figures 7 and 8) and this prevents the formation of the early nucleation site formed by helices A, G and H observed at low pH. Th is destabilization of the A helix was suggested as the cause of a lack of intermediate structure in the cold denaturation unfolding of apoMb . Another aspect that may be important is the influence of the initial unfolded ensemble on the folding pathways. A r ecent investigation  suggests that thermodynamic features of the folded state are related to the properties of the unfolded state and that the kinetic pr operties of proteins during the folding process depend on the properties of the denatured state. Gruebele et al.  also suggested a relation between protein refolding and the nature of the initial unfolded ensemble. They analyzed the dependence of the apoMb refolding on the proper ties of the initial cold denatured states. They showed that under cold unfolding, apoMb shows a different behavior when compared to the acid-induced unfolding. The ensembles of states that different denatu ration mechanisms genera te are not similar, suggesting that the dynamics of the folding pr ocess may be influenced by the nature of the initial ensemble. Thus, variations in the folding pathways of apoMb, when the unfolded ensemble is prepared under different conditions, can be expected as our results and the other works cited here have shown, regarding the folding pathways calculated previously for pH induced unfol ding [27, 28]. As it was said in the first Chapter of this dissertation, there are diffe rent ways applied to unfold a protein. Lowering pH, denaturants, temperature jumps, forceinduced (steered molecular dynamics) and pressure-induced denaturation are some of th em. And they can have different mechanisms 32
to unfold a protein. Pressure and pH denatu ration may correspond to the water penetrating into the protein through formation of hydroge n bonds, while other type of denaturation may have charged groups interactin g with water by charge transfer. These were our first results about apoMb fo lding process, and it contributed to the puzzling of the folding landscape of this protein. After examining publications about the topic, which showed controve rsial points regarding the presence of intermediate states, and our results, we decided to analyze the apoMb folding process after obtaining the unfolded state via different denaturing conditions. The next two chapters show the results after low pH and low temperature unfolding conditions, and the final chapters compare and provide reasons for the differences observed in the apoMb folding route. 33
CHAPTER 5 REFOLDING AFTER LOW pH DENATURATION RESULTS 5.0 Introduction As said in the previous Chapter, after doing the simulations using high temperature to unfold apoMb in order to reac h the denatured state, it was observed some disagreement with previous results about the folding route of this protein. The absence of any trace of an intermediate state in the hi gh temperature simulations described in the present work opened the thought about the possi bility of different routes of folding, depending on the condition of the unfolding pr ocess. This was the reason why other unfolding conditions were applied to simulate the folding process. This chapter described folding simulations of apoMb carried out after low pH denaturing condition. Experimentally, the folding/unfolding process of apoMb induced by changes of pH in the solution has been widely descri bed [see references in Chapter 4], as stated in previous chapters also. Molecular dynamics calcula tions were also done simulating low pH condition through the protonation of amino acid residues [27, 28]. In this chapter, the methodology used to simulate low pH unfol ding condition and the results of those simulations are described. The results of th is Chapter agree with experimental data described before about the apoMb acidic unfol ding. The conclusions extracted from these 34
data are also drawn, but the main conclusions ar e left to the chapter 7 of this dissertation, when the three different unfolding conditions are compared and their differences are addressed. 5.1 Materials and Methods The same apoMb structure used in the si mulations of the previous Chapter was used here to start the low pH and the low temperature calculations (next Chapter). The 2mb5 Protein Data Bank entry was utilized as initial coordinates, af ter removing the heme unit and doing an energy minimization to achieve the stable apo form of myoglobin. In order to simulate a low pH environmen t, all the titratable amino acids of apoMb were protonated. Acidic hydroge n atoms were added to all the Aspartic and Glutamic acids and to all the Histidines residues of apoMb. This protonation scheme follows the same that was done in the theoretical condition  to simulate a so lution at pH 2, when apoMb is considered to be completely unf olded. The generation of the protonated apoMb was performed through NAMD (Nanoscale Molecu lar Dynamics)  scripts, and the simulations were carried out also using an aqueous environment, considering the importance of water as a solvent when dealing with changes in the pH of a solution. The solvation of the protein-wa ter system was generated us ing VMD (Visual Molecular Dynamics program version 1.8.6) . The water model used was TIP3P. 16009 water molecules were added, and the dimensions of th e water box were kept so that the distance between each side of the protein was 20 far from the edge of the cubic water box. The protein-water system was neutralized by 35 Cl ions using autoionize plugin of VMD. 35
The entire system consisted of 50554 atoms. The next simulation, steps done using NAMD, was the minimization of the energy of the whole system, followed by a heating run (temperature was gradually increased until 298K during 600 ps) and an equilibration run at 298K for another 200 ps. The producti on run (10 ns) was then performed at constant pressure (1 atm). Temperature and pressure were monitored by Langevin dynamics and by Nos-Hoover Langevin piston pr essure control, as implemented in NAMD. A steered molecular dynamics (carried out by constant force restraint method) was needed in order to speed up the unfolding process. Ther efore, a constant pulling off of two amino acids chosen randomly wa s added in the production run. A harmonic constraint force constant of 10 kJ/mol/ was used. At the production run, the protein with a Rg of about 30 unfolds similar to the experiments [ 29], and this was done in 5ns. Four different trajectories were obtained by pulling off two diffe rent amino acids in each trajectory. After the achievement of the unfolded struct ures in each of the four low pH runs, those structures were used to search for the folding route. This was done by the same refolding procedure carried out in Chapter 4. 5.2 Results and Conclusions The result showing here is the average of the data collected in the four different trajectories that were obtained through the pulling off two different amino acids at each time. Each trajectory contains 1500 conf ormations generating the folding route. The overall results of the folding pathway obtained in this Chap ter are different 36
from the results of Chapter 4, in which high temperature was used to obtain the unfolded structure of apoMb. This is the main conclusion achieved after the low pH simulations were done. It is observed that apoMb follow differe nt routes of fold ing, depending on the unfolding condition used to achieve the denatu red state. Thus, this suggests that the folding pathway strongly depends on the conf ormational structure of the unfolded state. In the contour plot Rg vs. helical cont ent (figure 10), it is evidenced that the distribution moved towards the right when compared to the high-temperature results, showing that there is not an exponential de cay of the number of helical residues for conformations having compact radius of gyr ation. The formation of the helices starts when the radius of gyration is about 20. And from the experimental results, it was stated the presence of an intermediate with a radius of gyration between 21-23. At the region of radius of gyration around 20 the increase in the number of helical residues starts to become slightly exponential, but not as much as what was ob served in the correspondent high-temperature plot. Our conclu sion here is that there is likely an intermediate state population concentration in th e low pH simulations, having Rg ~ 20-22 and helicity ~ 33-38%, in agreement with the experiments using low pH condition to unfold apoMb. We assume it also based on the analys is of another plot that it will be discussed in Chapter 7 (figure 23). This plot was made in order to obtain a better examination of the region where the number of helical residues starts to increase. From this graphic, it will be possible to observe a flat region, where the number of contacts doe s not change very much, although the helices continue to form. Th is plot shows that the contacts that are present in the native structure start to form in this flat region. 37
The graphic representing the population di stribution as a function of number of native contacts and radius of gyration (figur e 11) does not show eith er a very exponential decay of the Rg. This means that the native co ntacts start to form when the structure is still extended in contrast to the high-temperature runs. The formation of the contacts between the helices was also anal yzed (figure 12). It shows the existence of smalle r number of non-native contacts since the beginning of the simulation. This means that, after apoMb is unfolded under low pH, most native contacts are still present, what does not occur in the conformations unfolded through hightemperature condition. However, from this fi gure, there is still not an evidence of AGH core formation early in the simulation. The contacts between helices A, G, and H start to be prominent later in the simulation (figur e 12 d and e). There is a prominent native interaction between HIS24 (helix B) and HIS119 (helix G). It forms before the first half of the simulation (figure 12c), and it is one of the relevant contacts in order to form the native apoMb. Another contact that forms c oncomitantly with the previous one is between HIS24 (helix B) and ARG 118 (helix G). These two contacts provide strong interaction between B and G helices. From figur es a, b, and c, it is observed formation of contacts between the helix A and the EF loop (residues HIS12-GLY80), and between helix B and the GH loop (residue s GLY25-PRO120). In both fi gures 12d and e, it can be seen that practically the same contacts betw een helices are present. Only the contacts between helices G and H are not totally formed yet (figure 12d). In figure d, it is also observed that the number of contacts between CD loop and helix D (LYS50LYS56) is higher than other contacts. 38
The presence of important contacts be tween specific residues in the low pH simulations (where it is observe d the turning point in the c ontour plot of figure 10) is discussed later on this dissertat ion (figure 23 and Table 1). A plot that shows the formation of th e individual helices along the trajectory (figure 13) presents also slig ht differences when compared to the high-temperature plot (figure 8). Under low pH condition, the he lices ABEGH are formed early in the simulation. Some experimental and computatio nal works observed that the intermediate state still maintain the AGH co re of secondary structure [ 27-29]. Other experiments also claim the presence of AGH core [35, 53, 54]. Our low pH simulations suggest the presence of a ABEGH core, since these helices start to form first during the simulation. Figure 14 gives the molecular view of what was described in the previous paragraph. The representation of the helices al ong the trajectory in this figure shows that the first low pH-unfolded conformations still ha ve a lot of secondary structure, as shown in figure 12 that also the native contacts (cont acts between helices are native) are present in the first conformations during the folding of apoMb. 39
CHAPTER 6 REFOLDING OF THE LOW TEMPERAT URE UNFOLDED STATE RESULTS 6.0 Introduction Cold denaturing of proteins is consid ered to occur as much as the high temperature or low pH denaturation. Cold dena turation is believed to occur subsequent to some changes in the nature of the interactio ns between water and th e non-polar groups of the protein. But there is not much resear ch using low temperature to study protein folding/unfolding. One of the e xperimental works  app lied a temperature close to 265K in order to observe the unfolding of a poMb. Its conclusion was not very clear about intermediate formation and it was not possi ble to have a good char acterization of the unfolded states. And in one computational simulation that was done , apoMb was tried to unfold in the range of 265-278K to analyze the initial stages of unfolding. The authors found that it was also difficult to unfold apoMb th rough cooling the temperature due to time restraints imposed by the use of molecular dyn amics simulations. In this dissertation, it was not possible to unfold apoMb lowering the temperature either. We had to use a different methodology to obtain the unfolded state that is described in the next section. However, the method to refold the protein after th e achievement of the unfolded structure was the same procedure used in the previous two Chapters. 40
6.1 Materials and Methods In this Chapter, the same apoMb structure utilized in Chapters 4 and 5 was used to perform the low temperature simulations. Due to the impossibility of achievement an unfolded structure via molecular dynamics simulations, for this part the unfol ded structures obtained in Chapter 4 (in which high-temperature was applied to unfol d apoMb) were used to the study of low temperature unfolding. This wa s done by taking the unfolded structure from the high temperature unfolding condition of Chapter 4 and applying low temperature to a water box with it using NAMD program. The results of the low temperature yielded through this procedure were not biased since (as s hown below) they display differences between the apoMb folding route after the high-t emperature unfolding and after the low temperature unfolding. Almost the same pro cedure to prepare the apoMb-water system done in Chapter 5 was followed here. The only di fference is that for this low temperature unfolding, a neutral protein was used, meani ng that the glutamic, aspartic acids and histidines residues were not pr otonated in order to simulate a neutral apoMb. This neutral system was composed by 50416 atoms (15 981 water molecules). Subsequently, a minimization, a heating (until 265K) and an e quilibration runs were performed as already described in Chapter 5. The production run was al so carried out at constant pressure (1 atm) for 10 ns. The unfolded structure genera ted was then taken and the same refolding protocol using SDEL utilized in Chapters 4 and 5 was applied to it. 41
6.2 Results and Conclusions We will show the results of the average of four different trajectories that were obtained through the cold simula tions done as described befo re. Each trajectory also contains 1500 conformations, th e same number used in the other two different unfolding conditions. The results of the folding pathway obtaine d in this Chapter are different from those in Chapters 4 (high temperature dena turing condition) and 5 (low pH denaturing condition). The results for low temperature seem to have an intermediate behavior between high-temperature and low pH results The low temperature results do not show either any type of intermediate state populated along the trajectories simulated here. This is the overall conclusion achieved after co mparing all the three different denaturing conditions studied in this dissertation to unde rstand the folding pathway of apoMb. It is concluded that apoMb can fold via different routes, whic h depend on the unfolding condition utilized to obtain the denatured st ate. Consequently, th e suggestion that the folding pathway strongly depends on the conf ormational structure of the unfolded state made in the previous Chapter is defini tely supported by the simulations in low temperature done in this Chapter. Figure 15 shows a contour plot that s hows the population distribution regarding Rg and number of helical residues. The plot shows a region ar ound Rg close to 20 where the number of helical co ntent starts to increase with a soft exponential inclination. This increase is not so steeped as compared to the high-temperature results (figure 4), but it is sharper when compared to the results at low pH condition (figure 10). There is an 42
interesting similarity between population distributions of th e low temperature and the low pH simulations. For both condi tions, helices start to form when the Rg ~ 20. However, the analysis of the different number of c ontacts and the relative number of helices populated during the refolding simulations afte r low temperature unfolding (that is going to be showed later in figure 22) shows a linear relation between these two properties. Just to give an idea about this result now, the plot in figure 22 does not have the same flat region that we believe is related to the I-state population in the low pH simulations. As far as there are not many studies about low te mperature unfolding/folding of apoMb, this evidence suggests that even though there is a ch ange in the distribution of conformations (figure 15) around Rg ~ 20, we do not observe the pr esence of the I-sta te characterized in the low pH denaturation e xperiments  and in our lo w pH denaturation simulations. The population distribution of conformati ons as a function of the Rg and the number of native contacts (figure 16) shows similarities with respect to the results at low pH. These two contour plots show that there was not a bias in taking structures from the high-temperature runs from MOIL to produ ce the low temperature unfolded structures, because the results are closer to the low pH than to the high-temperature simulations. Thus, there is not reason to believe that computing the unfolded structure at low temperature starting with the high-temperature structures had an impact in the folding route obtained after low temperature unfolding. Our resu lts indicate that the native contacts are forming along the entire foldi ng trajectory after the low temperature denaturation, but still there are many of those contacts that form early in the trajectories. A similar conclusion can be drawn from figure 17, wher e the contacts between the 43
different helices of apoMb are followed during the simulation. Th ese plots are also similar to the low pH results in which there are few non-na tive contacts at the beginning of the folding. Most of the contacts that ar e observed initially in the trajectory are native contacts. However, the formation of the cont acts between AGH core occurs later in the simulation as in the low pH denaturing condition. The contact maps (figure 17) are more similar to the low pH than to the high temperature denaturing condition. Figure 17a shows more contacts between the A helix and the H helix (VAL17-LYS140, GLU18LYS140), and these contacts remain present in the native apoMb. Many native contacts between the B and E helices are also presen t since the beginning of simulation (figure 17a). Contacts between the CD loop (ARG45) and the D helix (GLU54, and LYS56) appear at the beginning (figure 17a), then they disappear for a while during the simulation (figure 17c). They start to form again in figures 17d and 17e. They are also contacts present in the native apoMb. The D helix is small, and there is a bigger fluctuation between its contacts. Figures 17d and e are similar with most native contacts already formed when compared to figure 6. The contacts between the B helix and the GH loop start to form at the end (figure 17e). The sa me is observed for the contacts between the CG helices. Figure 18 shows the formation of the indi vidual helices along the trajectory. It displays more fluctuations when it is comp ared to the other two denaturing conditions. This occurs mainly for the C-helix, one of th e smallest helices in apoMb, and considered to be very unstable until the end of the fold ing process. We do not see a pattern of early formation for the AGH core as we noticed in the same plot at low pH. Figure 18 is more 44
similar to the high-temperature case (figure 8) where it was not clear the presence of the ABGH core as in the low pH plot. In this lo w temperature results most helices, except for C helix, are formed towards the e nd of the refolding trajectories. A pictorial representation of the low te mperature refolding is given in figure 19 where molecular snapshots along the folding trajectory are shown. Co mparing this figure to figure 9 it is possible to see that the initia l unfolded structure at low temperature is not as unfolded as the denatured structur es of our high temperature runs. These results make evident the main conclusion coming from this work that unfolded conformations derived from differe nt unfolding denaturing conditions can have a different influence in the apoMb folding route. Explanations and reasons for those differences are given in the final two Chapters. 45
CHAPTER 7 CONCLUSIONS COMPARING THE TH REE UNFOLDING CONDITIONS The overall results of the fo lding pathway obtained in th e previous two Chapters are rather different from our published result s  described in Chapter 4, in which high temperature was used to obtain the unfolde d structure of apoMb. This is the main conclusion reached after the low pH and the lo w temperature simulations were done. It is observed that apoMb follow different routes of fold ing, depending on the unfolding condition used to generate the denatured st ate. Figure 20 shows a comparison between the variation of the Rg and the helical content fo r each of the three different initial unfolded states. It is evident that these properties evolve in different ways depending on the unfolding condition used to obtain the denatured state. The low pH curve (black line) of figure 18 is more similar to the results obtai ned in the low pH experiment done by Uzawa et al. (2004). The increase of helical content with the decrease of the radius of gyration is more sigmoidal compared to the more exponen tial behavior at high te mperature. The cold denaturation result (blue line) shows also a sigmoidal shape, but slightly moved to the left (towards the high te mperature curve). From our results, it was observed a two st ate process for the refolding of the high temperature denatured initial st ate of apoMb, a more rapid coll apse of the protein with no 46
formation of helices arose when compared to the other two denaturing conditions. It is claimed that a hydrophobic collapse is an impor tant event in the apoMb folding , but it is not possible to find the most importa nt contacts forming the helices during the folding pathway from our high temperature results (figure 21). Th is plot shows the contacts during the formation of the native helices of apoM b, and it can be observed the fluctuation in the data. Table 1 (Appendix page A-24) shows impor tant contacts formed mainly between hydrophobic residues (contacts 2, 3, 6, 7, and 8 showed in Table 1) during the refolding of the low pH denaturation structures. These contacts are important since they start to form in the flat region of the plot of figure 23, and they keep increasing until the achievement of the native structure of a poMb. This evidences the hydrophobic collapse also as being an important step for the apoMb folding. There are also contacts being formed between polar amino acids (contacts number 1, 4, and 12 se e Table 1) and two contacts containing charged resi dues (contacts 10, and 11 s ee Table 1). All of these are contacts formed during the apoMb folding, a nd they are present in the native protein. There are a variety of interac tions taking place during the fo lding process. Dipolar bonds are important to specific structural interactions that stabilize the -helices. And hydrophobic interactions are also important, si nce they stabilize the native structure by driving the overall collapse of the chain. A nd most of the important contacts between amino acids seen in Table 1 (Appendix page A-24) are hydrophobic. In the low temperature denaturing condition, the above specific important contacts 47
present during the low pH folding (red dots in figure 23) are also more difficult to delineate (figure 22), because the formation of the contacts relative to the formation of native helices is linear. However, the native contacts present towards the end of the simulation for all the three dena turing conditions are the same. It means that the important contacts to form the native apoMb are pres ent in figures 21-23. The difference is when they start to form, and how faster they are formed. Figure 21 shows that these contacts are formed towards the end of the simulati on. Figures 22 (low temperature unfolding condition) and 23 (low pH unf olding condition) displays th ese contacts being formed earlier when compared to figure 21 (high temperature unfolding condition). These three plots evidence more structural differences regarding apoMb folding starting from different initial state. From figure 24 to 31 (see Appendix), it is shown the location between two of the contacts described in Table 1 (Appendix page A-24) to gi ve an idea of how different they are for each unfolded structure coming from different denaturing condition. The figures display these contacts for each one of the three denaturing condition applied in this dissertation. Figure 24 shows the contac t HIS24-HIS119 in the native apoMb. The native distance of this contact is 0.9. After the unfolding, the distance between these residues are 4.4, 2.2, and 5.6, respectively for high temperature, low pH and low temperature denaturing conditi ons (figures 25, 26 and 27). Th e other contact shown is between HIS116 and GLY124. In the native st ructure, the distan ce between these two amino acids is 1.2. And for the high temperature, low pH 48
and low temperature unfolded structures, these distances are 1.3, 1.9, and 3.8. The other distances are showed in Table 1 (Append ix page A-24). An important point to m here is the puzzle in the protein folding pro cess regarding the local interactions (residues that are relatively close in sequence), when co mpared to those amino acids that are further in sequence (medium-range and long-range inter actions). It is expected that these longrange interactions are important to the tertia ry structure, whereas the local interactions define the secondary structure. However, the specific role of each type of interaction in explaining folding process of proteins remains unknown yet. So far we have described the structural analysis extracted from the simulations done in this dissertation, and provided explan ations for the collected data. In the next paragraphs, general ideas will be discussed partially based on others works made for the protein apoMb. In the low pH and low temperature si mulations, one possible explanation for a different unfolding of the pr otein is that the hydrophobic co re of apoMb is surrounded by water (explicit water molecules). And the wate r molecules can affect structurally the apoMb atoms. It is believed that the b ackbone hydrogen bonds of protein core are minimized while few water molecules were making hydrogen bonds with the protonated side chains of the protein, pe netrating in the interior of the native apoMb, and unfolding less the protein when compared to th e high temperature perturbation. And the microscopic unfolding mechanism of the protein can be related to the interaction between water and the protein residues . The low pH simulations do not show the same folding 49
behavior considering the dist ribution of populations along th e number of helices and Rg (figure 10 and figure 4). One possible reason is that the pH condition can have a dependence on the interactions with solvent (water molecules), a nd the electrostatic repulsion with the decrease of the pH makes these interactions weaker. But there is the need of more studies about th e role of water molecules in the protein folding process, since there are unsolved questi ons regarding the structural nature of the interactions between the protein and the water molecule s. And the water-i nduce effects in the simulations, like desolvation, cannot be seized by the implicit solvent models that we used. The conformations of the unfolded macrostate of the low pH condition are then rather different from the high-temperature unfolded ensemble. This could be sufficient for the protein to follow another folding rout e in the funneled ener gy landscape surface (figure 1 Appendix). Hence, our results reinforce the sugge stion that the folding pathway depends strongly on the conformational structure of th e unfolded state, which has a substantial freedom of its degree of conformation. Each side chain of the resi dues can rotate about the single bonds, having a rela tive motion of the groups and not having a rigid structure. In our investigation, it was obs erved that different non-specific interactions occur in each different unfolding condition, becoming difficult to have a definite pattern for the apoMb folding. Despite the fact that the data provided in the trajectories generated in this dissertation can only be studied using structural analysis, it was briefly discussed in the 50
first chapter of this disser tation (and also thermodynamic e xplanations were provided above regarding the differences in the foldi ng pathways for the three unfolding conditions studied in this work) about the current view of protein folding process, the complex energy landscape surface. From figure 1 (Appe ndix), it can be seen an intrinsically high conformational entropy of the denatured state, presen ting a large number of accessible states. Contrarily to the native state, which car ries low entropy and it is restricted to the configurations of the structure that it can have in its folded state. However, a protein folds, despites the fact that the process goes against the thermodynamically unfavorable lost of entropy. Therefore, there has to exis t an equilibrium between this entropy lost and a decrease of the enthalpy of the system at the same time, since this is the current explanation of the thermodynami c process of protein folding. It is also known that it is not only the thermodynamics of the protein that it is important in this process. The environment surrounding the polypeptide ch ain, i.e., the solven t properties also contributes to the thermodynamics change s in the entropy a nd enthalpy magnitudes during the folding process. The findings of this study are relevant in the aspect of providing a better understanding about the mechanism of how a pr otein folds. It can improve the production of drugs against diseases associated to misf olded proteins. Other applicability of this study is in the nanomaterials area. ApoMb was chosen for th is study because of the large amount of data available that can guide a theoretical study like this. The relevance of this work is the finding that it is needed to pay more attention to the denatured state of proteins. As seen in the results of this work, the significance of the unfolded state in understanding the folding proce ss is associated to the mechanism of this 51
process. The denatured state should play a dom inant role in the folding process and the stability of the native protein. It is important to identify the folding-initiation sites in order to understand the protein folding mechanism. These sites present some residual structure in unfolded proteins, and it seem s to be needed to better defi ne the unfolded state, since it is the starting point of the folding process. For many proteins, the unfolded state is not random, and has a defined structure. The de tection of residual structure in unfolded proteins may lead to important hints about those initiation si tes in protein folding. A large number of proteins have hydrophobic regions cl ustered in the protein core as a residual structure in the denatured state. These region s may play a role as seeds in the folding process. The fact of th e existence of these hydrophobi c regions is one of the homogeneities found in the very heterogeneous ensemble of the protein structures in the denatured state. 52
CHAPTER 8 FINAL REMARKS The overall results for each specific denaturing condition described before are not similar to each other, and th e high temperature do not rese mble results from previous works, because the trajectories do not show an intermediate state. On the other hand, we observed differences in the fo lding pathway regarding the na ture of the initial unfolded state. This has three possible explanations : the conformation ense mbles of the three different unfolding conditions are determinan t in the way the protein folds. The other explanation is about the statis tics of the computer simula tions: the number of folding trajectories computed may not be enough to ob tain a correct average results obtained in the experiments. The third possible explanation is that most of the papers about this topic analyzed the variation of properties during th e unfolding process, while the perturbations where still being applied to the system (lik e for the pH experiments and simulations, for example [27-29]). The T-jump experiments are a little different, because they applied the T-jump and then studied the relaxation of th e protein. But still these papers measured different properties of the refolding process, and there was not a clear formation of any type of intermediate state in the apoMb fo lding route in these T-jump experiments. The evidences found in this work show a path-dependent folding of apoMb, 53
suggesting a more complex energy landscape fo r this protein. Thus, the existence of different routes of folding according to the nature of the denatured state reinforces the theory of an energy landscape consisting of multiple local mi nima. An unfolded structure that starts to fold in a different area of the energy surface can follow a different folding route, following each local minimum of energy the system finds in the way to the native state (global energy minimum). This agrees w ith recent theories of protein folding  that states there is a conformational heter ogeneity of the energy surface. And that an individual protein having a la rge ensemble of unfolded state may follow different routes in the conformation space from the denature d state towards the native state and viceversa. The kinetics experiments indicate a tw o-state mechanism of folding, evidenced by laser-induced temperature jumps that show an exponential behavior for small jumps. However, for larger jumps, this behavior st arts to be more non-exponential, implying that there can be multiple folding pathways. Thus, as our main result in th is dissertation, it can be speculated different microstates of the same ensemble of a protein can fold through a three-state (or multistate) folding route downhi ll towards the minimum of the native state energy or even can cross the energy barrier be tween the two states (unfolded and native). And these processes can occur simultaneously in that same ensemble, in each different denaturing condition ensemble. The starting point for protein folding is the unfolded state. Thus, they are especially important for the folding reaction, and must act in order to guide the folding process. The understanding of the nature of the denatured state should provide important clues to decipher the route a protein follow towards the folded state. The existence of native-like interactions in the unfolded state sh ould increase the probability of a protein to 54
reach the native state. One obstacle to the determ ination of the folding route of proteins is the lack of characterization of the denatured state, leading to a poor understanding of the denatured state ensemble. There is not a solved tertiary structure of the unfolded state that could show the most important interactions that are present in this starting state. This information would be valuable in order to identify the differences in the initial interactions due to the different condition in whic h a protein was unfolded. The identification of these physical interactions that may stabilize the native protein structure is not easy to obtain, because th e proteins do not unfold to a si mple state, as said before. They display a poorly understood ensemble of the denatured state (t he polypeptide chains do not become random coils after their folded structure breaks down) It is difficult to determine the structure of the unf olded state, because it repres ents a multitude of different structures consisting of rapidly intercha nging conformations. Then, it has not been possible to identify those stabilizing interacti ons that could induce the ensemble of chain conformations that are more populated in the denatured state. Another difficult is that some of the analyzed interactions that were re tained in the residual structure present in the unfolded state involve interactions that are not seen in the native state. Thermodynamics studies are also helpful, since the thermodynamics properties are affected by the way a protein was unfolded. Therefore, the main conclusion in this wo rk is that the nature of initial denatured ensemble critically affects the dynamics of the folding process. 55
REFERENCES  Nelson, D. L., Cox, M. M. Lehninger Lehn inger Principles of Bi ochemistry; 4th ed.; New York: Worth Publishers, 2005.  Mei, G.; DiVenere, A.; Buganza, M. ; Vecchini, P.; Rosato, N.; FinazziAgro, A. Biochemistry 1997, 36 10917.  Salvucci, M. E.; Osteryoung, K. W.; Cr afts-Brandner, S. J.; Vierling, E. Plant Physiology 2001, 127 1053.  Shortle, D.; Ackerman, M. S. Science 2001, 293 487.  Cho, J. H.; Raleigh, D. P. Journal of Molecular Biology 2005, 353 174.  Anfinsen CB, Haber E, Sela M, White FH Proceedings of the National Academy of Sciences of the United States of America 1961, 47 1309.  Anfinsen CB. Science 1973, 181 223.  Yon JM. Journal of Cellu lar and Molecular Medicine 2002, 6 307.  Uversky VN. Cellular and Molecular Life Sciences 2003, 60 1852.  Daggett V, Fersht AR. Tre nds in Biochemical Sciences 2003, 28 18.  Eaton WA, Munoz V, Hagen SJ, Ja s GS, Lapidus LJ, Henry ER, Hofrichter J.Annual Review of Biophysics and Biomolecular Structure 2000, 29 327. 56
 Levinthal.C. Journal De Chimie P hysique Et De Physic o-Chimie Biologique 1968, 65, 44.  Fersht AR. Current Opin ion in Structural Biology 1997, 7 3.  Fersht AR. Proceedings of the National Ac ademy of Sciences of the United States of America 1995, 92 10869.  Itzhaki LS, Otzen DE, Fersht AR. Journal of Molecular Biology 1995, 254 260.  Kim PS, Baldwin RL. Annua l Review of Biochemistry 1982, 51 459.  Ptitsyn OB, Rashin AA. Biophysical Chemistry 1975, 3 1.  Karplus M, Weaver DL. Nature 1976, 260 404.  Dill KA. Biochemistry 1985, 24 1501.  Plotkin SS, Onuchic JN. Proceedings of the National Academy of Sciences of the United States of America 2000, 97 6509.  Leeson DT, Gai F, Rodriguez HM, Gr egoret LM, Dyer RB. Proceedings of the National Academy of Scien ces of the United Stat es of America 2000, 97 2527.  Onuchic JN, LutheySchu lten Z, Wolynes PG. Annual Revi ew of Physical Chemistry 1997, 48 545.  Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG. Proteins-Structure Function and Genetics 1995, 21 167.  Onuchic JN, Wolyne s PG. Current Opinion in Structural Biology 2004, 14 70.  Kim PS, Baldwin RL. Annua l Review of Biochemistry 1982, 51 459.  Kim PS, Baldwin RL. Annua l Review of Biochemistry 1990, 59 631.  Onufriev A, Case DA, Bashford D. Journal of Molecular Biology 2003, 325 555. 57
 Tiradorives J, Jorgensen WL. Biochemistry 1993, 32 4175.  Uzawa T, Akiyama S, Kimura T, Taka hashi S, Ishimori K, Morishima I, Fujisawa T. Proceedings of the National Academy of Sciences of the United States of America 2004, 101 1171.  Ptitsyn OB. Current Opinion in Structural Biology 1995, 5 74.  Ptitsyn OB. FEBS Letters 1991, 285 176.  Kuwata K, Shastry R, Cheng H, Hoshino M, Batt CA, Goto Y, Roder H. Nature Structural Biology 2001, 8 151.  Cheng X, Schoenborn BP. Ac ta Crystallogr. ,Sect.B 1990, 46 : 195.  Jamin M. Protein and Peptide Letters 2005, 12 229.  Nishimura C, Wright PE, Dys on HJ. Journal of Molecular Biology 2003, 334 293.  Wright, P. E. and Baldwin, R. L. Case study 1: The folding process of apomyoglobin. In: Mechanisms of Protein Folding, 2nd Ed. Pain, R. (Ed.). Oxford University Press, New York, p. 309, 2000.  Wittenberg JB, W ittenberg BA. Journal of Experimental Biology 2003, 206 2011.  Brunori M. Bi ophysical Chemistry 2000, 86 221.  Gast K, Damaschun H, Misselwitz R, Mullerfrohne M, Zirwer D, Damaschun G. European Biophysics Journal with Biophysics Letters 1994, 23 297.  Griko YV, Privalov PL, Venyaminov SY Kutyshenko VP. Journal of Molecular Biology 1988, 202 197.  Hughson FM, Wright PE, Baldwin RL. Science 1990, 249 1544.  Eliezer D, Jennings PA, Wright PE Doniach S, Hodgson KO Tsuruta H. Science 1995, 270 487. 58
 Eliezer D, Yao J, Dyson HJ, Wr ight PE. Nature Structural Biology 1998, 5 148.  Loh SN, Kay MS, Baldwin RL. Proceedin gs of the National Academy of Sciences of the United States of America 1995, 92 5446.  Jennings PA, Wright PE. Science 1993, 262 892.  Miksovska J, Larsen RW. Journal of Protein Chemistry 2003, 22 387.  Kay MS, Ramos CHI, Baldwin RL. Pr oceedings of the National Academy of Sciences of the United States of America 1999, 96 2007.  Luo Y, Baldwin RL. Biochemistry 2001, 40 5283.  Sirangelo I, Tavassi S, Irace G. Biochim. Biophys. Acta. 2000, 1476 173.  Jamin M, Baldwin RL. Nature Structural Biology 1996, 3 613.  Kiefhaber T, Baldwin RL. Journal of Molecular Biology 1995, 252 122.  Kay MS, Baldwin RL. Na ture Structural Biology 1996, 3 439.  Sabelko J, Ervin J, Gruebele M. Journal of Physical Chemistry B 1998, 102 1806.  Ballew RM, Sabelko J, Gruebele M. Proceedings of the National Academy of Sciences of the United States of America 1996, 93 5759.  Smeller, L. Biochim. Biophys. Acta 2002, 1595 11.  Privalov, P. L. Crit. ReV. Biochem. Mol. Biol. 1990, 25 281.  Ervin, J.; Larios, E.; Osvath, S.; Sc hulten, K.; Gruebele, M. Biophysical Journal 2002, 83 473.  Nishii I, Kataoka M, Tokuna ga F, Goto Y. Biochemistry 1994, 33 4903.  Jacobs, D. J.; W ood, G. G. Biopolymers 2004, 75 1.  Robinson, G. W.; Cho, C. H. Biophysical Journal.. 1999, 77 3311. 59
 Choi, H. S.; Huh, J.; J o, W. H. Biomacromolecules 2004, 5 2289.  Lopez CF, Darst RK, Rossky PJ. J. Phys. Chem. B 2008, 112 5961.  Marqus MI. Phys. Stat. Sol. (A) 2006, 203 1487.  Nishii I, Kataoka M, Goto Y. 1995, 250 223.  Hummer G, Garde S, Garcia AE, Pa ulaitis ME, Pratt LR. Proceedings of the National Academy of Scien ces of the United Stat es of America 1998, 95 1552.  Hansson T, Oostenbrink C, van Guns teren WF. Current Opin ion in Structural Biology 2002, 12 190.  Karplus M. Accounts of Chemical Research 2002, 35, 321.  Norberg J, Nilsson L. Quarterly Reviews of Biophysics 2003, 36, 257.  Vendruscolo M, Paci E. Curre nt Opinion in Stru ctural Biology 2003, 13 82.  Elber R, Ghosh A, Cardenas A. Accounts of Chemical Research 2002, 35 396.  Cardenas AE, Elber R. Biophysical Journal 2003, 85 2919.  Elber R, Cardenas A, Ghosh A, Ster n HA. Advances in Chemical Physics 2003, 126 93.  Tobi D, Shafran G, Linial N, Elber R. Proteins-Structure Function and Genetics 2000, 40 71.  Day R, Bennion BJ, Ham S, Dagge tt V. Journal of Molecular Biology 2002, 322 189.  Daggett V. Accounts of Chemical Research 2002, 35 422.  Gorski SA, Capaldi AP, Kleanthous C, Radford SE. Journal of Molecular Biology 2001, 312 849. 60
 Goto Y, Takahashi N, Fink AL. Biochemistry 1990, 29 3480.  Ghosh A, Elber R, Scheraga HA. Proceed ings of the National Academy of Sciences of the United States of America 2002, 99 10394.  Cardenas AE, Elber R. Protei ns-Structure Functi on and Genetics 2003, 51 245.  Yunger J. Physica a-Statistical Mechanics a nd Its Applications 2007, 386 791.  Landau, L. D., and Lifshitz, E. M. Mechanics; Pergamon Press, Oxford., 1984.  Gray CG, Taylor EF. Am erican Journal of Physics 2007, 75 434.  Czerminski R, Elber R. Journal of Chemical Physics 1990, 92 5580.  Czerminski R, Elbe r R. International Journal of Quantum Chemistry 1990, 167.  Berman HM, Westbrook J, Feng Z, Gillila nd G, Bhat TN, Weissig H, Shindyalov I. N, Bourne PE. Nucleic Acids Research 2000, 28 235.  Elber R, Roitberg A, Simmerling C, Gold stein R, Li HY, Verkhivker G, Keasar C, Zhang J, Ulitsky A. Computer Physics Communications 1995, 91 159.  Tsui V, Case DA. Biopolymers 2000, 56 275.  Weiner SJ, Kollman PA, Case DA, Si ngh UC, Ghio C, Alagona G, Profeta S, Weiner P. Journal of the American Chemical Society 1984, 106 765.  Jorgensen WL, Tiradorives J. Jour nal of the American Chemical Society 1988, 110 1657.  McNutt M, Mullins LS, Raushel FM, Pace CN. Biochemistry 1990, 29 7572.  Gilmanshin R, Gulotta M, Dyer RB, Callender RH. Biochemistry 2001, 40 5127.  Ferguson N, Capaldi AP, James R, Klean thous C, Radford SE. Journal of Molecular Biology 1999, 286 1597. 61
 Ferguson N, Li W, Capaldi AP, Kleant hous C, Radford SE. Journal of Molecular Biology 2001, 307 393-405.  Amatori A, Tiana G, FerkinghoffBorg J, Broglia RA. Proteins 2008, 1047.  Gruebele M, Sabelko J, Ballew R, Er vin J. Accounts of Chemical Research 1998, 31 699.  Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kale L, Schulten K. Journal of Computational Chemistry, 2005, 26 1781.  Humphrey W, Dalke A, Schulten K, "VMD Visual Molecular Dynamics", J. Molec. Graphics, 1996, 14 33.  Dametto M, Crdenas AE. J ournal of Physical Chemistry B, 2008, 112 9501.  Levy Y, Onuchic JN. A nnu. Rev. Biophys. Biomol. Struct., 2006, 35 389.  Zhuravlev PI, Materese CK, Garegi n A. Papoian GA. Journal of Physical Chemistry B, 2009, 113 8800. 62
64 Figure 1 Representation of the folding ener gy landscape. The view of the ruggedness funneled energy landscap e of protein folding is the prevailing modern statistical mechanical theory of protein folding. (J. Phys. Chem. B, 2009, 113, 26, 8800).
1 VAL 52 GLU 103 TYR 2 LEU 53 ALA 104 LEU 3 SER 54 GLU 105 GLU 4 GLU 55 MET 106 PHE 5 GLY 56 LYS 107 ILE 6 GLU 57 ALA 108 SER 7 TRP 58 SER 109 GLU 8 GLN 59 GLU 110 ALA 9 LEU 60 ASP 111 ILE 10 VAL 61 LEU 112 ILE 11 LEU 62 LYS 113 HIS 12 HIS 63 LYS 114 VAL 13 VAL 64 HIS 115 LEU 14 TRP 65 GLY 116 HIP 15 ALA 66 VAL 117 SER 16 LYS 67 THR 118 ARG 17 VAL 68 VAL 119 HIS 18 GLU 69 LEU 120 PRO 19 ALA 70 THR 121 GLY 20 ASP 71 ALA 122 ASP 21 VAL 72 LEU 123 PHE 22 ALA 73 GLY 124 GLY 23 GLY 74 ALA 125 ALA 24 HIS 75 ILE 126 ASP 25 GLY 76 LEU 127 ALA 26 GLN 77 LYS 128 GLN 27 ASP 78 LYS 129 GLY 28 ILE 79 LYS 130 ALA 29 LEU 80 GLY 131 MET 30 ILE 81 HIP 132 ASN 31 ARG 82 HIS 133 LYS 32 LEU 83 GLU 134 ALA 33 PHE 84 ALA 135 LEU 34 LYS 85 GLU 136 GLU 35 SER 86 LEU 137 LEU 36 HIP 87 LYS 138 PHE 37 PRO 88 PRO 139 ARG 38 GLU 89 LEU 140 LYS 39 THR 90 ALA 141 ASP 40 LEU 91 GLN 142 ILE 41 GLU 92 SER 143 ALA 42 LYS 93 HIS 144 ALA 43 PHE 94 ALA 145 LYS 44 ASP 95 THR 146 TYR 45 ARG 96 LYS 147 LYS 46 PHE 97 HIS 148 GLU 47 LYS 98 LYS 149 LEU 48 HIS 99 ILE 150 GLY 49 LEU 100 PRO 151 TYR 50 LYS 101 ILE 152 GLN 51 THR 102 LYS 153 GLY 65 Figure 2. Myoglobin amino acid sequence. It shows the number of amino acid that corresponds each helices in the protein. From To Amino acid Helix A 4 18 Helix B 22 34 Helix C 38 41 Helix D 52 56 Helix E 60 77 Helix F 83 96 Helix G 102 117 Helix H125 149
66 Figure 3. Native structure of apomyoglobin, showing helices A to H. This molecular view was extracted from the crystal structure of myoglobin (PDB ID 2mb5) after the removal of the heme group. The picture was made using the VMD (Visual Molecular Dynamics version 1.8.6). A B C D E F G H
67 Figure 4 Contour plot of the radius of gyration and helicity for the high temperature condition. It is showing the population distribution of conformation having different values of radius of gyration and helical content for the four folding trajectories co mputed. A residue is considered to have a helical conformation if the dihedral angles are -140.0 -37.5 and 69.0 -21.0. radius of gyration (angstrons)number helical residues radius of gyration (angstrons)number helical residues
68 number of native contactsradius of gyration (angstrons) number of native contactsradius of gyration (angstrons) number of native contactsradius of gyration (angstrons) number of native contactsradius of gyration (angstrons) Figure 5 Graphic of number of contacts and Rg for the high temperature condition. Contour plot representing the population distribution of st ructures for the four folding trajectories as a function of the number of native contacts and radius of gyration.
69 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa native A B C Figure 6. Plot showing amino acids contact between the helices (A-H) in the native structure of apomyoglobin. Two amino acids are considered in contact if they are five residues away from each other, and the distance between the center of mass of the side chains of each residue are within 6.5
70 (b) 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa -600 (2200) A B C 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa -600 (2200) A B C (a) 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa -300 (2200) A B C 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa -300 (2200) A B C (c) 20406080100120140160 20 40 60 80 100 120 140 160 H G F ED C BA H G F E Daaaa -900 (2200) A B C 20406080100120140160 20 40 60 80 100 120 140 160 H G F ED C BA H G F E Daaaa -900 (2200) A B C (d) 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa -1200 (2200) A B C 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa -1200 (2200) A B C (e) 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa -1500 (2200) A B C 20406080100120140160 20 40 60 80 100 120 140 160 H G FED C BA H G F E Daaaa -1500 (2200) A B C Figure 7. Plots showing amino acids contact between the helices (A-H) along the four folding high temperature trajectories. (a) sum of contacts for the first 300 configurations in the trajectory, (b) contacts from structure 301 to 600, (c) from structure 601 to 900, (d) from 901 to 1200, (e) from 1201 to 1500.
71 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 E A B C D F G% right helixpath length H Figure 8 The variation of the helical co ntent for each specific helix of apomyoglobin as a function of the normalized path length. This is an average over the four foldin g high temperature trajectories computed.
72 Figure 9 Molecular snapshots along one of the high temperature folding trajectories of apomyog lobin. The last struct ure (Rg = 14.48 ) is the native conformation. The snapshots taken were equally spaced along the trajectory.
73 Figure 10 Contour plot showing the popula tion distribution of conformation having different values of radius of gyration and helical content for the four low pH folding trajec tories computed. A residue is considered to have a helical confor mation if the dihedral angles are -140.0 -37.5 and -69.0 -21.0.
74 Figure 11 .Contour plot representing th e population distribution of structures for the four low pH folding trajectories as a function of the number of native contacts and radius of gyration.
75 (e) (d) (a) (b) (c) Figure 12. Plots showing amino acids contact between the helices (A-H) along the four folding low pH trajectories. (a) sum of contacts for the first 300 configurations in the trajectory, (b) contacts from structure 301 to 600, (c) from structure 601 to 900, (d) from 901 to 1200, (e) from 1201 to 1500. 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 300 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 300 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 600 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 600 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 900 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 900 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 1200 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 1200 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 1500 (smd9_low_pH) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 1500 (smd9_low_pH) A B C
76 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 B D G H% right helixpath_length E F C A Figure 13 The variation of the helical content for each specific helix of apomyoglobin as a function of the normalized path length. This is an averag e over the four folding low pH trajectories computed.
77 Rg= 24.60 Rg= 22.98 Rg= 20.85 Rg= 17.84 Rg= 15.68 Rg= 14.48 Rg= 24.60 Rg= 22.98 Rg= 20.85 Rg= 17.84 Rg= 15.68 Rg= 14.48 Figure 14 Molecular snapshots along one of the low pH folding trajectories of apomyoglobin. The last structure (Rg = 14.48 ) is the native conformation. The snapshots taken were equally spaced along the trajectory.
78 Figure 15 Contour plot showing th e population distribution of conformation having different va lues of radius of gyration and helical content for the four low temperature folding trajectories computed. A residue is considered to have a helical conformation if the dihedral angles are -140.0 -37.5 and -69.0 -21.0.
79 Figure 16 Contour plot representing the population distribution of structures for the four low te mperature folding trajectories as a function of the number of native contacts and radius of gyration.
80 (a) (e) (d) (c) (b) Figure 17. Plots showing amino acids contact between the helices (A-H) along the four folding low temperature trajectories. (a) sum of contacts for the first 300 configurations in the trajectory, (b) contacts from structure 301 to 600, (c) from structure 601 to 900, (d) from 901 to 1200, (e) from 1201 to 1500. 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 300 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 300 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 600 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 600 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 900 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 900 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 1200 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 1200 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 1500 (cold265_2200) A B C 20406080100120140 20 40 60 80 100 120 140 H G FED C BA H G F E Daaaa 1500 (cold265_2200) A B C
81 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 020406080100 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 D C B E% right helixpath_length H F A G Figure 18 The variation of the helical c ontent for each specific helix of apomyoglobin as a function of the normalized path length. This is an average over the four folding low temperature trajectories computed.
82 Rg=25.55 Rg=22.65 Rg=20.53 Rg=18.40 Rg=16.41 Rg=14.48 Rg=25.55 Rg=22.65 Rg=20.53 Rg=18.40 Rg=16.41 Rg=14.48 Figure 19 Molecular snapshots along one of the low temperature folding trajectories of apomyoglobin. The last structure (Rg = 14.48 ) is the native conformation. The snapshots taken were equally spaced along the trajectory.
83 Figure 20. Comparison of the behavior of th e radius of gyration (Rg) as a function of the helical content for the folding trajectories obtained from the three different denaturing conditions analyzed in this work. This is an average over the different trajectories computed at each condition. 10 15 20 25 30 35 0 10 20 30 40 50 10 15 20 25 30 35 0 10 20 30 40 50 10 15 20 25 30 35 0 10 20 30 40 50 high temperature unfolding low temperature unfolding% right helix (total)Rg low pH unfolding 10 15 20 25 30 35 0 10 20 30 40 50 10 15 20 25 30 35 0 10 20 30 40 50 10 15 20 25 30 35 0 10 20 30 40 50 high temperature unfolding low temperature unfolding% right helix (total)Rg low pH unfolding
0,00,10,20,30,40,50,60,70,80,9 0 20 40 60 80 100 120 140 160 180 0,00,10,20,30,40,50,60,70,80,9 20 40 60 80 100 120 140 160 180 # of contactshelical content high temperature 84 Figure 21 Plot showing the total number of contacts as a function of the formation of the eight helices present in the apoMb. This analysis is for the high temperature condition. It evidences a spread behavior when considering these two propert ies. The red dots are the same contacts that are formed intensively in the picture 23.
0,00,10,20,30,40,50,60,70,80,9 0 20 40 60 80 100 120 140 160 180 0,00,10,20,30,40,50,60,70,80,9 0 20 40 60 80 100 120 140 160 180 # of contactshelical contentlow temperature 85 Figure 22 The information in this graphic is the same as in the previous plot, but now for the low temperature condition. It shows the development of the to tal number of contacts versus the formation of all the eight helic es of apoMb. It is possible to observe here the linear accumula tion of the contacts regarding the formation of the helices. The red dots are the same contacts that are displaying in figure 23.
0,00,10,20,30,40,50,60,70,80,9 0 20 40 60 80 100 120 140 160 180 0,00,10,20,30,40,50,60,70,80,9 0 20 40 60 80 100 120 140 160 180 # of contactslow pHhelical content 86 Figure 23 The distribution in this graphic is more illustrative about an accumulation of some intermediate state. The evolution of the contacts following the form ation of the helices shows a flat region, indicative of an other state in the fo lding route of apoMb. The red dots are the contacts that start to form intensively when the helical content is about 40%. These contacts keep increasing during the simulation, and they ar e present in the native apoMb (in a relative big number compared to other contacts).
HELICES OR LOOPS RESIDUES NATIVE DISTANCE () UNFOLDED DISTANCE () high-temp low Ph low temp A-(EF) HIS12-GLY80 1.1 2.2 3.3 3.1 A-(GH) VAL17-PRO120 0.8 3.5 5.2 6.4 A-(GH) VAL17-PHE123 0.7 4.9 5.0 4.7 B-(GH) HIS24-HIS119 0.9 3.9 2.2 5.6 B-(GH) GLY25-PRO120 0.9 3.8 1.9 2.2 B-G LEU29-ILE112 0.8 3.1 2.4 7.6 B-G LEU32-ILE111 0.7 3.2 2.1 7.6 B-G LEU32-LEU115 0.7 3.5 2.2 7.4 B-G PHE33-SER108 0.6 3.2 1.8 7.6 (BC)-D SER35-GLU52 0.5 2.1 2.8 6.3 (CD)-D LYS50-LYS56 0.7 1.2 1.6 7.3 G-(GH) HIS116-GLY124 1.2 1.3 1.9 3.8 G-(GH) SER117-GLY124 1.4 1.2 1.4 4.2 87 Figure 24. Relation between the amino acids belonging to each of the specific contacts associated with each specific helix or loop. The loops are indicated in parentheses.
88 HIS 119 Figura 25 Representation of the contact number 4 (HIS24-HIS119) from Table 1. It is represented the native stru cture. The figure shows the side chains of the two histidine residues. The distance between the -carbons of each residue is 0.9. HIS 24
89 Figure 26 Figure showing the distance between the contact number 4 (HIS24-HIS119) from Table 1, after the native structure of apoMb was unfolded through high temperature. The distance between the -carbons of each residue is 4.4 HIS 119 HIS 119
90 HIS 119 Figure 27 Figure showing the distance between the contact number 4 (HIS24-HIS119) from Table 1, after th e native structure of apoMb was unfolded through lowering the pH to obtain the denatured state. The distance between the -carbons of each residue is 2.2
91 HIS 119 Figure 28 Representation of the distance between the contact number 4 (HIS 24-HIS119) from Table 1, after the native structur e of apoMb was unfolded through low temperature. The distance between the -carbons of each residue is 5.6
92 GLY 124 Figure 29 Molecular representation of the native structure of apoMb showing the contact number 12 (HIS116-GLY124) from the Table 1. The figure shows the side chains of the two residues. The distance between the -carbons of each residue is 1.2 HIS 116
93 GLY 124 Figure 30 Figure showing the distance between the contact number 12 (HIS116-GLY124) from Table 1, after the native structur e of apoMb was unfolded through high temperature. The distance between the -carbons of each residue is 1.3
94 Figure 31 Representation of the distance between the contact number 12 (HIS116-GLY124) from Table 1, after the native structure of apoMb was unfolded through lowering the pH. The distance between the -carbons of each residue is 1.9
95 GLY 124 Figure 32 Figure showing the distance between the contact number 12 (HIS116-GLY124) from Table 1, after the native structur e of apoMb was unfolded through low temperature condition. The distance between the -carbons of each residue is 3.8
ABOUT THE AUTHOR Mariangela Dametto majored in biomedi cal science, and rece ived a Bachelor's Degree from Universidade Federal de So Paulo (Federal University of So Paulo), So Paulo, Brazil, in December of 2000. She re ceived her Master's Degree in Oncology from Hospital do Cncer (Cancer Hospital), So Paulo, Brazil, in November of 2002. She entered the Doctoral program, studying computational chemistry under the supervision of the Professor Alfredo E. Crden as, at the University of South Florida in Fall 2004. In addition to her research and formal coursework, she attended and made presentations at national and local conferences.