Harris et al.

Molecular Dynamics Simulations of the Glucocorticoid Receptor Protein in Complex With a Glucocorticoid Response Element in a 10 Angstrom Water Layer

Lester F. Harris, Michael R. Sullivan, Pamela D. Popken-Harris and David F. Hickok

Abbott Northwestern Hospital Cancer Research Laboratory
800 E. 28th St., Minneapolis, MN 55407

Correspondence should be addressed to Lester F. Harris Ph.D.
Email:
editor@epress.com
Submitted for publication: July 1995

Title Page Abstract Introduction Materials and Methods Results
Discussion Conclusions Acknowledgements References Table of Contents

Figure 1 Figure 2 Figure 3 Figure 4 Figure 5
Figure 6 Figure 7 Figure 8 Table I Table II


ABSTRACT

We investigated protein/DNA interactions, using molecular dynamics simulations in solvent computed for 600 picoseconds in a 10 Angstom water layer, between the glucocorticoid receptor (GR) DNA binding domain (DBD) amino acids and DNA of a glucocorticoid receptor response element (GRE) consisting of 29 nucleotide base pairs. Hydrogen bonding interactions were monitored. In addition, van der Waals and electrostatic interaction energies were calculated. Amino acids of the GR DBD DNA recognition helix formed both direct and water mediated hydrogen bonds at cognate codon/anticodon nucleotide base and backbone sites within the GRE DNA right major groove halfsite. Likewise amino acids in a beta strand structure adjacent to the DNA recognition helix formed both direct and water mediated hydrogen bonds at cognate codon/anticodon nucleotide base and backbone sites within both the GRE right and left major groove halfsites. In addition, amino acids within a predicted alpha helix located on the carboxyl terminus of the GR DBD interacted at codon/anticodon nucleotide sites on the DNA backbone of the GRE right major groove flanking nucleotides. These interactions together induced breakage of Watson-Crick nucleotide base pairing hydrogen bonds, resulting in significant structural changes and bending of the DNA into the protein.


INTRODUCTION

Is there a code for recognition between DNA regulatory proteins and cognate DNA binding sites? Biological experiments and molecular models of both prokaryotic and eukaryotic regulatory protein/DNA interactions have described a very specific event (1-15). However a code for DNA site-specific recognition was not detected. In fact, these findings have resulted in a debate regarding the existence of a recognition code (16-19).

Our laboratory has long been interested in the origin of the genetic code and DNA site specific recognition by DNA regulatory proteins and has made several key observations. We observed and reported that genetic information is conserved between prokaryotic and eukaryotic DNA regulatory proteins' DNA binding domains and their cognate sites on DNA to which they specifically bind, operators or response elements (18-19). As an example, we reported that genetic information is conserved between the DNA sequence of a well characterized glucocorticoid response element (GRE) (Genebank locus MMTPRGR1) and its flanking nucleotides and the c-DNA encoding the glucocorticoid receptor (GR) DNA binding domain (DBD) (Genebank locus HUMGCRA) (19). The GR DBD consists of 150 amino acids which fold into a structural motif of two "zinc finger" modules (20). Using genetic sequence search techniques, we were the first to locate and describe the GR DNA recognition alpha helix on the carboxyl terminus of the first zinc finger. We discovered the GR DNA recognition helix by observing that its encoding c-DNA shares genetic information with a GRE (19). By model building, we observed that amino acids of the DNA recognition helix were aligned with their cognate codon-anticodon nucleotides within the GRE DNA right major groove halfsite. This conservation of genetic information allowed us to hypothesize a code for DNA site specific recognition based on a stereochemical relationship between functional sites on amino acids and their codon-anticodon nucleotides.

The genomic structure of the human GR gene has been determined to consist of ten exons (21). The two zinc fingers of the DBD are separately encoded by two of the ten exons, 3 and 4. The DNA recognition helix is encoded in exon 3 at the splice junction site of exons 3 and 4. Adjacent to the DNA recognition helix is a beta strand structure encoded in exon 4. The recognition helix and beta strand structures are spliced at a conserved Gly residue and serve as a bridge which joins the two zinc fingers. Earlier, we observed that the carboxyl terminus of the GR DBD contains a predicted alpha helix structure encoded in exon 5 at the exon 4 and 5 splice junction site. We compared separately the nucleotide sequences of GR DBD exons 3, 4 and 5 with a nucleotide sequence (Genebank locus MMTPRGR1) known to contain GRE sites upstream of the mouse mammary tumor virus gene transcription start site (22). We observed nucleotide subsequence similarity between a well characterized GRE and its flanks and nucleotide sequences on the ends of exons 3, 4 and 5 at their splice junction sites. These sequences encode the DNA recognition helix in exon 3, a beta strand in exon 4 and a structure predicted to be an alpha helix in exon 5. By model building, we observed that amino acids located within the DNA recognition helix, the beta strand , and the predicted alpha helix of the GR DBD as described above are spaced so that they align with trinucleotides identical to cognate codon/anticodon nucleotides within the GRE major groove halfsites and flanking regions (22). These findings suggested that these GR DBD amino acids may interact with their codon-anticodon nucleotides within the GRE and its flanks.

Recently, using molecular dynamics simulations in solvent, we investigated protein/DNA interactions between the GR DBD amino acids and the GRE and its flanking nucleotides. We compared findings from a fully solvated 80 Angstrom water droplet GR DBD/GRE model with those from a 10 Angstrom water layer GR DBD/GRE model. Our findings indicated that the interactions between the GR DBD amino acids and the nucleotides of the GRE were independent of the hydration shell (23).

In the present study, we conducted 600 picoseconds of molecular dynamics simulations in a 10 Angstron water layer model, investigating interactions between the GR DBD amino acids and the GRE and its flanking nucleotides. Hydrogen bonding interactions were monitored. In addition, van der Waals and electrostatic interaction energies were calculated. The findings indicate that GR DBD amino acids of the DNA recognition helix have preferential electrostactic attraction toward their cognate codon-anticodon nucleotides and form both direct and water mediated hydrogen bonds at these nucleotide base and backbone sites within the GRE right major groove halfsite. Likewise, amino acids of the beta strand and the predicted alpha helix, described above, form hydrogen bonds with nucleotide base and DNA backbone sites at cognate codon/anticodon nucleotides within the GRE major groove halfsites and GRE flanking regions, respectively. These interactions together induce breakage of Watson-Crick nucleotide base pairing hydrogen bonds, resulting in significant structural changes and bending of the DNA toward the protein.


MATERIALS AND METHODS

Model Building:

The model of the GR DBD dimer used in this study (see figure 7) was derived from NMR atomic coordinates of the GR DBD (personal communication, Kaptein) (20). However, residues following Arg 510 in the NMR GR DBD structural determination were disordered, and no coordinates were reported. The amino acid sequence ranging from Arg 510 to Lys 517 contained a predicted alpha helix encoded by exon 5 which we reported earlier to have genetic similartity to the GRE flanking nucleotide regions (22). Therefore, in order to study potential interactions by amino acids of this predicted alpha helix and nucleotides flanking the GRE, it was necessary using the QUANTA program (24) from Molecular Simulations Inc., to create an alpha helix of the exon 5 encoded amino acids ranging from 511 to 517 and attach this structure to Arg 510; this modified GR DBD structure was used in a 10 Angstrom water layer model to study GR DBD amino acid interactions with a GRE and its flanking nucleotides. A model of B-form DNA of a naturally occurring MMTV GRE from GENBANK locus MMTPRGR1 in which we observed genetic similarity with the c-DNA encoding the GR DBD (19, 22) was likewise created using the NUCLEIC ACID BUILDER module from the QUANTA program (24). Solvated molecular dynamics simulation of the NMR GR DBD/GRE model is described below.

Dynamics Parameters:

The solvated molecular dynamics simulations were run on a CRAY YMP C-90 supercomputer using a specially optimized version of CHARMm (release version 22.1) which has an atom limit of 15,000. The 10 Angstrom water layer model required 6717 water atoms, the GR/GRE protein/DNA complex consisted of 2908 atoms resulting in a model of 9625 total atoms. The molecular dynamics simulation required 0.6 CRAY C-90 CPU hours of computational resources per picosecond of simulation.

The solvated model was minimized for 200 cycles using the Steepest Descents method. Then the structure was minimized for 100 cycles using the Adopted Basis Newton-Rapson method. Heating was run for 600 cycles, at 0.001 ps per cycle for a total of 0.6 picoseconds, resulting in 0.5o K temperature increase per cycle (from 0 to 300 degrees K). Equilibration was run for 1000 cycles (1 picosecond) resulting in an overall temperature RMS deviation of approximately 3 degrees K. Finally molecular dynamics were run with a step size of 0.001 picoseconds for an additional 600 picoseconds (600,000 cycles) using velocity scaling. A constant dielectric potential with an e value of 1.00 was used. A non- bonded cutoff of 15.00 angstroms was used. Non-bonded parameters were updated every 20 cycles and all energy terms were computed. For a detailed discussion of the CHARMm potential energy function see reference (25) and for a review of molecular dynamics implementation in the biological sciences see reference (26).

Explicit sodium counter-ions were used in the DNA model, based on geometry provided by Don Gregory Ph.D. from Molecular Simulations Inc. Zinc atoms were placed in the GR structure and tetrahedrally coordinated with the sulfur atoms from the "zinc-finger" cysteines. The residue topology file (RTF) for the "zinc-finger" cysteines was altered and a new residue type was created 'ZCY' (for zinc binding cysteine) in which the negative charges on the sulfur atoms were increased from -0.19 to -0.50 so that the charges from the four tetrahedrally coordinated cysteine sulfur atoms would neutralize the +2.0 charge on the zinc atom. In addition, the charges on the zinc binding cysteine beta carbons were increased from +0.19 to +0.40 and the charges on the alpha carbons were increased from +0.10 to +0.20 in order to maintain the ZCY residue at a net 0.0 charge.

DNA Groove Geometry Calculations:

The conformational changes of the DNA during dynamics were evaluated using the CURVES 4.1 program provided by Richard Lavery of Laboratoire de Biochimie Theorique CNRS (personal communication). The documentation provided describes CURVES as "an algorithm for calculating a helical parameter description for any irregular nucleic acid segment with respect to an optimal, global helical axis. The solution is obtained by minimizing a function which represents the variations in helical parameters between successive nucleotides as well as quantifying the kinks and dislocations which exist between successive helical axis segments". For more detailed information regarding the CURVES 4.1 program see references (27-28).

Interaction Energy Calculations:

Graphs of initial interaction energy between GRE nucleotides and selected GR DBD amino acids were calculated using CHARMm (25-26). In all graphs, interaction energy was calculated using a constant dielectric potential with an e value of 1.00. "Total Energy" is the sum of electrostatic interaction energy and Van der Waals interaction energy. The values given for the interaction of particular amino acid and nucleotide residues are the sum of the interaction energies of all atoms in those residues.

Hydrogen Bond Calculations:

The hydrogen bond interactions for the 10 Angstrom water layer GR/GRE model were recorded at 1.0 picosecond (1000 cycle) intervals. Frequencies of H-bonding (see table 1) interactions greater than 600 reflect multiple hydrogen bonds (i.e. when two or more of the grouped atoms from one residue interact at the same atom from another residue) for a given amino acid/nucleotide interaction. The hydrogen bonding interactions between amino acids encoded by exons 3, 4 and 5 of the GR DBD and nucleotides of the GRE and flanking regions were monitored. We used a distance-angle algorithm to compute hydrogen bonds which was based on the results of analysis of hydrogen bonding in proteins (29). The value used for the maximum distance allowed between the hydrogen atom and the acceptor was 2.5 angstroms. The value used for the maximum distance allowed between the atom bearing the hydrogen and the acceptor was 3.3 angstroms. The minimum angle at the acceptor was 90 degrees (limit = 0 to 180 degrees). The minimum angle at the hydrogen was 90 degrees (limit = 0 to 180 degrees). The minimum angle at the atom bearing the hydrogen was 90 degrees (limit = 0 to 180 degrees).


RESULTS AND DISCUSSION

Molecular Dynamics:

Recently, we reported that nucleotide subsequence similarity exists between a well characterized GRE and its flanking nucleotides and the c-DNA which encodes amino acids of the GR DBD (22). We also observed by model building that amino acids encoded at the splice junctions of exons 3, 4 and 5 of the GR DBD are aligned with their cognate codon/anticodon nucleotides within the GRE right and left major groove halfsites and flanks. This includes amino acids of the GR DNA recognition helix encoded in exon 3, a beta strand encoded in exon 4, adjacent to the DNA recognition helix and amino acids of a predicted alpha helix encoded in exon 5 at the exon 4 and 5 splice junction site (see figure 1 A-H). These findings suggested that the amino acids within the above structures may interact with their cognate codon/anticodon nucleotides within the GRE and its flanks. To investigate this possibility, we docked the GR DBD dimer at H-bonding distance within the DNA major groove halfsites of the GRE. Using the CHARMm program, we conducted 600 picoseconds of molecular dynamics. A GR DBD/ 29 bp GRE model , without the water molecules, is shown in figure 2. In this model the GR DBD is docked at approximately 10 Angstroms from the 29 bp GRE and flanking nucleotides for visual clarity. This model is to be used as a key for locating interactions found between the GR DBD amino acids and nucleotides of the GRE and its flanks during molecular dynamics, see table 1.

Amino Acid-Nucleotide Hydrogen Bonding Interactions:

Hydrogen bonding interactions between amino acids encoded by exons 3, 4 and 5 of the NMR GR DBD and nucleotides of the GRE and flanking regions were monitored. A summary of H-bonding interactions is shown in table 1. Equivalent functional sites on the amino acids are grouped: Lysine hydrogen bond donor sites HZ1, HZ2 and HZ3 are combined as HZ. Arginine hydrogen bond donor sites HH11 and HH12 are combined as HH1 and hydrogen bond donor sites HH21 and HH22 are combined as HH2. Glutamine hydrogen bond donor sites HE21 and HE22 are combined as HE2. Asparagine hydrogen bond donor sites HD21 and HD22 are combined as HD2. Glutamic acid hydrogen bond acceptor sites OE1 and OE2 are combined as OE. Likewise, the DNA backbone phosphate group hydrogen bond acceptor sites O1P and O2P are combined as OP. First and last occurrences of DNA/protein hydrogen bonds which includes minimization, heating and equilibration steps followed by the 600 picosecond production dynamics simulation are given in picoseconds of dynamics with their frequency of occurrence. Individual hydrogen bonds are labeled "C" for amino acid-nucleotide codon interactions, "AC" for amino acid-nucleotide anticodon interactions, "C*" and "AC*" for amino acid-nucleotide codon and anticodon interactions when the codon or anticodon sequence is present reading 3' to 5'. It can be seen in table 1 that the majority of H-bonding interactions for exon 3 encoded amino acids of the DNA recognition helix occur at codon/anticodon nucleotide sites within the GRE right major groove halfsite. Likewise, H-bonding interactions between the GRE right major groove halfsite and its flanking nucleotides and the exon 4 and exon 5 encoded amino acids of the beta strand and predicted alpha helix respectively occur at codon/anticodon nucleotide sites. In contrast, in the left major groove halfsite, with the exception of amino acid V 468 of the DNA recognition helix encoded in exon 3, just those amino acids of the exon 4 encoded beta strand form H-bonds at codon/anticodon nucleotide sites see table 1.

Electrostatic and van der Waals interactions:

We calculated van der Waals and electrostatic interaction energies between amino acids of the GR DBD and nucleotides on the sense and antisense strands of the GRE and its flanks. Calculations were performed on the minimized, heated and equilibrated structures at the beginning of the dynamics simulation and after 600 picoseconds in order to analyze the attractive forces between GR DBD DNA recognition helix amino acids and GRE nucleotides. A total energy (Kcal/M) interaction consisting of both van der Waals and electrostatic energy was determined. Total energy values were recorded for the hydrophilic amino acids of the GR DNA recognition helix and nucleotide base pairs within the GRE DNA right major groove halfsite. The maximal attractive energy potential for Lys 461, Lys 465, Arg 466 and Glu 469 sidechains was with their cognate codon or anticodon nucleotide base pairs found within a palindromic sequence, 5'-AAGAA-3'-5'-TTCTT-3', which has codons for Lys (AAG), Arg (AGA) and Glu (GAA) in both directions, 5'-to-3' and 3'-to-5', in the GRE DNA right major groove halfsite (see figure 3A, C-E). In addition, Val 462 showed a strong van der Waals interaction at the middle nucleotide of its codon GTT on the sense strand (see figure 3B). The van der Waals interaction of Val 462 at the middle nucleotide of its codon site in the right major groove halfsite agrees with our original prediction for this amino acid (19) which was recently confirmed by the findings of Luisi et al. (14). At the beginning of dynamics (0 picoseconds), the maximal attractive energy potential for Arg 466 was not directed toward its codon/anticodon nucleotide base pair (see figure 3D). However, during molecular dynamics, Arg 466 showed strong attractive energy potential for its codon nucleotide G38, AGA (see figure 3D ). Our results show global electrostatic attraction for GR DNA recognition helix amino acids toward their cognate codon/anticodon nucleotides within the GRE right major groove halfsite. In addition, Gln 471 of the exon 4 encoded beta strand has maximal attractive energy potential for its codon nucleotide on the sense strand, CAA, reading 3'-to- 5', see figure 3F.

Specific Amino Acid-Nucleotide Interactions:

The GR DBD is reported to preferentially and specifically bind to the GRE right major groove halfsite containing the 5'-TGTTCT-3'-5'-AGAACA-3' recognition sequence as a monomer which in turn facilitates cooperative dimerization and subsequent non specific interaction with nucleotides of the adjacent left major groove halfsite (30). We reported earlier that genetic information is conserved within the GRE right major groove halfsite for amino acids of the exon 3 encoded DNA recognition helix Amino acids (22). In addition, we also reported that genetic information is conserved within both the GRE left and right major groove halfsites and flanking regions for amino acids of the exon 4 encoded beta strand and the exon 5 encoded amino acids of a putative DNA binding alpha, respectively (23). Amino acids Lys 461, Lys 465 and Arg 466 of the GR DNA recognition helix are conserved at similar positions within the DNA recognition helices of the steroid receptor family; these amino acids of the GR DNA recognition helix have been reported to specifically bind DNA at GRE sites and regulate gene transcription (31). We observed that amino acids Lys 461, Lys 465 and Arg 466 form both direct and water mediated multidentate H-bonds at cognate codon/anticodon nucleotide base sites within the GRE right major groove halfsite, as shown in table 1 see figures 1A-H and 2 for reference. Close up views of specific amino acid-nucleotide H-bonding interactions are shown in figure 4A-J. In figure 4A, water mediated H- bonds between the sidechain of Lys 461 and its codon nucleotide A36 at the N7 base site is shown; water mediated H-bonding between Lys 461 and anticodon nucleotide C21 at the H41 site is also shown. Our molecular dynamic simulations also indicate that Val 462 of the DNA recognition helix has van der Waals interaction with it's codon nucleotide T19 within the GRE right major groove halfsite, see figure 4B. H- bonding interactions for amino acid Lys 465 are shown in figure 4C. Lysine 465 forms direct H-bonds with its codon nucleotide G38 at the N7 and 06 base sites. Lysine 465 also forms a direct H-bond with the N7 base site of its codon nucleotide A37, as well as, forming a direct H-bond with the O4 base site of nucleotide T20. These interactions, in concert, disrupt the Watson-Crick (WC) H-bonds between C21- G38 as can be seen in figure 4C. It is interesting to note that methylation of G38 has been reported to inhibit site specific DNA binding by the GR protein (32). In figure 4D, H-bonding interactions are shown for Arg 466, direct H-bonds are formed between its codon nucleotide A39 at the OP backbone site and at the codon nucleotide G38 at O5' and OP backbone sites. Glutamic acid 469 forms a water mediated H- bond with its codon nucleotide G38 at the phosphate backbone see figure 4E. Within the right major groove halfsite, methylation of G18 has also been reported to inhibit site specific DNA binding by the GR protein (32). Our results show that amino acids, Gln 471 and Asn 473, of the beta strand encoded in exon 4 at the splice junction site of exons 3 and 4 form both direct and water mediated H-bonds with nucleotide base sites O6 and N7 respectively on their anticodon nucleotide G18, see table 1 and figure 4F and G.

In addition, we recently observed that flanking the GRE major groove halfsites are sequences rich in purines 5'-TAAAACGA- 3' on the right and 3'-TCAAAAAC- 5' on the left. These sequences contain codons/anticodons for a cluster of hydrophilic amino acids (Arg 510, Lys 511, Thr 512, Lys 513, Lys 514, Lys 515, Ile 516 and lys 517) located within the predicted alpha helix on the carboxyl end of the GR DBD (22) see figures 1 and 2. It is interesting to note that the GRE and flanking nucleotide sequence in which we observed maximal nucleotide subsequence similarity to the GR DBD (22) is identical in sequence and location within the MMTV5LTR to that described by Scheidereit et al. as a GR binding site using nuclease footprinting (32-33) see figure 2. It is also interesting that the GR amino acids ranging from 510-517 are related in sequence to the nuclear localization signal of the simian virus SV40 T- antigen: Pro, Pro, Lys, Lys, Lys, Arg, Lys and Val (34). We report herein that amino acids Arg 510, Lys 513 and Lys 517 of the predicted alpha helix within the NMR GR DBD right monomer form both direct and water mediated H-bonds on the DNA backbone at codon/anticodon sites of the GRE right major groove halfsite and flanking nucleotide region, see table 1 and figure 4H-J. These interactions together induce DNA bending into the protein.

Hydrogen bonding interactions between exon 3 encoded amino acids of the left GR DBD monomer of the dimer and nucleotides of the GRE left major groove halfsite involve the same amino acids as seen in the right GR DBD monomer and occur at equivalent dyad symmetrical nucleotide positions as in the GRE right major groove halfsite. However, the wild type GRE major groove halfsites consist of an imperfect palindrome of the 5' TGTTCT 3' recognition sequence which occurs in the right major groove halfsite; the sequence 5'TGTAAC 3' occurs in the left major groove halfsite. Therefore codon/anticodon nucleotide sites for Lys 461, Lys 465 Arg 466 and Glu 469 of the DNA recognition helix are not present in the GRE left major groove halfsite (see figure 1A, C-D) and interactions occur at non-codon nucleotide sites (see table 1 ). However, methylation of G47 in the left major groove halfsite is reported to inhibit site specific DNA binding by the GR protein dimer (32). Our results show that Gln 471 and Tyr 474 form both direct and water mediated H-bonds with their codon/anticodon nucleotide base pair C12- G47, see table 1. This observation along with the specific atomic interactions which take place between the GR DNA recognition helix amino acids and their cognate codon/anticodon nucleotide bases within the GRE right major groove halfsite as described above, see table 1, figure 1A-D, figures 3A-F and 4A-J, supports our hypothesis that conservation of genetic information is a determinate of site specific DNA recognition and binding. Furthermore, the overall richness in codon/anticodon nucleotides for amino acids of the GR DBD DNA recognition helix encoded in exon 3, the beta strand in exon 4 and predicted alpha helix in exon 5 coupled with the atomic interactions by these amino acids at their conserved codon /anticodon sites in the GRE major groove halfsites and flanking regions (see figure 1A-H, table 1, figure 3A-F and figure 4A-J ) offer an explanation for the DNA binding preference reported for the GR at this particular GRE site (36) as opposed to the other GRE sites available in the LTR upstream of the MMTV gene initiation site.

Nucleotide-Nucleotide Hydrogen Bonding Interactions for the 29 BP GRE DNA:

Hydrogen bonding interactions between sense and antisense strand GRE nucleotides during 600 picoseconds of molecular dynamics on the GR DBD/GRE 29 BP DNA model are shown in table 2. A loss of one or more canonical Watson-Crick (WC) H-bonds can be seen occurring predominantly in the right major groove halfsite at nucleotide base pairs: C16-G43, T17-A42, G18-C41, T20-A39, C21-G38, T22- A37 and T23-A36. The majority of the loss in canonical H-bonds for these nucleotide base pairs can be accounted for by amino acid-nucleotide interactions at WC sites as shown in table 1 and by the non- canonical nucleotide-nucleotide H-bonding interactions shown in table 2. In the left major groove halfsite a complete loss of canonical WC H-bonds, during the entire dynamics simulation, occurs between nucleotide base pair A11-T48 due largely to the H-bonding interactions occurring between amino acid Arg 466 at WC sites on A11-T48 as shown in table 1; water mediated H-bonding interactions also occurred between amino acid Gln 471 and nucleotide A11 at the H62 WC site, table 1. The WC H-bonding for the other nucleotide base pairs of the GRE left major halfsite are predominantly canonical throughout the dynamics simulation.

Structural Changes in GR DBD Protein/GRE DNA Complex:

During molecular dynamics of the GR DBD/GRE complex, structural changes occur in both the DNA and protein. The DNA appears to wrap around the GR DBD DNA recognition alpha helices. In addition, nucleotides flanking the GRE major groove halfsites are drawn into amino acids of the exon 5 encoded predicted alpha helix. The minor groove between the left and right GRE major groove halfsite is compressed. The nucleotides in particular within the GRE right major groove halfsite show a loss of canonical WC base pairing H-bonds (see table 2). In addition, nucleotides flanking the right major groove halfsite show a decrease in minor groove width (see figure 5A-D). A closeup view of the GRE right major groove nucleotide sequence is shown in figure 5D. A loss of WC canonical H-bonding is apparent at nucleotide pairs G18-C41 and C21-G38. Methylation of guanine at these sites has been shown to inhibit binding of the GR protein (32). To further illustrate the geometric changes in the GRE DNA, using the CURVES program (27-28), GRE DNA major and minor groove width was analyzed after 600 picoseconds of molecular dynamics for the GR DBD/GRE model compared to GRE DNA at 0 picoseconds (see figure 6). The DNA major and minor groove widths determined at 0 picoseconds, 11.4 and 5.6 Angstroms, respectively are in close agreement with values reported for canonical B-DNA duplexes (28). These values were used to monitor changes in DNA major and minor groove width during molecular dynamics. An increase in width in the GRE right major groove halfsite can be seen. This observation is in agreement with results from GR DBD/GRE co-crystal findings (14). A decrease in minor groove width between the GRE DNA major groove halfsites was also observed. Similar findings have also been reported for certain prokaryotic DNA regulatory protein/DNA complexes (28). In addition, nucleotides of the minor groove flanking the GRE right major groove showed a marked decrease in width within the poly A/T sequence reflecting bending into the GR DBD protein. Similar findings of DNA bending have been reported for other DNA regulatory protein/DNA complexes (15, 36-37). Interactive molecular models of the GR/GRE complex before and after 600 picoseconds of molecular dynamics are shown in figure 7. In addition an MPEG movie showing the structural changes occurring in the GR DBD/GRE model during 600 picoseconds of molecular dynamics is shown in figure 8.


CONCLUSIONS

Our findings, reported herein, show that amino acids Lys 461, Lys 465, and Arg 466 of the GR DNA recognition helix encoded in exon 3 and Gln 471 of the beta strand encoded at the splice junction site of exons 3 and 4 adjacent to the GR DNA recognition helix specifically form both direct and water mediated H-bonds at their cognate codon/anticodon nucleotide base sites within the 5'-CTGTTCTT-3' -5'- AAGAACAG-3' recognition motif. In addition, Val 462 interacts by van der Waals with the middle nucleotide of its codon, GTT, and Glu 469 has strong electrostatic attraction toward its codon nucleotide A39, GAA. Therefore recognition of codon-anticodon nucleotides within the GRE DNA right major groove halfsite by amino acids of the GR DNA recognition helix offers an explanation for the GR DNA binding preference to the GRE major groove halfsite which contains the 5'-TGTTCT-3'- 5'-AGAACA-3' recognition motif(30).

Our findings indicate that GR site specific DNA recognition involves overlapping reading frames. In addition, our findings also suggest that site specific DNA recognition may be bi-directional, that is, amino acids may recognize their cognate codon/anticodon nucleotides reading 5'-to-3' or 3'-to-5'. This appears to be the case in the naturally occurring GRE right major groove halfsite palindrome sequence 5'-AAGAA- 3' on the antisense strand which has codon nucleotides for hydrophilic amino acids Lys (AAG), Arg (AGA), and Glu (GAA) of the GR DNA recognition helix in overlapping reading frames in both directions. These observations offer an explanation as to why more than one amino acid can interact with the same nucleotide and vice-versa (11) and still satisfy site specific DNA recognition according to our hypothesis. Unlike the 5'-TGTTCT-3' 5'-AGAACA-3' recognition motif which is conserved within the right major groove halfsite of GREs, the nucleotide sequences of the GRE flanking regions are not conserved (38). However, we detected conservation of genetic information between both flanks of a GRE and the GR DBD exon 5 encoded predicted alpha helix (22), see figure 1A, C. It is significant that this same GRE site, among the several located within the LTR nucleotide sequence upstream of the transcription start of the MMTV gene, has been reported to preferentially bind GR and have the highest transcription enhancing activity (35). Therefore, our findings indicate that conservation of genetic information (19,22) and the corresponding atomic interactions of amino acids of the GR DBD DNA recognition helix, beta strand and predicted alpha helix with cognate codon/anticodon nucleotides within a GRE and its flanking DNA sequence as reported herein are correlated with both DNA site specific recognition and transcription enhancement.

Our findings described herein and elsewhere (18-19, 22-23) strongly support the idea of a stereochemical basis for the origin of the genetic code (39-47) because amino acids within regulatory proteins' DNA recognition helices are consistently being found lining up with cognate codon-anticodon nucleotides within their specific DNA binding sites. These findings also suggest that these structures may have been template dependent in their evolution (i.e. peptides acting as templates for nucleotide polymerization or vice-versa (48-50). Our observations that genetic information is conserved between the GRE and its flanking nucleotides and nucleotide sub-sequences at the splice junction sites of exons 3, 4 and 5 which encode the DNA recognition helix, beta strand and predicted alpha helix, respectively, of the GR DBD implies that these structures are primordial molecular recognition modules which have been conserved. Therefore, we propose that prebiotic, template directed autocatalytic synthesis of mutually cognate peptides and polynucleotides resulted in their amplification and evolutionary conservation in a contemporary eukaryotic organism as a modular genetic regulatory apparatus. Finally, the amino acid- nucleotide atomic interactions described herein confirm our original prediction that conservation of genetic information is a determinate of site specific DNA recognition for DNA regulatory proteins (18-19, 22-23).


ACKNOWLEDGEMENTS

We thank Don Gregory of Molecular Simulations Inc. for providing geometry for explicit sodium counter-ions used in all simulations and for Zn atom placement and charge parameters for Zn binding cysteines in the "zinc fingers" of the GR DBD structures. We also thank the Molecular Simulations Inc. staff for software support with QUANTA, Michael Fenton of Fentonnet.com for data reduction programs, Barry Bolding of Cray Research Inc. for CHARMm software optimization on the CRAY C-90, Minnesota Supercomputer Institute Scientific Director, Don Truhlar for support and encouragement, the Minnesota Supercomputer Center user services representatives for technical support on the CRAY-2 and C-90, R. Kaptein for personal communication of GR NMR structural coordinates, R. Lavery for providing CURVES 4.1 software and special thanks are due to Charlie Larson of Silicon Graphics Inc. for hardware support with the IRIS 4D 320-GTX workstation. This work was supported in part by a research grant from the Minnesota Supercomputer Institute, Minneapolis MN. This work was also supported by a research fellowship in memory of William Lang Jr..


REFERENCES

  1. Ptashne, M. Specific binding of Lambda phage repressor to Lambda DNA. Nature 214, 232-234 (1967) MEDLINE

  2. McKay, D., Weber, I. and Steitz, T. Structure of catabolite gene activator at 2.9 angstroms resolution. Incorporation of amino acid sequence and interactions with cyclic AMP. J. Biol. Chem. 257, 9518-9524 (1982). MEDLINE

  3. Takeda, Y., Ohlendorf, D., Anderson, W., Matthews, B. DNA-binding proteins. Science 221, 1020-1026 (1983) MEDLINE

  4. Pabo, C., Sauer, R. Protein-DNA recognition. A. Rev. of Biochem. 53, 293-321 (1984). MEDLINE

  5. Marx, J. A crystalline view of protein-DNA binding. Science 229, 846-848 (1985). MEDLINE

  6. Schleif, R. DNA binding by proteins. Science 241:1182-1187 (1988). MEDLINE

  7. Otwinowski, Z., Schevitz, R., Zhang, R., Lawson, C., Joachimiak, A., Marmorstein, R., Luisi-B-F and Sigler, P. Crystal structure of trp repressor/operator complex at atomic resolution. Nature 335, 321-329 (1988) MEDLINE

  8. Aggarwal, A., Rodgers, D., Drottar, M., Ptashne, M. and Harrison, S. Recognition of a DNA operator by the repressor of phage 434: A view at high resolution. Science 242, 899-907 (1988). MEDLINE

  9. Harrison, S., Anderson, J., Koudelka, G., Mondragon, A., Subbiah, S., Wharton, R., Wolberger, C. and Ptashne, M. Recognition of DNA sequences by the repressor of bacteriophage 434. Biophys. Chem. 29, 31-37 (1988) MEDLINE

  10. Brenowitz, M., Senear, D. and Ackers, G. Flanking DNA-sequences contribute to the specific binding of cI-repressor and OR1. Nucleic Acids Res. 17, 3747-3755 (1989) MEDLINE

  11. Harrison, S. and Aggarwal, A. DNA recognition by proteins with the helix-turn-helix motif. Annu. Rev. of Biochem. 59, 933-969 (1990). MEDLINE

  12. Schwabe, J., Neuhaus, D. and Rhodes, D. Solution structure of the DNA-binding domain of the oestrogen receptor. Nature 348, 458-461, (1990). MEDLINE

  13. Baleja, J. and Sykes, B. Comparison of the structures of operator DNA free and in complex with Lambda repressor. Biochemistry and Cell Biology 69, 202-205 (1991). MEDLINE

  14. Luisi, B., Xu, w., Otwinowski, Z., Freedman, L., Yamaoto, K. and Sigler, P. Crystallographic analysis of the interaction of the glucocorticoid receptor with DNA. Nature 352, 497-505 (1991). MEDLINE

  15. Schultz, S., Shields, G. and Steitz, T. Crystal structure of a CAP-DNA complex: The DNA is bent by 90 degrees. Science 253, 1001-1007 (1991) MEDLINE

  16. Beato, M. Modulation of gene expression through DNA binding proteins: Is there a regulatory code? Haemat. Blood Transf. 29, 217-223 (1985). MEDLINE

  17. Matthews, B. No code for recognition. Nature 335, 294-295 (1988). MEDLINE

  18. Harris, L., Sullivan, M. and Hickok, D. Conservation of Genetic information between regulatory protein DNA binding alpha helices and their cognate operator sites: A simple code for site-specific recognition. Comp. Math. with Appl. 20, 1-23 (1990)

  19. Harris, L., Sullivan, M. and Hickok, D. Genetic sequences of hormone response elements share similarity with predicted alpha helices within DNA binding domains of steroid receptor proteins: A basis for site-specific recognition. Comp. Math. with Appl. 20, 25-48 (1990).

  20. Hard, T., Kellenbach, E., Boelens, R., Maler, B., Dahlman, K., Freedman, L., Carlstedt-Duke, J., Yamamoto, K., Gustafsson, J. and Kaptein, R. Solution structure of the glucocorticoid receptor DNA-binding domain. Science 249, 157-160 (1990). MEDLINE

  21. Encio, I., Detera-Wadleigh, S. The genomic structure of the human glucocorticoid receptor. J Biol. Chem. 266, 7182-7188 (1990). MEDLINE

  22. Harris, L., Sullivan, M. and Hickok, D. Conservation of Genetic information: A code for site- specific DNA recognition. Proc. Natl. Acad. Sci. USA 90, 5534-5538 (1993). MEDLINE

  23. Harris, L., Sullivan, M. Popken-Harris, P. and Hickok, D. Molecular dynamics simulations in solvent of the glucocorticoid receptor protein in complex with a glucocorticoid response element DNA sequence. J. Biomol. Struct. Dyn. 12, 249-270 (1994) MEDLINE

  24. Quanta is a molecular modeling and display tool developed by Molecular Simulations Inc., (200 Fifth Avenue, Waltham, Massachusetts 02254) which allows the construction of molecular models of DNA sequences, point mutations of existing models and the modeling of small peptides with a selected secondary structure.

  25. Brooks, B., Bruccoleri, R., Olafson, B., States, D., Swaminathan, S. and Karplus, M. CHARMm: A program for macromolecular energy, minimization and dynamics calculations. J. Comput. Chem. 4, 187-217 (1983)

  26. Karplus, M. and Petsko, G. Molecular dynamics simulations in biology (review). Nature 347, 631-639 (1990). MEDLINE

  27. Lavery, R. and Sklenar, H. The definition of generalized helicoidal parameters and of axis curvature for irregular nucleic acids. J. Biomol. Struct. Dyn. 6, 63-91 (1988). MEDLINE

  28. Stofer, E. and Lavery, R. Measuring the geometry of DNA grooves. Biopolymers 34, 337-346 (1994). MEDLINE

  29. Baker, E. and Hubbard, R. Hydrogen bonding in globular proteins. Prog. Biophys. Mol. Biol. 44, 97-179 (1984).

  30. Tsai, S., Carlstedt-Duke, J., Weigel, N., Dahlman, K., Gustafsson, J-A., Tsai, M-J. and O'Malley, B. Molecular interactions of steroid hormone receptor with its enhancer element: Evidence for receptor dimer formation. Cell 55, 361-369 (1988). MEDLINE

  31. Hollenberg, S. and Evans, R. Multiple and cooperative trans-activation domains of the human glucocorticoid receptor. Cell 55, 899-906 (1988). MEDLINE

  32. Scheidereit, C., Beato, M. Contacts between hormone receptor and DNA double helix within a glucocorticoid regulatory element of mouse mammary tumor virus. Proc. Natl. Acad. Sci. USA 81, 3029-3034 (1984). MEDLINE

  33. Scheidereit, C., Geisse, S., Westphal, H., Beato, M. The glucocorticoid receptor binds to defined nucleotide sequences near the promoter of mouse mammary tumor virus. Nature 304, 749-752 (1983). MEDLINE

  34. Picard, P and Yamamoto, K. Two signals mediate hormone-dependent nuclear localization of the glucocorticoid receptor. EMBO J. 6, 3333-3340 (1987). MEDLINE

  35. Buetti, E., Kuhnel, B. Distinct sequence elements involved in the glucocorticoid regulation of the mouse mammary tumor virus promoter identified by linker scanning mutagenesis. J. Mol. Biol. 190, 379-389 (1986). MEDLINE

  36. Leidig, F., Baxter, J. and Eberhardt, N. Thyroid hormone receptors induce DNA bending: potential importance for receptor action. Transactions of the Association of American Physicians 103, 154- 162 (1990). MEDLINE

  37. Nardulli, A. and Shapiro, D. Binding of the estrogen receptor DNA-binding domain to the estrogen response element induces DNA bending. Mol. Cell. Biol. 12, 2037-2042 (1992). MEDLINE

  38. Chalepakis, G., Postma, J., Beato, M. A model for hormone receptor binding to the mouse mammary tumour virus regulatory element based on hydroxyl radical footprinting. Nucleic Acids Res. 16:10237-10247 (1988). MEDLINE

  39. Woese, C. Models for the evolution of codon assignments. J. Mol. Biol. 43, 235-240 (1969). MEDLINE

  40. Woese, C. The fundamental nature of the genetic code: Prebiotic interactions between polynucleotides and polyamino acids or their derivatives. Proc. Natl. Acad. Sci. USA 59, 110-117 (1968). MEDLINE

  41. Hendry, L., Bransome Jr., E., Hutson, M. and Campbell, L. A newly discovered stereochemical logic in the structure of DNA suggests that the genetic code is inevitable. Perspect. Biol. Med. 27, 623-651 (1984). MEDLINE

  42. Hendry, L, Mahesh, V., Bransome Jr., E., Hutson, M. and Campbell, L. A stereochemical rationalle for the genetic code derived from complementary fit of amino acids into cavities formed in codon/anticodon sequences in double stranded DNA: Further evidence based upon noncomplementarity of untranslated amino acids. The World Wide Web Journal of Biology 1, (1995).

  43. Lacey, J. and Mullins Jr. D. Experimental studies related to the origin of the genetic code and the process of protein synthesis - a review. Origins of Life 13, 3-42 (1983). MEDLINE

  44. Lacey, J. and Mullins Jr. D. The case for the anticode. Origins of Life 14, 505-511 (1984). MEDLINE

  45. Lacey, J., Wickramasinghe, N. and Cook, G. Experimental studies related to the origin of the genetic code and the process of protein synthesis - a review update. Origins of Life 22, 243-275 (1992). MEDLINE

  46. Yarus, M. & Christian, E. Genetic code origins. Nature 342, 349-350 (1989). MEDLINE

  47. Yarus, M. An RNA-amino acid complex and the origin of the genetic code. New Biologist 3, 183-189 (1991). MEDLINE

  48. Nelsestuen, G. Amino acid - directed nucleic acid synthesis. A possible mechanism in the origin of life. J. Mol Evol. 11, 19-120 (1978). MEDLINE

  49. Nelsestuen, G. Amino acid catalyzed condensation of purines and pyrimidines with 2- deoxyribose. Biochemistry 18, 2843-2846 (1979). MEDLINE

  50. Lacey, J., Staves, M. and Thomas, K. Ribonucleic acids may be catalysts for the preferential synthesis of L-amino acid peptides: a minireview. J. Mol. Evol. 31(3) 244-248 (1990). MEDLINE

  51. Dayhoff, M. Atlas of protein sequence and structure. National Biomedical Research Foundation, Silver Spring, MD. (1978).

  52. Miesfeld, R., Godowski, P., Maler, B. and Yamamoto, K. Glucocorticoid receptor mutants that define a small region sufficient for enhancer activation. Science 236, 423-425 (1987). MEDLINE

  53. Payvar, F., DeFranco, D., Firestone, G., Edgar, B., Wrange, O., Okret, S., Gustafsson, J. and Yamamoto, K. Sequence-specific binding of glucocorticoid receptor to MTV DNA at sites within and upstream of the transcribed region. Cell 35, 381-392 (1983). MEDLINE

  54. Carson, M. & Bugg, C. E. Algorithm for ribbon models of proteins. J. Mol. Graphics 4, 121-122 (1986).


© 1995 Epress Inc.