Structural organization of the mouse cytosolic malate dehydrogenase gene: Comparison with that of the mouse mitochondrial malate dehydrogenase gene

Structural organization of the mouse cytosolic malate dehydrogenase gene: Comparison with that of the mouse mitochondrial malate dehydrogenase gene

J. Mol. Biol. (1988) 202, 355364 Structural Organization of the Mouse Cytosolic Malate Dehydrogenase Gene: Comparison with that of the Mouse Mitochon...

2MB Sizes 4 Downloads 95 Views

J. Mol. Biol. (1988) 202, 355364

Structural Organization of the Mouse Cytosolic Malate Dehydrogenase Gene: Comparison with that of the Mouse Mitochondrial Malate Dehydrogenase Gene Chiaki Setoyama, Tadashi Joh, Teruhisa Tsuzuki and Kazunori Shimada? Department of Biochemistry Kumamoto University Medical School 2-2-1 Honjo, Kumamoto 860, Japan (Received 25 September 1987) We cloned and characterized a mouse cytosolic malate dehydrogenase (cMDHase) (EC 1.l .I .37) gene, which is about 14 x lo3 base-pairs long and is interrupted by eight introns. The 5’ and 3’ flanking regions and the exact sizes and boundaries of the exon blocks, including the transcription-initiation sites, were determined. The 5’ end of the gene lacks the TATA and CAAT boxes characteristic of eukaryotic promoters, but contains G + C-rich sequences, one putative binding site for a cellular transcription factor, Spl , and at least two major transcription-initiation sites. The sequences around the transcription-initiation sites are compatible with the formation of a number of potentially stable stem-loop structures. We compared structural organization of the mouse cMDHase gene with that of the previously characterized mouse mitochondrial MDHase (mMDHase) gene, and found that the conservation of intron positions spreads across much of the two genes. This result suggests that a common ancestral gene for the cytosolic MDHase and the mitochondrial MDHase was broken up by introns, before the divergence. We also compared the nucleotide sequence of the promoter region of the mouse cytosolic MDHase gene with that of the other three mouse genes coding for isoenzymes participating in the malate-aspartate shuttle, i.e. mitochondrial MDHase, cytosolic and mitochondrial aspartate aminotransferases (cAspATase and mAspATase). We found that highly conserved regions are present in the promoter region of the cAspATase gene.

1. Introduction

role in the malate-aspartate shuttle operative in a metabolic co-ordination between the cytosol and mitochondria, in various mammalian tissues (Williamson et al., 1973). The genes coding for the cMDHase and mMDHase are believed to have originated from a common ancestral gene during the course of evolution (Birktoft & Banaszak, 1983). We reported that the homology of the amino acid sequences between the mouse cMDHase and thermophilic bacterial MDHase, as well as the homology between the mouse mMDHase and Escherichia coli MDHase, markedly exceeds the intraspecies sequence homology between cMDHase and mMDHase from mice (Joh et al., 1987a). On the basis of these findings, we proposed a possible mode of evolution of the MDHase genes (Joh et aE., 1987a). To study the molecular mechanism(s) of transcription of mammalian isoenzyme genes and

dehydrogenase Malate (MDHaseS; r,-malate : NAD+ oxidoreductase, EC 1.1.1.37), an NAD+-dependent dehydrogenase, occurs in two distinct isoenzymic forms in animal cells, one being located in the cytosol (cMDHase) and the other in the mitochondrial matrix (mMDHase). These two isoenzymes, in co-operation with aspartate aminotransferase (AspATase) isoenzymes, play a crucial

t Author to whom all correspondence should be addressed. $ Abbreviations used: MDHase, malate dehydrogenase; cMDHase, cytosolic MDHase; mMDHase, mitochondrial MDHase; AspATase, aspartate aminotransferase; mAspATase, mitochondrial AspATase; cAspATase, cytosolic AspATase; kb, lo3 base-pairs; bp, base-pairs; SK! is O.lFi M-NaCI. 0.015 M-trisodium citrate, pH 7.0. 0022-2836/SS/l5035&10

$03.00/O

355

0 1988 Academic Press Limited

356

C. Setoyama

observe the structural and evolutionary relationships, we investigated the gene organizations of cytosolic and mitochondrial isoenzymes, such as cMDHase, mMDHase, cytosolic AspATase and mitochondrial AspATase (cAspATase) (mAspATase). Since the mouse seems to be the most suitable species for such studies, we first isolated and sequenced mouse cDNAs for cMDHase (Joh et al., 1987a), mMDHase (Joh et al., 1987b), cAspATase and mAspATase (Obaru et al., 1986). Subsequently, we isolated and characterized mouse genomic DNAs for mAspATase (Tsuzuki et al., cAspATase (Obaru et al., 1988) and 1987) mMDHase (Takeshima et aZ., 1988). We have since attempted to isolate the cMDHase gene. We report in this paper the structural organization of the mouse cMDHase gene and compare its 5’ flanking region with that of the mouse mMDHase, cAspATase and mAspATase genes. The presence of several highly homologous regions between the mouse cMDHase and cAspATase genes suggests t,hat these two cytosolic isoenzyme genes might be regulated co-ordinately at the transcriptional level.

2. Materials and Methods (a) Enzymes and chemicals Enzymes and chemicals were purchased from the following sources: restriction enzymes from Takara Shuzo Co., Ltd (Kyoto, Japan), New England Biolabs and Toyobo Co., Ltd (Osaka, Japan); Escherichia coli DNA polymerase I from New England Biolabs; bacteriophage T4 polynucleotide kinase, T4 DNA ligase, Klenow enzyme, and dideoxy DNA sequencing reagents from Takara Shuzo; reverse transcriptase from Seikagaku Kogyo Co., Ltd (Tokyo); calf intestinal alkaline phos[y-=P]ATP Boehringer-Mannheim; from phatase (7000 Ci/mmol) and [a-‘*P]dCTP (3000 Ci/mmol) from New England Nuclear and Amersham, respectively. (b) DNAs used as hybridization

probes

A mouse cMDHase cDNA cloned in pmcMDH-5 has been characterized (Joh et al., 1987a). The following DNA fragments were prepared from the pmcMDH-5 and were used as probes: a mixture of 200 bp PatI, 610 bp PwuII and 630 bp PvuII fragments as a total probe, and a 140 bp P&I-PvuII fragment as a 5’ end probe. These DNA fragments were radioactively labeled with [a-32P]dCTP by nick-translation, as described by Rigby et al. (1977).

et al. (d) Restriction enzyme mapping and Southern blotting Enzyme digests of phage-cloned DNAs and total genomic DNAs were electrophoresed on 04% agarose and onto nitrocellulose filters, gels, transferred hybridized to one of the cDNA probes (10” cts/min per pg), as described (Southern, 1975; Tsuzuki et al., 1983). Restriction enzyme mapping of phage-cloned DNAs as well as selected DNA fragments was carried out as described by Tsuzuki et al. (1985). (e) DNA sequence analysis Exon-containing BamHI fragments were subcloned from the phage clones. Exon regions were then isolated from the subclones, inserted into pUC18, and sequenced by the dideoxyribonucleotide chain-termination method of Sanger (1981). For exons containing unique restriction endonuclease sites present in the cDNA, the corresponding enzymes were used to generate DNA fragments so that the exons would be adjacent to the priming site in pUC18 for sequence analysis. Entry, editing and analyses of the sequence data were by GRASE and GENIAS programs purchased from Mitsui Knowledge Industry (Tokyo, Japan), using a personal computer, NEC PC-9801 VX. All the cloning procedures were carried out in accordance with the guidelines for research involving recombinant DNA molecules, issued by the Ministry of Education, Science and Culture of Japan. (f) Primer extension The primer extension reaction was carried out according to Agarwal et al. (1981). A 5’-end-labeled oligodeoxyribonucleotide, comsynthetic 20-base plementary to a part of the mouse cMDHase cDNA sequence, from nucleotide positions - 39 to -20 (Joh et al., 1987a), was used as a primer. A 5 pg sample of mouse heart poly(A)+ RNA and 5 x lo5 cts/min of the primer (spec. act., 2 x IO6 cts/min per pmol) was annealed under the conditions described by Agarwel et al. (1981). After the annealing procedures, the reaction mixture was adjusted to 50 mm-Tris . HCl (pH 8.3) 5 mlcl-dithiothreitol, 15 mM-Mgcl,, 60 mM-NaCl, 100 pg bovine serum and O-5 mM each of the 4 unlabeled albumin/ml, deoxynucleotide triphosphates. With the addition of reverse transcriptase to 500 units/ml, the samples were reverse-transcribed for 90 min at 37°C and loaded onto a 6% urea-containing gel adjacent to sequence ladders, synthesized according to the method of Sanger (1981) and using the same primer.

3. Results (a) Isolation

(c) Isolation of clone8 containing the mouse cMDHase gene Mouse total DNA was extracted from the liver of a male C3H/He mouse, as described by Blin & Stafford (1976). A bacteriophage lambda L47.1 (Loenen & library was Brammar, 1980)/mouse genomic DNA constructed on BamHI partial digests of the mouse total genomic DNA, as described by Man&is et al. (1982). This library was screened using either the total or 5’ end probe of the cMDHase cDNA. Positive clones were rescreened at least twice, and the DNAs prepared from the positive clones were characterized by restriction endonuclease mapping.

and restriction the mouse cMDHase

mapping gene

of

We have reported the isolation and sequencing of a cDNA clone for mouse cMDHase, the nucleotide sequence of which corresponds to the almost fulllength mRNA this cDNA as genomic clones partial DNA Southern blot

sequence (Joh et aZ., 1987a). Using a probe, we isolated two different BamHI from a lambda L47.l/mouse

library (Lm mcMDH-1 and -2). analysis (Southern, 1975) of the BamHI and BamHIIHindIII digests of mouse total

genomic

cMDHase

DNA,

cDNA

examined as

using

a probe,

the

total

gave

mouse

bands

of

Mouse Cytosolic MDHase Gene (0)

357

(b) I

2

3

I

2

3

12.3

4.6 4.4

-

Figure 1. Southern blot analyses of the mouse genomic DNA and the phage-cloned DNAs. DNAs were digested with (a) BuntHI and (b) BarnHI and HindIII, electrophoresed on 04% agarose gels, and transferred to nitrocellulose filters. The blots were hybridized with the mouse total cMDHase probe (seeMaterials and Methods). Lane 1 corresponds to the mouse genomic DNA; and lanes 2 and 3 to the 2 different phage clones, LmmcMDH-1 and -2. Sizes of the restriction fragment are shown in kb.

(b) Structure of the 5’ end region and transcription-initiation site of the muse cMDHase gene

essentially the same size as those estimated for the cloned genomic DNA (Fig. 1). These two clones were carrying an overlapping portion of a chromosomal segment (Figs 1 and 2). These findings provide support for a single chromosomal locus of the mouse eMDHase gene and indicate that there is no DNA rearrangement during cloning procedure. The restriction map of the mouse cMDHase gene, deduced from the analysis of cloned DNA fragments,

is shown

in Figure

2, and indicates

The nucleotide sequence analysis of 539 bp upstream from the translation-initiation site of mouse cMDHase reveals several characteristic features of the structure. First,, there is neither a TATA box, nor a CAAT box-like sequence. Second, the 5’ flanking region of the cMDHase gene, spanning from nucleotide positions -350 to -50, has a high content of G and C residues (60.6%). Third, there are many CG or GC dinucleotide sequences. As similar clusters of CG or GC sequences have been found in the promoter regions of the mouse mAspATase (Tsuzuki et al., 1988), cAspTAase (Obaru et al., 1988) and mMDHase (Takeshima et al., 1987) genes. Fourth, within this region, there is a putative binding site for a cellular transcription factor, Spl (Dynan & Tjian, 1983a,b;

that

the gene spans about 14 kb. Exons in the gene were first roughly located by Southern blqt hybridization of various restriction fragments of the cloned DNA, using nick-translated cDNA as a probe, and then determined by sequence analysis with reference to the cDNA sequence (Joh et al., 1987a). We thus located nine exons, as shown in Figure 2. We then determined the 5’ and 3’ flanking regions and the exact sizes and boundaries of the exon blocks (Fig. 3).

B genom~c DNA

I

LmmcMDH-I

I

LmmcMDH-2

H

I

I

I m I 0

H

H

I

I

I

I

I3

2

3

. I

, .

i3

I

J

L

I

4

5

I m

I m

H

1

678 I .

9 ‘I I,

B

I I x

I

I

H I kb

Figure 2. Restriction maps of the mouse cMDHase gene. The top line shows the cMDHase gene structure and filled boxes indicate approximate locations and sizes of exons. The translation-initiation (0) and stop ( x ) sites are indicated. The genomic DNA region covered by the 2 overlapping phage clones is also shown. BarnHI sites (B) and Hind111 sites (H) are indicated.

-400

-300

-250

-350

-150

-50

INTRON 1 (4.8 kb) ----------TAAACTAGTGGTCTTTGTCATTACAG

+1 Met ATG GTGAGG

Ser Glu Pro Ile Arg Val Leu Val Thr Gly Ala TCT GAA CCA ATC AGA GTC CTT GTG ACT GGA GCA

1

INTRON 2 (0.8 kb) ----------GCCTGCTGTCCTTGCTCTTTGGCAG

sp Val Ile Ala Thr Asp Lys Glu Glu Ile Ala Phe Lys Asp Leu Asp Val Ala Val Leu Val Gly Ser Met Pro Ar AT GTC ATT GCA ACG GAC AAA GAA GAG ATT GCC TTC AAA GAC CTG GAT GTG GCT GTC CTA GTG GGC TCC ATG CCA AG

INTRON 4 (0.9 kb) ----------TCTGCTCTGTGCCTCCACCATCTAG

Val Ile Val Val Gly Asn GTC ATT GTT GTG GGA AAC

Pro Ala Asn Thr Asn Cys Leu Thr Ala Ser Lys Ser Ala Pro Ser Ile Pro Lys Glu Asn Phe Ser Cys Leu Thr Arg Leu Asp Bis ASn CCA GCC AAT ACG AAC TGC CTG ACA GCC TCC AAG TCA GCG CCA TCG ATC CCC AAG GAG AAT TTC AGT TGC CTG ACT CGC TTG GAC CAC AAC

ys Ser Val Lys AA TCA GTT AAG GTGACTCACACAGATTTCATGGGGT----------

g Arg Glu Gly Met Glu Arg Lys Asp Leu Leu Lys Ala Asn Val Lys Ile Phe Lys Ser Gln Gly Thr Ala Leu Glu Lys Tyr Ala Lys L A AGG GAA GGC ATG GAG AGG AAG GAC CTA CTG AAA GCC AAT GTG AAA ATC TTC AAA TCC CAG GGC ACA GCC TTG GAG AAA TAC GCC AAG A

TGTTTGCCATGTCCATAG

INTRON 3 (1.6 kb) ----------TGTGTGT

Pro Ile Ile Leu Val Leu Leu Asp Ile Thr Pro Met Met Gly Val Leu Asp Gly CCC ATC ATT CTT GTG CTG TTG GAC ATC ACC CCC ATG ATG GGT GTT CTG GAC GGT

Val Leu Met Glu Leu Gln Asp Cys Ala Leu Pro Leu Leu Gln A GTC CTG ATG GAA CTG CAA GAC TGT GCC CTT CCC CTT CTG CAG G GTGAGTTGGAAGTCAAAGAAAACAG----------

--

Ala Gly Gln Ile Ala Tyr Ser Leu Leu Tyr Ser Ile Gly Asn Gly Ser Val Phe Gly Lys Asp Gln GCT GGT CAA ATT GCA TAT TCA CTG TTG TAC AGT ATT GGA AAT GGA TCT GTC TTT GGG AAA GAC CAG GTAGGGGCATGTTCTTATAAATAC--------

TGGGCTCTGGAACTCACAC----------

1

2 2 -2 u* TTGCGGGCCAGCCCCGGTTCTCTCCCAGAGTCTGTTCCGCTGTAGAGGTGACCTGACTGCTGGAGACTGCCTTTTGCj\GGTGCAGAGATCGGCC~GT~GCAATA

-100

TAGGAAGAAGGGGTTTGGGGGAATTGTAGTTGTAGTTTAGCACTG~AGG~TGCACG~GGTGGGCGCCAGAGGTCGCGGAAGkACTACACTTCCCAGAAAGGGGCCGTGTCTCCAG~CG~GCCT

-200

CTCTCCTGCCAATTGCTGAGCGCCATCAGGCAGGCGCCTCACTCAAAGCACCAACCCTCTGCTCACAGACGCGCTCCAATCACCGAGGCTCAGCCCGGGACTACTTTGCAGCGAGGCGCG

.

GGCTTTAAGCAACGGAAGGTCTCTTA~~~TGTTTAGTCTTGGGGAGGATAGATTCTCGTGGAGCGACGTGTGTGTCGCTCAGGGGTCGGTTTCTCCTCCCTCGAGTTkACGCCTC

-450

GTTCTTCCGCAAGCGTCAATTCCTCCCGCCTCTGAGAGAGTTTTT~GGTTTGTTTCCGGGTCGAGCG

-500

TGGCTTTTAGATTTA----------

INTRON 5 (3.4 kb) ----------TGATATGATGTTTTACATGAACTAG

Ile Ala Leu Lys Leu ATT GCT CTT AAA CTC

INTRON 6 (O-7 kb) _---

------ACTGTCTCTCTGTTGTCCCACCCAG

Thr Val Gln Gln Arg Gly Ala Ala Val Ile Lys Ala Arg Lys Leu Ser ACT GTG CAA CAG CGT GGT GCT GCT GTC ATC AAG GCT CGG AAG CTG TCC

Ile ATC

Lys AAG GTGGGTACATGGAGAG----------

INTRON 8 (1.4 kb) ----------AGCTCTCGCCCTTGTCCCCTGACAG

Asn AAT

Gly Glu Phe Val Ser Met Gly Val Ile Ser Asp Gly Asn Ser Tyr Gly Val Pro Asp GGA GAG TTC GTG TCG ATG GGT GTT ATC TCT GAT GGC AAC TCC TAT GGT GTC CCT GAT

Figure 3. Nucleotidr sequences of the mouse cMDHase gene. The sequence of 5’ and 3’ flanking regions, all of the exons and exon-intron boundaries are shown. Nucleotide position +1 corresponds to the first nucleotide of the initiation codon and the nucleotides upstream from the initiation codon are negatively numbered. The amino acid sites are sequences coded by the exons are given above the nucleotide sequence. ***, stop codon. The putative Spl binding sequence is boxed. The transcription-initiation indicated by arrows (see Fig. 4). The polyadenylation site is indicated by an arrowhead and the heavy underlines indicate the AATAAA signals. The horizontal arrows with numbers above and below the line indicate direct and inverted repeats, respectively, and identical numbers indicate that the sequences are identical.

v GAAAATCTCTCAGACTCTGTTTCTACTTTATATTTAGTATCTTCAGGAAAACAAGTTTGGCCCAATTATT

Glu Thr Ala Phe Glu Phe Leu Ser Ser Ala *** GAG ACC GCT TTT GAG TTT CTC TCC TCT GCG TGA CTAGACACTCGTTTTGACATCAGCAGACAGCCGAAGGCTG

Lys Thr Trp Lys Phe Val Glu Gly Leu Pro Ile Asn Asp Phe Ser Arg Glu Lys Met Asp Leu Thr Ala Lys Glu Leu Thr Glu Glu Lys AAG ACC TGG AAG TTT GTT GAA GGC CTC CCC ATT AAT GAC TTC TCC CGT GAA AAG ATG GAC CTG ACA GCA AAG GAG CTG ACC GAG GAA AAG

Asp Leu Leu Tyr Ser Leu Pro Val Val GAC CTG CTC TAC TCA CTC CCT GTC GTG

INTRON 7 (0.3 kb)----------GCTATGATAATGTAAACTTTTTCAG

Ser Ala Met Ser Ala Ala Lys Ala Ile Ala Asp His Ile Arg Asp Ile Trp Phe Gly Thr Pro Glu AGT GCA ATG TCT GCT GCG AAA GCC ATC GCA GAC CAC ATC AGA GAC ATC TGG TTT GGA ACC CCA GAG GTGAGGGTTCTCATTTGTACTGGCC-------

G-------w--

Leu Gln Gly Lys Glu Val Gly Val Tyr Glu Ala Leu Lys Asp Asp Ser Trp Leu Lys Gly Glu Phe Ile Thr CTG CAA GGA AAG GAA GTC GGT GTG TAT GAA GCC CTG AAA GAC GAC AGC TGG CTG AAG GGA GAG TTC ATC ACG GTAAGAAGGATGTGAACCCTCTGA

Gly Val Thr Ala Asp Asp Val Lys Asn Val Ile Ile Trp Gly Asn His Ser Ser Thr Gln Tyr Pro Asp Val Asn His Ala Lys Val Lys GGT GTA ACC GCT GAT GAT GTA AAG AAT GTC ATT ATC TGG GGA AAT CAT TCA TCG ACC CAti TAT CCA GAT GTC AAT CAT GCC AAG GTG AAA

Arg Ala Lys Ser Gln CGA GCA AAA TCT CAA GTAAGAAAAA

360

C. Setoyama et al.

12

3456

Figure 4. Primer extension analysis of the 5’ end portion of cMDHese mRNA. Primer extension was carried out as described in the text. A 5’-end-labeled synthetic 20-base oligodeoxyribonucleotide, complementary to a part of the mouse cMDHase cDNA sequence, from nucleotide positions - 39 to -20, was used as a primer. Lane 1, A 5 kg of poly(A)+ RNA from mouse heart plus 20 nucleotide long primer labeled with j2P at the 5’ end and reverse transcriptase; lane 2, tRNA was added instead of the poly(A)+ RNA. Sizes were determined from a dideoxy nucleotide sequencing ladder extended from the same primer (lane 3, G reaction; lane 4, A; lane 5. T; lane 6, C). The products are indicated by arrowheads.

Gidoni et al., 1984). The hexanucleotide sequence CCGCCC is located between nucleotide positions -442 to -437. Fifth, there is one pair of 9 bp long inverted repeats at nucleotide positions - 126 to - 118 and -51 to -43, and three 5 bp long direct repeats at nucleotide positions -36 to -32, - 15 to

- 11, and -8 to -4. Sixth, using a computer program (GENIAS) to search for dyads of symmetry in this region we found a number of potentially stable stem-loop structures. To search for the 5’ boundary of the gene, we tried to locate the transcription-initiation site of the gene (Fig. 4). For this purpose, a synthetic 20 nucleotide long oligodeoxyribonucleotide, complementary to the mouse cMDHase cDNA sequence at nucleotide position - 39 to - 20 (Joh et al.: 1987a), was prepared, and was used in the primer extension experiment. This primer was labeled at the 5’ end, hybridized with mouse heart poly(A)+ RNA and extended by reverse transcriptase. We chose mouse heart poly(A)+ RNA for this analysis, because we found that the levels of the cMDHase mRNAs are high in heart, brain and muscle, but relatively low in kidney, liver, spleen and testes, and there is no significant difference in size (Joh et al., 1987a). The exact sizes of the extended products were determined by running them on a denaturing gel, together with the sequencing products of a genomic fragment, starting from the same primer. Two major initiation sites are located at nucleotide positions -82 and -81 (Fig. 4). As the first step towards elucidating the regulatory mechanisms of transcription of the cMDHase gene, we compared the sequence of its 5’ flanking region with that of the mouse mMDHase, cAspATase and mAspATase genes, all participating in the malate-aspartate shuttle. We found that the sequence of mouse cMDHase 5’ flanking region, i.e. 350 nucleotides preceding the initiation codon (ATG), shows 25 to 45% overall homology with that of the corresponding regions of mouse mMDHase, cAspATase and mAspATase genes, However, three highly conserved regions were present between the mouse cMDHase and cAspATase 5’ flanking regions (Obaru et al., 1988) (Fig. 5): the first region is located between the mouse cMDHase nucleotide position - 285 to - 263 and the cAspATase nucleotide position - 312 to - 290, and shares a 66.7% homology; the second region is located between the mouse cMDHase nucleotide position -209 to -200 and the cAspATase nucleotide position - 216 to -205, and shares a 83.3% homology; the third region is located between the mouse cMDHase nucleotide position -116 to -94 and the cAspATase nucleotide position - I I7 to -99, and shares a 78.3% homology. It is of particular interest that one of the highly conserved regions is located close to both of the transcription-initiation sites of the cMDHase and cAspATase genes (Fig. 5). (c) Sequence around the polyadenylation site The 3’ end of the mouse cMDHase gene was determined by comparing the nucleotide sequence of the 3’ non-coding region of the gene with the sequence of mouse cMDHase cDNA that contains the complete 3’ non-coding region (Fig. 3) (Joh ef

.

.

.

.

.

.

-100

-300

.

.

A

i

.

AA

of the 5’ flanking regions of the cMDHase and cAspATase genes. The nucleotide position genes and the nucleotides upstream from the initiation codon are negatively cAspATase Only 2 or more contiguous matching nucleotides are indicated by lines. The arrowheads cAspATase gene, respectively.

ACAGAGCTGCTCCCGGCTCGTTCTCGAGGTCTCGGCACATTCTGTCGCG ATG +1 . . . .

CTC

Figure 5. Nucleotide seyuence comparison each of the initiation codons for cMDHase or cAspATase gene are from Obaru et al. (1988). transcription-initiation sites for cMDHase or

cAspATase

cMDHase

cMDHase

cMDHase

cAspATase

cMDHase

.

I

-50

-250

+ 1 corresponds to the first nucleotide of numbered. Sequence data for the mouse above or below the sequence indicate t,he

A

.

362

C. Setoyama et al.

al., 1987a). All of the 3’ non-coding region and part of the coding sequence for carboxyl-terminal amino acids (from residues 294 to 334) are coded by the last and largest exon (328 bp). As shown in Figure 3, two potential polyadenylation signal sequences, AATAAA (Proudfoot & Brownlee, 1976), are located 25 nucleotides upstream and 96 nucieotides downstream from the polyadenylation site, respectively. We assume that the former is functioning as a polyadenylation signal, because all the cMDHase cDNA clones isolated to date are carrying only the signal located 25 nucleotides upstream from the polyadenylation site (Joh et al., 1987a). The sequence CAYTG is usually present close to the polyadenylation site (Benoist et al., 1980). However, we found no equivalent sequence around this polyadenylation site.

sequences of exons are identical with those of the corresponding regions of the mouse cMDHase cDNA (Joh et al., 1987a). The locations of all the introns, as they relate to the secondary structures and functional domains of the mouse cMDHase protein, are summarized in Figure 6. The polypeptide chain of cMDHase can be separated into three domains; an NAD-binding domain, a catalytic domain and a carboxylterminal tail. The NAD-binding domain can be further divided into adenine-binding and nicotinamide-binding subdomains (Fig. 6). A generalized structure for NAD-binding domain is a series of alternating B-sheets and a-helical strands organized into a (or/I or ,!?a), structure (Blake, 1983) (Fig. 6). Intron 1 falls between the initiation methionine and the amino-terminal serine residue of the mature cMDHase, and thus could not have participated in the evolutionary association of functional protein domains (Fig. 6). The locations of introns 3, 5 and 6 correspond with the boundaries of protein domains; intron 3 divides the two mononucleotide-binding subdomains, intron 5 separates the NAD-binding domain from the catalytic domain and intron 6 delineates the catalytic domain from the carboxylterminal tail. This concordance supports the idea that some introns participated in the construction of functional protein domains. However, a further correlation between the positions of introns relative to the well-defined structural and/or functional domains of the mature protein was not obvious. Figure 6 also compares the positions of introns in the mouse cMDHase and mMDHase genes. Both of these genes have eight introns, two of the eight are located at the same positions while four of the remaining six have shifted over by about five to

(d) Organization of exon and intron structure: conservation of intron/exon patterns in MDHase isoenzyme genes The mouse cMDHase gene consists of nine exons and eight introns (Figs 3 and 6). All the exonintron boundaries in the cMDHase gene complied with the well-documented GT-AG splice junction rule (Breathnach & Chambon, 1981). The length of the introns is highly variable and ranges from 0.3 kb to 4.8 kb. As observed with most intron sequences, the regions 3’ to the donor sites were purine-rich and the regions 5’ to the acceptor sites The length of the exons were pyrimidine-rich. ranges from 84 to 328 bp. Except for the last and largest exon, the length of exons is relatively uniform and their average length is 120 bp. All the

intron

intron

3

4

12

67

125 126

-3 -2

55

83

119 120

1

2

3

4

1

2

a-helix P-

sheet

domains

B

A

c

B

c’

CD

DE

5

225 226

Nicotinamide Adenine binding Loop binding Nucleotide-binding domain

293 294

221

271 272

6

7

8

2F

F

8

187 188 5

1F

E

7

6

lG-2G

GH

J

Catalytic domain

3G

H K

LM

Carboxyl-terminal tail

Figure 6. Correlation of mMDHase exons with the structural and functional domains of the protein. The positions of introns in the mouse eMDHase and mMDHase genes are aligned to the amino acid sequence of the enzyme. Locations and sizes of P-sheets (arrows A to M) and a-helices (open boxes B to H) are shown (Birktoft et al., 1987; Grant et al., 1987). Numbering of a-helices and p-sheets is according to Birktoft et al. (1987). The amino-terminal box marked by vertical lines indicates the leader sequence. The positions of introns are indicated by broken lines, and amino acid residues that correspond to or abut introns are indicated above for cMDHase and below for mMDHase.

Mouse Cytosolic MDHase Gene seven codons, respectively (Fig. 6). The general conservation of intron positions spreads across most of the two molecules (Fig. 6). This pattern cannot be readily explained by the separate insertion of pre-existing gene. into a continuous, introns Rather, the conservation of intron positions leads to the conclusion that the ancestral gene was broken up by introns, before divergence of the mitochondrial and cytosolic MDHase genes.

4. Discussion We have determined the structural organization of the mouse chromosomal gene for cMDHase by analyzing the overlapping genomic clones obtained from a lambda L47.l/mouse genomic DNA library. A complete set of well-characterized cDNAs and genomic DNAs for the mouse MDHase and AspATase isoenzymes, four major participants in the malate-aspartate shuttle, is now available (Obaru et aE., 1986, 1988; Joh et aE., 1987a,6; Tsuzuki et at., 1987; Takeshima et al., 1988). As we found that the relative levels of cMDHase, mMDHase, cAspATase and mAspATase mRNAs are approximately equal in different mouse tissues (Joh et al., 1987b; unpublished observation), it has been suggested that transcription of the MDHase and AspATase isoenzyme genes is co-ordinately regulated. Because transcriptional control elements are located in the 5’ flanking region of protein(Breathnach & coding genes in eukaryotes Chambon: 1981), and because several DNA elements involved in tissue-specific expression of cellular genes have also been located within the 5’ flanking region (Walker et al., 1983; Gillies et al., 1984; Ott et a&, 19841, we compared the nucieotide sequences of the 5’ flanking regions of these isoenyzme genes and noted several characteristic features, including the absence of conventional 5’ transcriptional regulatory sequences such as TATA and CAAT boxes and the presence of G+C-rich sequences and at least one potential binding site for cellular transcription factor, Spl (Dynan & Tjian, 1983a,b; Tsuzuki et aZ., 1987; Obaru et al., 1988; Takeshima et al., 1988). We found that the sequences around the transcription-initiation sites of mouse MDHase and AspATase isoenzyme genes are all compatible with the formation of a number of potentially stable stem-loop structures (Tsuzuki et al., 1987; Obaru et al., 1988; Takeshima et al., 1988). Moreover, we found several highly conserved regions in the 5’ flanking sequences between mAspATase and cAspATase (Obaru et al., 1988), between mMDHase and mAspATase (Takeshima et al., 1988), and between cMDHase and cAspATase (Fig. Fi). Some of the stem-loop structures and/or highly conserved regions described above probably provide signals for the co-ordinate expression of the MDHase and AspATase isoenzyme genes. Availability of these genomic clones will facilitate study of the regulatory mechanism of expression of these two pairs of isoenzyme genes. Experiments are

363

underway to assign roles for each one of these structures. We reported the overall amino acid sequence identity among mouse MDHases and bacterial MDHases (Joh et al., 1987u), and suggested a possible mode of evolution of the MDHase genes, i.e. the first duplication of a common ancestral MDHase gene occurred before emergence of the eukaryotic cells, subsequently, the mammalian mMDHase and E. coti MDHase genes have evolved from one of the duplicates and the mammalian cMDHase and Thermus jiavus MDHase genes have evolved from one of the other duplicates (Joh et al., 1987a). To acquire more information on the mode of the MDHase gene evolution, we compared the organization of the mouse cMDHase gene with that of the mouse mMDHase gene (Takeshima et al., 1988), and found that the intron/exon patterns are conserved between these two isoenzyme genes; two of the eight introns are located at the same positions, while four of the remaining six are located at closely related positions (Fig. 6). This observation leads to the conclusion that the ancestral MDHase gene was broken up by introns before the divergence. In a previous study, we compared the mouse mAspATase and cAspATase isoenzyme gene structures (Obaru et al., 1988), and concluded that introns antedate the divergence of cytosolic and mitochondrial AspATase isoenzyme genes. If such is indeed the case, the intronlexon patterns shown in Figure 6 suggest that several shortdistance slidings of intron positions occurred after the divergence of mitochondrial and cytosolic MDHase genes. This work was supported by a grant-in-aid for the special promotion of science from the Ministry of Education, Science and Culture of Japan. We thank Drs Y. Morino and S. Tanase of Kumamoto University for preparation of a synthetic oligodeoxyribonucleotide and valuable discussion, and M. Ohara of Kyushu University for reading the manuscript.

References Agarwal, K. L., Brunstedt, J. & Xoyes, B. E. (1981). J. Biol. Chem. 256, 1023-1028.

Benoist, C., O’Hare, K., Breathnech, R. & Chambon, P. (1980). Nucl. Acids Res. 8, 127-142. Blake, C. C. F. (1983). Nature (London), 306, 535-537. Birktoft, J. J. BE Banaszak, L. J. (1983). J. Biol. Chem. 258, 412482. Birktoft, J. J., Bradshaw, R. A., & Banaszak, L. J. (1987). Biochemistry, 26, 2722-2734. Blin, N. & Stafford, D. W. (1976). Nucl. Acids Res. 3, 2303-2308. Breathnach, R. & Chambon, P. (1981). Annu. Rev. Biochem. 50, 349-383. Dynan, W. S. & Tjian, R. (1983a). Cell, 32, 669-680. Dynan, W. S. & Tjian, R. (19833). Cell, 35, 79-87. Gidoni, D., Dynan, W. S. & Tjian, R. (1984). Nature (London), 312,40!+413. Gillies, S. D., Folson, V. & Tonegawa, S. (1984). Nature (London), 3 10, 594-597.

364

C. Setoyama

Grant, P. M., Roderisk, S. L., Grant, G. A.. Banaszak, L, J. & Strauss, A. W. (1987). Biochemistry, 26, 128 134. Joh, T., Takeshima, H., Tsuzuki, T., Setoyama, C., Shimada, K., Tanase, S., Kuramitsu, S., Kagamiyama, H. & Morino, Y. (1987a). J. Biol. Chem. 262, 15127-15131. Joh, T., Takeshima, H., Tsuzuki, T., Shimada, K., Tanase, S. & Morino, Y. (19876). Biochemistry, 26, 2515-2520. Loenen, W. A. t Brammar, W. J. (1980). Gene. 10, 249 260. Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982). Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Obaru, K., Nomiyama, H., Shimada, K., Nagashima, F. $ Morino, Y. (1986). J. Biol. Chem. 261, 1697616983. Obaru, K., Tsuzuki, T., Setoyama, C. & Shimada, K. (1988). J. Mol. Biol. 200, 12-22. Ott, M.-O., Sperling, L., Herbomel, P., Yaniv, M. AZ Weiss, M. C. (1984). EMBO J. 505, 25052510. Edited

et al. Proudfoot, N. J. & Brownlee, G. 6. (1976). Nature (London), 263, 211-214. Rigby, P. W. J., Dieckmann, M., Rhodes, C. & Berg, P. (1977). J. Mol. Biol. 113, 237-251. Sanger, F. (1981). Science, 214, 12051210. Southern, E. (1975). J. Mol. Biol. 98, 503-517. Takeshima, H., Joh. T., Tsuzuki, T., Shimada, K. & Matsukado, Y. (1988). J. Mo2. Biol. 200, l-l 1. Tsuzuki, T., Nomiyama, H., Setoyama, C., Maeda, S. & Shimada, K. (1983). Gene, 25, 223-229. Tsuzuki, T., Mita, S., Maeda, S., Araki, S. t Shimada, K. (1985). J. Biol. Chem. 260, 1222412227. Tsuzuki, T., Obaru, K., Setoyama, C. & Shimada. K. (1987). J. Mol. BioE. 198, 21-31. Walker, M. D., Edlund, T., Boulet, A. M. & Rutter, W. J. (1983). Nature (London), 306, 557-561. Williamson, J. R., Safer, B., LaNoue, K. F., Smith, C. M. & Walajtys, E. I. (1973). In Rate Control of Biological XXVZI of the Society for Process, Symposium Experimental Biology (Davis, D. D., ed.), pp. 241281, Cambridge University Press, Cambridge and London.

by K. Matsubara