Back to EveryPatent.com
United States Patent |
5,708,158
|
Hoey
|
January 13, 1998
|
Nuclear factors and binding assays
Abstract
The invention provides methods and compositions for identifying
pharmacological agents useful in the diagnosis or treatment of disease
associated with the expression of a gene modulated by a transcription
complex containing at least a human nuclear factor of activated T-cells
(hNFAT). The materials include a family of hNFAT proteins, active
fragments thereof, and nucleic acids encoding them. The methods are
particularly suited to high-throughput screening where one or more steps
are performed by a computer controlled electromechanical robot comprising
an axial rotatable arm.
Inventors:
|
Hoey; Timothy (Woodside, CA)
|
Assignee:
|
Tularik Inc. (South San Francisco, CA)
|
Appl. No.:
|
818823 |
Filed:
|
March 14, 1997 |
Current U.S. Class: |
536/23.5; 536/23.1 |
Intern'l Class: |
C12N 015/12; C12N 015/11 |
Field of Search: |
536/23.1,23.5
|
References Cited
Other References
Li et al, PNAS (USA), Sep., 1991, vol. 88: pp. 7739-7743.
|
Primary Examiner: Elliott; George C.
Assistant Examiner: McKelvey; Terry A.
Attorney, Agent or Firm: Osman; Richard Aron
Parent Case Text
This is a division of application Ser. No. 08/396,479 filed Mar. 02, 1995,
now U.S. Pat. No. 5,612,455 which is a continuation-in-part application of
Ser. No. 08/270,653 filed on Jul. 5, 1994, now abandoned.
Claims
What is claimed is:
1. An isolated nucleic acid encoding a human nuclear factor of activated
T-cells, hNFAT, protein comprising hNFATp.sub.1 (SEQ ID NO:2),
hNFATp.sub.2 (SEQ ID NO:2, residues 220-921), hNFAT3 (SEQ ID NO:6),
hNFAT4b (SEQ ID NO:10) or hNFAT4c (SEQ ID NO:12).
2. The isolated nucleic acid of claim 1, wherein said protein comprises
hNFATp.sub.1 (SEQ ID NO:2).
3. The isolated nucleic acid of claim 1, wherein said protein comprises
hNFATp.sub.2 (SEQ ID NO:2, residues 220-921).
4. The isolated nucleic acid of claim 1, wherein said protein comprises
hNFAT3 (SEQ ID NO:6).
5. The isolated nucleic acid of claim 1, wherein said protein comprises
hNFAT4b (SEQ ID NO:10).
6. The isolated nucleic acid of claim 1, wherein said protein comprises
hNFAT4c (SEQ ID NO:12).
7. An isolated nucleic acid encoding an hNFAT protein, said nucleic acid
comprising SEQ ID NO:1, SEQ ID NO:5, SEQ ID NQ:9, SEQ ID NO:11 or
nucleotides 1-356 and 868-3478 of SEQ ID NO:1.
8. The isolated nucleic acid of claim 7, wherein said nucleic acid
comprises SEQ ID NO:1.
9. The isolated nucleic acid of claim 7, wherein said nucleic acid
comprises nucleotides 1-356 and 868-3478 of SEQ ID NO:1.
10. The isolated nucleic acid of claim 7, wherein said nucleic acid
comprises SEQ ID NO:5.
11. The isolated nucleic acid of claim 7, wherein said nucleic acid
comprises SEQ ID NO:9.
12. The isolated nucleic acid of claim 7, wherein said nucleic acid
comprises SEQ ID NO:1.
Description
INTRODUCTION
1. Field of the Invention
The field of this invention is human transcription factors of activated
T-cells.
2. Background
Identifying and developing new pharmaceuticals is a multibillion dollar
industry in the U.S. alone. Gene specific transcription factors provide a
promising class of targets for novel therapeutics directed to these and
other human diseases. Urgently needed are efficient methods of identifying
pharmacological agents or drugs which are active at the level of gene
transcription. If amenable to automated, cost-effective, high throughput
drug screening, such methods would have immediate application in a broad
range of domestic and international pharmaceutical and biotechnology drug
development programs.
Immunosuppression is therapeutically desirable in a wide variety of
circumstances including transplantation, allergy and other forms of
hypersensitivity, autoimmunity, etc. Cyclosporin, a widely used drug for
effecting immunosuppression, is believed to act by inhibiting a
calcineurin, a phosphatase which activates certain nuclear factors of
activated T-cells (NFATs). However, because of side effects and toxicity,
clinical indications of cyclosporin (and the more recently developed
FK506) are limited.
Accordingly, it is desired to identify agents which more specifically
interfere with the function of hNFATs. Unfortunately, the reagents
necessary for the development of high-throughput screening assays for such
therapeutics are unavailable.
3. Relevant Literature
Nolan (Jun. 17, 1994) Cell 77, 1-20 provides a recent review and commentary
on molecular interactions of hNFAT proteins. Northrop et at. (Jun. 9,
1994) Nature 369, 497-502 report the cloning of a cDNA encoding human
NFATc. McCaffrey et al. (Oct. 29, 1993) Science 262, 750-754 report the
cloning of a fragment of a gene encoding a murine NFATp.sub.1.
SUMMARY OF THE INVENTION
The invention provides methods and compositions for identifying lead
compounds and pharmacological agents useful in the diagnosis or treatment
of disease associated with the expression of one or more genes modulated
by a transcription complex containing a human nuclear factor of activated
T-cells (hNFAT). Several forms of hNFAT are provided including hNFATs
designated hNFATp.sub.1, hNFATp.sub.2, hNFATc, hNFAT3, hNFAT4a, hNFAT4b
and hNFAT4c. The invention also provides isolated nucleic acid encoding
the subject hNFATs, vectors and cells comprising such nucleic acids, and
methods of recombinantly producing polypeptides comprising hNFAT. The
invention also provides hNFAT-specific binding reagents such as
hNFAT-specific antibodies.
Methods using the disclosed hNFATs in drug development programs involve
combining a selected hNFAT with a natural intracellular hNFAT binding
target and a candidate pharmacological agent. Natural intracellular
binding targets include transcription factors, such as AP1 proteins and
nucleic acids encoding a hNFAT binding sequence. The resultant mixture is
incubated under conditions whereby, but for the presence of the candidate
pharmacological agent, the hNFAT selectively binds the target. Then the
presence or absence of selective binding between the hNFAT and target is
detected. A wide variety of alternative embodiments of the general methods
using hNFATs are disclosed. The methods are particularly suited to
high-throughput screening where one or more steps are performed by a
computer controlled electromechanical robot comprising an axial rotatable
arm and the solid substrate is a portion of a well of a microtiter plate.
______________________________________
hNFAT SEQ ID NOS:
______________________________________
hNFATp.sub.1
cDNA SEQ ID NO: 1
hNFATp.sub.1
protein SEQ ID NO: 2
hNFATp.sub.2
cDNA SEQ ID NO: 1, bases 1-356
and 868-3478
hNFATp.sub.2
protein SEQ ID NO: 2, residues 220-921
hNFATc cDNA SEQ ID NO: 3
hNFATc protein SEQ ID NO: 4
hNFAT3 cDNA SEQ ID NO: 5
hNFAT3 protein SEQ ID NO: 6
hNFAT4a cDNA SEQ ID NO: 7
hNFAT4a protein SEQ ID NO: 8
hNFAT4b cDNA SEQ ID NO: 9
hNFAT4b protein SEQ ID NO: 10
hNFAT4c cDNA SEQ ID NO: 11
hNFAT4c protein SEQ ID NO: 12
______________________________________
DETAILED DESCRIPTION OF THE INVENTION
The invention provides methods and compositions relating to human NFATs.
The subject hNFATs include regulators of cytokine gene expression that
modulate immune system function. As such, hNFATs and HNFAT-encoding
nucleic acids provide important targets for therapeutic intervention.
hNFATs derive from human cells, comprise invariant hNFAT rel domain
peptides (see, Table 1)and share at least 50% pair-wise rel sequence
identity with each of the disclosed hNFAT sequences. Invariant hNFAT rel
domain peptides include from the N-terminal end of the rel domain,
HHRAHYETEGSRGAVKA (SEQ ID NO:2, residues 419-435), PHAFYQVHRITGK (SEQ ID
NO:2, residues 470-482), IDCAGILKLRN (SEQ ID NO:2, residues 513-523),
DIELRKGETDIGRKNTRVRLVFRVHX,P (SEQ ID NO: 13), and PX.sub.2 ECSQRSAX.sub.3
ELP (SEQ ID NO: 14), where each X.sub.1 and X.sub.2 is hydrophobic residue
such as valine or isoleucine, and X.sub.3 is any residue, but preferably
glutamine or histidine.
TABLE 1
__________________________________________________________________________
hNFAT rel domains
NFATp (SEQ ID NO: 2, residues 388-678)
NFATc (SEQ ID NO: 4, residues 406-697)
NFAT3 (SEQ ID NO: 6, residues 397-686)
NFAT4b/c (SEQ ID NO: 10, resiudes 411-700)
__________________________________________________________________________
NFATp
IPVTASLPPLEWPLSSQSGSYELRIEVQPKPHHRAHYETEGSRGAVKAPT
50
NFATc
SYMSPTLPALDWQLPSHSGPYELRIEVQPKSHHRAHYETEGSRGAVKASA
50
NFAT3
IFRTSALPPLDWPLPSQYEQLELRIEVQPRAHHRAHYETEGSRGAVKAAP
50
NFAT4b/c
IFRTSSLPPLDWPLPAHFGQCELKIEVQPKTHHRAHYETEGSRGAVKAST
50
NFATp
GGHPVVQLHGYMENKPLGLQIFIGTADERILKPHAFYQVHRITGKTVTTT
100
NFATc
GGHPIVQLHGYLENEPLMLQLFIGTADDRLLRPHAFYQVHRITGKTVSTT
100
NFAT3
GGHPVVKLLGYS-EKPLTLQMFIGTADERNLRPHAFYQVHRITGKMVATA
99
NFAT4b/c
GGHPVVKLLGYN-EKPINLQMFIGTADDRYLRPHAFYQVHRITGKTVATA
99
NFATp
SYEKIVGNTKVLEIPLEPKNNMRATIDCAGILKLRNADIELRKGETDIGR
150
NFATc
SHEAILSNTKVLEIPLLPENSMRAVIDCAGILKLRNSDIELRKGETDIGR
150
NFAT3
SYEAVVSGTKVLEMTLLPENNMAANIDCAGILKLRNSDIELRKGETDIGR
149
NFAT4b/c
SQEIIIASTKVLEIPLLPENNMSASIDCAGILKLRNSDIELRKGETDIGR
149
NFATp
KNTRVRLVFRVHIPESSGRIVSLQTASNPIECSQRSAHELPMVERQDTDS
200
NFATc
KNTRVRLVFRVHVPQPSGRTLSLQVASNPIECSQRSAQELPLVEKQSTDS
200
NFAT3
KNTRVRLVFRVHVPQGGGKVVSVQAASVPIECSQRSAQELPQVEAYSPSA
199
NFAT4b/c
KNTRVRLVFRVHIPQPSGKVLSLQIASIPVECSQRSAQELPHIEKYSINS
199
NFATp
CLVYGGQQMILTGQNFTSESKVVFTEKTTDGQQIWEMEATVDKDKSQPNM
250
NFATc
YPVVGGKKMVLSGHNFLQDSKVIFVEKAPDGHHVWEMEAKTDRDLCKPNS
250
NFAT3
CSVRGGEELVLTGSNFLPDSKVVFIERGPDGKLQWEEEATVNRLQSNEVT
249
NFAT4b/c
CSVNGGHEMVVTGSNFLPESKIIFLEKGQDGRPQWEVEGKIIREKCQGAH
249
NFATp
LFVEIPEYRNKHIRTPVKVNFYVINGKRKRSQPQHFTYHPV
291
NFATc
LVVEIPPFRNQRITSPVHVSFYVCNGKRKRSQYQRFTYLPA
291
NFAT3
LTLTVPEYSNKRVSRPVQVYFYVSNGRRKRSPTQSFRFLPV
290
NFAT4b/c
IVLEVPPYHNPAVTAAVQVHFYLCNGKRKKSQSQRFTYTPV
290
__________________________________________________________________________
In addition to the shared rel domains, some hNFATs have smaller regions of
sequence similarity on the terminal side of the rel domains. For example,
the amino terminal regions of hNFAT 4a, 4b and 4c and hNFATc have several
regions of similarity (Table 2). The two largest regions (designated
regions A and B in Table 2) contain 23 of 41 and 24 of 45 identical amino
acids between the two proteins. hNFATp and hNFAT3 also have similarity to
other hNFAT proteins in this region (Table 2). The homology between hNFAT3
and hNFAT 4a, 4b and 4c extends about 25 amino acids upstream of the rel
region (designated region C in Table 2).
TABLE 2
__________________________________________________________________________
hNFAT regions 5' to the rel domain
__________________________________________________________________________
NFATc
PSTATLSLPSLEAYRDPS-CLSPASSLSSRSCNSEASSYES
195
NFAT4a
PSRDHLYLPLEPSYRESSLSPSPASSISSRSWFSDASSCES
189
A NFATc (SEQ ID NO: 4, residues 152-191)
NFAT4a (SEQ ID NO: 8, residues 144-184)
NFATc
SPQHSPSTSPRASVTEESWLGAR-----SSRPASPCNKRKYSLNG
272
NFAT4a
SPRQSPCHSPRSSVTDENWLSPRPASGPSSRPTSPCGKRRSSAEV
281
NFATc (SEQ ID NO: 4, residues 233-272)
NFAT4a (SEQ ID NO: 8, residues 236-281)
NFATc
SSRPASPCNKRKYSLNG 272
NFAT3
SPRPASPCGKRRYSSSG 275
B NFATc (SEQ ID NO: 4, residues 256-272)
NFAT3 (SEQ ID NO: 6, residues 259-275)
NFATc
SPQHSPSTSPRASVTEESWLGARSSRP 272
NFATp
SPRTSPIMSPRTSLAEDSCLGRHSPVP 239
NFATc (SEQ ID NO: 4, residues 233-259)
NFATp (SEQ ID NO: 2, residues 213-239)
NFAT3
RKEVAGMDYLAVPSPLAWSKARIGGHSP 396
NFAT4a
KKDSCGDQFLSVPSPFTWSKPKPG-HTP 410
C NFAT3 (SEQ ID NO: 6, residues 369-396)
NFAT4a (SEQ ID NO: 8, residues 384-410)
__________________________________________________________________________
Nucleic acids encoding hNFATs may be isolated from human cells by screening
cDNA libraries for human immune cells with probes or PCR primers derived
from the disclosed hNFAT genes. In addition to the invariant hNFAT rel
sequences and the 50% pair-wise rel domain identity, cDNAs of hNFAT
transcripts typically share substantially overall sequence identity with
one or more of the disclosed hNFAT sequences.
The subject hNFAT fragments have one or more hNFAT-specific binding
affinities, including the ability to specifically bind at least one
natural human intracellular hNFAT-specific binding target or a
hNFAT-specific binding agent such as a hNFAT-specific antibody or a
hNFAT-specific binding agent identified in assays such as described below.
Accordingly, the specificity of hNFAT fragment specific binding agents is
confirmed by ensuring non-cross-reactivity with other NFATs. Furthermore,
preferred hNFAT fragments are capable of eliciting an antibody capable of
specifically binding an hNFAT. Methods for making immunogenic peptides
through the use of conjugates, adjuvants, etc. and methods for eliciting
antibodies, e.g. immunizing rabbits, are well known.
Exemplary natural intracellular binding targets include nucleic acids which
comprise one or more hNFAT DNA binding sites. Functional hNFAT binding
sites have been found in the promoters or enhancers of several different
cytokine genes including IL-2, IL-4, IL-3, GM-CSF, and TNF-a and are often
located next to AP-1 binding sites, which are recognized by members of the
fos and jun families of transcription factors. Typically, the AP-1 binding
sites adjacent to hNFAT sites are low affinity sites, and AP-1 proteins
cannot bind them independently. However, many NF-AT and AP-1 protein
combinations are capable of cooperatively binding to DNA. Furthermore,
cell-type specificity of cytokine gene transcription is often controlled,
at least in part, by the combinations of hNFAT and AP-1 proteins present
in those cells. For example, there are different classes of T cells that
secrete different sets of cytokines: e.g. TH1 cells produce IL-2 and
IFN-g, while TH2 cells produce IL-4, IL-5, and IL-6. hNFAT binding sites
are involved in the regulation of both TH1 and TH2 cytokines. Further,
differential expression of the cytokine gene in T cell subsets is
controlled the combinatorial interactions of hNFAT and AP-1 proteins.
In addition to DNA binding sites and other transcription factors such as
AP1, other natural intracellular binding targets include cytoplasmic
proteins such as ankyrin repeat containing hNFAT inhibitors, protein
serine/threonine kinases, etc., and fragments of such targets which are
capable of hNFAT-specific binding. Other natural hNFAT binding targets are
readily identified by screening cells, membranes and cellular extracts and
fractions with the disclosed materials and methods and by other methods
known in the art. For example, two-hybrid screening using hNFAT fragments
are used to identify intracellular targets which specifically bind such
fragments. Preferred hNFAT fragments retain the ability to specifically
bind at least one of an hNFAT DNA binding site and can preferably
cooperatively bind with AP 1. Convenient ways to verify the ability of a
given hNFAT fragment to specifically bind such targets include in vitro
labelled binding assays such as described below, and EMSAs.
A wide variety of molecular and biochemical methods are available for
generating and expressing hNFAT fragments, see e.g. Molecular Cloning, A
Laboratory Manual (2nd Ed., Sambrook, Fritsch and Maniatis, Cold Spring
Harbor), Current Protocols in Molecular Biology (Eds. Aufubel, Brent,
Kingston, More, Feidman, Smith and Stuhl, Greene Publ. Assoc.,
Wiley-Interscience, New York, N.Y., 1992) or that are otherwise known in
the art. For example, hNFAT or fragments thereof may be obtained by
chemical synthesis, expression in bacteria such as E. coli and eukaryotes
such as yeast or vaccinia or baculovirus-based expression systems, etc.,
depending on the size, nature and quantity of the hNFAT or fragment. The
subject hNFAT fragments are of length sufficient to provide a novel
peptide. As used herein, such peptides are at least 5, usually at least
about 6, more usually at least about 8, most usually at least about 10
amino acids. hNFAT fragments may be present in a free state or bound to
other components such as blocking groups to chemically insulate reactive
groups (e.g. amines, carboxyls, etc.) of the peptide, fusion peptides or
polypeptides (i.e. the peptide may be present as a portion of a larger
polypeptide), etc.
The subject hNFAT fragments maintain binding affinity of not less than six,
preferably not less than four, more preferably not less than two orders of
magnitude less than the binding equilibrium constant of a full-length
native hNFATto the binding target under similar conditions. Particular
hNFAT fragments or deletion mutants are shown to function in a
dominant-negative fashion. Such fragments provide therapeutic agents, e.g.
when delivered by intracellular immunization--transfection of susceptible
cells with nucleic acids encoding such mutants.
The claimed hNFAT and hNFAT fragments are isolated, partially pure or pure
and are typically recombinantly produced. As used herein, an "isolated"
peptide is unaccompanied by at least some of the material with which it is
associated in its natural state and constitutes at least about 0.5%,
preferably at least about 2%, and more preferably at least about 5% by
weight of the total protein (including peptide) in a given sample; a
partially pure peptide constitutes at least about 10% , preferably at
least about 30%, and more preferably at least about 60% by weight of the
total protein in a given sample; and a pure peptide constitutes at least
about 70% , preferably at least about 90%, and more preferably at least
about 95 % by weight of the total protein in a given sample.
Preferred hNFAT fragments comprise at least a functional portion of the rel
domain. There are several different biochemical functions that are
mediated by the rel and hNFAT rel-similarity domains: DNA binding,
dimerization, interaction with B-zip proteins, interaction with inhibitor
proteins, and nuclear localization. Other rel family proteins have been
shown to physically interact with AP-1 (fos and jun) proteins (Stein et
al., EMBO J. 12, 1993). The rel homology domain is necessary for this
interaction and the B-zip region of the AP-1 proteins is involved in this
protein-protein interaction. The specificity in the ability of hNFAT and
AP-1 family members to interact is related to the tissue specific and cell
type specific regulation of gene expression governed by these proteins.
The rel and rel-similarity domains also interact with members of the I-kB
family of inhibitor proteins including I-kB-like ankyrin repeat proteins
(reviewed in Beg and Baldwin, Genes and Dev., 1993). The C-terminal half
or the rel domain is involved the interaction with I-kB. There are 5
related I-kB-like proteins which are characterized by having multiple
copies of a 33 amino acid sequence motif called the ankyrin repeat.
The invention provides hNFAT-specific binding agents, methods of
identifying and making such agents, and their use in diagnosis, therapy
and pharmaceutical development. For example, hNFAT-specific agents are
useful in a variety of diagnostic applications, especially where disease
or disease prognosis is associated with immune disfunction resulting from
improper expression of hNFAT. Novel hNFAT-specific binding agents include
hNFAT-specific antibodies and other natural intracellular binding agents
identified with assays such as one- and two-hybrid screens; non-natural
intracellular binding agents identified in screens of chemical libraries,
etc.
Generally, hNFAT-specificity of the binding target is shown by binding
equilibrium constants. Such targets are capable of selectively binding a
hNFAT, i.e. with an equilibrium constant at least about 10.sup.4 M.sup.-1,
preferably at least about 10.sup.6 M.sup.-1, more preferably at least
about 10.sup.8 M.sup.-1. A wide variety of cell-based and cell-free assays
may be used to demonstrate hNFAT-specific binding. Cell based assays
include one and two-hybrid screens, mediating or competitively inhibiting
hNFAT-mediated transcription, etc. Preferred are rapid in vitro, cell-free
assays such as mediating or inhibiting hNFAT-protein (e.g. hNFAT-AP1
binding), hNFAT-nucleic acid binding, immunoassays, etc. Other useful
screening assays for hNFAT/hNFAT fragment-target binding include
fluorescence resonance energy transfer (FRET), electrophoretic mobility
shift analysis (EMSA), etc.
The invention also provides nucleic acids encoding the subject hNFAT and
hNFAT fragments, which nucleic acids may be part of hNFAT-expression
vectors and may be incorporated into recombinant cells for expression and
screening, transgenic animals for functional studies (e.g. the efficacy of
candidate drugs for disease associated with expression of a hNFAT), etc.
In addition, the invention provides nucleic acids sharing substantial
sequence similarity with that of one or more wild-type hNFAT nucleic
acids. Substantially identical or homologous nucleic acid sequences
hybridize to their respective complements under high stringency
conditions, for example, at 55.degree. C. and hybridization buffer
comprising 50% formamide in 0.9M saline/0.09M sodium citrate (SSC) buffer
and remain bound when subject to washing at 55.degree. C. with the
SSC/formamide buffer. Where the sequences diverge, the differences are
preferably silent, i.e. or a nucleotide change providing a redundant
codon, or conservative, i.e. a nucleotide change providing a conservative
amino acid substitution.
The subject nucleic acids find a wide variety of applications including use
as hybridization probes, PCR primers, therapeutic nucleic acids, etc. for
use in detecting the presence of hNFAT genes and gene transcripts, for
detecting or amplifying nucleic acids with substantial sequence similarity
such as hNFAT homology and structural analogs, and for gene therapy
applications. Given the subject probes, materials and methods for probing
cDNA and genetic libraries and recovering homology are known in the art.
Preferred libraries are derived from human immune cells, especially cDNA
libraries from differentiated and activated human lymphoid cells. In one
application, the subject nucleic acids find use as hybridization probes
for identifying hNFAT cDNA homologs with substantial sequence similarity.
These homologs in turn provide additional hNFATs and hNFAT fragments for
use in binding assays and therapy as described herein. hNFAT encoding
nucleic acids also find applications in gene therapy. For example, nucleic
acids encoding dominant-negative hNFAT mutants are cloned into a virus and
the virus used to transfect and confer disease resistance to the
transfected cells.
Therapeutic hNFAT nucleic acids are used to modulate, usually reduce,
cellular expression or intracellular concentration or availability of
active hNFAT. These nucleic acids are typically antisense: single-stranded
sequences comprising complements of the disclosed hNFAT nucleic acids.
Antisense modulation of hNFAT expression may employ hNFAT antisense
nucleic acids operably linked to gene regulatory sequences. Cell are
transfected with a vector comprising an hNFAT sequence with a promoter
sequence oriented such that transcription of the gene yields an antisense
transcript capable of binding to endogenous hNFAT encoding mRNA.
Transcription of the antisense nucleic acid may be constitutive or
inducible and the vector may provide for stable extrachromosomal
maintenance or integration. Alternatively, single-stranded antisense
nucleic acids that bind to genomic DNA or mRNA encoding a hNFAT or hNFAT
fragment may be administered to the target cell, in or temporarily
isolated from a host, at a concentration that results in a substantial
reduction in hNFAT expression. For gene therapy involving the transfusion
of hNFAT transfected cells, administration will depend on a number of
variables that are ascertained empirically. For example, the number of
cells will vary depending on the stability of the transfused cells.
Transfusion media is typically a buffered saline solution or other
pharmacologically acceptable solution. Similarly the amount of other
administered compositions, e.g. transfected nucleic acid, protein, etc.,
will depend on the manner of administration, purpose of the therapy, and
the like.
The subject nucleic acids are often recombinant, meaning they comprise a
sequence joined to a nucleotide other than that which it is joined to on a
natural chromosome. An isolated nucleic acid constitutes at least about
0.5 %, preferably at least about 2%, and more preferably at least about 5%
by weight of total nucleic acid present in a given fraction. A partially
pure nucleic acid constitutes at least about 10%, preferably at least
about 30%, and more preferably at least about 60% by weight of total
nucleic acid present in a given fraction. A pure nucleic acid constitutes
at least about 80%, preferably at least about 90%, and more preferably at
least about 95% by weight of total nucleic acid present in a given
fraction.
The invention provides efficient methods of identifying pharmacological
agents or drugs which are active at the level of hNFAT modulatable
cellular function, particularly hNFAT mediated interleukin signal
transduction. Generally, these screening methods involve assaying for
compounds which interfere with hNFAT activity such as hNFAT-AP1 binding,
hNFAT-DNA binding, etc. The methods are amenable to automated,
cost-effective high throughput drug screening and have immediate
application in a broad range of domestic and international pharmaceutical
and biotechnology drug development programs.
Target therapeutic indications are limited only in that the target cellular
function (e.g. gene expression) be subject to modulation, usually
inhibition, by disruption of the formation of a complex (e.g.
transcription complex) comprising a hNFAT or hNFAT fragment and one or
more natural hNFAT intracellular binding targets. Since a wide variety of
genes are subject to hNFAT regulated gene transcription, target
indications may include infection, metabolic disease, genetic disease,
cell growth and regulatory disfunction, such as neoplasia, inflammation,
hypersensitivity, etc. Frequently, the target indication is related to
either immune dysfunction or selective immune suppression.
A wide variety of assays for binding agents are provided including labelled
in vitro protein-protein and protein-DNA binding assay, electrophoretic
mobility shift assays, immunoassays for protein binding or transcription
complex formation, cell based assays such as one, two and three hybrid
screens, expression assays such as transcription assays, etc. For example,
three-hybrid screens are used to rapidly examine the effect of transfected
nucleic acids, which may, for example, encode combinatorial peptide
libraries or antisense molecules, on the intracellular binding of hNFAT or
hNFAT fragments to intracellular hNFAT targets. Convenient reagents for
such assays (e.g. GAL4 fusion partners) are known in the art.
hNFAT or hNFAT fragments used in the methods are usually added in an
isolated, partially pure or pure form and are typically recombinantly
produced. The hNFAT or fragment may be part of a fusion product with
another peptide or polypeptide, e.g. a polypeptide that is capable of
providing or enhancing protein--protein binding, sequence-specific nucleic
acid binding or stability under assay conditions (e.g. a tag for detection
or anchoring).
The assay mixtures comprise at least a portion of a natural intracellular
hNFAT binding target such as AP1 or a nucleic acid comprising a sequence
which shares sufficient sequence similarity with a gene or gene regulatory
region to which the native hNFAT naturally binds to provide
sequence-specific binding of the hNFAT or hNFAT fragment. Such a nucleic
acid may further comprise one or more sequences which facilitate the
binding of a second transcription factor or fragment thereof which
cooperatively binds the nucleic acid with the hNFAT (i.e. at least one
increases the affinity or specificity of the DNA binding of the other).
While native binding targets may be used, it is frequently preferred to
use portions (e.g. peptides, nucleic acid fragments) or analogs (i.e.
agents which mimic the hNFAT binding properties of the natural binding
target for the purposes of the assay) thereof so long as the portion
provides binding affinity and avidity to the hNFAT conveniently measurable
in the assay. Binding sequences for other transcription factors may be
found in sources such as the Transcription Factor Database of the National
Center for Biotechnology Information at the National Library for Medicine,
in Faisst and Meyer (1991) Nucleic Acids Research 20, 3-26, and others
known to those skilled in this art.
Where used, the nucleic acid portion bound by the peptide(s) may be
continuous or segmented and is usually linear and double-stranded DNA,
though circular plasmids or other nucleic acids or structural analogs may
be substituted so long as hNFAT sequence-specific binding is retained. In
some applications, supercoiled DNA provides optimal sequence-specific
binding and is preferred. The nucleic acid may be of arty length amenable
to the assay conditions and requirements. Typically the nucleic acid is
between 8 bp and 5 kb, preferably between about 12 bp and 1 kb, more
preferably between about 18 bp and 250 bp, most preferably between about
27 and 50 bp. Additional nucleotides may be used to provide structure
which enhances or decreased binding or stability, etc. For example,
combinatorial DNA binding can be effected by including two or more DNA
binding sites for different or the same transcription factor on the
oligonucleotide. This allows for the study of cooperative or synergistic
DNA binding of two or more factors. In addition, the nucleic acid can
comprise a cassette into which transcription factor binding sites are
conveniently spliced for use in the subject assays.
The assay mixture also comprises a candidate pharmacological agent.
Generally a plurality of assay mixtures are run in parallel with different
agent concentrations to obtain a differential response to the various
concentrations. Typically, one of these concentrations serves as a
negative control, i.e. at zero concentration or below the limits of assay
detection. Candidate agents encompass numerous chemical classes, though
typically they are organic compounds; preferably small organic compounds.
Small organic compounds have a molecular weight of more than 50 yet less
than about 2,500, preferably less than about 1000, more preferably, less
than about 500. Candidate agents comprise functional chemical groups
necessary for structural interactions with proteins and/or DNA, and
typically include at least an amine, carbonyl, hydroxyl or carboxyl group,
preferably at least two of the functional chemical groups, more preferably
at least three. The candidate agents often comprise cyclical carbon or
heterocyclic structures and/or aromatic or polyaromatic structures
substituted with one or more of the forementioned functional groups.
Candidate agents are also found among biomolecules including peptides,
saccharides, fatty acids, steroids, purines, pyrimidines, derivatives,
structural analogs or combinations thereof, and the like. Where the agent
is or is encoded by a transfected nucleic acid, said nucleic acid is
typically DNA or RNA.
Candidate agents are obtained from a wide variety of sources including
libraries of synthetic or natural compounds. For example, numerous means
are available for random and directed synthesis of a wide variety of
organic compounds and biomolecules, including expression of randomized
oligonucleotides. Alternatively, libraries of natural compounds in the
form of bacterial, fungal, plant and animal extracts are available or
readily produced. Additionally, natural and synthetically produced
libraries and compounds are readily modified through conventional
chemical, physical, and biochemical means. In addition, known
pharmacological agents may be subject to directed or random chemical
modifications, such as acylation, alkylation, esterification,
amidification, etc., to produce structural analogs.
A variety of other reagents may also be included in the mixture. These
include reagents like salts, buffers, neutral proteins, e.g. albumin,
detergents, etc. which may be used to facilitate optimal protein-protein
and/or protein-nucleic acid binding and/or reduce non-specific or
background interactions, etc. Also, reagents that otherwise improve the
efficiency of the assay, such as protease inhibitors, nuclease inhibitors,
antimicrobial agents, etc. may be used.
The resultant mixture is incubated under conditions whereby, but for the
presence of the candidate pharmacological agent, the hNFAT specifically
binds the cellular binding target, portion or analog. The mixture
components can be added in any order that provides for the requisite
bindings. Incubations may be performed at any temperature which
facilitates optimal binding, typically between 4.degree. and 40.degree.
C., more commonly between 15.degree. and 40.degree. C. Incubation periods
are likewise selected for optimal binding but also minimized to facilitate
rapid, high-throughput screening, and are typically between 0.1 and 10
hours, preferably less than 5 hours, more preferably less than 2 hours.
After incubation, the presence or absence of specific binding between the
hNFAT and one or more binding targets is detected by any convenient way.
For cell-free binding type assays, a separation step is often used to
separate bound from unbound components. The separation step may be
accomplished in a variety of ways. Conveniently, at least one of the
components is immobilized on a solid substrate which may be any solid from
which the unbound components may be conveniently separated. The solid
substrate may be made of a wide variety of materials and in a wide variety
of shapes, e.g. microtiter plate, microbead, dipstick, resin particle,
etc. The substrate is chosen to maximize signal to noise ratios, primarily
to minimize background binding, for ease of washing and cost.
Separation may be effected for example, by removing a bead or dipstick from
a reservoir, emptying or diluting reservoir such as a microtiter plate
well, rinsing a bead (e.g. beads with iron cores may be readily isolated
and washed using magnets), particle, chromatographic column or filter with
a wash solution or solvent. Typically, the separation step will include an
extended rinse or wash or a plurality of rinses or washes. For example,
where the solid substrate is a microtiter plate, the wells may be washed
several times with a washing solution, which typically includes those
components of the incubation mixture that do not participate in specific
binding such as salts, buffer, detergent, nonspecific protein, etc. may
exploit a polypeptide specific binding reagent such as an antibody or
receptor specific to a ligand of the polypeptide.
Detection may be effected in any convenient way. For cell based assays such
as one, two, and three hybrid screens, the transcript resulting from
hNFAT-target binding usually encodes a directly or indirectly detectable
product (e.g. galactosidase activity, luciferase activity, etc.). For
cell-free binding assays, one of the components usually comprises or is
coupled to a label. A wide variety of labels may be employed--essentially
any label that provides for detection of bound protein. The label may
provide for direct detection as radioactivity, luminescence, optical or
electron density, etc. or indirect detection such as an epitope tag, an
enzyme, etc. The label may be appended to the protein e.g. a phosphate
group comprising a radioactive isotope of phosphorous, or incorporated
into the protein structure, e.g. a methionine residue comprising a
radioactive isotope of sulfur.
A variety of methods may be used to detect the label depending on the
nature of the label and other assay components. For example, the label may
be detected bound to the solid substrate or a portion of the bound complex
containing the label may be separated from the solid substrate, and
thereafter the label detected. Labels may be directly detected through
optical or electron density, radiative emissions, nonradiative energy
transfers, etc. or indirectly detected with antibody conjugates, etc. For
example, in the case of radioactive labels, emissions may be detected
directly, e.g. with particle counters or indirectly, e.g. with
scintillation cocktails and counters. The methods are particularly suited
to automated high throughput drug screening. Candidate agents shown to
inhibit hNFAT-target binding or transcription complex formation provide
valuable reagents to the pharmaceutical industries for animal and human
trials.
As previously described, the methods are particularly suited to automated
high throughput drug screening. In a particular embodiment, the arm
retrieves and transfers a microtiter plate to a liquid dispensing station
where measured aliquots of each an incubation buffer and a solution
comprising one or more candidate agents are deposited into each designated
well. The arm then retrieves and transfers to and deposits in designated
wells a measured aliquot of a solution comprising a labeled transcription
factor protein. After a first incubation period, the liquid dispensing
station deposits in each designated well a measured aliquot of a
biotinylated nucleic acid solution. The first and/or following second
incubation may optionally occur after the arm transfers the plate to a
shaker station. After a second incubation period, the arm transfers the
microtiter plate to a wash station where the unbound contents of each well
is aspirated and then the well repeatedly filled with a wash buffer and
aspirated. Where the bound label is radioactive phosphorous, the arm
retrieves and transfers the plate to the liquid dispensing station where a
measured aliquot of a scintillation cocktail is deposited in each
designated well. Thereafter, the amount of label retained in each
designated well is quantified.
In more preferred embodiments, the liquid dispensing station and arm are
capable of depositing aliquots in at least eight wells simultaneously and
the wash station is capable of filling and aspirating ninety-six wells
simultaneously. Preferred robots are capable of processing at least 640
and preferably at least about 1,280 candidate agents every 24 hours, e.g.
in microliter plates. Of course, useful agents are identified with a range
of other assays (e.g. gel shifts, etc.) employing the subject hNFAT and
hNFAT fragments.
The subject hNFAT and hNFAT fragments and nucleic acids provide a wide
variety of uses in addition to the in vitro binding assays described
above. For example, cell-based assays are provided which involve
transfecting a T-cell antigen receptor expressing cell with an hNFAT
inducible reporter such as luciferase. Agents which modulate hNFAT
mediated cell function are then detected through a change in the reporter.
The following examples are offered by way of illustration and not by way of
limitation.
EXPERIMENTAL
Investigation of the antigen inducible expression of the IL-2 gene led to
the discovery of the regulatory transcription factor NFAT (Nuclear Factor
of Activated T cells) (Durand et al. 1988; Shaw et al. 1988). Like several
other transcription factors involved in mediating signal transduction, the
activity of NFAT is regulated by subcellular localization. In resting T
cells NFAT activity is restricted to cytoplasm; stimulation of the T cell
receptor leads to translocation of NFAT to the nucleus. Movement of NFAT
to the nucleus is dependent on the activation of the calcium-regulated
phosphatase calcineurin (Clipstone and Crabtree 1992). The
immunosuppressive drugs cyclosporin and FK506 inhibit the activity of
calcineurin, and thereby prevent the nuclear localization of NFAT and
subsequent activation of cytokine gene expression (reviewed in (Schreiber
and Crabtree 1992).
Activation of the T cell antigen receptor induces two signalling pathways
required for IL-2 induction, one is the cyclosporin-sensitive,
calcium-dependent pathway and the other relies on the activation of
protein kinase C (PKC). Antigenic stimulation of these pathways can be
mimicked by treating cells with a calcium ionophore and a phorbol ester.
The PKC-inducible activity was found to be mediated by fos and jun
proteins (Jain et al. 1992; Northrop et al. 1993). The NFAT binding site
in the IL-2 promoter is adjacent to a weak binding site for AP-1 proteins,
and NFAT and AP-1 proteins bind cooperatively to this composite element
(Jain et al. 1993; Northrop et al. 1993). The transcriptional activation
mediated by AP-1 proteins through this site appears to be critical for
IL-2 expression in activated T cells. There are several different
combinations of fos and jun family members that can interact with NFAT to
bind DNA (Boise et al. 1993; Northrop et al. 1993; Jain et al. 1994;
Yaseen et al. 1994). Therefore, the composition of the AP-1 complex that
interacts with NFAT may vary in different cell types and different stages
of T cell activation. NFAT was originally reported to be a T cell specific
transcription factor critical for the restricted expression of IL-2 (Shaw
et al. 1988). More recently, NFAT activity was detected in B cells
(Brabletz et al. 1991; Yaseen et al. 1993; Choi et al. 1994; Venkataraman
et al. 1994). This is consistent with the finding that, in transgenic
mice, the major sites of expression of a reporter gene regulated by the
IL-2 NFAT/AP-1 site are activated T and B cells (Verweij et at. 1990).
In addition to IL-2, NFAT sites have been discovered in the promoters of
several other cytokine genes, including IL-4 (Chuvpilo et al. 1993; Szabo
et al. 1993; Rooney et al. 1994), IL-3 (Cockerill et al. 1993), GM-CSF
(Masuda et al. 1993), and TNF-a (Goldfeld et al. 1993). Thus, it appears
that NFAT proteins are involved in the coordinate regulation of many
different cytokines in activated lymphocytes. As with IL-2, most of the
NFAT sites in other cytokine promoters are composite elements that also
contain AP-1 binding sites (Rao, 1994).
Distinct genes encoding NFAT proteins have now been isolated (Jain et al.
1993; McCaffrey et al. 1993; Northrop et al. 1994; Hoey et al., in press).
Two of these genes, designated NFATp and NFATc, encode related proteins
that are highly similar to each other within a 290 amino acid domain. This
NFAT homology region shares weak sequence similarity with the DNA binding
and dimerization domain of the rel family of transcription factors
(reviewed in (Nolan 1994). There is evidence that both NFATp and NFATc may
be involved in mediating transcriptional regulation in activated T cells.
For example, NFATp forms a specific complex on DNA with fos and jun that
activates transcription in vitro (McCaffrey et al. 1993). NFATc has been
shown to activate IL-2 expression by a cotransfection assay in T cells
(Northrop et al. 1994). Furthermore, both proteins appears to be modified
by calcineurin (Jain et al 1993; Northrop et al. 1994). In addition to
NFATp and NFATc, we have isolated two new members of the human NFAT gene
family. We have used these clones to examine the tissue distribution of
the different NFAT genes. We have also expressed and purified the DNA
binding domains of the NFAT family proteins and investigated their
biochemical activities.
Results
1. Cloning of Human NFAT Genes
cDNA libraries were prepared from Jurkat T cells and human peripheral blood
lymphocytes, and screened using a probe derived from the rel similarity
region of the murine NFATp gene (McCaffrey et al. 1993). Cross-hybridizing
clones were isolated, sequenced, and determined to be derived from 4
distinct genes.
One of the genes isolated in this study is related to the murine NFATp gene
(McCaffrey et al. 1993), and another is identical to the NFATc gene
(Northrop et al. 1994). We have isolated two classes of NFATp cDNAs which
are the result of alternative splicing upstream of the rel domain. One
form is similar to the cDNA reported by McCaffrey et al., while the other
is alternatively spliced downstream of the rel similarity region; in
particular, this form is missing an exon encoding the region near the
N-terminus of the protein (SEQ ID NO: 1, base pairs 357-867) and has a
different initiating methionine (SEQ ID NO: 1, base pairs 880-882).
In addition to these previously identified genes, we cloned two novel
members of the NFAT gene family, hereby designated as NFAT3 and NFAT4. The
NFAT3 sequence was obtained from three overlapping cDNAs spanning 2880 bp,
and deduced to encode a protein of 902 amino acids. We obtained three
classes of NFAT4 cDNAs that resulted from alternative splicing downstream
of the rel homology domain. These three types of cDNAs encode proteins
that vary in sequence and length at their C-terminal ends. The three forms
are designated NFAT4a, NFAT4b, and NFAT4c. The positions of splice
junctions in the coding regions are after proline 699 in NFAT4a and after
valine 700 and proline 716 in NFAT4b and NFAT4c.
All of the NFAT genes are at least 65% identical to each other within a 290
amino acid domain. This domain is related to the DNA binding and
dimerization domain of the rel family of transcription factors (Nolan
1994; Northrop et al. 1994). Among the different NFAT genes, the
N-terminal and central portions of the rel similarity domain are more
highly conserved than the C-terminus.
Aside from the strikingly similar rel domains shared by all four NFAT
genes, the NFAT family members have smaller regions of sequence similarity
on the amino terminal side of the rel domains. The amino terminal regions
of NFAT4 and NFATc have several regions of significant similarity. The two
largest regions contain 23 of 41 and 24 of 45 identical amino acids
between the two proteins. Both of these regions are rich in serine and
proline residues. NFATp and NFAT3 also have some similarity to the other
NFAT proteins in this region, although it is less extensive than that
shared between NFAT4 and NFATc. The homology between NFAT3 and NFAT4
extends about 25 amino acids upstream of the rel similarity region.
2. Expression Patterns of the NFAT Genes
On tile basis of previous reports, expression of NFAT genes was expected to
be restricted to lymphocytes (Shaw et al. 1988; Verweij et al. 1990;
McCaffrey et al. 1993; Northrop et al. 1994). The expression of each NFAT
gene was tested by Northern blot using RNA from sixteen different human
tissues. For NFATp, expression of an mRNA approximately 7.5 kb was
detected in almost all human tissues. The expression was slightly higher
in PBLs and placenta. NFATc expression was also detected at a low level in
several different tissues. The NFATc probe hybridized to two bands of
approximately 2.7 and 4.5 kb. Surprisingly, the 4.5 kb NFATc transcript
was strongly expressed in skeletal muscle. The 2.7 kb mRNA appears to
correspond to the previously described NFATc clone (Northrop et al. 1994).
NFAT3 exhibited a very complicated expression pattern with at least 3 major
RNA bands between 3 and 5 kb. The major sites of NFAT3 expression were
observed outside the immune system. NFAT3 was highly expressed in
placenta, lung, kidney, testis and ovary. In contrast, NFAT3 expression
was very weak in spleen and thymus and undetectable in PBLs.
NFAT4 was expressed predominately as a 6.5 kb message. Like NFATc it was
strongly expressed in skeletal muscle. NFAT4 also displayed relatively
high expression in thymus. The probe for the NFAT4 northerns contained the
3' half of the NFAT homology region as well as downstream regions from the
NFAT4c class of cDNA. This probe should hybridize to all three classes of
NFAT4 transcripts. Only one form is detected in the Northern blots,
suggesting that the 4c class is the most abundant transcript.
These results indicate that each of the NFAT genes is expressed in a
distinct tissue-specific pattern. Furthermore, none of the NFAT genes are
restricted to lymphocytes.
3. DNA Binding Activity of the NFAT Proteins
The rel similarity regions along with a small amount of flanking sequences
of each of the four classes of NFAT proteins were expressed in E. coli.
Each of the 4 proteins was well expressed and soluble. The proteins were
purified to near homogeneity by DNA affinity chromatography (Kadonaga and
Tjian 1986). The binding site used for purification was a high affinity
NFAT site derived from the IL-4 promoter with the core binding sequence
GGAAAATTTT (SEQ ID NO: 15) (Rooney et al. 1994).
The binding specificities of the NFAT proteins were tested on two known
functional binding sites, the IL-4 promoter NFAT site and the NFAT binding
site in the distal antigen response element from the IL-2 promoter (Durand
et al. 1988; Shaw et al. 1988). All the proteins were able to bind the
IL-4 promoter site. NFATp, NFATc, and NFAT3 recognized this sequence with
very similar affinity, while NFAT4 bound this sequence with lower affinity
(>10-fold) than the other three proteins in this assay. NFAT4 protein may
have a different optimum binding sequence than the other NFAT proteins.
The same amounts of the four NFAT proteins were tested on the NFAT binding
site from the IL-2 promoter. This NFAT site (GGAAAAACTG) (SEQ ID NO:16)
has three differences relative to the IL-4 site which make it a weaker
site for all four NFAT proteins. The NFAT proteins differ in their ability
to recognize this site independently. NFATp had the highest relative
affinity for the IL-2 binding site, while NFATc and NFAT3 bound weakly to
this site and NFAT4 binding was not detectable in this assay.
The IL-2 NFAT site is part of a composite element that is adjacent to a
weak AP-1 site (TGTTTCA) (Jain et al. 1992; Northrop et al. 1993). To
determine if there were any differences in the ability of NFAT proteins to
interact with AP-1, the four NFAT proteins were tested with AP-1 for
binding to the IL-2 site. When tested alone all the NFAT proteins, as well
as the AP-1 proteins, bound relatively weakly to the IL-2 composite
element. The combination of c-jun and fra1 with each of the four NFAT
proteins resulted in highly cooperative DNA binding. In the presence of
the AP-1 protein the four NFAT proteins bound to the IL-2 site with very
similar affinity. In all cases, jun homodimers were not as effective as
jun-fra1 heterodimers in promoting cooperative binding in the gel shift
assay. These results indicate that the DNA binding and protein interaction
specificity of the NFAT proteins are very similar. Indeed, the
interactions of the four NFAT proteins with these AP-1 proteins appear to
be identical. NFAT4 did not bind independently to this site, but
recognized this site with the same affinity as the other NFAT proteins in
the presence of AP-1.
4. Transcriptional Activation by the NFAT Poteins
Having established that the DNA binding properties of the four NFAT
proteins are quite similar, we investigated their transcriptional
activation potentials. We used a transient transfection assay into Jurkat
T cells to measure the ability of the NFAT proteins to activate the IL-2
promoter. The IL-2 promoter was chosen because it is a critical regulatory
target for NFAT and has at least two functional NFAT binding sites (Randak
et al. 1990). Activation of this promoter by antigenic stimulation can be
mimicked by treatment with phorbol esters, such as phorbol 12-myristate 13
acetate (PMA), together with ionomycin, a calcium ionophore.
Each of the four NFAT genes was transfected into Jurkat cells, and their
ability to activate the IL-2 promoter was tested with various combinations
of PMA and ionomycin. Treatment of the cells with PMA plus ionomycin
induced strong activation by the endogenous NFAT proteins in Jurkat cells.
Transfection of each of the four of the NFAT genes resulted in an
additional stimulation the IL-2 promoter between 4- and 8-fold. Activation
of the IL-2 promoter by each of the NFAT proteins was dependent on both
PMA and ionomycin.
We also tested the ability of NFAT to activate transcription in COS and
HepG2 cells using a synthetic reporter gene consisting three copies of an
NFAT/AP-1 composite element. Transfection of each of the four NFAT into
HepG2 cells resulted in activation of the reporter gene of at least
20-fold in the presence of PMA and ionomycin. In contrast to Jurkat cells,
NFAT3 was more potent than the others in the HepG2 transfections,
resulting in 140-fold activation. Another difference between the results
of HepG2 and Jurkat cells is that the NFAT proteins appeared to activate
transcription in the absence of PMA or calcium ionophore.
In COS cells NFAT3 produced a striking 50-fold activation that was observed
independently of PMA and ionomycin treatment. NFAT3 was found to stimulate
transcription in COS cells much more strongly than the other proteins.
5. NFAT Proteins are Active as Monomers
There are many similar features of the NFAT and rel families of
transcription factors. Rel proteins form homo- and heterodimers in
solution, and dimerization is required for DNA binding (reviewed in
Baeuerle and Henkel 1994). The C-terminal half of the rel homology domain
is thought to be involved in mediating dimerization. Since the similarity
between NFAT and the rel families extends throughout the 300 amino acid
rel domain, and the rel domain of the NF-kB proteins is sufficient for
dimer formation, we expected that the NFAT proteins might also be function
as dimers. To test this idea we determined the native masses of the NFAT
proteins by gel filtration chromatography and glycerol gradient
centrifugation. For these experiments we used the rel similarity regions
of NFATp and NFATc that were expressed in E. coli and purified by DNA
affinity chromatography. The molecular weights of these proteins are 40.4
and 35.6 kD, respectively. As a control we used purified NF-kB p50 protein
that is known to exist as a stable dimer in solution (Baeuerle and
Baltimore 1989). The p50 protein is 45.8 kD calculated from its amino acid
sequence.
On both the gel filtration column and the glycerol gradient, the NFATp and
NFATc rel domains migrated at a position close to their actual molecular
weight. Under the same conditions, p50 behaved as species that was larger
than its monomer molecular weight. The data from the gel filtration column
was used to calculate the Stokes radius of each protein, and the S values
were determined by glycerol gradient sedimentation. These two properties
were used to calculate the apparent molecular size of the proteins (Siegel
and Monty 1966; Thompson et al. 1991). The apparent molecular sizes of the
NFATp and NFATc rel domains were determined to be 42 kD and 32 kD
respectively. These values are close to the monomer molecular weight for
both NFAT proteins. As expected, p50 exhibited an apparent molecular size
close to that of a dimer.
After determining that NFAT rel domains were monomers in solution, we then
considered the possibility that NFAT proteins might form dimers when bound
to DNA. To address this question we carried out gel mobility shift assays
with two different sized versions of NFATc translated in vitro (Hope and
Struhl 1987). The shorter version contains the rel similarity region and a
small amount of flanking residues and is referred to as NFATc-309. This
construct is equivalent to the one that was expressed in E. coli. The
larger version, NFATc-589, contains additional N-terminal sequences. When
expressed individually in a rabbit reticulocyte lysate both versions of
NFATc were active and produced protein-DNA complexes with different
mobilities. When the two different NFATc proteins were mixed by
co-translation the same protein-DNA complexes were apparent and no
intermediate species was detectable, as would be expected if the proteins
were forming dimers on the DNA. These results suggest that NFAT proteins
are capable of sequence-specific DNA binding as monomers.
Methods
1. Isolation of Human NFAT Clones
Peripheral blood lymphocytes (PBLs) were isolated from 2 units of blood
(obtained from Irwin Memorial Blood Bank, San Francisco) by fractionation
on sodium metrizoate/polysaccharide (Lymphoprep, Nycomed) gradients.
Jurkat T cells were grown in RPMI+10% fetal bovine serum. Total RNA was
isolated from Jurkat cells or peripheral blood lymphocytes according to
the Guanidinium-HCl method (Chomczynski and Sacchi 1987). Poly-A+RNA was
purified using oligo-dT magnetic beads (Promega). Random primed and
oligo-dT primed libraries were prepared from both Jurkat and PBL RNA
samples. The cDNA libraries were constructed in the vector Lambda ZAPII
(Stratagene) according to the protocol supplied by the manufacturer. The
cDNA was size selected for greater than 1 kb by electrophoresis a on 5 %
polyacrylamide gel prior to ligation. Each library contained approximately
2.times.10.sup.6 recombinant clones. Each of the four libraries was
screened independently under the same conditions.
The probe for the initial library screen was a 372 bp fragment derived by
PCR from the C-terminal half of the rel homology domain of the mouse NFATp
gene. This region corresponds to amino acids 370 through 496 in the
published mNFATp sequence (McCaffrey et al. 1993). The fragment was
labeled by random priming and hybridized in 1M NaCl, 50 mM Tris pH 7.4, 2
mM EDTA, 10.times. Denhardt's, 0.05% SDS, and 50 mg/ml salmon sperm DNA at
60.degree. C. The filters were washed first in 2.times. SSC, 0.1% SDS, and
then in 1.times. SSC, 0.1% SDS at 60.degree. C. Hybridizing clones were
purified and converted into Bluescript plasmid DNA clones. The DNA
sequence was determined using thermal cycle sequencing and the Applied
Biosystems 373A sequencer. Approximately 50 clones were isolated from the
first set of screens. Sequence analysis and cross-hybridization
experiments indicated that these clones were derived from 4 distinct
genes. For NFAT4, additional cDNA clones were obtained from a skeletal
muscle cDNA library (Stratagene). The 5' ends of the cDNA clones were
obtained from a Jurkat cDNA library prepared as described above with gene
specific primers for each of the NFAT genes.
2. Northerns
The northern blots with mRNA isolated from human tissues were purchased
from Clontech. DNA probes were labeled by random priming and hybridized in
5.times. SSPE, 10.times. Denhardt's, 50% formamide, 2% SDS, 100 mg/ml
salmon sperm DNA at 42.degree. C. The filters were washed in 2.times.
SSC/0.05% SDS at room temperature, and subsequently in 0.1.times. SSC/0.1%
SDS at 60.degree. C. For NFATp the probe was 1.2 kb cDNA fragment
containing the entire rel similarity region of NFATp. For NFATc, the probe
was a 291 nucleotide PCR fragment corresponding to the 3' end of rel
similarity region (amino acids 597 to 693 (Northrop et al. 1994). For
NFATc, a different set of blots was hybridized with a 0.8 kb cDNA fragment
located upstream of the rel domain. The two different NFATc probes
produced identical results. For NFAT3, the probe was a 0.6 kb fragment
located downstream of the rel similarity region corresponding to the
region encoding amino acid 720 through the 3' end of the clone. For NFAT4,
the probe was a 1.3 kb cDNA fragment corresponding to residue 549 to 963
from the 4c class of cDNAs.
3. Protein Expression and Purification
E. coli expression vectors for each NFAT protein were constructed in the T7
polymerase expression vector pT7-HMK, which has an eight amino acid heart
muscle kinase (hmk) site at the N-terminus. NdeI sites were introduced by
PCR using mutagenic oligonucleotides in the coding regions upstream of the
NFAT rel domains, and these restriction sites were subsequently used for
cloning into pT7-HMK. The sizes of the different proteins (without the hmk
sequences) are as follows: NFATp, 353 amino acids (the residues homologous
to 185 through 537 according to McCaffrey et al. 1993); NFATc, 309 amino
acids (amino acids 408 through 716 according to Northrop et al. 1994);
NFAT3, 345 amino acids (residues 400 through 744); NFAT4, 316 amino acids
(residues 393 through 708).
Proteins were expressed using the T7 polymerase expression system in the
strain BL21(DE3) (Studier and Moffat 1986). Expression was induced by
addition of 0.4 mM IPTG, and the cultures were shaken for 4 hours at room
temperature. The cells were harvested, washed in PBS, resuspended in 0.4M
KCl-HEG (25 mM HEPES pH 7.9; 0.1 mM EDTA; 10% glycerol; 0.2% NP-40; 2 mM
DTT, 0.2 mM PMSF, 0.2 mM sodium metabisulfite) and lysed by two cycles of
freeze- thawing followed by sonication. The lysate was spun in 10 min to
remove insoluble material. NFAT proteins were purified from the soluble
fractions of the extracts by DNA affinity chromatography (Kadonaga and
Tjian 1986). The binding site sequence for the affinity resin was from the
IL-4 promoter, TACATTGGAAAATTTTATTACAC (SEQ ID NO:17). The DNA was
biotinylated on one strand and coupled to avidin agarose beads (Sigma) at
a concentration of approximately 1 mg DNA/ml. Approximately 10 mg of E.
coli extracts containing the recombinant NFAT proteins were loaded on 1.5
ml DNA columns equilibrated with 0.1M KCl-HEG. The columns were washed
successively with 0.1, 0.2, and 0.4M HEG. The specifically bound NFAT
proteins were eluted with 1.0M KCl-HEG.
Fra-1 was expressed in E. coli from the vector pET11 (Novagen). The protein
was purified from the soluble fraction to approximately 80% homogeneity by
fractionation on heparin-sepharose. c-Jun protein was expressed in E. coli
and purified from the insoluble portion of the extract as previously
described (Bohmann and Tjian, 1989). The concentrations of the purified
proteins were determined by comparing the intensity of coomassie staining
with the staining intensity of BSA standards.
4. DNA Binding Experiments
Electrophoretic mobility shift assays were performed with the indicated
amounts of proteins in 50 mM KCl, 25 mM HEPES, 0.05 mM EDTA, 5% glycerol,
1 mM DTT with 1 mg of poly(dI-dC) and 100 ng of BSA. The binding reactions
and electrophoresis were carried out at room temperature. The samples were
run on a 5% polyacrylamide, 0.5.times. TBE gel at 200 V.
5. Transfections
The full-length coding regions for each of the NFAT genes were subcloned
into the RSV expression vector pREP4 (Invitrogen). The reporter plasmid
was pXIL2-Luc (constructed by Jim Fraser). It contains the IL-2 promoter
(-326 to +47, as in Durand et al 1988) upstream of the luciferase gene.
Approximately 1.times.10.sup.6 Jurkat cells were transiently transfected
by lipofection (Lipofectin, Gibco/BRL). Twenty hours after transfection
the cells were treated with 25 ng/ml PMA and 2 mM ionomycin, and the cells
were harvested 8 hours after induction. Transfection efficiencies were
standardized by co-transfection of pRSV-bgal and subsequent determination
of bgal activity. Each transfection contained 2 mg of expression vector, 5
mg of luciferase reporter, and 1 mg of bgal plasmid and 10 ml of
lipofectin. COS-7 and HepG2 cells were transfected by a modification of
the calcium phosphate method (Chen and Okayama 1987). The reporter gene
contained three copies of the antigen response element (-286 to -257)
upstream of the herpes virus tk minimal promoter (-50 to +28) in the
luciferase vector pGL2 (Promega).
6. Gel Filtration Columns and Glycerol Gradients
Protein samples were run on a 2.4 ml Superdex-200 column using the
Pharmacia Smart system. The column was equilibrated with 0.5M KCl-HEG at a
flow rate of 80 ml/min. The elution volumes of purified NFATc, NFATp, and
p50 were determined relative to those of molecular weight standards.
Purified p50 was provided by Zhaodan Cao. The following molecular weight
standards (10 mg) were chromatographed on separate runs: thyroglobulin
(669 kD), b-amylase (200 kD), BSA (66 kD), carbonic anhydrase (29 kD), and
cytochrome c (12 kD). The elution volume (V.sub.e) was converted to
K.sub.av by the equation, K.sub.av =(V.sub.e -V.sub.o) /V.sub.i, where
V.sub.o is the void volume and V.sub.i is the included volume. The Stokes
radii were determined from a plot of (-log K.sub.av).sup.1/2 vs. the
Stokes radii of the standards (Ackers 1970).
The S values were determined by glycerol gradient centrifugation. Five ml
10-30% glycerol gradients were prepared using a Beckman density gradient
former. The samples were centrifuged in a SW50Ti rotor at 39,000 rpm for
40 hours. After centrifugation, 200-ml fractions were collected and
analyzed by gel electrophoresis and coomassie staining. The S values were
determined by their sedimentation positions relative to the standards.
Native molecular sizes were determined from the Stokes radii (a), S values
(s), and the partial specific volumes (V) by the method of Siegel and
Monty using the equation M=6pNas/1-V (Siegel and Monty 1966, Thompson et
al. 1991).
7. References Cited in Experimental Section
Ackers (1970) Adv. Prof. Chem. 24:343-446; Baeuerle and Baltimore (1989)
Genes & Dev. 3:1689-1698; Baeuerle and Henkel (1994) Annu. Rev. Immunol.
12:141-179; Boise et al. (1993 ) Mol. Cell. Biol. 13:1911-1919; Brabletz
et al. (1991) Nucl. Acids Res. 19:61-67; Chen and Okayama (1987) Mol.
Cell. Biol. 7:2745-2752; Choi et al. (1994) Immunity 1:179-187;
Chomczynski and Sacchi (1987) Anal. Biochem. 162: 156-159; Chuvpilo et al.
(1993) Nucl. Acids Res. 21:5694-5704; Clipstone and Crabtree (1992) Nature
357:695-697; Cockerill et al. (1993) Proc. Natl. Acad. Sci. USA
90:2466-2470; Durand et al. (1988) Mol. Cell. Biol. 8:1715-1724; Goldfeld
et al. (1993) J. Exp. Med. 178:1365-1379; Grabstein et al. (1994) Science
264:965-968; Hohlfeld and Engel (1994) Immunol. Today 15:269-274; Hoyos et
al. (1989) Science 244:457-460; Hope and Struhl (1987) EMBO J.
6:2781-2784; Jain et al. (1992) Nature 356:801-803; Jain et al. (1993)
Nature 365:352-355; Jain et al. (1993) J. Immunol. 151:837-848; Jain et
al. (1994) Mol. Cell. Biol. 14:1566-1574; Kadonaga and Tjian (1986) Proc.
Natl. Acad. Sci. USA 83:5889-5893; Masuda (1993) Mol. Cell. Biol.
13:7399-7407; McCaffrey et al. (1993) Science 262:750-754; McCaffrey et
al. (1993) J. Biol. Chem. 268:3747-3752; Mouzaki and Rungger (1994) Blood
84:2612-2621; Nolan (1994) Cell 77:795-798; Northrop (1994) Nature
369:497-502; Northrop (1993) J. Biol. Chem. 268:2917-2293; Randak (1990)
EMBO J. 9:2529-2536; Rooney (1994) EMBO J. 13:625-633; Schreiber and
Crabtree (1992) Immunol. Today 13:136-142; Shaw (1988) Science
241:202-205; Siegel and Monty (1966) Biochim. Biophys. Acta 112:346-362;
Studier and Moffat (1986) J. Mol. Biol. 189:113-130; Szabo (1993) Mol.
Cell. Biol. 13:4793-4805; Thompson et al . (1991) Science 253:762-768;
Venkataraman et al. (1994) Immunity 1:189-196; Verweij et al. (1990) J.
Biol. Chem 265:15788-15795; Yaseen et al. (1994) Mol. Cell. Biol.
14:6886-6895; and Yaseen et al. (1993) J. Biol. Chem. 268:14285-14293.
The following examples are offered by way of illustration and not by way of
limitation.
EXAMPLES
1. Protocol for hNFAT-hNFAT Dependent Transcription Factor Binding Assay.
A. Reagents:
hNFAT: 20 .mu.g/ml in PBS.
Blocking buffer: 5% BSA, 0.5% Tween 20 in PBS; 1 hr, RT.
Assay Buffer: 100 mM KCl, 20 mM HEPES pH 7.6, 0.25 mM EDTA, 1% glycerol,
0.5% NP-40, 50 mM BME, 1 mg/ml BSA, cocktail of protease inhibitors.
.sup.33 P hNFAT 10.times. stock: 10.sup.-8 -10.sup.-6 M "cold" hNFAT
homolog supplemented with 200,000-250,000 cpm of labeled hNFAT homolog
(Beckman counter). Place in the 4.degree. C. microfridge during screening.
Protease inhibitor cocktail (1000.times.): 10 mg Trypsin Inhibitor (BMB
#109894), 10 mg Aprotinin (BMB #236624), 25 mg Benzamidine (Sigma
#B-6506), 25 mg Leupeptin (BMB #1017128), 10 mg APMSF (BMB #917575), and 2
mM NaVo.sub.3 (Sigma #S-6508) in 10 ml of PBS.
B. Preparation of Assay Plates:
Coat with 120 .mu.l of stock NF-AT per well overnight at 4.degree. C.
Wash 2.times. with 200 .mu.l PBS.
Block with 150 .mu.l of blocking buffer.
Wash 2.times. with 200 .mu.l PBS.
C. Assay:
Add 80 .mu.l assay buffer/well.
Add 10 .mu.l compound or extract.
Add 10 .mu.l .sup.33 P-NFAT (20,000-25,000 cpm/0.3
pmoles/well=3.times.10.sup.-9 M final concentration).
Shake at 25.degree. C. for 15 min.
Incubate additional 45 min. at 25.degree. C.
Stop the reaction by washing 4.times. with 200 .mu.l PBS.
Add 150 .mu.l scintillation cocktail.
Count in Topcount.
D. Controls for all Assays (Located on each Plate):
a. Non-specific binding (no hNFAT added)
b. cold hNFAT at 80% inhibition.
2. Protocol for hNFAT-AP1 Dependent Transcription Factor Binding Assay.
A. Reagents:
fos-jun heterodimers (junB and fra1): 20 .mu.g/ml in PBS.
Blocking buffer: 5% BSA, 0.5% Tween 20 in PBS; 1 hr, RT.
Assay Buffer: 100 mM KCl, 20 mM HEPES pH 7.6, 0.25 mM EDTA, 1% glycerol,
0.5% NP-40, 50 mM BME, 1 mg/ml BSA, cocktail of protease inhibitors.
.sup.33 P hNFAT 10.times. stock: 10.sup.-8 -10.sup.6 M "cold" hNFAT homolog
supplemented with 200,000-250,000 cpm of labeled hNFAT homolog (Beckman
counter). Place in the 4.degree. C. microfridge during screening.
Protease inhibitor cocktail (1000.times.): 10 mg Trypsin Inhibitor (BMB
#109894), 10 mg Aprotinin (BMB #236624), 25 mg Benzamidine (Sigma
#B-6056), 25 mg Leupeptin (BMB #1017128), 10 mg APMSF (BMB #917575), and 2
mM NaVo.sub.3 (Sigma #S-6508) in 10 ml of PBS.
B. Preparation of Assay Plates:
Coat with 120 .mu.l of stock fos-jun heterodimers per well overnight at
4.degree. C.
Wash 2.times. with 200 .mu.l PBS.
Block with 150 .mu.l of blocking buffer.
Wash 2.times. with 200 .mu.l PBS.
C. Assay:
Add 80 .mu.l assay buffer/well.
Add 10 .mu.l compound or extract.
Add 10 .mu.l .sup.33 P-NFAT (20,000-25,000 cpm/0.3
pmoles/well=3.times.10.sup.-9 M final concentration).
Shake at 25.degree. C. for 15 min.
Incubate additional 45 min. at 25.degree. C.
Stop the reaction by washing 4.times. with 200 .mu.l PBS.
Add 150 .mu.l scintillation cocktail.
Count in Topcount.
D. Controls for all assays (located on each plate):
a. Non-specific binding (no hNFAT added)
b. cold hNFAT at 80% inhibition.
3.Protocol for hNFAT-fos-jun Dependent Transcription Factor--DNA Binding
Assay.
A. Reagents:
Neutralite Avidin: 20 .mu.g/ml in PBS.
Blocking buffer: 5% BSA, 0.5% Tween 20 in PBS; 1 hr, RT.
Assay Buffer: 100 mM KCl, 20 mM HEPES pH 7.6, 0.25 mM EDTA, 1% glycerol,
0.5% NP-40, 50 mM BME, 1 mg/ml BSA, cocktail of protease inhibitors.
.sup.33 P hNFAT 10.times. stock: 10.sup.-6 -10.sup.-8 M "cold" hNFAT
homolog supplemented with 200,000-250,000 cpm of labeled hNFAT homolog
(Beckman counter) and 10.sup.-6 -10.sup.-8 M fos-jun heterodimers. Place
in the 4.degree. C. microfridge during screening.
Protease inhibitor cocktail (1000.times.): 10 mg Trypsin Inhibitor (BMB
#109894), 10 mg Aprotinin (BMB #236624), 25 mg Benzamidine (Sigma
#B-6506), 25 mg Leupeptin (BMB #1017128), 10 mg APMSF (BMB #917575), and 2
mM NaVo.sub.3 (Sigma #S-6508) in 10 ml of PBS.
Oligonucleotide stock: (specific biotinylated). Biotinylated oligo at 17
pmole/.mu.l AP1-NFAT site: (BIOTIN)-GG AGG AAA AAC TGT TTC ATA CAG AAG GCG
T (SEQ ID NO:18)
B. Preparation of Assay Plates:
Coat with 120 .mu.l of stock N-Avidin per well overnight at 4.degree. C.
Wash 2.times. with 200 .mu.l PBS.
Block with 150 .mu.l of blocking buffer.
Wash 2.times. with 200 .mu.l PBS.
C. Assay:
Add 40 .mu.l assay buffer/well.
Add 10 .mu.l compound or extract.
Add 10 .mu.l .sup.33 P-NFAT (20,000-25,000 cpm/0.1-10 pmoles/well=10.sup.-9
-10.sup.-7 M final concentration).
Shake at 25.degree. C. for 15 min.
Incubate additional 45 min. at 25.degree. C.
Add 40 .mu.l oligo mixture (1.0 pmoles/40 .mu.l in assay buffer with 1 ng
of ss-DNA)
Incubate 1 hr at RT.
Stop the reaction by washing 4.times. with 200 .mu.l PBS.
Add 150 .mu.l scintillation cocktail.
Count in Topcount.
D. Controls for all Assays (Located on each Plate):
a. Non-specific binding (no oligo added)
b. Specific soluble oligo at 80% inhibition.
All publications and patent applications cited in this specification are
herein incorporated by reference as if each individual publication or
patent application were specifically and individually indicated to be
incorporated by reference. Although the foregoing invention has been
described in some detail by way of illustration and example for purposes
of clarity of understanding, it will be readily apparent to those of
ordinary skill in the art in light of the teachings of this invention that
certain changes and modifications may be made thereto without departing
from the spirit or scope of the appended claims.
__________________________________________________________________________
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(iii) NUMBER OF SEQUENCES: 18
(2) INFORMATION FOR SEQ ID NO:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3478 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 223..2987
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:
GGAGCAGGAAGCTCGCGCCGCCGTCGCCGCCGCCGCTCAGCTTCCCCGGGCGCGTCCAGG60
ACCCGCTGCGCCAGGCGCGCCGTCCCCGGACCCGGCGTGCGTCCCTACGAGGAAAGGGAC120
CCCGCCGCTCGAGCCGCCTCCGCCAGCCCCACTGCGAGGGGTCCCAGAGCCAGCCGCGCC180
CGCCCTCGCCCCCGGCCCCGCAGCCTTCCCGCCCTGCGCGCCATGAACGCCCCC234
MetAsnAlaPro
GAGCGGCAGCCCCAACCCGACGGCGGGGACGCCCCAGGCCACGAGCCT282
GluArgGlnProGlnProAspGlyGlyAspAlaProGlyHisGluPro
5101520
GGGGGCAGCCCCCAAGACGAGCTTGACTTCTCCATCCTCTTCGACTAT330
GlyGlySerProGlnAspGluLeuAspPheSerIleLeuPheAspTyr
253035
GAGTATTTGAATCCGAACGAAGAAGAGCCGAATGCACATAAGGTCGCC378
GluTyrLeuAsnProAsnGluGluGluProAsnAlaHisLysValAla
404550
AGCCCACCCTCCGGACCCGCATACCCCGATGATGTCCTGGACTATGGC426
SerProProSerGlyProAlaTyrProAspAspValLeuAspTyrGly
556065
CTCAAGCCATACAGCCCCCTTGCTAGTCTCTCTGGCGAGCCCCCCGGC474
LeuLysProTyrSerProLeuAlaSerLeuSerGlyGluProProGly
707580
CGATTCGGAGAGCCGGATAGGGTAGGGCCGCAGAAGTTTCTGAGCGCG522
ArgPheGlyGluProAspArgValGlyProGlnLysPheLeuSerAla
859095100
GCCAAGCCAGCAGGGGCCTCGGGCCTGAGCCCTCGGATCGAGATCACT570
AlaLysProAlaGlyAlaSerGlyLeuSerProArgIleGluIleThr
105110115
CCGTCCCACGAACTGATCCAGGCAGTGGGGCCCCTCCGCATGAGAGAC618
ProSerHisGluLeuIleGlnAlaValGlyProLeuArgMetArgAsp
120125130
GCGGGCCTCCTGGTGGAGCAGCCGCCCCTGGCCGGGGTGGCCGCCAGC666
AlaGlyLeuLeuValGluGlnProProLeuAlaGlyValAlaAlaSer
135140145
CCGAGGTTCACCCTGCCCGTGCCCGGCTTCGAGGGCTACCGCGAGCCG714
ProArgPheThrLeuProValProGlyPheGluGlyTyrArgGluPro
150155160
CTTTGCTTGAGCCCCGCTAGCAGCGGCTCCTCTGCCAGCTTCATTTCT762
LeuCysLeuSerProAlaSerSerGlySerSerAlaSerPheIleSer
165170175180
GACACCTTCTCCCCCTACACCTCGCCCTGCGTCTCGCCCAATAACGGC810
AspThrPheSerProTyrThrSerProCysValSerProAsnAsnGly
185190195
GGGCCCGACGACCTGTGTCCGCAGTTTCAAAACATCCCTGCTCATTAT858
GlyProAspAspLeuCysProGlnPheGlnAsnIleProAlaHisTyr
200205210
TCCCCCAGAACCTCGCCAATAATGTCACCTCGAACCAGCCTCGCCGAG906
SerProArgThrSerProIleMetSerProArgThrSerLeuAlaGlu
215220225
GACAGCTGCCTGGGCCGCCACTCGCCCGTGCCCCGTCCGGCCTCCCGC954
AspSerCysLeuGlyArgHisSerProValProArgProAlaSerArg
230235240
TCCTCATCGCCTGGTGCCAAGCGGAGGCATTCGTGCGCCGAGGCCTTG1002
SerSerSerProGlyAlaLysArgArgHisSerCysAlaGluAlaLeu
245250255260
GTTGCCCTGCCGCCCGGAGCCTCACCCCAGCGCTCCCGGAGCCCCTCG1050
ValAlaLeuProProGlyAlaSerProGlnArgSerArgSerProSer
265270275
CCGCAGCCCTCATCTCACGTGGCACCCCAGGACCACGGCTCCCCGGCT1098
ProGlnProSerSerHisValAlaProGlnAspHisGlySerProAla
280285290
GGGTACCCCCCTGTGGCTGGCTCTGCCGTGATCATGGATGCCCTGAAC1146
GlyTyrProProValAlaGlySerAlaValIleMetAspAlaLeuAsn
295300305
AGCCTCGCCACGGACTCGCCTTGTGGGATCCCCCCCAAGATGTGGAAG1194
SerLeuAlaThrAspSerProCysGlyIleProProLysMetTrpLys
310315320
ACCAGCCCTGACCCCTCGCCGGTGTCTGCCGCCCCATCCAAGGCCGGC1242
ThrSerProAspProSerProValSerAlaAlaProSerLysAlaGly
325330335340
CTGCCTCGCCACATCTACCCGGCCGTGGAGTTCCTGGGGCCCTGCGAG1290
LeuProArgHisIleTyrProAlaValGluPheLeuGlyProCysGlu
345350355
CAGGGCGAGAGGAGAAACTCGGCTCCAGAATCCATCCTGCTGGTTCCG1338
GlnGlyGluArgArgAsnSerAlaProGluSerIleLeuLeuValPro
360365370
CCCACTTGGCCCAAGCCGCTGGTGCCTGCCATTCCCATCTGCAGCATC1386
ProThrTrpProLysProLeuValProAlaIleProIleCysSerIle
375380385
CCAGTGACTGCATCCCTCCCTCCACTTGAGTGGCCGCTGTCCAGTCAG1434
ProValThrAlaSerLeuProProLeuGluTrpProLeuSerSerGln
390395400
TCAGGCTCTTACGAGCTGCGGATCGAGGTGCAGCCCAAGCCACATCAC1482
SerGlySerTyrGluLeuArgIleGluValGlnProLysProHisHis
405410415420
CGGGCCCACTATGAGACAGAAGGCAGCCGAGGGGCTGTCAAAGCTCCA1530
ArgAlaHisTyrGluThrGluGlySerArgGlyAlaValLysAlaPro
425430435
ACTGGAGGCCACCCTGTGGTTCAGCTCCATGGCTACATGGAAAACAAG1578
ThrGlyGlyHisProValValGlnLeuHisGlyTyrMetGluAsnLys
440445450
CCTCTGGGACTTCAGATCTTCATTGGGACAGCTGATGAGCGGATCCTT1626
ProLeuGlyLeuGlnIlePheIleGlyThrAlaAspGluArgIleLeu
455460465
AAGCCGCACGCCTTCTACCAGGTGCACCGAATCACGGGGAAAACTGTC1674
LysProHisAlaPheTyrGlnValHisArgIleThrGlyLysThrVal
470475480
ACCACCACCAGCTATGAGAAGATAGTGGGCAACACCAAAGTCCTGGAG1722
ThrThrThrSerTyrGluLysIleValGlyAsnThrLysValLeuGlu
485490495500
ATACCCTTGGAGCCCAAAAACAACATGAGGGCAACCATCGACTGTGCG1770
IleProLeuGluProLysAsnAsnMetArgAlaThrIleAspCysAla
505510515
GGGATCTTGAAGCTTAGAAACGCCGACATTGAGCTGCGGAAAGGCGAG1818
GlyIleLeuLysLeuArgAsnAlaAspIleGluLeuArgLysGlyGlu
520525530
ACGGACATTGGAAGAAAGAACACGCGGGTGAGACTGGTTTTCCGAGTT1866
ThrAspIleGlyArgLysAsnThrArgValArgLeuValPheArgVal
535540545
CACATCCCAGAGTCCAGTGGCAGAATCGTCTCTTTACAGACTGCATCT1914
HisIleProGluSerSerGlyArgIleValSerLeuGlnThrAlaSer
550555560
AACCCCATCGAGTGCTCCCAGCGATCTGCTCACGAGCTGCCCATGGTT1962
AsnProIleGluCysSerGlnArgSerAlaHisGluLeuProMetVal
565570575580
GAAAGACAAGACACAGACAGCTGCCTGGTCTATGGCGGCCAGCAAATG2010
GluArgGlnAspThrAspSerCysLeuValTyrGlyGlyGlnGlnMet
585590595
ATCCTCACGGGGCAGAACTTTACATCCGAGTCCAAAGTTGTGTTTACT2058
IleLeuThrGlyGlnAsnPheThrSerGluSerLysValValPheThr
600605610
GAGAAGACCACAGATGGACAGCAAATTTGGGAGATGGAAGCCACGGTG2106
GluLysThrThrAspGlyGlnGlnIleTrpGluMetGluAlaThrVal
615620625
GATAAGGACAAGAGCCAGCCCAACATGCTTTTTGTTGAGATCCCTGAA2154
AspLysAspLysSerGlnProAsnMetLeuPheValGluIleProGlu
630635640
TATCGGAACAAGCATATCCGCACACCTGTAAAAGTGAACTTCTACGTC2202
TyrArgAsnLysHisIleArgThrProValLysValAsnPheTyrVal
645650655660
ATCAATGGGAAGAGAAAACGAAGTCAGCCTCAGCACTTTACCTACCAC2250
IleAsnGlyLysArgLysArgSerGlnProGlnHisPheThrTyrHis
665670675
CCAGTCCCAGCCATCAAGACGGAGCCCACGGATGAATATGACCCCACT2298
ProValProAlaIleLysThrGluProThrAspGluTyrAspProThr
680685690
CTGATCTGCAGCCCCACCCATGGAGGCCTGGGGAGCCAGCCTTACTAC2346
LeuIleCysSerProThrHisGlyGlyLeuGlySerGlnProTyrTyr
695700705
CCCCAGCACCCGATGGTGGCCGAGTCCCCCTCCTGCCTCGTGGCCACC2394
ProGlnHisProMetValAlaGluSerProSerCysLeuValAlaThr
710715720
ATGGCTCCCTGCCAGCAGTTCCGCACGGGGCTCTCATCCCCTGACGCC2442
MetAlaProCysGlnGlnPheArgThrGlyLeuSerSerProAspAla
725730735740
CGCTACCAGCAACAGAACCCAGCGGCCGTACTCTACCAGCGGAGCAAG2490
ArgTyrGlnGlnGlnAsnProAlaAlaValLeuTyrGlnArgSerLys
745750755
AGCCTGAGCCCCAGCCTGCTGGGCTATCAGCAGCCGGCCCTCATGGCC2538
SerLeuSerProSerLeuLeuGlyTyrGlnGlnProAlaLeuMetAla
760765770
GCCCCGCTGTCCCTTGCGGACGCTCACCGCTCTGTGCTGGTGCACGCC2586
AlaProLeuSerLeuAlaAspAlaHisArgSerValLeuValHisAla
775780785
GGCTCCCAGGGCCAGAGCTCAGCCCTGCTCCACCCCTCTCCGACCAAC2634
GlySerGlnGlyGlnSerSerAlaLeuLeuHisProSerProThrAsn
790795800
CAGCAGGCCTCGCCTGTGATCCACTACTCACCCACCAACCAGCAGCTG2682
GlnGlnAlaSerProValIleHisTyrSerProThrAsnGlnGlnLeu
805810815820
CGCTGCGGAAGCCACCAGGAGTTCCAGCACATCATGTACTGCGAGAAT2730
ArgCysGlySerHisGlnGluPheGlnHisIleMetTyrCysGluAsn
825830835
TTCGCACCAGGCACCACCAGACCTGGCCCGCCCCCGGTCAGTCAAGGT2778
PheAlaProGlyThrThrArgProGlyProProProValSerGlnGly
840845850
CAGAGGCTGAGCCCGGGTTCCTACCCCACAGTCATTCAGCAGCAGAAT2826
GlnArgLeuSerProGlySerTyrProThrValIleGlnGlnGlnAsn
855860865
GCCACGAGCCAAAGAGCCGCCAAAAACGGACCCCCGGTCAGTGACCAA2874
AlaThrSerGlnArgAlaAlaLysAsnGlyProProValSerAspGln
870875880
AAGGAAGTATTACCTGCGGGGGTGACCATTAAACAGGAGCAGAACTTG2922
LysGluValLeuProAlaGlyValThrIleLysGlnGluGlnAsnLeu
885890895900
GACCAGACCTACTTGGATGATGAGCTGATAGACACACACCTTAGCTGG2970
AspGlnThrTyrLeuAspAspGluLeuIleAspThrHisLeuSerTrp
905910915
ATACAAAACATATTATGAAACAGAATGACTGTGATCTTTGATCCGAG3017
IleGlnAsnIleLeu
920
AAATCAAAGTTAAAGTTAATGAAATTATCAGGAAGGAGTTTTCAGGACCTCCTGCCAGAA3077
ATCAGACGTAAAAGAAGCCATTATAGCAAGACACCTTCTGTATCTGACCCCTCGGAGCCC3137
TCCACAGCCCCTCACCTTCTGTCTCCTTTCATGTTCATCTCCCAGCCCGGAGTCCACACG3197
CGGATCAATGTATGGGCACTAAGCGGACTCTCACTTAAGGAGCTCGCCACCTCCCTCTAA3257
ACACCAGAGAGAACTCTTCTTTTCGGTTTATGTTTTAAATCCCAGAGAGCATCCTGGTTG3317
ATCTTAATGGTGTTCCGTCCAAATAGTAAGCACCTGCTGACCAAAAGCACATTCTACATG3377
AGACAGGACACTGGAACTCTCCTGAGAACAGAGTGACTGGAGCTTGGGGGGATGGACGGG3437
GGACAGAAGATGTGGGCACTGTGATTAAACCCCAGCCCTTG3478
(2) INFORMATION FOR SEQ ID NO:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 921 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:
MetAsnAlaProGluArgGlnProGlnProAspGlyGlyAspAlaPro
151015
GlyHisGluProGlyGlySerProGlnAspGluLeuAspPheSerIle
202530
LeuPheAspTyrGluTyrLeuAsnProAsnGluGluGluProAsnAla
354045
HisLysValAlaSerProProSerGlyProAlaTyrProAspAspVal
505560
LeuAspTyrGlyLeuLysProTyrSerProLeuAlaSerLeuSerGly
65707580
GluProProGlyArgPheGlyGluProAspArgValGlyProGlnLys
859095
PheLeuSerAlaAlaLysProAlaGlyAlaSerGlyLeuSerProArg
100105110
IleGluIleThrProSerHisGluLeuIleGlnAlaValGlyProLeu
115120125
ArgMetArgAspAlaGlyLeuLeuValGluGlnProProLeuAlaGly
130135140
ValAlaAlaSerProArgPheThrLeuProValProGlyPheGluGly
145150155160
TyrArgGluProLeuCysLeuSerProAlaSerSerGlySerSerAla
165170175
SerPheIleSerAspThrPheSerProTyrThrSerProCysValSer
180185190
ProAsnAsnGlyGlyProAspAspLeuCysProGlnPheGlnAsnIle
195200205
ProAlaHisTyrSerProArgThrSerProIleMetSerProArgThr
210215220
SerLeuAlaGluAspSerCysLeuGlyArgHisSerProValProArg
225230235240
ProAlaSerArgSerSerSerProGlyAlaLysArgArgHisSerCys
245250255
AlaGluAlaLeuValAlaLeuProProGlyAlaSerProGlnArgSer
260265270
ArgSerProSerProGlnProSerSerHisValAlaProGlnAspHis
275280285
GlySerProAlaGlyTyrProProValAlaGlySerAlaValIleMet
290295300
AspAlaLeuAsnSerLeuAlaThrAspSerProCysGlyIleProPro
305310315320
LysMetTrpLysThrSerProAspProSerProValSerAlaAlaPro
325330335
SerLysAlaGlyLeuProArgHisIleTyrProAlaValGluPheLeu
340345350
GlyProCysGluGlnGlyGluArgArgAsnSerAlaProGluSerIle
355360365
LeuLeuValProProThrTrpProLysProLeuValProAlaIlePro
370375380
IleCysSerIleProValThrAlaSerLeuProProLeuGluTrpPro
385390395400
LeuSerSerGlnSerGlySerTyrGluLeuArgIleGluValGlnPro
405410415
LysProHisHisArgAlaHisTyrGluThrGluGlySerArgGlyAla
420425430
ValLysAlaProThrGlyGlyHisProValValGlnLeuHisGlyTyr
435440445
MetGluAsnLysProLeuGlyLeuGlnIlePheIleGlyThrAlaAsp
450455460
GluArgIleLeuLysProHisAlaPheTyrGlnValHisArgIleThr
465470475480
GlyLysThrValThrThrThrSerTyrGluLysIleValGlyAsnThr
485490495
LysValLeuGluIleProLeuGluProLysAsnAsnMetArgAlaThr
500505510
IleAspCysAlaGlyIleLeuLysLeuArgAsnAlaAspIleGluLeu
515520525
ArgLysGlyGluThrAspIleGlyArgLysAsnThrArgValArgLeu
530535540
ValPheArgValHisIleProGluSerSerGlyArgIleValSerLeu
545550555560
GlnThrAlaSerAsnProIleGluCysSerGlnArgSerAlaHisGlu
565570575
LeuProMetValGluArgGlnAspThrAspSerCysLeuValTyrGly
580585590
GlyGlnGlnMetIleLeuThrGlyGlnAsnPheThrSerGluSerLys
595600605
ValValPheThrGluLysThrThrAspGlyGlnGlnIleTrpGluMet
610615620
GluAlaThrValAspLysAspLysSerGlnProAsnMetLeuPheVal
625630635640
GluIleProGluTyrArgAsnLysHisIleArgThrProValLysVal
645650655
AsnPheTyrValIleAsnGlyLysArgLysArgSerGlnProGlnHis
660665670
PheThrTyrHisProValProAlaIleLysThrGluProThrAspGlu
675680685
TyrAspProThrLeuIleCysSerProThrHisGlyGlyLeuGlySer
690695700
GlnProTyrTyrProGlnHisProMetValAlaGluSerProSerCys
705710715720
LeuValAlaThrMetAlaProCysGlnGlnPheArgThrGlyLeuSer
725730735
SerProAspAlaArgTyrGlnGlnGlnAsnProAlaAlaValLeuTyr
740745750
GlnArgSerLysSerLeuSerProSerLeuLeuGlyTyrGlnGlnPro
755760765
AlaLeuMetAlaAlaProLeuSerLeuAlaAspAlaHisArgSerVal
770775780
LeuValHisAlaGlySerGlnGlyGlnSerSerAlaLeuLeuHisPro
785790795800
SerProThrAsnGlnGlnAlaSerProValIleHisTyrSerProThr
805810815
AsnGlnGlnLeuArgCysGlySerHisGlnGluPheGlnHisIleMet
820825830
TyrCysGluAsnPheAlaProGlyThrThrArgProGlyProProPro
835840845
ValSerGlnGlyGlnArgLeuSerProGlySerTyrProThrValIle
850855860
GlnGlnGlnAsnAlaThrSerGlnArgAlaAlaLysAsnGlyProPro
865870875880
ValSerAspGlnLysGluValLeuProAlaGlyValThrIleLysGln
885890895
GluGlnAsnLeuAspGlnThrTyrLeuAspAspGluLeuIleAspThr
900905910
HisLeuSerTrpIleGlnAsnIleLeu
915920
(2) INFORMATION FOR SEQ ID NO:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2743 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 240..2390
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:
GAATTCCGCAGGGCGCGGGCACCGGGGCGCGGGCAGGGCTCGGAGCCACCGCGCAGGTCC60
TAGGGCCGCGGCCGGGCCCCGCCACGCGCGCACACGCCCCTCGATGACTTTCCTCCGGGG120
CGCGCGGCGCTGAGCCCGGGGCGAGGGCTGTCTTCCCGGAGACCCGACCCCGGCAGCGCG180
GGGCGGCCACTTCTCCTGTGCCTCCGCCCGCTGCTCCACTCCCCGCCGCCGCCGCGCGG239
ATGCCAAGCACCAGCTTTCCAGTCCCTTCCAAGTTTCCACTTGGCCCT287
MetProSerThrSerPheProValProSerLysPheProLeuGlyPro
925930935
GCGGCTGCGGTCTTCGGGAGAGGAGAAACTTTGGGGCCCGCGCCGCGC335
AlaAlaAlaValPheGlyArgGlyGluThrLeuGlyProAlaProArg
940945950
GCCGGCGGCACCATGAAGTCAGCGGAGGAAGAACACTATGGCTATGCA383
AlaGlyGlyThrMetLysSerAlaGluGluGluHisTyrGlyTyrAla
955960965
TCCTCCAACGTCAGCCCCGCCCTGCCGCTCCCCACGGCGCACTCCACC431
SerSerAsnValSerProAlaLeuProLeuProThrAlaHisSerThr
970975980985
CTGCCGGCCCCGTGCCACAACCTTCAGACCTCCACACCGGGCATCATC479
LeuProAlaProCysHisAsnLeuGlnThrSerThrProGlyIleIle
9909951000
CCGCCGGCGGATCACCCCTCGGGGTACGGAGCAGCTTTGGACGGTGGG527
ProProAlaAspHisProSerGlyTyrGlyAlaAlaLeuAspGlyGly
100510101015
CCCGCGGGCTACTTCCTCTCCTCCGGCCACACCAGGCCTGATGGGGCC575
ProAlaGlyTyrPheLeuSerSerGlyHisThrArgProAspGlyAla
102010251030
CCTGCCCTGGAGAGTCCTCGCATCGAGATAACCTCGTGCTTGGGCCTG623
ProAlaLeuGluSerProArgIleGluIleThrSerCysLeuGlyLeu
103510401045
TACCACAACAATAACCAGTTTTTCCACGATGTGGAGGTGGAAGACGTC671
TyrHisAsnAsnAsnGlnPhePheHisAspValGluValGluAspVal
1050105510601065
CTCCCTAGCTCCAAACGGTCCCCCTCCACGGCCACGCTGAGTCTGCCC719
LeuProSerSerLysArgSerProSerThrAlaThrLeuSerLeuPro
107010751080
AGCCTGGAGGCCTACAGAGACCCCTCGTGCCTGAGCCCGGCCAGCAGC767
SerLeuGluAlaTyrArgAspProSerCysLeuSerProAlaSerSer
108510901095
CTGTCCTCCCGGAGCTGCAACTCAGAGGCCTCCTCCTACGAGTCCAAC815
LeuSerSerArgSerCysAsnSerGluAlaSerSerTyrGluSerAsn
110011051110
TACTCGTACCCGTACGCGTCCCCCCAGACGTCGCCATGGCAGTCTCCC863
TyrSerTyrProTyrAlaSerProGlnThrSerProTrpGlnSerPro
111511201125
TGCGTGTCTCCCAAGACCACGGACCCCGAGGAGGGCTTTCCCCGCGGG911
CysValSerProLysThrThrAspProGluGluGlyPheProArgGly
1130113511401145
CTGGGGGCCTGCACACTGCTGGGTTCCCCGCAGCACTCCCCCTCCACC959
LeuGlyAlaCysThrLeuLeuGlySerProGlnHisSerProSerThr
115011551160
TCGCCCCGCGCCAGCGTCACTGAGGAGAGCTGGCTGGGTGCCCGCTCC1007
SerProArgAlaSerValThrGluGluSerTrpLeuGlyAlaArgSer
116511701175
TCCAGACCCGCGTCCCCTTGCAACAAGAGGAAGTACAGCCTCAACGGC1055
SerArgProAlaSerProCysAsnLysArgLysTyrSerLeuAsnGly
118011851190
CGGCAGCCGCCCTACTCACCCCACCACTCGCCCACGCCGTCCCCGCAC1103
ArgGlnProProTyrSerProHisHisSerProThrProSerProHis
119512001205
GGCTCCCCGCGGGTCAGCGTGACCGACGACTCGTGGTTGGGCAACACC1151
GlySerProArgValSerValThrAspAspSerTrpLeuGlyAsnThr
1210121512201225
ACCCAGTACACCAGCTCGGCCATCGTGGCCGCCATCAACGCGCTGACC1199
ThrGlnTyrThrSerSerAlaIleValAlaAlaIleAsnAlaLeuThr
123012351240
ACCGACAGCAGCCTGGACCTGGGAGATGGCGTCCCTGTCAAGTCCCGC1247
ThrAspSerSerLeuAspLeuGlyAspGlyValProValLysSerArg
124512501255
AAGACCACCCTGGAGCAGCCGCCCTCAGTGGCGCTCAAGGTGGAGCCC1295
LysThrThrLeuGluGlnProProSerValAlaLeuLysValGluPro
126012651270
GTCGGGGAGGACCTGGGCAGCCCCCCGCCCCCGGCCGACTTCGCGCCC1343
ValGlyGluAspLeuGlySerProProProProAlaAspPheAlaPro
127512801285
GAAGACTACTCCTCTTTCCAGCACATCAGGAAGGGCGGCTTCTGCGAC1391
GluAspTyrSerSerPheGlnHisIleArgLysGlyGlyPheCysAsp
1290129513001305
CAGTACCTGGCGGTGCCGCAGCACCCCTACCAGTGGGCGAAGCCCAAG1439
GlnTyrLeuAlaValProGlnHisProTyrGlnTrpAlaLysProLys
131013151320
CCCCTGTCCCCTACGTCCTACATGAGCCCGACCCTGCCCGCCCTGGAC1487
ProLeuSerProThrSerTyrMetSerProThrLeuProAlaLeuAsp
132513301335
TGGCAGCTGCCGTCCCACTCAGGCCCGTATGAGCTTCGGATTGAGGTG1535
TrpGlnLeuProSerHisSerGlyProTyrGluLeuArgIleGluVal
134013451350
CAGCCCAAGTCCCACCACCGAGCCCACTACGAGACGGAGGGCAGCCGG1583
GlnProLysSerHisHisArgAlaHisTyrGluThrGluGlySerArg
135513601365
GGGGCCGTGAAGGCGTCGGCCGGAGGACACCCCATCGTGCAGCTGCAT1631
GlyAlaValLysAlaSerAlaGlyGlyHisProIleValGlnLeuHis
1370137513801385
GGCTACTTGGAGAATGAGCCGCTGATGCTGCAGCTTTTCATTGGGACG1679
GlyTyrLeuGluAsnGluProLeuMetLeuGlnLeuPheIleGlyThr
139013951400
GCGGACGACCGCCTGCTGCGCCCGCACGCCTTCTACCAGGTGCACCGC1727
AlaAspAspArgLeuLeuArgProHisAlaPheTyrGlnValHisArg
140514101415
ATCACAGGGAAGACCGTGTCCACCACCAGCCACGAGGCTATCCTCTCC1775
IleThrGlyLysThrValSerThrThrSerHisGluAlaIleLeuSer
142014251430
AACACCAAAGTCCTGGAGATCCCACTCCTGCCGGAGAACAGCATGCGA1823
AsnThrLysValLeuGluIleProLeuLeuProGluAsnSerMetArg
143514401445
GCCGTCATTGACTGTGCCGGAATCCTGAAACTCAGAAACTCCGACATT1871
AlaValIleAspCysAlaGlyIleLeuLysLeuArgAsnSerAspIle
1450145514601465
GAACTTCGGAAAGGAGAGACGGACATCGGGAGGAAGAACACACGGGTA1919
GluLeuArgLysGlyGluThrAspIleGlyArgLysAsnThrArgVal
147014751480
CGGCTGGTGTTCCGCGTTCACGTCCCGCAACCCAGCGGCCGCACGCTG1967
ArgLeuValPheArgValHisValProGlnProSerGlyArgThrLeu
148514901495
TCCCTGCAGGTGGCCTCCAACCCCATCGAATGCTCCCAGCGCTCAGCT2015
SerLeuGlnValAlaSerAsnProIleGluCysSerGlnArgSerAla
150015051510
CAGGAGCTGCCTCTGGTGGAGAAGCAGAGCACGGACAGCTATCCGGTC2063
GlnGluLeuProLeuValGluLysGlnSerThrAspSerTyrProVal
151515201525
GTGGGCGGGAAGAAGATGGTCCTGTCTGGCCACAACTTCCTGCAGGAC2111
ValGlyGlyLysLysMetValLeuSerGlyHisAsnPheLeuGlnAsp
1530153515401545
TCCAAGGTCATTTTCGTGGAGAAAGCCCCAGATGGCCACCATGTCTGG2159
SerLysValIlePheValGluLysAlaProAspGlyHisHisValTrp
155015551560
GAGATGGAAGCGAAAACTGACCGGGACCTGTGCAAGCCGAATTCTCTG2207
GluMetGluAlaLysThrAspArgAspLeuCysLysProAsnSerLeu
156515701575
GTGGTTGAGATCCCGCCATTTCGGAATCAGAGGATAACCAGCCCCGTT2255
ValValGluIleProProPheArgAsnGlnArgIleThrSerProVal
158015851590
CACGTCAGTTTCTACGTCTGCAACGGGAAGAGAAAGCGAAGCCAGTAC2303
HisValSerPheTyrValCysAsnGlyLysArgLysArgSerGlnTyr
159516001605
CAGCGTTTCACCTACCTTCCCGCCAACGGTAACGCCATCTTTCTAACC2351
GlnArgPheThrTyrLeuProAlaAsnGlyAsnAlaIlePheLeuThr
1610161516201625
GTAAGCCGTGAACATGAGCGCGTGGGGTGCTTTTTCTAAAGACGCAGAA2400
ValSerArgGluHisGluArgValGlyCysPhePhe*
16301635
ACGACGTCGCCGTAAAGCAGCGTGGCGTGTTGCACATTTAACTGTGTGATGTCCCGTTAG2460
TGAGACCGAGCCATCGATGCCCTGAAAAGGAAAGGAAAAGGGAAGCTTCGGATGCATTTT2520
CCTTGATCCCTGTTGGGGGTGGGGGGCGGGGGTTGCATACTCAGATAGTCACGGTTATTT2580
TGCTTCTTGCGAATGTATAACAGCCAAGGGGAAAACATGGCTCTTCTGCTCCAAAAAACT2640
GAGGGGGTCCTGGTGTGCATTTGCACCCTAAAGCTGCTTACGGTGAAAAGGCAAATAGGT2700
ATAGCTATTTTGCAGGCACCTTTAGGAATAAACTTTGCTTTTA2743
(2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 716 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:
MetProSerThrSerPheProValProSerLysPheProLeuGlyPro
151015
AlaAlaAlaValPheGlyArgGlyGluThrLeuGlyProAlaProArg
202530
AlaGlyGlyThrMetLysSerAlaGluGluGluHisTyrGlyTyrAla
354045
SerSerAsnValSerProAlaLeuProLeuProThrAlaHisSerThr
505560
LeuProAlaProCysHisAsnLeuGlnThrSerThrProGlyIleIle
65707580
ProProAlaAspHisProSerGlyTyrGlyAlaAlaLeuAspGlyGly
859095
ProAlaGlyTyrPheLeuSerSerGlyHisThrArgProAspGlyAla
100105110
ProAlaLeuGluSerProArgIleGluIleThrSerCysLeuGlyLeu
115120125
TyrHisAsnAsnAsnGlnPhePheHisAspValGluValGluAspVal
130135140
LeuProSerSerLysArgSerProSerThrAlaThrLeuSerLeuPro
145150155160
SerLeuGluAlaTyrArgAspProSerCysLeuSerProAlaSerSer
165170175
LeuSerSerArgSerCysAsnSerGluAlaSerSerTyrGluSerAsn
180185190
TyrSerTyrProTyrAlaSerProGlnThrSerProTrpGlnSerPro
195200205
CysValSerProLysThrThrAspProGluGluGlyPheProArgGly
210215220
LeuGlyAlaCysThrLeuLeuGlySerProGlnHisSerProSerThr
225230235240
SerProArgAlaSerValThrGluGluSerTrpLeuGlyAlaArgSer
245250255
SerArgProAlaSerProCysAsnLysArgLysTyrSerLeuAsnGly
260265270
ArgGlnProProTyrSerProHisHisSerProThrProSerProHis
275280285
GlySerProArgValSerValThrAspAspSerTrpLeuGlyAsnThr
290295300
ThrGlnTyrThrSerSerAlaIleValAlaAlaIleAsnAlaLeuThr
305310315320
ThrAspSerSerLeuAspLeuGlyAspGlyValProValLysSerArg
325330335
LysThrThrLeuGluGlnProProSerValAlaLeuLysValGluPro
340345350
ValGlyGluAspLeuGlySerProProProProAlaAspPheAlaPro
355360365
GluAspTyrSerSerPheGlnHisIleArgLysGlyGlyPheCysAsp
370375380
GlnTyrLeuAlaValProGlnHisProTyrGlnTrpAlaLysProLys
385390395400
ProLeuSerProThrSerTyrMetSerProThrLeuProAlaLeuAsp
405410415
TrpGlnLeuProSerHisSerGlyProTyrGluLeuArgIleGluVal
420425430
GlnProLysSerHisHisArgAlaHisTyrGluThrGluGlySerArg
435440445
GlyAlaValLysAlaSerAlaGlyGlyHisProIleValGlnLeuHis
450455460
GlyTyrLeuGluAsnGluProLeuMetLeuGlnLeuPheIleGlyThr
465470475480
AlaAspAspArgLeuLeuArgProHisAlaPheTyrGlnValHisArg
485490495
IleThrGlyLysThrValSerThrThrSerHisGluAlaIleLeuSer
500505510
AsnThrLysValLeuGluIleProLeuLeuProGluAsnSerMetArg
515520525
AlaValIleAspCysAlaGlyIleLeuLysLeuArgAsnSerAspIle
530535540
GluLeuArgLysGlyGluThrAspIleGlyArgLysAsnThrArgVal
545550555560
ArgLeuValPheArgValHisValProGlnProSerGlyArgThrLeu
565570575
SerLeuGlnValAlaSerAsnProIleGluCysSerGlnArgSerAla
580585590
GlnGluLeuProLeuValGluLysGlnSerThrAspSerTyrProVal
595600605
ValGlyGlyLysLysMetValLeuSerGlyHisAsnPheLeuGlnAsp
610615620
SerLysValIlePheValGluLysAlaProAspGlyHisHisValTrp
625630635640
GluMetGluAlaLysThrAspArgAspLeuCysLysProAsnSerLeu
645650655
ValValGluIleProProPheArgAsnGlnArgIleThrSerProVal
660665670
HisValSerPheTyrValCysAsnGlyLysArgLysArgSerGlnTyr
675680685
GlnArgPheThrTyrLeuProAlaAsnGlyAsnAlaIlePheLeuThr
690695700
ValSerArgGluHisGluArgValGlyCysPhePhe
705710715
(2) INFORMATION FOR SEQ ID NO:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2881 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 142..2850
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:
GCTTCTGGAGGGAGGCGGCAGCGACGGAGGAGGGGGCTTCTCAGAGAAAGGGAGGGAGGG60
AGCCACCCGGGTGAAGATACAGCAGCCTCCTGAACTCCCCCCTCCCACCCAGGCCGGGAC120
CTGGGGGCTCCTGCCGGATCCATGGGGGCGGCCAGCTGCGAGGATGAGGAG171
MetGlyAlaAlaSerCysGluAspGluGlu
720725
CTGGAATTTAAGCTGGTGTTCGGGGAGGAAAAGGAGGCCCCCCCGCTG219
LeuGluPheLysLeuValPheGlyGluGluLysGluAlaProProLeu
730735740
GGCGCGGGGGGATTGGGGGAAGAACTGGACTCAGAGGATGCCCCGCCA267
GlyAlaGlyGlyLeuGlyGluGluLeuAspSerGluAspAlaProPro
745750755
TGCTGCCGTCTGGCCTTGGGAGAGCCCCCTCCCTATGGCGCTGCACCT315
CysCysArgLeuAlaLeuGlyGluProProProTyrGlyAlaAlaPro
760765770775
ATCGGTATTCCCCGACCTCCACCCCCTCGGCCTGGCATGCATTCGCCA363
IleGlyIleProArgProProProProArgProGlyMetHisSerPro
780785790
CCGCCGCGACCAGCCCCCTCACCTGGCACCTGGGAGAGCCAGCCCGCC411
ProProArgProAlaProSerProGlyThrTrpGluSerGlnProAla
795800805
AGGTCGGTGAGGCTGGGAGGACCAGGAGGGGGTGCTGGGGGTGCTGGG459
ArgSerValArgLeuGlyGlyProGlyGlyGlyAlaGlyGlyAlaGly
810815820
GGTGGCCGTGTTCTCGAGTGTCCCAGCATCCGCATCACCTCCATCTCT507
GlyGlyArgValLeuGluCysProSerIleArgIleThrSerIleSer
825830835
CCCACGCCGGAGCCGCCAGCAGCGCTGGAGGACAACCCTGATGCCTGG555
ProThrProGluProProAlaAlaLeuGluAspAsnProAspAlaTrp
840845850855
GGGGACGGCTCTCCTAGAGATTACCCCCCACCAGAAGGCTTTGGGGGC603
GlyAspGlySerProArgAspTyrProProProGluGlyPheGlyGly
860865870
TACAGAGAAGCAGGGGCCCAGGGTGGGGGGGCCTTCTTCAGCCCAAGC651
TyrArgGluAlaGlyAlaGlnGlyGlyGlyAlaPhePheSerProSer
875880885
CCTGGCAGCAGCAGCCTGTCCTCGTGGAGCTTCTTCTCCGATGCCTCT699
ProGlySerSerSerLeuSerSerTrpSerPhePheSerAspAlaSer
890895900
GACGAGGCAGCCCTGTATGCAGCCTGCGACGAGGTGGAGTCTGAGCTA747
AspGluAlaAlaLeuTyrAlaAlaCysAspGluValGluSerGluLeu
905910915
AATGAGGCGGCCTCCCGCTTTGGCCTGGGCTCCCCGCTGCCCTCGCCC795
AsnGluAlaAlaSerArgPheGlyLeuGlySerProLeuProSerPro
920925930935
CGGGCCTCCCCTCGGCCATGGACCCCCGAAGATCCCTGGAGCCTGTAT843
ArgAlaSerProArgProTrpThrProGluAspProTrpSerLeuTyr
940945950
GGTCCAAGCCCCGGAGGCCGAGGGCCAGAGGATAGCTGGCTACTCCTC891
GlyProSerProGlyGlyArgGlyProGluAspSerTrpLeuLeuLeu
955960965
AGTGCTCCTGGGCCCACCCCAGCCTCCCCGCGGCCTGCCTCTCCATGT939
SerAlaProGlyProThrProAlaSerProArgProAlaSerProCys
970975980
GGCAAGCGGCGCTATTCCAGCTCGGGAACCCCATCTTCAGCCTCCCCA987
GlyLysArgArgTyrSerSerSerGlyThrProSerSerAlaSerPro
985990995
GCTCTGTCCCGCCGTGGCAGCCTGGGGGAAGAGGGGTCTGAGCCACCT1035
AlaLeuSerArgArgGlySerLeuGlyGluGluGlySerGluProPro
1000100510101015
CCACCACCCCCATTGCCTCTGGCCCGGGACCCGGGCTCCCCTGGTCCC1083
ProProProProLeuProLeuAlaArgAspProGlySerProGlyPro
102010251030
TTTGACTATGTGGGGGCCCCACCAGCTGAGAGCATCCCTCAGAAGACA1131
PheAspTyrValGlyAlaProProAlaGluSerIleProGlnLysThr
103510401045
CGGCGGACTTCCAGCGAGCAGGCAGTGGCTCTGCCTCGGTCTGAGGAG1179
ArgArgThrSerSerGluGlnAlaValAlaLeuProArgSerGluGlu
105010551060
CCTGCCTCATGCAATGGGAAGCTGCCCTTGGGAGCAGAGGAGTCTGTG1227
ProAlaSerCysAsnGlyLysLeuProLeuGlyAlaGluGluSerVal
106510701075
GCTCCTCCAGGAGGTTCCCGGAAGGAGGTGGCTGGCATGGACTACCTG1275
AlaProProGlyGlySerArgLysGluValAlaGlyMetAspTyrLeu
1080108510901095
GCAGTGCCCTCCCCACTCGCTTGGTCCAAGGCCCGGATTGGGGGACAC1323
AlaValProSerProLeuAlaTrpSerLysAlaArgIleGlyGlyHis
110011051110
AGCCCTATCTTCAGGACCTCTGCCCTACCCCCACTGGACTGGCCTCTG1371
SerProIlePheArgThrSerAlaLeuProProLeuAspTrpProLeu
111511201125
CCCAGCCAATATGAGCAGCTGGAGCTGAGGATCGAGGTACAGCCTAGA1419
ProSerGlnTyrGluGlnLeuGluLeuArgIleGluValGlnProArg
113011351140
GCCCACCACCGGGCCCACTATGAGACAGAAGGCAGCCGTGGAGCTGTC1467
AlaHisHisArgAlaHisTyrGluThrGluGlySerArgGlyAlaVal
114511501155
AAAGCTGCCCCTGGCGGTCACCCCGTAGTCAAGCTCCTAGGCTACAGT1515
LysAlaAlaProGlyGlyHisProValValLysLeuLeuGlyTyrSer
1160116511701175
GAGAAGCCACTGACCCTACAGATGTTCATCGGCACTGCAGATGAAAGG1563
GluLysProLeuThrLeuGlnMetPheIleGlyThrAlaAspGluArg
118011851190
AACCTGCGGCCTCATGCCTTCTATCAGGTGCACCGTATCACAGGCAAG1611
AsnLeuArgProHisAlaPheTyrGlnValHisArgIleThrGlyLys
119512001205
ATGGTGGCCACGGCCAGCTATGAAGCCGTAGTCAGTGGCACCAAGGTG1659
MetValAlaThrAlaSerTyrGluAlaValValSerGlyThrLysVal
121012151220
TTGGAGATGACTCTGCTGCCTGAGAACAACATGGCGGCCAACATTGAC1707
LeuGluMetThrLeuLeuProGluAsnAsnMetAlaAlaAsnIleAsp
122512301235
TGCGCGGGAATCCTGAAGCTTCGGAATTCAGACATTGAGCTTCGGAAG1755
CysAlaGlyIleLeuLysLeuArgAsnSerAspIleGluLeuArgLys
1240124512501255
GGTGAGACGGACATCGGGCGCAAAAACACACGTGTACGGCTGGTGTTC1803
GlyGluThrAspIleGlyArgLysAsnThrArgValArgLeuValPhe
126012651270
CGGGTACACGTGCCCCAGGGCGGCGGGAAGGTCGTCTCAGTACAGGCA1851
ArgValHisValProGlnGlyGlyGlyLysValValSerValGlnAla
127512801285
GCATCGGTGCCCATCGAGTGCTCCCAGCGCTCAGCCCAGGAGCTGCCC1899
AlaSerValProIleGluCysSerGlnArgSerAlaGlnGluLeuPro
129012951300
CAGGTGGAGGCCTACAGCCCCAGTGCCTGCTCTGTGAGAGGAGGCGAG1947
GlnValGluAlaTyrSerProSerAlaCysSerValArgGlyGlyGlu
130513101315
GAACTGGTACTGACCGGCTCCAACTTCCTGCCAGACTCCAAGGTGGTG1995
GluLeuValLeuThrGlySerAsnPheLeuProAspSerLysValVal
1320132513301335
TTCATTGAGAGGGGTCCTGATGGGAAGCTGCAATGGGAGGAGGAGGCC2043
PheIleGluArgGlyProAspGlyLysLeuGlnTrpGluGluGluAla
134013451350
ACAGTGAACCGACTGCAGAGCAACGAGGTGACGCTGACCCTGACTGTC2091
ThrValAsnArgLeuGlnSerAsnGluValThrLeuThrLeuThrVal
135513601365
CCCGAGTACAGCAACAAGAGGGTTTCCCGGCCAGTCCAGGTCTACTTT2139
ProGluTyrSerAsnLysArgValSerArgProValGlnValTyrPhe
137013751380
TATGTCTCCAATGGGCGGAGGAAACGCAGTCCTACCCAGAGTTTCAGG2187
TyrValSerAsnGlyArgArgLysArgSerProThrGlnSerPheArg
138513901395
TTTCTGCCTGTGATCTGCAAAGAGGAGCCCCTACCGGACTCATCTCTG2235
PheLeuProValIleCysLysGluGluProLeuProAspSerSerLeu
1400140514101415
CGGGGTTTCCCTTCAGCATCGGCAACCCCCTTTGGCACTGACATGGAC2283
ArgGlyPheProSerAlaSerAlaThrProPheGlyThrAspMetAsp
142014251430
TTCTCACCACCCAGGCCCCCCTACCCCTCCTATCCCCATGAAGACCCT2331
PheSerProProArgProProTyrProSerTyrProHisGluAspPro
143514401445
GCTTGCGAAACTCCTTACCTATCAGAAGGCTTCGGCTATGGCATGCCC2379
AlaCysGluThrProTyrLeuSerGluGlyPheGlyTyrGlyMetPro
145014551460
CCTCTGTACCCCCAGACGGGGCCCCCACCATCCTACAGACCGGGCCTG2427
ProLeuTyrProGlnThrGlyProProProSerTyrArgProGlyLeu
146514701475
CGGATGTTCCCTGAGACTAGGGGTACCACAGGTTGTGCCCAACCACCT2475
ArgMetPheProGluThrArgGlyThrThrGlyCysAlaGlnProPro
1480148514901495
GCAGTTTCCTTCCTTCCCCGCCCCTTCCCTAGTGACCCGTATGGAGGG2523
AlaValSerPheLeuProArgProPheProSerAspProTyrGlyGly
150015051510
CGGGGCTCCTCTTTCCCCCTGGGGCTGCCATTCTCTCCGCCAGCCCCC2571
ArgGlySerSerPheProLeuGlyLeuProPheSerProProAlaPro
151515201525
TTTCGGCCGCCTCCTCTTCCTGCATCCCCACCGCTTGAAGGCCCCTTC2619
PheArgProProProLeuProAlaSerProProLeuGluGlyProPhe
153015351540
CCTTCCCAGAGTGATGTGCATCCCCTACCTGCTGAGGGATACAATAAG2667
ProSerGlnSerAspValHisProLeuProAlaGluGlyTyrAsnLys
154515501555
GTAGGGCCAGGCTATGGCCCTGGGGAGGGGGCTCCGGAGCAGGAGAAA2715
ValGlyProGlyTyrGlyProGlyGluGlyAlaProGluGlnGluLys
1560156515701575
TCCAGGGGTGGCTACAGCAGCGGCTTTCGAGACAGTGTCCCTATCCAG2763
SerArgGlyGlyTyrSerSerGlyPheArgAspSerValProIleGln
158015851590
GGTATCACGCTGGAGGAAGTGAGTGAGATCATTGGCCGAGACCTGAGT2811
GlyIleThrLeuGluGluValSerGluIleIleGlyArgAspLeuSer
159516001605
GGCTTCCCTGCACCTCCTGGAGAAGAGCCTCCTGCCTGAACCACGTGAA2860
GlyPheProAlaProProGlyGluGluProProAla*
161016151620
CTGTCATCACCTGGCAACCCC2881
(2) INFORMATION FOR SEQ ID NO:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 902 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:
MetGlyAlaAlaSerCysGluAspGluGluLeuGluPheLysLeuVal
151015
PheGlyGluGluLysGluAlaProProLeuGlyAlaGlyGlyLeuGly
202530
GluGluLeuAspSerGluAspAlaProProCysCysArgLeuAlaLeu
354045
GlyGluProProProTyrGlyAlaAlaProIleGlyIleProArgPro
505560
ProProProArgProGlyMetHisSerProProProArgProAlaPro
65707580
SerProGlyThrTrpGluSerGlnProAlaArgSerValArgLeuGly
859095
GlyProGlyGlyGlyAlaGlyGlyAlaGlyGlyGlyArgValLeuGlu
100105110
CysProSerIleArgIleThrSerIleSerProThrProGluProPro
115120125
AlaAlaLeuGluAspAsnProAspAlaTrpGlyAspGlySerProArg
130135140
AspTyrProProProGluGlyPheGlyGlyTyrArgGluAlaGlyAla
145150155160
GlnGlyGlyGlyAlaPhePheSerProSerProGlySerSerSerLeu
165170175
SerSerTrpSerPhePheSerAspAlaSerAspGluAlaAlaLeuTyr
180185190
AlaAlaCysAspGluValGluSerGluLeuAsnGluAlaAlaSerArg
195200205
PheGlyLeuGlySerProLeuProSerProArgAlaSerProArgPro
210215220
TrpThrProGluAspProTrpSerLeuTyrGlyProSerProGlyGly
225230235240
ArgGlyProGluAspSerTrpLeuLeuLeuSerAlaProGlyProThr
245250255
ProAlaSerProArgProAlaSerProCysGlyLysArgArgTyrSer
260265270
SerSerGlyThrProSerSerAlaSerProAlaLeuSerArgArgGly
275280285
SerLeuGlyGluGluGlySerGluProProProProProProLeuPro
290295300
LeuAlaArgAspProGlySerProGlyProPheAspTyrValGlyAla
305310315320
ProProAlaGluSerIleProGlnLysThrArgArgThrSerSerGlu
325330335
GlnAlaValAlaLeuProArgSerGluGluProAlaSerCysAsnGly
340345350
LysLeuProLeuGlyAlaGluGluSerValAlaProProGlyGlySer
355360365
ArgLysGluValAlaGlyMetAspTyrLeuAlaValProSerProLeu
370375380
AlaTrpSerLysAlaArgIleGlyGlyHisSerProIlePheArgThr
385390395400
SerAlaLeuProProLeuAspTrpProLeuProSerGlnTyrGluGln
405410415
LeuGluLeuArgIleGluValGlnProArgAlaHisHisArgAlaHis
420425430
TyrGluThrGluGlySerArgGlyAlaValLysAlaAlaProGlyGly
435440445
HisProValValLysLeuLeuGlyTyrSerGluLysProLeuThrLeu
450455460
GlnMetPheIleGlyThrAlaAspGluArgAsnLeuArgProHisAla
465470475480
PheTyrGlnValHisArgIleThrGlyLysMetValAlaThrAlaSer
485490495
TyrGluAlaValValSerGlyThrLysValLeuGluMetThrLeuLeu
500505510
ProGluAsnAsnMetAlaAlaAsnIleAspCysAlaGlyIleLeuLys
515520525
LeuArgAsnSerAspIleGluLeuArgLysGlyGluThrAspIleGly
530535540
ArgLysAsnThrArgValArgLeuValPheArgValHisValProGln
545550555560
GlyGlyGlyLysValValSerValGlnAlaAlaSerValProIleGlu
565570575
CysSerGlnArgSerAlaGlnGluLeuProGlnValGluAlaTyrSer
580585590
ProSerAlaCysSerValArgGlyGlyGluGluLeuValLeuThrGly
595600605
SerAsnPheLeuProAspSerLysValValPheIleGluArgGlyPro
610615620
AspGlyLysLeuGlnTrpGluGluGluAlaThrValAsnArgLeuGln
625630635640
SerAsnGluValThrLeuThrLeuThrValProGluTyrSerAsnLys
645650655
ArgValSerArgProValGlnValTyrPheTyrValSerAsnGlyArg
660665670
ArgLysArgSerProThrGlnSerPheArgPheLeuProValIleCys
675680685
LysGluGluProLeuProAspSerSerLeuArgGlyPheProSerAla
690695700
SerAlaThrProPheGlyThrAspMetAspPheSerProProArgPro
705710715720
ProTyrProSerTyrProHisGluAspProAlaCysGluThrProTyr
725730735
LeuSerGluGlyPheGlyTyrGlyMetProProLeuTyrProGlnThr
740745750
GlyProProProSerTyrArgProGlyLeuArgMetPheProGluThr
755760765
ArgGlyThrThrGlyCysAlaGlnProProAlaValSerPheLeuPro
770775780
ArgProPheProSerAspProTyrGlyGlyArgGlySerSerPhePro
785790795800
LeuGlyLeuProPheSerProProAlaProPheArgProProProLeu
805810815
ProAlaSerProProLeuGluGlyProPheProSerGlnSerAspVal
820825830
HisProLeuProAlaGluGlyTyrAsnLysValGlyProGlyTyrGly
835840845
ProGlyGluGlyAlaProGluGlnGluLysSerArgGlyGlyTyrSer
850855860
SerGlyPheArgAspSerValProIleGlnGlyIleThrLeuGluGlu
865870875880
ValSerGluIleIleGlyArgAspLeuSerGlyPheProAlaProPro
885890895
GlyGluGluProProAla
900
(2) INFORMATION FOR SEQ ID NO:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2406 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 211..2337
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:
CGGCTGCGGTTCCTGGTGCTGCTCGGCGCGCGGCCAGCTTTCGGAACGGAACGCTCGGCG60
TCGCGGGCCCCGCCCGGAAAGTTTGCCGTGGAGTCGCGACCTCTTGGCCCGCGCGGCCCG120
GCATGAAGCGGCGTTGAGGAGCTGCTGCCGCCGCTTGCCGCTGCCGCCGCCGCCGCCTGA180
GGAGGAGCTGCAGCACCCTGGGCCACGCCGATGACTACTGCAAACTGTGGCGCC234
MetThrThrAlaAsnCysGlyAla
905910
CACGACGAGCTCGACTTCAAACTCGTCTTTGGCGAGGACGGGGCGCCG282
HisAspGluLeuAspPheLysLeuValPheGlyGluAspGlyAlaPro
915920925
GCGCCGCCGCCCCCGGGCTCGCGGCCTGCAGATCTTGAGCCAGATGAT330
AlaProProProProGlySerArgProAlaAspLeuGluProAspAsp
930935940
TGTGCATCCATTTACATCTTTAATGTAGATCCACCTCCATCTACTTTA378
CysAlaSerIleTyrIlePheAsnValAspProProProSerThrLeu
945950955
ACCACACCACTTTGCTTACCACATCATGGATTACCGTCTCACTCTTCT426
ThrThrProLeuCysLeuProHisHisGlyLeuProSerHisSerSer
960965970975
GTTTTGTCACCATCGTTTCAGCTCCAAAGTCACAAAAACTATGAAGGA474
ValLeuSerProSerPheGlnLeuGlnSerHisLysAsnTyrGluGly
980985990
ACTTGTGAGATTCCTGAATCTAAATATAGCCCATTAGGTGGTCCCAAA522
ThrCysGluIleProGluSerLysTyrSerProLeuGlyGlyProLys
99510001005
CCCTTTGAGTGCCCAAGTATTCAAATTACATCTATCTCTCCTAACTGT570
ProPheGluCysProSerIleGlnIleThrSerIleSerProAsnCys
101010151020
CATCAAGAATTAGATGCACATGAAGATGACCTACAGATAAATGACCCA618
HisGlnGluLeuAspAlaHisGluAspAspLeuGlnIleAsnAspPro
102510301035
GAACGGGAATTTTTGGAAAGGCCTTCTAGAGATCATCTCTATCTTCCT666
GluArgGluPheLeuGluArgProSerArgAspHisLeuTyrLeuPro
1040104510501055
CTTGAGCCATCCTACCGGGAGTCTTCTCTTAGTCCTAGTCCTGCCAGC714
LeuGluProSerTyrArgGluSerSerLeuSerProSerProAlaSer
106010651070
AGCATCTCTTCTAGGAGTTGGTTCTCTGATGCATCTTCTTGTGAATCG762
SerIleSerSerArgSerTrpPheSerAspAlaSerSerCysGluSer
107510801085
CTTTCACATATTTATGATGATGTGGACTCAGAGTTGAATGAAGCTGCA810
LeuSerHisIleTyrAspAspValAspSerGluLeuAsnGluAlaAla
109010951100
GCCCGATTTACCCTTGGATCCCCTCTGACTTCTCCTGGTGGCTCTCCA858
AlaArgPheThrLeuGlySerProLeuThrSerProGlyGlySerPro
110511101115
GGGGGCTGCCCTGGAGAAGAAACTTGGCATCAACAGTATGGACTTGGA906
GlyGlyCysProGlyGluGluThrTrpHisGlnGlnTyrGlyLeuGly
1120112511301135
CACTCATTATCACCCAGGCAATCTCCTTGCCACTCTCCTAGATCCAGT954
HisSerLeuSerProArgGlnSerProCysHisSerProArgSerSer
114011451150
GTCACTGATGAGAATTGGCTGAGCCCCAGGCCAGCCTCAGGACCCTCA1002
ValThrAspGluAsnTrpLeuSerProArgProAlaSerGlyProSer
115511601165
TCAAGGCCCACATCCCCCTGTGGGAAACGGAGGCACTCCAGTGCTGAA1050
SerArgProThrSerProCysGlyLysArgArgHisSerSerAlaGlu
117011751180
GTTTGTTATGCTGGGTCCCTTTCACCCCATCACTCACCTGTTCCTTCA1098
ValCysTyrAlaGlySerLeuSerProHisHisSerProValProSer
118511901195
CCTGGTCACTCCCCCAGGGGAAGTGTGACAGAAGATACGTGGCTCAAT1146
ProGlyHisSerProArgGlySerValThrGluAspThrTrpLeuAsn
1200120512101215
GCTTCTGTCCATGGTGGGTCAGGCCTTGGCCCTGCAGTTTTTCCATTT1194
AlaSerValHisGlyGlySerGlyLeuGlyProAlaValPheProPhe
122012251230
CAGTACTGTGTAGAGACTGACATCCCTCTCAAAACAAGGAAAACTTCT1242
GlnTyrCysValGluThrAspIleProLeuLysThrArgLysThrSer
123512401245
GAAGATCAAGCTGCCATACTACCAGGAAAATTAGAGCTGTGTTCAGAT1290
GluAspGlnAlaAlaIleLeuProGlyLysLeuGluLeuCysSerAsp
125012551260
GACCAAGGGAGTTTATCACCAGCCCGGGAGACTTCAATAGATGATGGC1338
AspGlnGlySerLeuSerProAlaArgGluThrSerIleAspAspGly
126512701275
CTTGGATCTCAGTATCCTTTAAAGAAAGATTCATGTGGTGATCAGTTT1386
LeuGlySerGlnTyrProLeuLysLysAspSerCysGlyAspGlnPhe
1280128512901295
CTTTCAGTTCCTTCACCCTTTACCTGGAGCAAACCAAAGCCTGGCCAC1434
LeuSerValProSerProPheThrTrpSerLysProLysProGlyHis
130013051310
ACCCCTATATTTCGCACATCTTCATTACCTCCACTAGACTGGCCTTTA1482
ThrProIlePheArgThrSerSerLeuProProLeuAspTrpProLeu
131513201325
CCAGCTCATTTTGGACAATGTGAACTGAAAATAGAAGTGCAACCTAAA1530
ProAlaHisPheGlyGlnCysGluLeuLysIleGluValGlnProLys
133013351340
ACTCATCATCGAGCCCATTATGAAACTGAAGGTAGCCGAGGGGCAGTA1578
ThrHisHisArgAlaHisTyrGluThrGluGlySerArgGlyAlaVal
134513501355
AAAGCATCTACTGGGGGACATCCTGTTGTGAAGCTCCTGGGCTATAAC1626
LysAlaSerThrGlyGlyHisProValValLysLeuLeuGlyTyrAsn
1360136513701375
GAAAAGCCAATAAATCTACAAATGTTTATTGGGACAGCAGATGATCGA1674
GluLysProIleAsnLeuGlnMetPheIleGlyThrAlaAspAspArg
138013851390
TATTTACGACCTCATGCATTTTACCAGGTGCATCGAATCACTGGGAAG1722
TyrLeuArgProHisAlaPheTyrGlnValHisArgIleThrGlyLys
139514001405
ACAGTCGCTACTGCAAGCCAAGAGATAATAATTGCCAGTACAAAAGTT1770
ThrValAlaThrAlaSerGlnGluIleIleIleAlaSerThrLysVal
141014151420
CTGGAAATTCCACTTCTTCCTGAAAATAATATGTCAGCCAGTATTGAT1818
LeuGluIleProLeuLeuProGluAsnAsnMetSerAlaSerIleAsp
142514301435
TGTGCAGGTATTTTGAAACTCCGCAATTCAGATATAGAACTTCGAAAA1866
CysAlaGlyIleLeuLysLeuArgAsnSerAspIleGluLeuArgLys
1440144514501455
GGAGAAACTGATATTGGCAGAAAGAATACTAGAGTACGACTTGTGTTT1914
GlyGluThrAspIleGlyArgLysAsnThrArgValArgLeuValPhe
146014651470
CGTGTACACATCCCACAGCCCAGTGGAAAAGTCCTTTCTCTGCAGATA1962
ArgValHisIleProGlnProSerGlyLysValLeuSerLeuGlnIle
147514801485
GCCTCTATACCCGTTGAGTGCTCCCAGCGGTCTGCTCAAGAACTTCCT2010
AlaSerIleProValGluCysSerGlnArgSerAlaGlnGluLeuPro
149014951500
CATATTGAGAAGTACAGTATCAACAGTTGTTCTGTAAATGGAGGTCAT2058
HisIleGluLysTyrSerIleAsnSerCysSerValAsnGlyGlyHis
150515101515
GAAATGGTTGTGACTGGATCTAATTTTCTTCCAGAATCCAAAATCATT2106
GluMetValValThrGlySerAsnPheLeuProGluSerLysIleIle
1520152515301535
TTTCTTGAAAAAGGACAAGATGGACGACCTCAGTGGGAGGTAGAAGGG2154
PheLeuGluLysGlyGlnAspGlyArgProGlnTrpGluValGluGly
154015451550
AAGATAATCAGGGAAAAATGTCAAGGGGCTCACATTGTCCTTGAAGTT2202
LysIleIleArgGluLysCysGlnGlyAlaHisIleValLeuGluVal
155515601565
CCTCCATATCATAACCCAGCAGTTACAGCTGCAGTGCAGGTGCACTTT2250
ProProTyrHisAsnProAlaValThrAlaAlaValGlnValHisPhe
157015751580
TATCTTTGCAATGGCAAGAGGAAAAAAAGCCAGTCTCAACGTTTTACT2298
TyrLeuCysAsnGlyLysArgLysLysSerGlnSerGlnArgPheThr
158515901595
TATACACCAGGTACGAGGAGTCATGATGGTTTACTATAGAGCTTTCTTT2347
TyrThrProGlyThrArgSerHisAspGlyLeuLeu*
160016051610
CCTAATGAATAAAAAGTTATTTAACGAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA2406
(2) INFORMATION FOR SEQ ID NO:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 708 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:
MetThrThrAlaAsnCysGlyAlaHisAspGluLeuAspPheLysLeu
151015
ValPheGlyGluAspGlyAlaProAlaProProProProGlySerArg
202530
ProAlaAspLeuGluProAspAspCysAlaSerIleTyrIlePheAsn
354045
ValAspProProProSerThrLeuThrThrProLeuCysLeuProHis
505560
HisGlyLeuProSerHisSerSerValLeuSerProSerPheGlnLeu
65707580
GlnSerHisLysAsnTyrGluGlyThrCysGluIleProGluSerLys
859095
TyrSerProLeuGlyGlyProLysProPheGluCysProSerIleGln
100105110
IleThrSerIleSerProAsnCysHisGlnGluLeuAspAlaHisGlu
115120125
AspAspLeuGlnIleAsnAspProGluArgGluPheLeuGluArgPro
130135140
SerArgAspHisLeuTyrLeuProLeuGluProSerTyrArgGluSer
145150155160
SerLeuSerProSerProAlaSerSerIleSerSerArgSerTrpPhe
165170175
SerAspAlaSerSerCysGluSerLeuSerHisIleTyrAspAspVal
180185190
AspSerGluLeuAsnGluAlaAlaAlaArgPheThrLeuGlySerPro
195200205
LeuThrSerProGlyGlySerProGlyGlyCysProGlyGluGluThr
210215220
TrpHisGlnGlnTyrGlyLeuGlyHisSerLeuSerProArgGlnSer
225230235240
ProCysHisSerProArgSerSerValThrAspGluAsnTrpLeuSer
245250255
ProArgProAlaSerGlyProSerSerArgProThrSerProCysGly
260265270
LysArgArgHisSerSerAlaGluValCysTyrAlaGlySerLeuSer
275280285
ProHisHisSerProValProSerProGlyHisSerProArgGlySer
290295300
ValThrGluAspThrTrpLeuAsnAlaSerValHisGlyGlySerGly
305310315320
LeuGlyProAlaValPheProPheGlnTyrCysValGluThrAspIle
325330335
ProLeuLysThrArgLysThrSerGluAspGlnAlaAlaIleLeuPro
340345350
GlyLysLeuGluLeuCysSerAspAspGlnGlySerLeuSerProAla
355360365
ArgGluThrSerIleAspAspGlyLeuGlySerGlnTyrProLeuLys
370375380
LysAspSerCysGlyAspGlnPheLeuSerValProSerProPheThr
385390395400
TrpSerLysProLysProGlyHisThrProIlePheArgThrSerSer
405410415
LeuProProLeuAspTrpProLeuProAlaHisPheGlyGlnCysGlu
420425430
LeuLysIleGluValGlnProLysThrHisHisArgAlaHisTyrGlu
435440445
ThrGluGlySerArgGlyAlaValLysAlaSerThrGlyGlyHisPro
450455460
ValValLysLeuLeuGlyTyrAsnGluLysProIleAsnLeuGlnMet
465470475480
PheIleGlyThrAlaAspAspArgTyrLeuArgProHisAlaPheTyr
485490495
GlnValHisArgIleThrGlyLysThrValAlaThrAlaSerGlnGlu
500505510
IleIleIleAlaSerThrLysValLeuGluIleProLeuLeuProGlu
515520525
AsnAsnMetSerAlaSerIleAspCysAlaGlyIleLeuLysLeuArg
530535540
AsnSerAspIleGluLeuArgLysGlyGluThrAspIleGlyArgLys
545550555560
AsnThrArgValArgLeuValPheArgValHisIleProGlnProSer
565570575
GlyLysValLeuSerLeuGlnIleAlaSerIleProValGluCysSer
580585590
GlnArgSerAlaGlnGluLeuProHisIleGluLysTyrSerIleAsn
595600605
SerCysSerValAsnGlyGlyHisGluMetValValThrGlySerAsn
610615620
PheLeuProGluSerLysIleIlePheLeuGluLysGlyGlnAspGly
625630635640
ArgProGlnTrpGluValGluGlyLysIleIleArgGluLysCysGln
645650655
GlyAlaHisIleValLeuGluValProProTyrHisAsnProAlaVal
660665670
ThrAlaAlaValGlnValHisPheTyrLeuCysAsnGlyLysArgLys
675680685
LysSerGlnSerGlnArgPheThrTyrThrProGlyThrArgSerHis
690695700
AspGlyLeuLeu
705
(2) INFORMATION FOR SEQ ID NO:09:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2647 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 211..2427
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:09:
CGGCTGCGGTTCCTGGTGCTGCTCGGCGCGCGGCCAGCTTTCGGAACGGAACGCTCGGCG60
TCGCGGGCCCCGCCCGGAAAGTTTGCCGTGGAGTCGCGACCTCTTGGCCCGCGCGGCCCG120
GCATGAAGCGGCGTTGAGGAGCTGCTGCCGCCGCTTGCCGCTGCCGCCGCCGCCGCCTGA180
GGAGGAGCTGCAGCACCCTGGGCCACGCCGATGACTACTGCAAACTGTGGCGCC234
MetThrThrAlaAsnCysGlyAla
15
CACGACGAGCTCGACTTCAAACTCGTCTTTGGCGAGGACGGGGCGCCG282
HisAspGluLeuAspPheLysLeuValPheGlyGluAspGlyAlaPro
101520
GCGCCGCCGCCCCCGGGCTCGCGGCCTGCAGATCTTGAGCCAGATGAT330
AlaProProProProGlySerArgProAlaAspLeuGluProAspAsp
25303540
TGTGCATCCATTTACATCTTTAATGTAGATCCACCTCCATCTACTTTA378
CysAlaSerIleTyrIlePheAsnValAspProProProSerThrLeu
455055
ACCACACCACTTTGCTTACCACATCATGGATTACCGTCTCACTCTTCT426
ThrThrProLeuCysLeuProHisHisGlyLeuProSerHisSerSer
606570
GTTTTGTCACCATCGTTTCAGCTCCAAAGTCACAAAAACTATGAAGGA474
ValLeuSerProSerPheGlnLeuGlnSerHisLysAsnTyrGluGly
758085
ACTTGTGAGATTCCTGAATCTAAATATAGCCCATTAGGTGGTCCCAAA522
ThrCysGluIleProGluSerLysTyrSerProLeuGlyGlyProLys
9095100
CCCTTTGAGTGCCCAAGTATTCAAATTACATCTATCTCTCCTAACTGT570
ProPheGluCysProSerIleGlnIleThrSerIleSerProAsnCys
105110115120
CATCAAGAATTAGATGCACATGAAGATGACCTACAGATAAATGACCCA618
HisGlnGluLeuAspAlaHisGluAspAspLeuGlnIleAsnAspPro
125130135
GAACGGGAATTTTTGGAAAGGCCTTCTAGAGATCATCTCTATCTTCCT666
GluArgGluPheLeuGluArgProSerArgAspHisLeuTyrLeuPro
140145150
CTTGAGCCATCCTACCGGGAGTCTTCTCTTAGTCCTAGTCCTGCCAGC714
LeuGluProSerTyrArgGluSerSerLeuSerProSerProAlaSer
155160165
AGCATCTCTTCTAGGAGTTGGTTCTCTGATGCATCTTCTTGTGAATCG762
SerIleSerSerArgSerTrpPheSerAspAlaSerSerCysGluSer
170175180
CTTTCACATATTTATGATGATGTGGACTCAGAGTTGAATGAAGCTGCA810
LeuSerHisIleTyrAspAspValAspSerGluLeuAsnGluAlaAla
185190195200
GCCCGATTTACCCTTGGATCCCCTCTGACTTCTCCTGGTGGCTCTCCA858
AlaArgPheThrLeuGlySerProLeuThrSerProGlyGlySerPro
205210215
GGGGGCTGCCCTGGAGAAGAAACTTGGCATCAACAGTATGGACTTGGA906
GlyGlyCysProGlyGluGluThrTrpHisGlnGlnTyrGlyLeuGly
220225230
CACTCATTATCACCCAGGCAATCTCCTTGCCACTCTCCTAGATCCAGT954
HisSerLeuSerProArgGlnSerProCysHisSerProArgSerSer
235240245
GTCACTGATGAGAATTGGCTGAGCCCCAGGCCAGCCTCAGGACCCTCA1002
ValThrAspGluAsnTrpLeuSerProArgProAlaSerGlyProSer
250255260
TCAAGGCCCACATCCCCCTGTGGGAAACGGAGGCACTCCAGTGCTGAA1050
SerArgProThrSerProCysGlyLysArgArgHisSerSerAlaGlu
265270275280
GTTTGTTATGCTGGGTCCCTTTCACCCCATCACTCACCTGTTCCTTCA1098
ValCysTyrAlaGlySerLeuSerProHisHisSerProValProSer
285290295
CCTGGTCACTCCCCCAGGGGAAGTGTGACAGAAGATACGTGGCTCAAT1146
ProGlyHisSerProArgGlySerValThrGluAspThrTrpLeuAsn
300305310
GCTTCTGTCCATGGTGGGTCAGGCCTTGGCCCTGCAGTTTTTCCATTT1194
AlaSerValHisGlyGlySerGlyLeuGlyProAlaValPheProPhe
315320325
CAGTACTGTGTAGAGACTGACATCCCTCTCAAAACAAGGAAAACTTCT1242
GlnTyrCysValGluThrAspIleProLeuLysThrArgLysThrSer
330335340
GAAGATCAAGCTGCCATACTACCAGGAAAATTAGAGCTGTGTTCAGAT1290
GluAspGlnAlaAlaIleLeuProGlyLysLeuGluLeuCysSerAsp
345350355360
GACCAAGGGAGTTTATCACCAGCCCGGGAGACTTCAATAGATGATGGC1338
AspGlnGlySerLeuSerProAlaArgGluThrSerIleAspAspGly
365370375
CTTGGATCTCAGTATCCTTTAAAGAAAGATTCATGTGGTGATCAGTTT1386
LeuGlySerGlnTyrProLeuLysLysAspSerCysGlyAspGlnPhe
380385390
CTTTCAGTTCCTTCACCCTTTACCTGGAGCAAACCAAAGCCTGGCCAC1434
LeuSerValProSerProPheThrTrpSerLysProLysProGlyHis
395400405
ACCCCTATATTTCGCACATCTTCATTACCTCCACTAGACTGGCCTTTA1482
ThrProIlePheArgThrSerSerLeuProProLeuAspTrpProLeu
410415420
CCAGCTCATTTTGGACAATGTGAACTGAAAATAGAAGTGCAACCTAAA1530
ProAlaHisPheGlyGlnCysGluLeuLysIleGluValGlnProLys
425430435440
ACTCATCATCGAGCCCATTATGAAACTGAAGGTAGCCGAGGGGCAGTA1578
ThrHisHisArgAlaHisTyrGluThrGluGlySerArgGlyAlaVal
445450455
AAAGCATCTACTGGGGGACATCCTGTTGTGAAGCTCCTGGGCTATAAC1626
LysAlaSerThrGlyGlyHisProValValLysLeuLeuGlyTyrAsn
460465470
GAAAAGCCAATAAATCTACAAATGTTTATTGGGACAGCAGATGATCGA1674
GluLysProIleAsnLeuGlnMetPheIleGlyThrAlaAspAspArg
475480485
TATTTACGACCTCATGCATTTTACCAGGTGCATCGAATCACTGGGAAG1722
TyrLeuArgProHisAlaPheTyrGlnValHisArgIleThrGlyLys
490495500
ACAGTCGCTACTGCAAGCCAAGAGATAATAATTGCCAGTACAAAAGTT1770
ThrValAlaThrAlaSerGlnGluIleIleIleAlaSerThrLysVal
505510515520
CTGGAAATTCCACTTCTTCCTGAAAATAATATGTCAGCCAGTATTGAT1818
LeuGluIleProLeuLeuProGluAsnAsnMetSerAlaSerIleAsp
525530535
TGTGCAGGTATTTTGAAACTCCGCAATTCAGATATAGAACTTCGAAAA1866
CysAlaGlyIleLeuLysLeuArgAsnSerAspIleGluLeuArgLys
540545550
GGAGAAACTGATATTGGCAGAAAGAATACTAGAGTACGACTTGTGTTT1914
GlyGluThrAspIleGlyArgLysAsnThrArgValArgLeuValPhe
555560565
CGTGTACACATCCCACAGCCCAGTGGAAAAGTCCTTTCTCTGCAGATA1962
ArgValHisIleProGlnProSerGlyLysValLeuSerLeuGlnIle
570575580
GCCTCTATACCCGTTGAGTGCTCCCAGCGGTCTGCTCAAGAACTTCCT2010
AlaSerIleProValGluCysSerGlnArgSerAlaGlnGluLeuPro
585590595600
CATATTGAGAAGTACAGTATCAACAGTTGTTCTGTAAATGGAGGTCAT2058
HisIleGluLysTyrSerIleAsnSerCysSerValAsnGlyGlyHis
605610615
GAAATGGTTGTGACTGGATCTAATTTTCTTCCAGAATCCAAAATCATT2106
GluMetValValThrGlySerAsnPheLeuProGluSerLysIleIle
620625630
TTTCTTGAAAAAGGACAAGATGGACGACCTCAGTGGGAGGTAGAAGGG2154
PheLeuGluLysGlyGlnAspGlyArgProGlnTrpGluValGluGly
635640645
AAGATAATCAGGGAAAAATGTCAAGGGGCTCACATTGTCCTTGAAGTT2202
LysIleIleArgGluLysCysGlnGlyAlaHisIleValLeuGluVal
650655660
CCTCCATATCATAACCCAGCAGTTACAGCTGCAGTGCAGGTGCACTTT2250
ProProTyrHisAsnProAlaValThrAlaAlaValGlnValHisPhe
665670675680
TATCTTTGCAATGGCAAGAGGAAAAAAAGCCAGTCTCAACGTTTTACT2298
TyrLeuCysAsnGlyLysArgLysLysSerGlnSerGlnArgPheThr
685690695
TATACACCAGTTTTGATGAAGCAAGAACACAGAGAAGAGATTGATTTG2346
TyrThrProValLeuMetLysGlnGluHisArgGluGluIleAspLeu
700705710
TCTTCAGTTCCAACTTTGCCACAGACCTCTCGGCAAACTCTGCTCGGG2394
SerSerValProThrLeuProGlnThrSerArgGlnThrLeuLeuGly
715720725
TCTCAGCCTCCTTCAGCTTCTCCTCCAACAGTTTGATCTCCTCTTCATATTTA2447
SerGlnProProSerAlaSerProProThrVal
730735
TCTTCTTTGGTGGAATACTTGTCCGCCTGGGCCTCCAGGGATTTCAAGTTGTTGGTAACA2507
ATTTTCAGCTCCTCCTCTAGGTCCCCACATTTACTCTCGGCCACCTCAGCCCTCTCCTCC2567
GAGCGCTCCAGCTCTCCTTCCAGGATCACCAGCTTCCTGGCCACCTCTTCATATTTGCGG2627
TCTGAATCCTCAGCGATGTG2647
(2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 739 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
MetThrThrAlaAsnCysGlyAlaHisAspGluLeuAspPheLysLeu
151015
ValPheGlyGluAspGlyAlaProAlaProProProProGlySerArg
202530
ProAlaAspLeuGluProAspAspCysAlaSerIleTyrIlePheAsn
354045
ValAspProProProSerThrLeuThrThrProLeuCysLeuProHis
505560
HisGlyLeuProSerHisSerSerValLeuSerProSerPheGlnLeu
65707580
GlnSerHisLysAsnTyrGluGlyThrCysGluIleProGluSerLys
859095
TyrSerProLeuGlyGlyProLysProPheGluCysProSerIleGln
100105110
IleThrSerIleSerProAsnCysHisGlnGluLeuAspAlaHisGlu
115120125
AspAspLeuGlnIleAsnAspProGluArgGluPheLeuGluArgPro
130135140
SerArgAspHisLeuTyrLeuProLeuGluProSerTyrArgGluSer
145150155160
SerLeuSerProSerProAlaSerSerIleSerSerArgSerTrpPhe
165170175
SerAspAlaSerSerCysGluSerLeuSerHisIleTyrAspAspVal
180185190
AspSerGluLeuAsnGluAlaAlaAlaArgPheThrLeuGlySerPro
195200205
LeuThrSerProGlyGlySerProGlyGlyCysProGlyGluGluThr
210215220
TrpHisGlnGlnTyrGlyLeuGlyHisSerLeuSerProArgGlnSer
225230235240
ProCysHisSerProArgSerSerValThrAspGluAsnTrpLeuSer
245250255
ProArgProAlaSerGlyProSerSerArgProThrSerProCysGly
260265270
LysArgArgHisSerSerAlaGluValCysTyrAlaGlySerLeuSer
275280285
ProHisHisSerProValProSerProGlyHisSerProArgGlySer
290295300
ValThrGluAspThrTrpLeuAsnAlaSerValHisGlyGlySerGly
305310315320
LeuGlyProAlaValPheProPheGlnTyrCysValGluThrAspIle
325330335
ProLeuLysThrArgLysThrSerGluAspGlnAlaAlaIleLeuPro
340345350
GlyLysLeuGluLeuCysSerAspAspGlnGlySerLeuSerProAla
355360365
ArgGluThrSerIleAspAspGlyLeuGlySerGlnTyrProLeuLys
370375380
LysAspSerCysGlyAspGlnPheLeuSerValProSerProPheThr
385390395400
TrpSerLysProLysProGlyHisThrProIlePheArgThrSerSer
405410415
LeuProProLeuAspTrpProLeuProAlaHisPheGlyGlnCysGlu
420425430
LeuLysIleGluValGlnProLysThrHisHisArgAlaHisTyrGlu
435440445
ThrGluGlySerArgGlyAlaValLysAlaSerThrGlyGlyHisPro
450455460
ValValLysLeuLeuGlyTyrAsnGluLysProIleAsnLeuGlnMet
465470475480
PheIleGlyThrAlaAspAspArgTyrLeuArgProHisAlaPheTyr
485490495
GlnValHisArgIleThrGlyLysThrValAlaThrAlaSerGlnGlu
500505510
IleIleIleAlaSerThrLysValLeuGluIleProLeuLeuProGlu
515520525
AsnAsnMetSerAlaSerIleAspCysAlaGlyIleLeuLysLeuArg
530535540
AsnSerAspIleGluLeuArgLysGlyGluThrAspIleGlyArgLys
545550555560
AsnThrArgValArgLeuValPheArgValHisIleProGlnProSer
565570575
GlyLysValLeuSerLeuGlnIleAlaSerIleProValGluCysSer
580585590
GlnArgSerAlaGlnGluLeuProHisIleGluLysTyrSerIleAsn
595600605
SerCysSerValAsnGlyGlyHisGluMetValValThrGlySerAsn
610615620
PheLeuProGluSerLysIleIlePheLeuGluLysGlyGlnAspGly
625630635640
ArgProGlnTrpGluValGluGlyLysIleIleArgGluLysCysGln
645650655
GlyAlaHisIleValLeuGluValProProTyrHisAsnProAlaVal
660665670
ThrAlaAlaValGlnValHisPheTyrLeuCysAsnGlyLysArgLys
675680685
LysSerGlnSerGlnArgPheThrTyrThrProValLeuMetLysGln
690695700
GluHisArgGluGluIleAspLeuSerSerValProThrLeuProGln
705710715720
ThrSerArgGlnThrLeuLeuGlySerGlnProProSerAlaSerPro
725730735
ProThrVal
(2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3969 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 211..3414
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:
CGGCTGCGGTTCCTGGTGCTGCTCGGCGCGCGGCCAGCTTTCGGAACGGAACGCTCGGCG60
TCGCGGGCCCCGCCCGGAAAGTTTGCCGTGGAGTCGCGACCTCTTGGCCCGCGCGGCCCG120
GCATGAAGCGGCGTTGAGGAGCTGCTGCCGCCGCTTGCCGCTGCCGCCGCCGCCGCCTGA180
GGAGGAGCTGCAGCACCCTGGGCCACGCCGATGACTACTGCAAACTGTGGCGCC234
MetThrThrAlaAsnCysGlyAla
740745
CACGACGAGCTCGACTTCAAACTCGTCTTTGGCGAGGACGGGGCGCCG282
HisAspGluLeuAspPheLysLeuValPheGlyGluAspGlyAlaPro
750755760
GCGCCGCCGCCCCCGGGCTCGCGGCCTGCAGATCTTGAGCCAGATGAT330
AlaProProProProGlySerArgProAlaAspLeuGluProAspAsp
765770775
TGTGCATCCATTTACATCTTTAATGTAGATCCACCTCCATCTACTTTA378
CysAlaSerIleTyrIlePheAsnValAspProProProSerThrLeu
780785790795
ACCACACCACTTTGCTTACCACATCATGGATTACCGTCTCACTCTTCT426
ThrThrProLeuCysLeuProHisHisGlyLeuProSerHisSerSer
800805810
GTTTTGTCACCATCGTTTCAGCTCCAAAGTCACAAAAACTATGAAGGA474
ValLeuSerProSerPheGlnLeuGlnSerHisLysAsnTyrGluGly
815820825
ACTTGTGAGATTCCTGAATCTAAATATAGCCCATTAGGTGGTCCCAAA522
ThrCysGluIleProGluSerLysTyrSerProLeuGlyGlyProLys
830835840
CCCTTTGAGTGCCCAAGTATTCAAATTACATCTATCTCTCCTAACTGT570
ProPheGluCysProSerIleGlnIleThrSerIleSerProAsnCys
845850855
CATCAAGAATTAGATGCACATGAAGATGACCTACAGATAAATGACCCA618
HisGlnGluLeuAspAlaHisGluAspAspLeuGlnIleAsnAspPro
860865870875
GAACGGGAATTTTTGGAAAGGCCTTCTAGAGATCATCTCTATCTTCCT666
GluArgGluPheLeuGluArgProSerArgAspHisLeuTyrLeuPro
880885890
CTTGAGCCATCCTACCGGGAGTCTTCTCTTAGTCCTAGTCCTGCCAGC714
LeuGluProSerTyrArgGluSerSerLeuSerProSerProAlaSer
895900905
AGCATCTCTTCTAGGAGTTGGTTCTCTGATGCATCTTCTTGTGAATCG762
SerIleSerSerArgSerTrpPheSerAspAlaSerSerCysGluSer
910915920
CTTTCACATATTTATGATGATGTGGACTCAGAGTTGAATGAAGCTGCA810
LeuSerHisIleTyrAspAspValAspSerGluLeuAsnGluAlaAla
925930935
GCCCGATTTACCCTTGGATCCCCTCTGACTTCTCCTGGTGGCTCTCCA858
AlaArgPheThrLeuGlySerProLeuThrSerProGlyGlySerPro
940945950955
GGGGGCTGCCCTGGAGAAGAAACTTGGCATCAACAGTATGGACTTGGA906
GlyGlyCysProGlyGluGluThrTrpHisGlnGlnTyrGlyLeuGly
960965970
CACTCATTATCACCCAGGCAATCTCCTTGCCACTCTCCTAGATCCAGT954
HisSerLeuSerProArgGlnSerProCysHisSerProArgSerSer
975980985
GTCACTGATGAGAATTGGCTGAGCCCCAGGCCAGCCTCAGGACCCTCA1002
ValThrAspGluAsnTrpLeuSerProArgProAlaSerGlyProSer
9909951000
TCAAGGCCCACATCCCCCTGTGGGAAACGGAGGCACTCCAGTGCTGAA1050
SerArgProThrSerProCysGlyLysArgArgHisSerSerAlaGlu
100510101015
GTTTGTTATGCTGGGTCCCTTTCACCCCATCACTCACCTGTTCCTTCA1098
ValCysTyrAlaGlySerLeuSerProHisHisSerProValProSer
1020102510301035
CCTGGTCACTCCCCCAGGGGAAGTGTGACAGAAGATACGTGGCTCAAT1146
ProGlyHisSerProArgGlySerValThrGluAspThrTrpLeuAsn
104010451050
GCTTCTGTCCATGGTGGGTCAGGCCTTGGCCCTGCAGTTTTTCCATTT1194
AlaSerValHisGlyGlySerGlyLeuGlyProAlaValPheProPhe
105510601065
CAGTACTGTGTAGAGACTGACATCCCTCTCAAAACAAGGAAAACTTCT1242
GlnTyrCysValGluThrAspIleProLeuLysThrArgLysThrSer
107010751080
GAAGATCAAGCTGCCATACTACCAGGAAAATTAGAGCTGTGTTCAGAT1290
GluAspGlnAlaAlaIleLeuProGlyLysLeuGluLeuCysSerAsp
108510901095
GACCAAGGGAGTTTATCACCAGCCCGGGAGACTTCAATAGATGATGGC1338
AspGlnGlySerLeuSerProAlaArgGluThrSerIleAspAspGly
1100110511101115
CTTGGATCTCAGTATCCTTTAAAGAAAGATTCATGTGGTGATCAGTTT1386
LeuGlySerGlnTyrProLeuLysLysAspSerCysGlyAspGlnPhe
112011251130
CTTTCAGTTCCTTCACCCTTTACCTGGAGCAAACCAAAGCCTGGCCAC1434
LeuSerValProSerProPheThrTrpSerLysProLysProGlyHis
113511401145
ACCCCTATATTTCGCACATCTTCATTACCTCCACTAGACTGGCCTTTA1482
ThrProIlePheArgThrSerSerLeuProProLeuAspTrpProLeu
115011551160
CCAGCTCATTTTGGACAATGTGAACTGAAAATAGAAGTGCAACCTAAA1530
ProAlaHisPheGlyGlnCysGluLeuLysIleGluValGlnProLys
116511701175
ACTCATCATCGAGCCCATTATGAAACTGAAGGTAGCCGAGGGGCAGTA1578
ThrHisHisArgAlaHisTyrGluThrGluGlySerArgGlyAlaVal
1180118511901195
AAAGCATCTACTGGGGGACATCCTGTTGTGAAGCTCCTGGGCTATAAC1626
LysAlaSerThrGlyGlyHisProValValLysLeuLeuGlyTyrAsn
120012051210
GAAAAGCCAATAAATCTACAAATGTTTATTGGGACAGCAGATGATCGA1674
GluLysProIleAsnLeuGlnMetPheIleGlyThrAlaAspAspArg
121512201225
TATTTACGACCTCATGCATTTTACCAGGTGCATCGAATCACTGGGAAG1722
TyrLeuArgProHisAlaPheTyrGlnValHisArgIleThrGlyLys
123012351240
ACAGTCGCTACTGCAAGCCAAGAGATAATAATTGCCAGTACAAAAGTT1770
ThrValAlaThrAlaSerGlnGluIleIleIleAlaSerThrLysVal
124512501255
CTGGAAATTCCACTTCTTCCTGAAAATAATATGTCAGCCAGTATTGAT1818
LeuGluIleProLeuLeuProGluAsnAsnMetSerAlaSerIleAsp
1260126512701275
TGTGCAGGTATTTTGAAACTCCGCAATTCAGATATAGAACTTCGAAAA1866
CysAlaGlyIleLeuLysLeuArgAsnSerAspIleGluLeuArgLys
128012851290
GGAGAAACTGATATTGGCAGAAAGAATACTAGAGTACGACTTGTGTTT1914
GlyGluThrAspIleGlyArgLysAsnThrArgValArgLeuValPhe
129513001305
CGTGTACACATCCCACAGCCCAGTGGAAAAGTCCTTTCTCTGCAGATA1962
ArgValHisIleProGlnProSerGlyLysValLeuSerLeuGlnIle
131013151320
GCCTCTATACCCGTTGAGTGCTCCCAGCGGTCTGCTCAAGAACTTCCT2010
AlaSerIleProValGluCysSerGlnArgSerAlaGlnGluLeuPro
132513301335
CATATTGAGAAGTACAGTATCAACAGTTGTTCTGTAAATGGAGGTCAT2058
HisIleGluLysTyrSerIleAsnSerCysSerValAsnGlyGlyHis
1340134513501355
GAAATGGTTGTGACTGGATCTAATTTTCTTCCAGAATCCAAAATCATT2106
GluMetValValThrGlySerAsnPheLeuProGluSerLysIleIle
136013651370
TTTCTTGAAAAAGGACAAGATGGACGACCTCAGTGGGAGGTAGAAGGG2154
PheLeuGluLysGlyGlnAspGlyArgProGlnTrpGluValGluGly
137513801385
AAGATAATCAGGGAAAAATGTCAAGGGGCTCACATTGTCCTTGAAGTT2202
LysIleIleArgGluLysCysGlnGlyAlaHisIleValLeuGluVal
139013951400
CCTCCATATCATAACCCAGCAGTTACAGCTGCAGTGCAGGTGCACTTT2250
ProProTyrHisAsnProAlaValThrAlaAlaValGlnValHisPhe
140514101415
TATCTTTGCAATGGCAAGAGGAAAAAAAGCCAGTCTCAACGTTTTACT2298
TyrLeuCysAsnGlyLysArgLysLysSerGlnSerGlnArgPheThr
1420142514301435
TATACACCAGTTTTGATGAAGCAAGAACACAGAGAAGAGATTGATTTG2346
TyrThrProValLeuMetLysGlnGluHisArgGluGluIleAspLeu
144014451450
TCTTCAGTTCCATCTTTGCCTGTGCCTCATCCTGCTCAGACCCAGAGG2394
SerSerValProSerLeuProValProHisProAlaGlnThrGlnArg
145514601465
CCTTCCTCTGATTCAGGGTGTTCACATGACAGTGTACTGTCAGGACAG2442
ProSerSerAspSerGlyCysSerHisAspSerValLeuSerGlyGln
147014751480
AGAAGTTTGATTTGCTCCATCCCACAAACATATGCATCCATGGTGACC2490
ArgSerLeuIleCysSerIleProGlnThrTyrAlaSerMetValThr
148514901495
TCATCCCATCTGCCACAGTTGCAGTGTAGAGATGAGAGTGTTAGTAAA2538
SerSerHisLeuProGlnLeuGlnCysArgAspGluSerValSerLys
1500150515101515
GAACAGCATATGATTCCTTCTCCAATTGTACACCAGCCTTTTCAAGTC2586
GluGlnHisMetIleProSerProIleValHisGlnProPheGlnVal
152015251530
ACACCAACACCTCCTGTGGGGTCTTCCTATCAGCCTATGCAAACTAAT2634
ThrProThrProProValGlySerSerTyrGlnProMetGlnThrAsn
153515401545
GTTGTGTACAATGGACCAACTTGTCTTCCTATTAATGCTGCCTCTAGT2682
ValValTyrAsnGlyProThrCysLeuProIleAsnAlaAlaSerSer
155015551560
CAAGAATTTGATTCAGTTTTGTTTCAGCAGGATGCAACTCTTTCTGGT2730
GlnGluPheAspSerValLeuPheGlnGlnAspAlaThrLeuSerGly
156515701575
TTAGTGAATCTTGGCTGTCAACCACTGTCATCCATACCATTTCATTCT2778
LeuValAsnLeuGlyCysGlnProLeuSerSerIleProPheHisSer
1580158515901595
TCAAATTCAGGCTCAACAGGACATCTCTTAGCCCATACACCTCATTCT2826
SerAsnSerGlySerThrGlyHisLeuLeuAlaHisThrProHisSer
160016051610
GTGCATACCCTGCCTCATCTGCAATCAATGGGATATCATTGTTCAAAT2874
ValHisThrLeuProHisLeuGlnSerMetGlyTyrHisCysSerAsn
161516201625
ACAGGACAAAGATCTCTTTCTTCTCCAGTGGCTGACCAGATTACAGGT2922
ThrGlyGlnArgSerLeuSerSerProValAlaAspGlnIleThrGly
163016351640
CAGCCTTCGTCTCAGTTACAACCTATTACATATGGTCCTTCACATTCA2970
GlnProSerSerGlnLeuGlnProIleThrTyrGlyProSerHisSer
164516501655
GGGTCTGCTACAACAGCTTCCCCAGCAGCTTCTCATCCCTTGGCTAGT3018
GlySerAlaThrThrAlaSerProAlaAlaSerHisProLeuAlaSer
1660166516701675
TCACCGCTTTCTGGGCCACCATCTCCTCAGCTTCAGCCTATGCCTTAC3066
SerProLeuSerGlyProProSerProGlnLeuGlnProMetProTyr
168016851690
CAATCTCCTAGCTCAGGAACTGCCTCATCACCGTCTCCAGCCACCAGA3114
GlnSerProSerSerGlyThrAlaSerSerProSerProAlaThrArg
169517001705
ATGCATTCTGGACAGCACTCAACTCAAGCACAAAGTACGGGCCAGGGG3162
MetHisSerGlyGlnHisSerThrGlnAlaGlnSerThrGlyGlnGly
171017151720
GGTCTTTCTGCACCTTCATCCTTAATATGTCACAGTTTGTGTGATCCA3210
GlyLeuSerAlaProSerSerLeuIleCysHisSerLeuCysAspPro
172517301735
GCGTCATTTCCACCTGATGGGGCAACTGTGAGCATTAAACCTGAACCA3258
AlaSerPheProProAspGlyAlaThrValSerIleLysProGluPro
1740174517501755
GAAGATCGAGAGCCTAACTTTGCAACCATTGGTCTGCAGGACATCACT3306
GluAspArgGluProAsnPheAlaThrIleGlyLeuGlnAspIleThr
176017651770
TTAGATGATGACCAATTTATATCTGACTTGGAACACCAGCCATCAGGT3354
LeuAspAspAspGlnPheIleSerAspLeuGluHisGlnProSerGly
177517801785
TCAGCAGAGAAATGGCCTAACCACAGTGTGCTCTCATGTCCAGCTCCT3402
SerAlaGluLysTrpProAsnHisSerValLeuSerCysProAlaPro
179017951800
TTCTGGAGAATCTAGAGGTGAACGAGATAATTGGGAGAGACATGTCCCAGAT3454
PheTrpArgIle
1805
TTCTGTTTCCCAAGGAGCAGGGGTGAGCAGGCAGGCTCCCCTCCCGAGTCCTGAGTCCCT3514
GGATTTAGGAAGATCTGATGGGCTCTAACAGTGCTTACTGCAGCCTTGTGTCCACCACCA3574
ACTTCTCAGCATGTTTCTCTCCTTGGACCTTGGGTTTCCAACTCTGCAGCCTTCAGGTCT3634
GGGGCCAGGAGTGGGACCCACCATTTGTGGGGAAAGTAGCATTCCTCCACCTCAGGCCTT3694
GGGTAGATTTGGCAAAAGAACAGGAGCAGCATAGGCTGTTTGAGCTTTGGGGAAATGAAC3754
TTTGCTTTTTATATTTAACTAGGATACTTTTATATGATGGGTGCTTTGAGTGTGAATGCA3814
GCAGGCTCTCTTGTTTCCGAGGTGCTGCTTTTGCAGGTGACCTGGTTACTTAGCTAGGAT3874
TGGTGATTTGTACTGCTTTATGGTCATTTGAAGGGCCCTTTAGTTTTTATGATAATTTTT3934
AAAATAGGAACTTTTGATAAGACCTTCTAGAAGCC3969
(2) INFORMATION FOR SEQ ID NO:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1068 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:
MetThrThrAlaAsnCysGlyAlaHisAspGluLeuAspPheLysLeu
151015
ValPheGlyGluAspGlyAlaProAlaProProProProGlySerArg
202530
ProAlaAspLeuGluProAspAspCysAlaSerIleTyrIlePheAsn
354045
ValAspProProProSerThrLeuThrThrProLeuCysLeuProHis
505560
HisGlyLeuProSerHisSerSerValLeuSerProSerPheGlnLeu
65707580
GlnSerHisLysAsnTyrGluGlyThrCysGluIleProGluSerLys
859095
TyrSerProLeuGlyGlyProLysProPheGluCysProSerIleGln
100105110
IleThrSerIleSerProAsnCysHisGlnGluLeuAspAlaHisGlu
115120125
AspAspLeuGlnIleAsnAspProGluArgGluPheLeuGluArgPro
130135140
SerArgAspHisLeuTyrLeuProLeuGluProSerTyrArgGluSer
145150155160
SerLeuSerProSerProAlaSerSerIleSerSerArgSerTrpPhe
165170175
SerAspAlaSerSerCysGluSerLeuSerHisIleTyrAspAspVal
180185190
AspSerGluLeuAsnGluAlaAlaAlaArgPheThrLeuGlySerPro
195200205
LeuThrSerProGlyGlySerProGlyGlyCysProGlyGluGluThr
210215220
TrpHisGlnGlnTyrGlyLeuGlyHisSerLeuSerProArgGlnSer
225230235240
ProCysHisSerProArgSerSerValThrAspGluAsnTrpLeuSer
245250255
ProArgProAlaSerGlyProSerSerArgProThrSerProCysGly
260265270
LysArgArgHisSerSerAlaGluValCysTyrAlaGlySerLeuSer
275280285
ProHisHisSerProValProSerProGlyHisSerProArgGlySer
290295300
ValThrGluAspThrTrpLeuAsnAlaSerValHisGlyGlySerGly
305310315320
LeuGlyProAlaValPheProPheGlnTyrCysValGluThrAspIle
325330335
ProLeuLysThrArgLysThrSerGluAspGlnAlaAlaIleLeuPro
340345350
GlyLysLeuGluLeuCysSerAspAspGlnGlySerLeuSerProAla
355360365
ArgGluThrSerIleAspAspGlyLeuGlySerGlnTyrProLeuLys
370375380
LysAspSerCysGlyAspGlnPheLeuSerValProSerProPheThr
385390395400
TrpSerLysProLysProGlyHisThrProIlePheArgThrSerSer
405410415
LeuProProLeuAspTrpProLeuProAlaHisPheGlyGlnCysGlu
420425430
LeuLysIleGluValGlnProLysThrHisHisArgAlaHisTyrGlu
435440445
ThrGluGlySerArgGlyAlaValLysAlaSerThrGlyGlyHisPro
450455460
ValValLysLeuLeuGlyTyrAsnGluLysProIleAsnLeuGlnMet
465470475480
PheIleGlyThrAlaAspAspArgTyrLeuArgProHisAlaPheTyr
485490495
GlnValHisArgIleThrGlyLysThrValAlaThrAlaSerGlnGlu
500505510
IleIleIleAlaSerThrLysValLeuGluIleProLeuLeuProGlu
515520525
AsnAsnMetSerAlaSerIleAspCysAlaGlyIleLeuLysLeuArg
530535540
AsnSerAspIleGluLeuArgLysGlyGluThrAspIleGlyArgLys
545550555560
AsnThrArgValArgLeuValPheArgValHisIleProGlnProSer
565570575
GlyLysValLeuSerLeuGlnIleAlaSerIleProValGluCysSer
580585590
GlnArgSerAlaGlnGluLeuProHisIleGluLysTyrSerIleAsn
595600605
SerCysSerValAsnGlyGlyHisGluMetValValThrGlySerAsn
610615620
PheLeuProGluSerLysIleIlePheLeuGluLysGlyGlnAspGly
625630635640
ArgProGlnTrpGluValGluGlyLysIleIleArgGluLysCysGln
645650655
GlyAlaHisIleValLeuGluValProProTyrHisAsnProAlaVal
660665670
ThrAlaAlaValGlnValHisPheTyrLeuCysAsnGlyLysArgLys
675680685
LysSerGlnSerGlnArgPheThrTyrThrProValLeuMetLysGln
690695700
GluHisArgGluGluIleAspLeuSerSerValProSerLeuProVal
705710715720
ProHisProAlaGlnThrGlnArgProSerSerAspSerGlyCysSer
725730735
HisAspSerValLeuSerGlyGlnArgSerLeuIleCysSerIlePro
740745750
GlnThrTyrAlaSerMetValThrSerSerHisLeuProGlnLeuGln
755760765
CysArgAspGluSerValSerLysGluGlnHisMetIleProSerPro
770775780
IleValHisGlnProPheGlnValThrProThrProProValGlySer
785790795800
SerTyrGlnProMetGlnThrAsnValValTyrAsnGlyProThrCys
805810815
LeuProIleAsnAlaAlaSerSerGlnGluPheAspSerValLeuPhe
820825830
GlnGlnAspAlaThrLeuSerGlyLeuValAsnLeuGlyCysGlnPro
835840845
LeuSerSerIleProPheHisSerSerAsnSerGlySerThrGlyHis
850855860
LeuLeuAlaHisThrProHisSerValHisThrLeuProHisLeuGln
865870875880
SerMetGlyTyrHisCysSerAsnThrGlyGlnArgSerLeuSerSer
885890895
ProValAlaAspGlnIleThrGlyGlnProSerSerGlnLeuGlnPro
900905910
IleThrTyrGlyProSerHisSerGlySerAlaThrThrAlaSerPro
915920925
AlaAlaSerHisProLeuAlaSerSerProLeuSerGlyProProSer
930935940
ProGlnLeuGlnProMetProTyrGlnSerProSerSerGlyThrAla
945950955960
SerSerProSerProAlaThrArgMetHisSerGlyGlnHisSerThr
965970975
GlnAlaGlnSerThrGlyGlnGlyGlyLeuSerAlaProSerSerLeu
980985990
IleCysHisSerLeuCysAspProAlaSerPheProProAspGlyAla
99510001005
ThrValSerIleLysProGluProGluAspArgGluProAsnPheAla
101010151020
ThrIleGlyLeuGlnAspIleThrLeuAspAspAspGlnPheIleSer
1025103010351040
AspLeuGluHisGlnProSerGlySerAlaGluLysTrpProAsnHis
104510501055
SerValLeuSerCysProAlaProPheTrpArgIle
10601065
(2) INFORMATION FOR SEQ ID NO:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:
AspIleGluLeuArgLysGlyGluThrAspIleGlyArgLysAsnThr
151015
ArgValArgLeuValPheArgValHisXaaPro
2025
(2) INFORMATION FOR SEQ ID NO:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:
ProXaaGluCysSerGlnArgSerAlaXaaGluLeuPro
1510
(2) INFORMATION FOR SEQ ID NO:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:
GGAAAATTTT10
(2) INFORMATION FOR SEQ ID NO:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:
GGAAAAACTG10
(2) INFORMATION FOR SEQ ID NO:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 23 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:
TACATTGGAAAATTTTATTACAC23
(2) INFORMATION FOR SEQ ID NO:18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:
GGAGGAAAAACTGTTTCATACAGAAGGCGT30
__________________________________________________________________________
Top