skip navigation

S M L Text size

National Eye Institute Workshop to Identify Gaps, Needs, and Opportunities in Ophthalmic Genetics

June 4-5, 2009

Diseases, Biological Systems, Approaches and Methodologies

White Papers

« Previous White Papers


Janey Wiggs


Corneal dystrophies are a group of heterogenous conditions that are characterized by the progressive loss of corneal transparency that results from the accumulation of deposits within the different corneal layers. Most corneal dystrophies develop during the 2nd to 4th decades of life, and are treated surgically by corneal transplantation. Currently there are no known medical treatments for these conditions. The majority of corneal dystrophies are rare conditions that follow mendelian inheritance. Two significant exceptions are Fuchs endothelial dystrophy and keratoconus, both age-related conditions with complex inheritance patterns.

Genetics of mendelian corneal dystrophies: The majority of the mendelian corneal dystrophies are inherited as autosomal dominant traits. There is considerable genetic heterogeneity and also phenotypic variability (even within the same family). Corneal dystrophies may affect all layers of the corneal tissue: epithelium, stroma and Descemet's membrane. A number of genes responsible for these conditions have been identified (see Table) and one gene, TGFB1 is the cause of a group of corneal dystrophies (granular, Avellino, lattice and Reis Bucklers). Macular dystrophy (caused by mutations in CHST6 (carbohydrate sulfotransferase 6 gene) is inherited as an autosomal recessive trait.

Genetics of Fuchs endothelial dystrophy and keratoconus: Mutations in COL8A2 have been associated with some early-onset Fuchs endothelial dystrophy families, but most patients with adult onset Fuchs dystrophy do not have mutations in this gene. Like other age-related conditions, it is likely that multiple factors, both genetic and environmental may contribute to this condition. A genome wide linkage scan recently has identified several chromosome regions that may contain susceptibility genes. UV light exposure may also be an environmental risk factor for Fuchs. Keratoconus is also likely to result from multiple risk factors. A recent genome scan has been completed and has identified several candidate regions. Recently a gene responsible for Leber's congenital amaurosis, CRB1, has also been implicated in keratoconus. Patients with Down's syndrome are more likely to develop keratoconus, a finding that has suggested a locus on chromosome 21, and eye rubbing may be an environmental risk factor.

Consensus on phenotyping: A recent publication, "IC3D Classification of the Corneal Dystrophies" (see reviews below) has developed a consensus nomenclature for the corneal dystrophies based on genetic defects.

Genetic resources for cornea: Population-based studies, clinic-based studies, some animal models, affected tissue from penetrating keratoplasty.

Major cornea genetics studies

Mendelian corneal dystrophies: Many studies of linkage and gene identification have been published (see reviews and table below).

Fuchs and keratoconus: Recent genome scans for both conditions have been published (references below). These studies have identified relatively large chromosome regions with multiple possible candidate genes. A large multicenter study collecting Fuchs cases and controls lead by Dr. S. Iygengar (Case Western Ohio) is ongoing, as well as other studies in Fuchs lead by J. Gottsch (Hopkins) and G. Klintworth (Duke).

Aldave, A. J. et al. Arch Ophthalmol 2007;125:177-186.

Next Steps

Short-term goals to advance the field:


  1. Gene testing panel, sequencing array to establish screening/diagnosis/prognosis
  2. Animal models for testing potential therapeutics (gene replacement therapy could be considered for macular dystrophy)


  1. Sufficiently powered studies for Fuchs and keratoconus such that genes of modest effect could be identified. Gene-gene and gene-environment studies should also be considered.

Long-term goals to advance the field:

  1. Functional studies to inform potential therapeutics
  2. Translation of modifiable risk factors (for Fuchs and keratoconus) to clinical practice


Weiss JS, Møller HU, Lisch W, Kinoshita S, Aldave AJ, Belin MW, Kivelä T, Busin M, Munier FL, Seitz B, Sutphin J, Bredrup C, Mannis MJ, Rapuano CJ, Van Rij G, Kim EK, Klintworth GK. The IC3D classification of the corneal dystrophies. Cornea. 2008 Dec;27 Suppl 2:S1-83.

Poulaki V, Colby K. Semin Ophthalmol. Genetics of anterior and stromal corneal dystrophies. 2008 Jan-Feb;23(1):9-17.

Aldave AJ, Sonmez B. Elucidating the molecular genetic basis of the corneal dystrophies: are we there yet? Arch Ophthalmol. 2007 Feb;125(2):177-86.

Afshari NA, Li YJ, Pericak-Vance MA, Gregory S, Klintworth GK. Genome-wide linkage scan in fuchs endothelial corneal dystrophy. Invest Ophthalmol Vis Sci. 2009 Mar;50(3):1093-7.

Sundin OH, Broman KW, Chang HH, Vito EC, Stark WJ, Gottsch JD. A common locus for late-onset Fuchs corneal dystrophy maps to 18q21.2-q21.32. Invest Ophthalmol Vis Sci. 2006 Sep;47(9):3919-26.

Gajecka M, Radhakrishna U, Winters D, Nath SK, Rydzanicz M, Ratnamala U, Ewing K, Molinari A, Pitarque JA, Lee K, Leal SM, Bejjani BA. Localization of a gene for keratoconus to a 5.6-Mb interval on 13q32. Invest Ophthalmol Vis Sci. 2009 Apr;50(4):1531-9.

McMahon TT, Kim LS, Fishman GA, Stone EM, Zhao XC, Yee R, Malicki J. CRB1 Gene Mutations Are Associated With Keratoconus In Patients With Leber Congenital Amaurosis. Invest Ophthalmol Vis Sci. 2009 Apr 30. [Epub ahead of print]

go to the top

Lens and Cataract

Janey Wiggs


Cataract can be defined as any opacity of the crystalline lens. Cataracts typically develop with age, but may also affect children. Early-onset or congenital cataracts are particularly serious because they have the potential for inhibiting visual development and cause blindness from amblyopia, or other problems related to their surgical removal. Most inherited cataract is caused by mutations in genes encoding for proteins involved in the maintenance of lens transparency and homeostasis. Information about these proteins and their functions has lead to an increased understanding about lens biology and cataract formation.

Genetics of early-onset or congenital cataracts: About 70% of early-onset or congenital cataracts involve the lens only, while the remaining cases are associated with other ocular abnormalities or syndromes. Nonsyndromic cataracts are most frequently inherited as autosomal dominant traits, but also can be inherited in an autosomal recessive, or X-linked fashion. Phenotypically identical cataracts can result from mutations at different genetic loci and may have different inheritance patterns, while phenotypically variable cataracts can be found in a single large family. There are at least 39 mapped loci for early-onset or congenital cataracts, and mutations in 26 genes have been associated with these conditions. Of these, approximately 50% have mutations in crystallins, approximately 25% have mutations in connexins, and the remainder divided among the genes for heat shock transcription factor-4 (HSF4), aquaporin-0 (AQP0, MIP), and beaded filament structural protein-2 (BFSP2) (see review by Hejtmancik for a detailed mutation list).

Genetics of age-related cataracts: Studies have suggested that multiple risk factors, both genetic and environmental, are associated with age-related cataracts. Family aggregations have been reported and segregation analyses suggest modest family correlations. A genomewide scan, using model-free linkage analysis of affected sib pairs, several potential susceptiblity loci for age-related cortical cataract in white individuals. These linkage regions are large and contain many potential disease associated genes. Some genes causing early-onset cataracts or syndromic cataracts may also contribute to age-related cataracts. For example, a novel variant of galactokinase, causing an early-onset cataract was also shown to be associated with an increase in bilateral cataracts in adults.

Consensus on phenotyping: Cataracts, which can be defined as lens opacities, have multiple causes, but are often associated with breakdown of the lens microarchitecture, possibly including vacuole formation and disarray of lens cells, which can cause large fluctuations in density resulting in light scattering. Early-onset forms include: polar opacities, zonular, pulverulent, sutural, and cerulean (blue dot). Age-related cataracts are classified as nuclear sclerosis, anterior or posterior capsular, or cortical.

Genetic resources for cataract: Population-based studies, clinic-based studies, some animal models, some ‘special populations' with high incidence (India).

Major cataract genetics studies

Early-onset cataract: Many studies of linkage and gene identification have been published (see Hetmanjik review). Of the 39 known loci, 26 causative genes have been identified.

Age-related cataract: Several population based studies have examined a variety of environmental exposures for cataract associated risk. At least one genome-wide linkage study has been performed (Beaver Dam) and several candidate gene association studies have been published.

Next Steps

Short-term goals to advance the field:


  1. Gene testing panel, sequencing array to establish screening/diagnosis/prognosis
  2. Animal models for testing potential therapeutics


  1. Analysis of risk factors in population-based studies (environmental and modifiable)

Long-term goals to advance the field:

  1. Functional studies to inform potential therapeutics
  2. Translation of modifiable risk factors to clinical practice


Hejtmancik JF, Congenital Cataracts and their Molecular Genetics. Semin Cell Dev Biol. 2008 April; 19(2): 134–149.

Shiels A, Hejtmancik JF, Genetic Origins of Cataract. Arch Ophthalmol. 2007;125(2):165-173.

go to the top

Aquaporins in Ocular Disease

Alan S. Verkman


The aquaporins (AQPs) are small integral membrane proteins (~30 kDa/monomer) expressed widely in the animal and plant kingdoms, with 13 members in mammals. AQPs are expressed in epithelia and endothelia involved in fluid transport. Structural analysis by x-ray and electron crystallography indicate that the AQPs assemble in tetramers in which each monomer consists of six tilted alpha-helical domains enclose an aqueous pore. Molecular dynamics simulations suggest tortuous, single-file passage of water through a narrow pore of less than 0.3-nm diameter, in which steric and electrostatic factors prevent the transport of protons and other small molecules. AQP1, AQP2, AQP4, AQP5 and AQP8 are primarily water-selective, whereas AQP3, AQP7 and AQP9 (called 'aquaglyceroporins') also transport glycerol and possibly other small solutes.

Several AQPs are expressed in the eye. At the ocular surface, AQP1 is expressed in corneal endothelium, AQP3 and AQP5 in corneal epithelium, and AQP3 in conjunctival epithelium. AQPs are also expressed in lens fiber cells (AQP0), lens epithelium (AQP1), ciliary epithelium (AQP1, AQP4) and retinal Müller cells (AQP4). Mutations in AQP0 produce congenital cataracts in humans. Analysis of knockout mice lacking individual AQPs suggests their involvement in maintenance of corneal and lens transparency, corneal epithelial repair, intraocular pressure regulation, retinal signal transduction, and retinal swelling following injury. The neuroinflammatory demyelinating disease neuromyelitis optica, which causes optic neuritis and transverse myelitis, is associated with circulating AQP4 autoantibodies.

Major Studies

AQP0 mutations produce hereditary cataracts in mice and humans. Cataract-producing AQP0 mutations are thought to produce endoplasmic reticulum-retained and non-functional AQP0; however, the mechanism linking AQP0 loss-of-function and cataracts remains unclear. Proposed mechanisms include loss of AQP0-facilitated fiber-fiber adherence, and impaired fiber cell dehydration.

Though disease-producing loss-of-function mutations in other AQPs have not been identified, studies from knockout mice have revealed various AQP functions in ocular tissues. The roles of AQPs in the eye can in large part be attributed to their water and/or glycerol transporting functions. Examples include: AQP1/AQP4-dependent active fluid secretion by ciliary epithelium; AQP5-dependent osmosis across cornea, and AQP3-dependent corneal epithelial proliferation. The molecular roles of AQPs in maintaining corneal and lens transparency are less clear, as is the role of AQP4 in retina. Whether and how AQP4 autoantibodies cause optic neuritis in neuromyelitis optica is not known.

Known Genes

AQP0 (MIP; major intrinsic protein in lens fiber): Cataracts
Other AQPs expressed in eye: AQP1, AQP3, AQP4, AQP5

Next Steps

Much basic research remains in defining cell-level mechanisms for the ocular AQP functions, in establishing the relevance to human eye disease of conclusions from knockout mice, and in developing AQP-modulating drugs. Specific areas of basic research in the biology of ocular AQPs include elucidation of the precise cellular role of endothelial AQP1 in corneal fluid balance, of lens epithelial AQP1 in cataractogenesis, of corneal epithelial AQP3 in cell regeneration, and of retinal AQP4 in light-neural signal transduction and retinal fluid balance. The role of AQP4 autoantibodies in optic neuritis in NMO warrants investigation.

An intriguing possibility, which remains speculative at this time, is the clinical development of modulators of AQP function or expression. At the ocular surface, AQP3 and AQP5 up-regulation are predicted to accelerate epithelial resurfacing and reduce corneal edema, respectively. Inducers of AQP1 expression in corneal endothelium might reduce corneal edema and associated opacity following injury. Induction of lens AQPs might slow cataractogenesis. AQP1/AQP4 inhibition represents a possible strategy for reducing intraocular pressure associated with glaucoma. In the retina, AQP4 inhibitors might offer neuroprotection in ischemic and other retinopathies.

Suggested areas for investment of NEI resources: Repository for distribution of AQP knockout / transgenic mice; Small-molecule discovery of AQP modulators; Basic research on cellular mechanisms of AQP functions in ocular tissues, AQP4 autoantibodies in neuromyelitis optica, and AQPs in human eye diseases.


Verkman, A.S., J. Ruiz-Ederra and M. Levin (2008). Functions of aquaporins in the eye. Prog. Ret. Eye Res. 27:420-433.

go to the top

Connexins in Ocular Disease

Daniel A. Goodenough


Gap junctions are clusters of intercellular channels that directly connect the cytoplasms of adjacent cells. A long evolutionary history has permitted the adaptation of gap junctional intercellular channels for a wide variety of uses. This diversity of function is reflected in a 21-member family of structural proteins, the connexins (Cx), each with multiple channel conductance states, phosphorylation states, and with the ability to assemble in heterotypic and heteromeric configurations, diversifying their functional complexity. Cellular activities facilitated by gap junctions fall into two general classes: synchrony/coordination and stimulus/suppression. Among excitable cells, gap junctions, also known as electrical synapses, are common in circuits requiring speed or synchronous firing. In other tissues, gap junctions allow intercellular transfer of small molecules and ions. For example, in the ocular lens, exchange of nutrients and signals required for the prevention of cataract and proper regulation of postnatal lens growth require gap junctions. Indeed, mutations in lens connexin genes are a common cause of hereditary cataract.

Knockout of Cx50 results in both a pulverulent cataract and a smaller lens with concomitant microphthalmia due to a slower mitotic rate in the lens epithelium. Replacement of the Cx50 coding sequences with the Cx46 sequences (Cx50KI46) results in a rescue of the cataract but not of the mitotic rate defect, indicating that there is a quality inherent in Cx50 that is required. Back-crossing these animals with Cx46–/– and Cx50–/– animals has revealed that different forms of dominant cataract result from incongruous mixing of connexins, and that both connexin identity and the locus of gene expression can dramatically affect junctional coupling in the lens.

The ciliary epithelium consists of a double layer of epithelial cells. The pigmented epithelium (PE) rests on the connective tissue stroma and the non-pigmented epithelium (NPE) is polarized with its basal lamina facing the posterior chamber of the eye. The PE and NPE interact via their apical membranes with gap junctions containing Cx43. The NPE forms tight junctions delineating a boundary between the blood and the aqueous humor as part of the blood-aqueous barrier. The combined ion transporters and pumps located in the epithelia of the ciliary epithelium provide the source of the aqueous humor. The gap junctions between the PE and NPE are critical to coordinating the ionic pumping of the two epithelial cell types. The water following this NaCl movement can pass from PE to NPE cells via gap junctions. Targeted disruption of Cx43 in the ciliary epithelium results in a decrease or loss of intraocular pressure.

In the retina, gap junctions are involved in interneuronal electrical signaling. Targeted disruption of Cx36 results in a loss of scotopic vision due to the ablation of gap junctions between AII Amacrine and cone ON bipolar cells and possibly gap junctions between rods and cones. Deletion of Cx57 significantly reduces horizontal cell receptive field size and this connexin's distribution is modulated by light. It is speculated that the transjunctional voltage dependence of Cx45 channels could support the transmission of direction selectivity. Connexins are found in other ocular locations. Multiple connexin proteins are found between corneal epithelial cells; many of these may have redundant functions since knockout of connexin genes does not result in a corneal phenotype. In humans, oculodentodigital dysplasia, a pleomorphic, syndromic condition affecting a large number of cell types, results from mutations in Cx43. Some patients show abnormalities in eye development and in the development of glaucoma. In families carrying the Cx43 L113P mutation, the ophthalmic features include epicanthus, microcornea, and the presence of glaucoma. The retinal pigment epithelium (RPE) expresses Cx43. It has been reported that RPE regulates proliferation in the underlying neural retina by ATP efflux through Cx43 hemichannels, although targeted deletion of Cx43 from the RPE does not result in retinal abnormalities.

Major Studies

Positional cloning has been the principal method used for identifying mutations in humans with hereditary cataract and with ODDD.

Known Genes

A partial list of coding mutations causing cataract in human lens connexins:

Cx50: S50P S276F, W45S, P88Q, D47N, V79L.
Cx46: L11S, R76G, V28M.

There have been no reported regulatory mutations.

Next Steps

Given that the eye is a privileged compartment, it is a location where there is a potential future for targeted gene therapy. Current advances in viral gene transfer offer the possibility of introducing wild-type genes to replace mutants and to selectively express siRNA to downregulate overexpressed or improperly regulated gene activity. Given the high costs of primate research, a central facility permitting exploration of these methods would facilitate the transfer of technologies to humans.


Gong,X., C.Cheng, and C.H.Xia. 2007. Connexins in lens development and cataractogenesis. J.Membr.Biol. 218:9-12.

Mese,G., G.Richard, and T.W.White. 2007. Gap junctions: basic structure and function. J.Invest Dermatol. 127:2516-2524.

go to the top

Mouse Models in Ocular Disease

Simon John


Review mouse models currently available for study of ocular disease: Many models are available for a wide variety ocular diseases. There are more mouse models than for other mammalian species. Despite this important models exist in other mammalian species (eg spontaneous and induced glaucomas in rat and RCS rat) and other systems are used to study heritable ocular disease or test interventions (dog, zebrafish, flies). Inherited and or induced models exist for various diseases and conditions but not all of these models are well characterized in a uniform fashion. Although experimentally induced models do not allow us to study the initial causes of inherited disease, they can be useful for assessing genetic susceptibility factors and testing treatments. There are genetic models for cataracts, retinal degenerations, glaucoma, abnormal ocular development, night blindness, mucoploysaccharidoses, choridemia and others. Since they are easier to study and detect, developmental conditions are over-represented. One review (Budzynski, 2006) indicated that 36% of ocular disease models in mice involve ocular development. There are also many models of retinal degeneration.

There is a deficiency of later onset models and models of common complex human conditions such as age-related macular degeneration, open-angle glaucoma and diabetic retinopathy. A larger focus on generating and studying such models would be useful.

Major Studies

Provide an overview of the approaches used to create and characterize mouse models that contribute to ocular disease: Genetic mouse models are provided in a variety of ways (spontaneous mutations, induced point mutations, transgenic and knockout technologies etc, detailed in Adams review). Methods of analyses are tailored to the specific condition and study but include clinical approaches (slit-lamp, ophthalmoscope, fluorescein angiography, fundus photography, SLO, gonioscopy etc) and the full range of histological, molecular, genetic genomic, proteomic, biochemical and cell biological approaches. Physiological analyses are also important including electroretinography, IOP assessment and aqueous humor outflow. Genetic experiments identify and characterize causative genes, identify and characterize genes that modify susceptibility to or severity of disease, and manipulate specific pathways or processes to test hypothesis. Viral vectors are also used to express genes in areas of interest or to knockdown the function of specific genes using small inhibitory RNA approaches.

Known Genes

Describe genes that are responsible for ocular disease in the mouse models and note if mutations are regulatory or coding: Many genes contribute to ocular disease in mice. A search of the Mouse Genome Informatics database ( identified 334 genes associated with the search terms vision or ocular disease. Gene targeting that typically abrogates gene function was used to induce mutations in the great majority of these genes. A significant proportion of these genes were identified from mice with spontaneous mutation or chemically induced point mutations. The point mutations are often coding or splice junction mutations but other types exist. Some of the mutations involve gene traps, which decrease or abrogate the production of normal transcripts.

Gene targeting and /or gene trapping approaches are being used to mutate many of the genes in the mouse genome (see Adams et al.). Large scale efforts are producing impressive collections of "knockout first" alleles. These are null alleles that express a Lacz reporter gene. These alleles can be exposed to the site-specific recombinase Flp to generate a conditional allele, and then with the site-specific recombinae Cre to produce a null allele in a cell- or tissue- specific manner.

Next Steps

Describe next steps you recommend to advance the field. What communal research resources would represent a good investment for NEI? Many human conditions are caused by point mutations.

An efficient approach for identyfing new disease models and pathways would be to establish phenotyping centers to characterize the eyes of many of the new mutants that are being produced in specific genes.

Targeted mutations in some mouse genes do not cause phenotypes found in people with point mutations in the orthologous gene, while point mutations do. Point mutation resources are needed to complement the large gene knockout efforts. ENU mutagenesis is a valuable approach for producing these mutations and new generation sensitized screens will be more powerful and productive than previous versions for providing models and understanding of complex diseases. Sensitized mutagenesis screens using ENU, especially those sensitized by mutations in a gene that causes human disease will be valuable. These genetic screens will allow identification of other genes that interact with the first genes to modify disease and will identify new disease pathways. Genome wide collections of mouse mutations are likely to facilitate the identification of human disease genes, especially for diseases such as glaucoma, for which common large affect alleles do not appear to significantly contribute to disease.

A new twist on the ENU approach may emerge to produce banked point mutation resources. The mutations present in these banks can be identified using the new massively parallel sequencing techniques. As one approach, frozen sperm banks with associated databases would allow rapid and low cost mutation selection and recovery to live mice.

Mutation collections are needed on different strain backgrounds, as the commonly used B6 and 129 backgrounds are not optimal for studying various diseases. The production of mouse ES cells with different strain backgrounds will be important. Emerging oligonucleotide-based mutagenesis may prove an efficient strategy for producing subtle mutations in selected target genes in ES cells.

Emerging and sophisticated strategies that combine gene traps and transposons may provide efficient methods to produce genome wide collections of null and conditional alleles on these backgrounds. Additionally, they can be used to produce mutation collections on different backgrounds by breeding (no need for ES cells, much simpler and cost effective). These approaches can also be used in other species where it is possible to make transgenic animals.

Where human mutations become characterized it will often be valuable to "humanize" mice by replacing the mouse gene with normal and mutant versions of the human gene, or by adding a human transgene. BAC transgenics will often be a good approach to maintain endogenous expression patterns as they can assume human regulation in a mouse. Humanized models that are phenotypically well characterized using generally accepted/standardized approaches will be very valuable to the field.

Copy number variations are increasingly being implicated in disease and this will be true for ocular diseases. Assessing copy number variations in mouse genetic experiments will be important and the ability to manipulate the copy number of single or muliple mouse genes, small and larger genomic regions is important. Gene targeting and chromosome engineering techniques can produce desired duplications and deletions. BAC transgenics can increase copy number but this is not easily controllable. A powerful approach for producing genome wide deletions and duplications by breeding was recently reported (see Wu et al. 2007, cited in Adams review below). This method combines gene trapping and transposon technology and a centralized effort would produce genome wide collections of deletions, and duplications, in addition to insertional loss-of function and conditional rescue alleles.

Well-characterized collections of tissue and cell specific Cre and/or Flp recombinase (especially Cre) expressing mice will be valuable for determining how mutations cause specific diseases. Ideally these would include CreERT variants that require tamoxifen for activity, providing for spatial and temporal control of conditional mutations and fluorescent protein markers. Different diseases will require different sets of Cre mice that are tailored to manipulate the relevant cells and tissues for each disease (eg retinal ganglion cells, astrocytes, microglia endothelial cells and other cell types in glaucoma).

Similarly, well-characterized collections of fluorescent protein alleles that mark different cell types or organelles in different colors will be very valuable. Ideally different colors would be available for each cell type. These resources can also facilitate genetic screens.

Mice will be increasingly important in understanding complex genetic interactions and the genetic complexity of disease. Valuable new resources are being produced such as the "community cross". Robert Williams and Gary Churchill are experts in this area. Emphasis on training in genetics and the value of these resources is important.

Review and Books

Please include 1 or 2 recent reviews of the field.

Adams DJ, van der Weyden L. Contemporary approaches for modifying the mouse genome. Physiol Genomics 34(3):225-38, 2008

Eye, Retina, and Visual System of the Mouse, Chalupa LM, Williams RW (Eds.), MIT Press, Massachusetts, 2008

Systematic Evaluation of the Mouse Eye: Anatomy, Pathology and Biomethods. Smith RS, John SWM, Nishina PM, Sundberg JP, (Eds.), CRC Press, Boca Raton, Florida, 2002

go to the top

Stem Cells in Ocular Disease

Robert M. Lavker, Ph.D.


Stem cells have a large replicative and tissue repair capacity; they are morphologically and biochemically primitive and they are assumed to divide relatively infrequently in adult tissues. Another feature of some, but not all of the adult stem cells is a wide differentiation potential (pluripotency), which has vast implications for regenerative medicine. Finally, stem cells are the targets for neoplastic transformation. Upon division, in average one of the two stem cell progeny leave the stem cell niche (a specialized microenvironment) and become a transit amplifying (TA) cell that has a limited proliferative capability.

Corneal epithelium. Of the various zones comprising the ocular anterior surface epithelium, the limbal/corneal epithelium is the most extensively studied and best characterized with respect to ocular stem cell biology. Two important findings form the foundation for the concept that corneal epithelial stem cells are preferentially located in the limbal epithelial basal layer. The first is the seminal finding that the limbal epithelial basal cells lack a K3 keratin marker for an advanced stage of corneal epithelial differentiation, and hence are biochemically more primitive than corneal basal cells. The second is the demonstration that label-retaining cells (LRCs; a marker of putative stem cells) are restricted to the limbal basal layer. Independent support for the limbal stem cell concept comes from the observations that limbal basal cells: (i) have a high in vivo and in vitro proliferative potential; (ii) give rise to TA cells that undergo centripetal migration; (iii) are the predominant site of corneal epithelial neoplasms; (iv) are essential for the long-term maintenance of the central corneal epithelium; and (v) can be used to reconstitute the entire corneal epithelium. Taken together, these findings provide strong support to the limbal stem cell concept.

Conjunctival epithelium. The stem cell situation in the conjunctival epithelium is more complicated. Studies on the cell kinetic and in vitro growth characteristics of mouse bulbar, fornical and palpebral conjunctival epithelia demonstrate that the fornical epithelium is enriched in conjunctival epithelial stem cells although a small number of cells with stem cell characteristics are also present in the bulbar and palpebral zones. With respect to humans, a fornical location of stem cells has been confirmed; however, a significant number of epithelial stem cells might also be present in the bulbar conjunctiva. The proximal portion of the mouse meibomian gland ductal epithelium has been shown to be a preferential site of stem cells, and the progeny of these cells give rise to the MCJ epithelium of the eyelid. Furthermore, basal cells within the MCJ epithelium can emigrate into the palpebral conjunctival epithelium. Collectively, these studies suggest that the fornical zone maybe a major, but not exclusive site of conjunctival epithelial stem cells.

Lens epithelium. A hierarchy of cell proliferation exists within the lens epithelium with the slowest cycling cells, detected as heavily labeled LRCs, located exclusively in the central region. These cells have been proposed as the putative lens epithelial stem cells that divide very infrequently during homeostasis. However, upon perturbation, these cells can enter the proliferative pool. The lightly labeled LRCs, located in the central and germinative zones divide more frequently than the heavily labeled LRCs; however they too possess proliferative capacity. These cells are analogous to the "young" TA cells located at the peripheral corneal epithelium. Finally, a third population of more actively cycling cells exists primarily in the germinative zone and represents the TA cells. After their last division, these cells differentiate into the lens fiber cells.

Retina. In embryonic retina the majority of mitotically active cells are competent to give rise to multiple different cell types of retinal neurons as well as Muller glia. These cells are typically referred to as multipotent progenitors and could also be considered retinal stem cells. In frogs and fish, the cells within the ciliary marginal zone (CMZ), which is the region where the peripheral retina joins the ciliary epithelium, can give rise to all types of retinal neurons. Because these cells can generate most of the retina, the CMZ contains a population of true retinal stem cells. In chickens, cells in the CMZ have the potential to give rise to a large number of progeny, which can generate retinal neurons, glia and retinal pigmented epithelial cells. In the mammalian system, following retinogenesis, there is no evidence of growth in the retinal pigmented ciliary margin (PCM). However, the PCM of mice contain retinal stem cells that can proliferate in vitro with the potential to produce all the cell types of the neural retina, including rod photoreceptors, bipolar neurons and Muller glia. In the human, retinal stem cells are also believed to be present in the PCM. Muller glial cells, the predominant support cell in the retina, have been shown to have neural regenerative capacity being able to form bipolar neurons and rod photoreceptors following perturbation. Photoreceptor cells have the potential to form either rods or cones based on environmental cues.

Major Studies

A. Characterization of Stem Cells

In vivo. One of the most widely accepted means of recognizing stem cells in vivo is the detection of label-retaining cells (LRCs), which identifies cells that divide infrequently (a characteristic of stem cells). LRCs are exclusively located in the basal layer of the limbal epithelium, the anterior portion of the lens epithelium and in the CMZ of teleost retinas. Another means of identifying stem cells in vivo is presence of putative stem cell markers. Thus far, no single stem cell marker exists and a combination of markers is often used to define a stem cell. For example, corneal epithelial stem cells are characterized by the presence of p63α, ABCG2, keratin 15 and the absence of connexin 43. The situation in the retina is far more complex with the expression of nestin, Pax6, Chx10, Sox2, Prox1, Six3, vimentin, Mash1, Ngn2 and mushashi, to name a few, regarded as markers of retinal stem/progenitor cells.

In vitro. The in vitro counter-part of the LRC is the holoclone colony. Holoclones can be isolated from limbal epithelial cultures and have the replicative potential to replace the corneal epithelium. Thus, holoclone-forming colonies have been thought of as bona fide epithelial stem cells in vitro on the basis of their extended lifespan, high colony forming efficiency and ability to generate more committed progeny known as meroclones and paraclones. Very few if any holoclones can be isolated from human corneal epithelial cultures.

With respect to the retina, mouse retinal stem cells/ progenitor cells form monolayers or neurospheres. Neurospheres contain markers of retinal progenitors and their progeny express proteins present in subtypes of retinal neurons. Thus, neurospheres have been thought of as retinal stem cells. Addition of growth factors, serum, cytokines and other chemical agents to either neurospheres or monolayers can affect their proliferative and differentiation characteristics.

B. Tissue Regeneration

Corneal epithelium. The limbal stem cell theory forms the basis for identifying and reclassifying a host of corneal blinding diseases that display features of limbal stem cell deficiency (LSCD). Equally important, this theory formed the basis for the development of several surgical procedures using transplanted limbal stem cells to successfully restore vision in patients with LSCD. Thus, the limbal stem cell concept is one of the preeminent examples of "bench-to-bedside" success and could serve as a poster child for the use of stem cells in regenerative medicine.

The idea of ex vivo limbal stem cell expansion to treat LSCD is based on the original technique of using small pieces of limbal epithelial tissue from either the remaining healthy eye (autograft) or from cadavers (allograft) to reconstitute the entire corneal epithelium, thus allowing restoration of a healthy anterior surface. In ex vivo expansion, limbal epithelial stem and young TA cells are isolated from a small limbal biopsy and expanded in culture using either explant or suspension cultures. In both cases, limbal epithelial cells are stimulated to proliferate and eventually form a sheet of confluent cells. Amniotic membrane with or without growth-arrested 3T3 fibroblasts is often used as a substrate and carrier for the cultured limbal epithelial stem cells. Other carriers used in the clinic are fibrin, and poly(N-isopropylacrylammide) polymer temperature-sensitive supports. An excellent demonstration of stem cell flexibility are the recent reports that in humans, cultivated oral mucosal epithelial cell transplantation ("COMET") results in successful reconstruction of the corneal epithelium following acute corneal burns.

Retina. Given the complexity of the retina, t is not surprising that transplantation of retinal stem cells into damaged or degenerating retinas is in its infancy compared with corneal epithelial transplantation. Use of newborn mouse retinal stem/progenitor cells transplanted into degenerating or damaged retinas failed to differentiate into a retinal phenotype, suggesting that commitment to specific cell types in vitro may be required prior to transplantation. Transplantation of primary post-mitotic newborn retinal cells migrated and engrafted into the photoreceptor layer of littermate mice. These engrafted cells displayed cone and rod photoreceptor morphology as well as functionality. Transplantation of adult human retinal stem cells in developing mouse and chick retina revealed that these cells could generate retinal ganglion cells and photoreceptors as well as retinal pigmented epithelial cells. Thus, it appears that determining the correct "age" of the retinal stem/precursor cell as well as the best time-window for transplantation is crucial for success.

Next Steps

These remarks will be limited to the area of corneal epithelial stem cells.

There is a need to define further the "limbal epithelial stem cell signature", which are those genes that are unique to this population of basal cells. Such information will: (i) aid in our attempts to isolate limbal epithelial stem cells; (ii) help evaluate the quality of the confluent epithelial sheets; and (iii) provide further understanding of the flexibility among stem cells. Having populations of relatively "pure" stem cells for ex vivo expansion should increase the efficiency and reduce the time needed to form confluent sheets of epithelium for transplantation.

There is a need to define further the limbal stem cell niche, which is the specialized microenvironment that is crucial for the maintenance of limbal stem cells. There are three parts to the niche: (i) cell-cell interactions; (ii) cell-basement membrane interactions; and (iii) cell-stromal interactions. Evidence is accumulating that the methods currently used to isolate limbal epithelial basal cells (e.g., dispase, trypsin), and culturing conditions (e.g., 3T3 feeder cells, fibrin, amniotic membrane) may be selecting and/or altering the properties of the stem cells. This can result in a lengthy time (>2 weeks) to obtain confluent sheets of cells for transplantation. Alternatively, treatment of small human limbal biopsies in a manner that removes the collagen matrix, but not the stromal cells results in the generation of significantly greater (20 mm diameter) sheets of limbal cells in much shorter times (~5 days). This suggests that maintaining the limbal "niche" may vastly improve the regenerative capacity of the limbal basal epithelial cells. Therefore, developing methods that preserve the "niche" during corneal epithelial transplantation will greatly advance the field.

A major impediment in the use of limbal epithelial stem cells or retinal stem cells for tissue regeneration is the general lack of GMP (good manufacturing practices) facilities so that stem cell products can be legally used in human patients. Clinical trials (other than Phase I) in the US that use cell-based therapies have to be approved by the FDA and go through an IND (investigational new drug) application. This can be a lengthy and detailed process, which may be beyond the capabilities of a single investigator. To move this process forward, the ocular community needs to create either university-based or communal (industry-based) facilities so that the new stem cell based approaches that will be forthcoming from the research laboratories can be legally translated to the bedside. One such facility now exists in the US; however it is privately owned. Use of NEI resources to facilitate creation of GMP facilities would be an excellent investment.

The success rate of ex vivo expansion varies greatly (33%-100%) with a mean of ~77%. Consequently, there is a need to standardize many aspects of this technique. For example, outcome measures of success vary greatly, mean follow-up times are different (6 months vs. 2+years), protocols for limbal epithelial cell isolation and culture are widely disparate, and carriers (amniotic membrane, fibrin, and polymers) differ. A common set of objective outcome measures will greatly aid in assessing the value of ex vivo transplantation versus intact limbal transplantation for corneal regeneration.

Finally, much attention has been directed at identifying genes and proteins that are altered in various disease of the ocular anterior surface. However, one area that has been relatively neglected is the regulation of protein synthesis. MicroRNAs (miRNAs) are an exciting class off noncoding RNAs that can regulate gene expression at many levels, giving rise to the idea that RNA is capable of performing similar regulatory roles as proteins. Evidence suggests that miRNAs can play roles in developmental timing, stem cell development, cell proliferation, differentiation, apoptosis, and in the initiation and progression of cancer. Despite the importance of these regulatory RNAs, relatively little attention has been directed towards understanding miRNAs associated with ocular tissues and their associated diseases. It is anticipated that in the next few years there will be a significant increase in research directed towards understanding how miRNAs regulate ocular tissues. An NEI–sponsored workshop and/or conference on "miRNAs and their Role in Ocular Disease", eventually leading to the generation of an RFA, will be a cost-effective means of stimulating research in this important area.


Lavker, R. M., Tseng, S. C. G., and Sun, T.-T. (2004). Corneal epithelial stem cells at the limbus: looking at some old problems from a new angle. Exp Eye Res 78, 433-446.

Pellegrini, G., De Luca, M., and Arsenijevic, Y. (2007). Towards therapeutic application of ocular stem cells, Seminars in Cell & Developmental Biology 18, 805-18.

go to the top

Study Design and Statistical Genetic Analysis

Alexander F. Wilson


The goal of statistical genetic analysis is the identification of genetic effects (loci or alleles) that are responsible, at least in part, for the variation of a trait, or the susceptibility to the development of a disease. No single study design or method of analysis will identify all genetic effects; different study designs and methods are required depending on the size and frequency of the effect that you wish to detect. Unlike traditional approaches in biostatistics, where study designs often assume that the observations and the independent variables are indeed independent, in statistical genetics there are often dependencies between the observations (family relationships) and the "independent" variable (i.e., markers in linkage disequilibrium).


Thirty years ago, the primary focus of statistical genetic analysis was on the phenotype and the study design was usually family based. Considerable effort was spent in identifying the underlying mode of inheritance and frequency of a suspected single major locus underlying a disorder and then linkage analysis, the cosegregation of that putative locus and other known genetic variants or "markers" (primarily red cell antigens and electrophoretic protein polymorphisms), was used to identify genetic loci that co-segregated with the putative trait locus. Since that time the nature and number of genetic markers has changed dramatically. Genetic markers have evolved from red cell antigens and protein polymorphisms (dozens) to Restriction Length Fragment Polymorphisms (RFLPs: hundreds), Short Tandem Repeat Polymorphisms (STRPs: many hundreds), Single Nucleotide Polymorphisms (SNPs: many thousands) high density SNPs (hundreds of thousands), copy number variants (CNVs) and sequence variants (millions), and, in the not too distant future, complete sequence (billions). Given the high quality of the current SNP data, the focus is now on whether specific genetic (or sequence) variant(s) are involved in the variation in the trait or susceptibility to disease.

Phenotypic traits can be either quantitative – continuous variables that are measurable, or qualititative – variables that represent disease status (i.e., affected or unaffected). Study designs can be either family based or population based or a combination of the two. Family based designs tend to be more expensive with respect to data collection, but tend to minimize the effects of population stratification and genetic heterogeneity better than population based designs. Family based designs have the advantage of being able to determine whether polymorphisms are segregating as Mendelian loci and allow for the estimation of the heritability for quantitative traits. Finally, studies can be targeted to include only specific candidate regions or genes (candidate gene approach) or to encompass the entire human genome (genome-wide approach), and/or to focus on single genes or entire genetic pathways.

There are two general types of statistical genetic analysis that use genetic polymorphisms. Linkage analysis can only be done in family study designs and the method identifies loci that are cosegregating with a putative trait locus/loci (Linkage identifies Loci). Resolution in linkage analysis is in the cM range, which can, at best, result in candidate regions that contain many genetic loci. Unfortunately, resolution at that level makes targeted sequencing impractical at the moment. Tests of association can be done in either family- or population-based designs and they test for a relationship between a trait and specific allele (Association identifies Alleles). Resolution for tests of association is in the Kb range, and depends on the linkage disequilibrium structure of the population being studied. Resolution at this level is currently more tractable for targeted sequencing in a candidate gene approach. Linkage analysis in family-based study designs are most appropriate for identifying rare variants with moderate to large effects in families not a population. Tests of association in population-based study are most appropriate for identifying common variants that have measurable effects in a population. However, the size of the allelic effect is generally more important than its frequency. For many traits, common allelic variants have very small effects on the susceptibility to the disease or variation of the trait; in diabetes for example, common alleles detected in genome-wide case-control association studies have odds ratios of about 1.1. The effect size of the allele for Familial Hypercholesterolemia for example, accounts for a substantial proportion of the variation of the LDL levels in families with the allele, but the allele is so rare as to make the effect on the population insignificant.


There are numerous examples of linkage analysis and tests of association in both family and population-based studies in the literature including many in the genetics of occular disease (e.g., AMD [Klein RJ et al, 2005]. What may be more interesting are current association studies that replicate or corroborate findings originally identified in family studies with linkage analysis. Recently, McMahon et al., [2006] used a split sample candidate gene approach focusing on genetic pathways in a sample of nearly 2000 unrelated individuals with major depression to identify the 5HT2RA serotonin receptor locus, which affects response to Citalopram, an SSRI used to treat depression. This same receptor was identified as a possible candidate gene for depression in a family study done 25 years earlier in a meta-analysis of linkage of depression to ESD (physically next to the 5HTR2A receptor locus) [Wilson et. al 1991].

Next Steps

Several Institutes across NIH (NIA, NIDA, NICHD) face similar challenges to NEI with respect to integrating genetics and genomics into their research portfolios.

  1. Communal research resources. Use a single genotyping site and platform for studies of a given disease.
  2. Genotype existing family and population samples only where there is high quality phenotype data in a large number of samples. Use linkage analysis and intra-familial tests of association in the families to identify rare alleles with large effects, and population based studies to identify common variants.
  3. Foster/Encourage/Force large scale collaborations that use a single high density SNP genotyping panel to improve the power to detect genetic effects in both family and population based studies. Define common phenotypes, genotype on a single platform, and consider using split sample design with discovery and replication subsamples to ameliorate multiple-testing problems. Funding a few large studies is preferable to funding many small studies because of the differences inherent in phenotype definition, genotyping platform differences, and methods of analysis, and subsequent reduction in power.
  4. Prepare for the inevitable deluge of sequence data. The amount and error rate for sequence data is substantially greater than that for SNPs. Family data provides a means of identifying sequence variants that are segregating, reducing sequencing errors. Also consider a hybrid study design like the one proposed in the NHGRI and NHLBI ClinSeq study. In that study, a candidate gene approach with targeted sequencing is used to screen a large number of loci in a sample where participants are selected based on their phenotype and the willingness of their first degree relatives to participate in the study as well. Unique sequence variants can then be confirmed in the relatives.


Borecki IB, Province MA. Linkage and association: Basic concepts. Advances in Genet 2008; 60:51-74.


Klein RJ, Zeiss C, Chew EY, Tsai J-Y, Sackler RS, Haynes C, Henning AK, SanGiovanni JP, Mane SM, Mayne ST, Bracken MB, Ferris FL, Ott J, Barnstable J, Hoh J. Complement Factor H Polymorphism in Age-Related Macular Degeneration. Science 2005; 308:385-389.

McMahon FJ, Buervenich S, Charney D, Lipsky R, Rush AJ, Wilson AF, Sorant AJM, Papanicolaou GJ, Laje G, Fava M, Trivedi M, Wisniewski S, Manji H. Variation in the gene encoding the serotonin 2A receptor is associated with outcome of citalopram treatment: Results from the STAR*D trial. Am J Hum Genet 2006; 78:804-814.

Wilson AF, Elston RC, Mallott DB, Tran LD, Winokur G. The current status of genetic linkage studies of alcoholism and unipolar depression. Psychiatric Genet 1991; 2:107-124.

go to the top

Eye Disease Phenotypes

Where are we and where do we need to go from here?
Rohit Varma


A disease is a state that places individuals at increased risk for adverse consequences. In contrast, deviations from normal that do not place individuals at increased risk for adverse consequences are not considered a disease. Labeling a patient as diseased has been debated by nominalists who label symptoms with a disease name without offering an explanation for the underlying etiology and by essentialists who argue that an underlying pathological etiology exists for every disease and that the disease state should be defined by an essential lesion.

To be considered a disease, the genotypic and phenotypic states of the patient must have the potential for adverse consequences. When determining states associated with disease, the challenge is to describe potential adverse outcomes comprehensively and explicitly. Although a few diseases are universally and prematurely fatal, most diseases place patients at an increased but variable risk for morbidity or mortality. For example, some patients with high blood pressure will be asymptomatic throughout life, about 30% will suffer adverse consequences such as heart disease, and 5 to 10% will die from a stroke. Here, the "cutoff" between the categories of diseased and non-diseased could be based on many factors, including an implicit understanding of risk and potential for treatment. Criteria defining which individuals are diseased are important because abnormalities, such as genetic variations or elevated blood pressure, may occur in otherwise asymptomatic patients. Criteria for certain diseases, such as diabetes, have improved, owing to more accurate definitions of risk and better treatments. The risk of adverse consequences for some genetic abnormalities may be so low that the state is better described as a risk factor rather than being viewed as synonymous with disease. Defining the level of risk is important because any trait, condition, or behavior associated with a genetic abnormality is in danger of being construed as disease-associated. Patients with a genetic variation who are at minimal or no increased risk for adverse consequences should not be labeled as diseased. If the definition of disease is based solely on a genetic abnormality/anatomic lesion rather than on a clear specification of the risk, the label may harm the patient.

The leading causes of eye disease in the US and the world are:

For each disease, a better understanding of the pathophysiology particularly through an assessment of genetic risk associations would provide insight into the development of the disease and allow for identification of therapeutic targets that would possibly delay development and progression of the disease.



Currently the phenotypic definition of AMD in various studies is not standardized and therefore it may be difficult to compare the relationship between genetic variations and the phenotypic disease definitions. For example in the 3 landmark studies published in Science identifying the increased risk of AMD associated with a variant of Complement factor H the definitions of the Phenotype were very different. In a retrospective study on patients participating in the NEI Age-Related Eye Disease Study (AREDS), Klein et al. (Science 15 April 2005:Vol. 308. no. 5720, pp. 385 - 389) identified 96 case subjects who, at their most recent study visit, had either uniocular choroidal neovascularization (50 cases) or geographic atrophy central or non-central to the macula (46 cases). The case subjects' fellow eye was required to have at least one large drusen (>125 μm in diameter), and total drusen area equivalent to a circle of at least 1061 μm in diameter. Controls were 50 individuals from the AREDS sample who had few or no drusen (< 63 μm in diameter in each eye) for the duration of their participation in AREDS. Haines et al. (Science 15 April 2005:Vol. 308. no. 5720, pp. 419-421) assigned AMD status based on the clinical evaluation of stereoscopic color fundus photographs of the macula (EAP, AA), according to a 5-grade system, i.e., grade 1 has no AMD features, grade 2 has only small non-extensive drusen, grade 3 has extensive intermediate and/or large drusen, grade 4 is geographic atrophy, and grade 5 is neovascular AMD. This system is a slight modification of the AREDS grading system and uses example slides from the Wisconsin Grading System and the International Classification System as guides. Finally, in a study by Edwards et al. (Science 15 April 2005:Vol. 308. no. 5720, pp. 421-424), AMD cases had one or more drusen ≥ 63 μm in diameter, documented at some point during their disease course. The minimum disease severity required for inclusion was high risk AMD defined as sufficient drusen in the macula to fill a circle 700 μm in diameter or drusen with more advanced features such as retinal pigment epithelial hyperplasia. The presence of these features was predictive of significant risk for developing complications of AMD such as exudation and has been used previously in AMD genetics studies. Cases were diagnosed as (i) high risk, early AMD, (ii) AMD with pure geographic atrophy, or (iii) AMD with exudation using standard criteria.

While it is gratifying that the relationship between AMD (despite the different definitions) and CFH were present, it may be more informative from a mechanistic standpoint to refine the relationship between a more carefully defined AMD phenotype and CFH.

There are many large cohort studies for glaucoma, including large clinical trial cohorts, i.e. OHTS, CIGTS, EMGTS that have used different definitions of glaucoma. Definition of glaucoma in the OHTS was by standardized graders at a Reading Center for determination followed by consensus of an endpoint committee. In CIGTS and EMGTS it was defined by the individual physician. In various population-based studies (Barbados Eye Studies, Blue Mountains Eye Study, Rotterdam Eye Study, and Nurses Health Study) the definition of glaucoma was based on consensus visual field loss and optic nerve changes. Each of these definitions is different.

Thus, for glaucoma, concordance of phenotype definition is rudimentary particularly there is no standardization of angle width and inclusion of optic nerve damage. Genetic studies utilizing these broad disease definitions have largely not been successful (except for pseudoexfoliation syndrome). It may be more useful to identify specific lesions or a systems approach to phenotype definition.

Diabetic Retinopathy
The Early Treatment Diabetic Retinopathy Study severity scale is widely used for DR assignment and uses different anatomic lesions of varying sizes to determine severity. A more recent International Clinical Classification of Diabetic Retinopathy has been proposed. Despite these standardized classifications, no genetic associations have been identified.

There are many large cohort studies that have used different grading systems. The Barbados Eye Studies used slitlamp examination and LOCS II classification. The Blue Mountains Eye Study, AREDS and Beaver Dam Study used photographic grading - Wisconsin System for Classification of Cataracts from Photographs. In addition to these systems, a Wilmer Grading system is also present and has been used in some cohort studies. Again, there is little concordance in the various grading systems. The current phenotype grading systems have not yielded much success in phenotype genotype associations.


It is clear that many different phenotype definitions are used in various studies for each disease. Thus, there is a need for uniformity along with further elucidation of specific lesions. Furthermore, more advanced systems, such as systems network, need to be utilized for objectively characterizing disease phenotype.

Next Steps

Directions for future work will need to include the synchronization of disease definitions (anatomical, physiological, and environmental) with the help of the NEI and the eyeGene Network and to evaluate the risk relationships between individual and combinations of disease characteristics. In addition, the use of more advanced systems such as spectral-domain optical coherence tomography, non-invasive fundus oximetry, 24 hr intraocular pressure sensors to will be required to develop novel phenotypes, in coordination with a special task force. Finally, future research needs to concentrate on two approaches: (i) identifying and defining specific disease related lesions (ii) developing phenotype assessment using a network systems approach outlined below in the figure (Loscalzo et al Molecular Systems Biology 2007;3:124:1-11).

Network Model of Phenotype Definition

go to the top

Genome-Wide Association Studies

Lindsay A. Farrer


Genome-wide association studies (GWAS) use high-throughput genotyping technologies to profile large study samples (usually numbering 1,000 or more subjects) for hundreds of thousands of single nucleotide polymorphisms (SNPs) and relate them to clinical conditions and measurable traits. In the past 4 years, there has been an unheralded rate of discovery, with nearly 150 loci for more than 50 diseases and other traits identified and replicated in GWAS. These successes can be attributed to two major international efforts. One of these efforts, the human genome project, resulted in the (nearly) complete characterization of the consensus human sequence. This has greatly increased our ability to identify and describe the genomic structure of genes. A second effort, The HapMap project, characterized common genetic variability in humans including more than 2.5 million SNPs and, more recently, insertions and deletions known as copy number variations (CNVs). Multiple technologies now allow a GWAS design to be implemented with high fidelity and low cost (per genotype). The alleles, genotypes, or haplotypes of these SNPs are tested directly for association with disease. Estimates suggest that with 500,000 SNPs, 85-92% of the common Caucasian variation in the genome will be captured. The same genotyping platforms also capture CNV information. Thus, GWAS is by far the most detailed and complete method of whole genome interrogation currently available.

There is no single paradigm for the analysis of GWAS data. An increasing number of publications have attempted to address very specific analytical issues.1-3 For example, exhaustive allelic transmission disequilibrium tests1 can be used to examine all possible single locus and haplotype combinations in a computationally efficient manner, but is restricted to parent-child trio data. Classical statistical methods such as the chi-square test, logistic regression, and the Armitage trend test are commonly used for case-control data association studies, and linear regression and analysis of variance are often used to analyze quantitative trait data. A problem with such an initial analytical approach is the large number of expected false positive results. Using a nominal α=0.05 for a GWAS including 500,000 SNPs results in an average of 25,000 false positive results. Much has been written about the problem of how to correct for the vast number of single locus tests being performed, but consensus has not yet emerged.4-7 A Bonferroni correction is clearly too conservative for several reasons including the fact that it assumes the independence of each test even though many of the SNPs are in linkage disequilibrium and thus correlated with each other. Substantial effort has been devoted to developing alternatives to Bonferroni correction for multiple testing. Many of these methods are promising and much research is ongoing.

Major Studies

GWAS has already been validated with the identification of the highly significant effect of the Y402H polymorphism in the complement component H (CFH) gene in age-related macular degeneration (AMD). This was found simultaneously through GWAS8 and targeted positional candidate approaches.9,10 It is important to point out, however, that the small sample sizes used for that GWAS were sufficient only because the effect of the Y402H polymorphism was unexpectedly large. Subsequently, the use of GWAS methods has rapidly lead to susceptibility gene identification in types 1 and 2 diabetes, breast cancer, multiple sclerosis, Crohn's disease, colorectal cancer, prostate cancer, height, body mass index, eye color, and others (see recent reviews11-14). However, because some (and perhaps many) GWAS results meeting genome-wide significance will be false or imprecise, there are a number of steps that should be followed to validate, extend and refine detected genome-wide associations.15

Alternative GWAS Designs. To avoid high genotyping cost and multiple testing problems, GWAS often follow a staged design, in which a large number of markers are genotyped in a portion of the sample in the first stage, and a relatively small number of markers showing association in the discovery dataset are genotyped in the remainder of the sample in the second stage. Skol et al.3 examined the use of joint analysis as a more efficient approach to two-stage GWAS than replication-based analysis. They showed joint analysis of both stages of the data resulted in increased power to detect genetic association, even when effect sizes differ between the two stages. Even with this added power, ultimately, supplementary data need to be used to filter results down to a manageable number of the most likely genes to undergo comprehensive molecular analysis. As a variation on this theme, Yu et al.16 conducted a GWAS for several substance dependence traits using two datasets derived from a single study population. Although their study appears to conform to the staged design, the datasets — one Caucasian and the other African American — were treated as independent discovery samples. SNPs showing significant association in the same direction in both datasets were assigned higher priority for follow-up because such findings when observed in genetically diverse populations are more likely to be real.15 One of the most notable findings, association of cocaine-induced paranoia with a SNP in the α-endomannosidase (MANEA) gene, was confirmed in a follow-up study including two additional datasets.17

Endophenotypes. Current difficulties in the search for susceptibility genes of complex diseases have been attributed to the etiological heterogeneity of the clinically defined disease phenotype. Biological correlates of the disease which are genetically influenced and stable over time are considered to be more promising targets of phenotypes. They are more directly influenced by gene effects and presumably defined by a genetic determination which is less complex than the disease phenotype. While few GWAS thus far have examined endophenotypes, notable genetic associations have been reported for cognitive performance18 (with SORL1 which also has been associated with Alzheimer disease in numerous studies19), insulin secretion and sensitivity,20 and body mass index.21-23

Next Steps

The GWAS approach has potential to identify novel loci for many common ocular diseases and traits including AMD, glaucoma, diabetic retinopathy, cataract, oculomuscular disorders and refractive error conditions. A key to success is the availability of sufficiently powered discovery and replication samples of well-characterized subjects using standardized methods. NEI should support programs which bring together existing datasets for collaborative GWAS and consider investing in developing new resources for the study of phenotypes where current datasets are inadequate or where unique and powerful datasets (eg, genetically homogeneous populations) can be assembled. Given the likely heterogeneity and complexity of many of these disorders, studies of reliable and credible endophenotypes should be encouraged. Finally, since disorders of the eye are often associated with other conditions, neurological and vascular in particular, studies which focus on such interactions may provide important clues about pathogenesis and novel treatments.


  1. Lin S, Chakravarti. A, Cutler DJ. Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies. Nat Genet 2004;36:1181-1188.
  2. Marchini J, Donnelly P, Cardon LR. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet 2005;37:413-417.
  3. Skol AD, Scott LJ, Abecasis GR, Boehnke M. Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet 2006; 38: 209-213.
  4. NCI-NHGRI Working Group on Replication in Association Studies. Replicating genotype-phenotype associations. Nature 2007;447:655-660.
  5. Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 2005;6:95-108.
  6. Roeder K, Bacanu SA, Wasserman L, Devlin B. Using linkage genome scans to improve power of association in genome scans. Am J Hum Genet 2006;78:243-252.
  7. Hunter DJ, Kraft P. Drinking from the fire hose--statistical issues in genomewide association studies. N Engl J Med 2007;357:436-439.
  8. Klein RJ, Zeiss C, Chew EY, et al. Complement factor H polymorphism in age-related macular degeneration. Science 2005;308:385-389.
  9. Haines JL, Hauser MA, Schmidt S, et al. Complement factor H variant increases the risk of age-related macular degeneration. Science 2005;308:419-421.
  10. Edwards AO, Ritter R, Abel KJ, Manning A, Panhuysen C, Farrer LA. Complement factor H polymorphism and age-related macular degeneration. Science 2005;308:421-424.
  11. Mohlke KL, Boehnke M, Abecasis GR. Metabolic and cardiovascular traits: an abundance of recently identified common genetic variants. Hum Molec Genet 2008; 17:R102-R108.
  12. Easton DF, Eeles RA. Genome-wide association studies in cancer. Hum Molec Genet 2008; 17:R109-R115.
  13. Autoimmune diseases: insights from genome-wide association studies. Hum Molec Genet 2008; 17:R116-R121.
  14. Altshuler D, Daly MJ, Lander ES. Genetic mapping in human disease. Science 2008; 322: 881-888.
  15. Ioannidis JPA, Thomas G, Daly MJ. Validating, augmenting and refining genome-wide association signals. Nature 2009; 318-329.
  16. Yu Y, Kranzler HR, Panhuysen C, Weiss RD, Poling J, Farrer LA, Gelernter J. Substance dependence whole genome association study in two distinct American populations. Hum Genet 2008; 123:495-506.
  17. Farrer LA, Kranzler HR, Yu Y, Weiss RD, Brady, KT, Cubells JF, Gelernter J. Association of variants in the α-endomannosidase (MANEA) gene with cocaine-related behaviors. Arch Gen Psychiatry 2009; 66:267-274.
  18. Seshadri S, DeStefano AL, Au R, et al. Genetic correlates of brain aging on MRI and cognitive test measures: a genome-wide association and linkage analysis in the Framingham study. BMC Genetics 2007; 8 (Suppl 1):S15.
  19. Rogaeva E, Meng Y, Lee JH, et al. The sortilin-related receptor SORL1 is functionally and genetically associated with Alzheimer's disease. Nat Genet 2007; 39:168-177.
  20. Ruchat SM, Elks CE, Loos RJ, Vohl MC, Weisnagel SJ, Rankinen T, Bouchard C, Pérusse L. Association between insulin secretion, insulin sensitivity and type 2 diabetes susceptibility variants identified in genome-wide association studies. Acta Diabetol 2008. PMID: 19082521.
  21. Frayling TM, Timpson NJ, Weedon MN, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 2007; 316: 889-894.
  22. Scuteri A, Sanna S, Chen WM, et al. Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet 2007;3:e115.
  23. Willer CJ, Speliotes EK, LoosRJ, et al. Six new loci associated wth body mass index highlight a neuronal influence on body weight composition. Nat Genet 2009; 41:25-34.
go to the top

Next Generation Sequencing Technologies

Elaine Mardis


Prior to 2005, DNA sequencing required a series of steps that included subcloning into a vector, introduction into a host with plating on selectable media, growth and picking of selected subclones, DNA isolation and sequencing on capillary sequencing. With the advent of massively parallel approaches to DNA sequencing, this paradigm has changed dramatically—both in terms of the numbers and types of steps required to generate data as well as the scale of data generation. Current massively parallel approaches share features, including a simplified approach to library construction—DNA is fragmented, platform-specific DNA adapters are ligated to the fragment ends, and enzymatic amplification on an oligonucleotide-derivatized surface (glass slide or bead) is achieved to increase the number of copies of each unique fragment prior to sequencing. The actual sequencing reaction differs among platforms: 1) the Roche/454 platform sequences using native nucleotide incorporation and a downstream reporting system that emits light from luciferase/luciferin conversion proportional to the amount of released PPi, 2) the Illumina system uses 3'-blocked nucleotides that are differentially labeled with one of four fluors, and polymerase incorporation to identify the nucleotide on the template strand being sequenced, and 3) sequential ligations of differentially labeled short primer probes to a standard annealed universal primer in the Applied Biosystems/Life Technologies system.

Major Studies

Massively parallel sequencing has been used to sequence entire genomes of cancer samples, in comparison to the sequenced matched normal genome (and other common variants from dbSNP), to identify somatic single-nucleotide and indel variants that play a role in the carcinogenesis. Other approaches to identify disease genes have involved targeted capture of exons, often in a list of suspect genes, based on previous studies or suspect pathways. Targeted capture can happen by a variety of approaches, but is particularly well suited to specific gene lists (or whole exomes), and massively parallel sequencing. Last, several groups have generated PCR products (both small and long-range) that are targeted toward genes found under GWAS peaks, successfully identifying variants in causative genes.

Next Steps

Next-generation, massively parallel sequencing can be applied to ocular genetics in a number of ways. If affected tissues can be obtained, a number of biomolecules can be assayed including mRNA, miRNA, differential methylation status compared with normal tissues, and transcription factor (or other regulatory protein) binding sites. Often, these data types can be cross-correlated to enhance the understanding of the biology of a given disease.


Mardis, E.R. (2008). The impact of next-generation sequencing technology on genetics. Trends in Genetics 24(3): 133-41.

Morozova, M and Marra, M.A. (2008). Applications of next-generation sequencing technologies in functional genomics. Genomics 92(5): 255-64.

go to the top

Established and Evolving Proteomics Analytical Platforms

R. Reid Townsend


Proteomic analysis incorporates various strategies of sample preparation, protein separation, identification and quantification to develop workflows and platforms that are designed to analyze complex mixtures of proteins and define molecular ensembles from post-translational modifications. Two major established platforms have evolved, one at the protein level using two-dimensional gel electrophoresis (Friedman, Methods in Mol. Biol. 367, 219) and the other using peptides as surrogates for the identification and quantification of proteins (Qian et al., Mol. Cell. Proteomics. 5,1727). An evolving approach (‘top-down proteomics') is being developed to characterize gene products that result from protein modifications (e.g. phosphorylation, methylation, ubiquitylation…). Mass spectrometry is the major technology for molecular characterization in proteomics. Limitations of the current platforms include low throughput and depth of proteomic coverage, defining all proteins and protein forms in a biological sample. However, within the last few years, there has been another turn in the proteomics technology spiral that addresses these limitations of speed and sensitivity. One platform combines robotically-driven automated sample preparation and processing, robust nano-liquid chromatography (LC) directly coupled to rapid acquisition, high resolution mass spectrometry (MS), and label-free quantitative proteomic methods to achieve significant improvements in sensitivity, reproducibility, quantification, and throughput. This proteomics platform can produce quantitative data on tens of thousands of peptides per day in an automated format. This new quantitative platform is being widely applied to comparative proteomics of biological fluids, tissues and cells, protein interactions, and discovery and development of disease-associated biomarkers.

The availability of lower cost, high-resolution mass spectrometers is also fueling the development of ‘top-down' proteomics. A major focus of this area is to define the population of different gene products that result from splice variants and post-translational modifications including proteolytic trimming. It is well recognized that protein modifications can have profound biological outcomes. In a few cases it has become apparent that the biological outcome is the result of patterns of covalent modifications on proteins within a specific biological context (Turner, Nature Cell Biol. 2007). The modification of histones by combinations of methylation, acetylation, and phosphorylation has long term (epigenomic) and short term (cellular signaling) biological affects. This semiotic system, whereby a specific set of modifications on Ser/Thr and Lys residues on specific histones are recognized by adaptor molecules that regulate transcription at different times and places in the life of an organism, has been termed the histone code (Jenuwein et al., Science 293, 2001). Combinatorial phosphorylation, ubiquitylation and proteolysis has been shown to be critical in triggering the degradation of Cdc25A phosphatase in cell cycle progression (Busino et al., Oncogene, 23, 2050). Multi-site phosphorylation has been suggested to be a general mechanism to set thresholds for regulated protein-protein interactions (Nash et al., Nature 414 (2001). Deciphering the codes from protein modifications and understanding the impact gene product processing would be accelerated with technology platforms that can define the population of molecules from a gene product from a complex biological mixture. Multi-site modification of a single protein could result theoretically in a large number of individual protein forms, each of which would have to be characterized or deduced in a population of molecules using analytical measurements. However, studies to date indicate that there are a restricted number of observed modification sites, presumably, due to the specificity of modifying enzymes. For example, for histones, modifications are confined to the amino terminal tails that project from the DNA of the nucleosome and only specific residues are modified with phosphate, methyl or acetyl groups. An approach to the ‘ensemble' problem using proteomics is a combination of accurate mass measurements of the protein, which reveals multiple masses that require explanation by considering the gas-phase sequencing data (protein and peptide) to deduce the population of modified molecules. The integration of ‘top-down' and ‘bottom-up' proteomics data holds significant promise for characterizing the biologically important molecules that result from processing of gene products.


  1. Comparative proteomics (relative and absolute protein quantification)
    America, AH and Cordewener, JH (2008) ‘Comparative LC-MS: A landscape of peaks and valleys'. Proteomics 8, 731.
  2. Protein-protein and protein-DNA interactions
    Figeys, D. (2008) ‘Mapping the human protein interactome'. Cell Res. 18, 716.

  3. Molecular characterization of protein modifications and variants and definition of protein ensembles
    Siuti, N and Kelleher, NL (2007) ‘Deciperhing protein modifications using top-down proteomics'. Nat. Met. 4, 817.

« Previous

Department of Health and Human Services NIH, the National Institutes of Health