Journal of Proteomics and Genomics Research

Journal of Proteomics and Genomics Research

Journal of Proteomics and Genomics Research

Current Issue Volume No: 1 Issue No: 4

Research Article Open Access Available online freely Peer Reviewed Citation

Computational EPAS1 rSNP Analysis, Transcriptional Factor Binding Sites and High Altitude Sickness or Adaptation

1Department of Pediatrics, University of Washington, Seattle, WA 98195, USA.

Abstract

Purpose

The endothetal Per-Arnt-Sim (PAS) domain protein 1 (EPAS1) gene which encodes hypoxia-inducible-factor-2 alpha (HIF2a) is a transcription factor that is involved in the response to hypoxia. EPAS1 has been found to have four (rs56721780, rs6756667, rs7589621, rs1868092) simple nucleotide polymorphisms (SNPs) associated with human disease.These SNPs were computationally examined with respect to changes in potential transcriptional factor binding sites (TFBS) and these changes were discussed in relation to disease and alterations in high altitude adaptation in humans.

Methods

The JASPAR CORE and ConSite databases were instrumental in identifying the TFBS. The Vector NTI Advance 11.5 computer program was employed in locating all theTFBS in theEPAS1 gene from 1.6 kb upstream of the transcriptional start site to 539 bps past the 3’UTR. The JASPAR CORE database was also involved in computing each nucleotide occurrence (%) within the TFBS.

Results

The EPAS1 SNPs in the promoter, intron two and the 3’UTR regions have previously been found to be significantly associated with disease and different levels of high-altitude hypoxia among native Tibetans. The SNP alleles were found to alter the DNA landscape for potential transcriptional factors (TFs) to attach resulting in changes in TFBS and thereby, alter which transcriptional factors potentially regulate the EPAS1 genesuch as for the glucocorticoid and mineralocorticoid nuclear receptor binding sites created by the rs7589621 rSNP EPAS1-G allele. These receptors regulate carbohydrate, protein and fat metabolism. Also the minor rs7589621 rSNP EPAS1-A creates a punitive TFBS for the FOXC TF which is an important regulator of cell viability and resistance to oxidative stress. These EPAS1 SNPs should be considered as regulatory (r) SNPs.

Conclusion

The alleles of each rSNP were found to generate unique TFBS resulting in potential changes in TF EPAS1 regulation. The punitive changes in TFBS created by the four rSNPs could very well influence the significant cline in allele frequencies seen in Tibetans with increasing altitude or the haplotype association with high altitude polycythemia in male Han Chinese. These regulatory changes were discussed with respect to changes in human health that result in disease and sickness.

Author Contributions
Received 15 Dec 2015; Accepted 07 Feb 2016; Published 20 Feb 2016;

Academic Editor: Leonid Tarassishin, Associate Department of Pathology Albert Einstein College of Medicine United States.

Checked for plagiarism: Yes

Review by: Single-blind

Copyright ©  2016 Norman E Buroker, et al.

License
Creative Commons License     This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Competing interests

The authors have declared that no competing interests exist.

Citation:

Norman E. Buroker (2016) Computational EPAS1 rSNP Analysis, Transcriptional Factor Binding Sites and High Altitude Sickness or Adaptation. Journal of Proteomics and Genomics Research - 1(4):31-59. https://doi.org/10.14302/issn.2326-0793.jpgr-15-889

Download as RIS, BibTeX, Text (Include abstract )

DOI 10.14302/issn.2326-0793.jpgr-15-889

Introduction

The endothetal Per-Arnt-Sim (PAS) domain protein 1 (EPAS1) gene which encodes hypoxia-inducible-factor-2 alpha (HIF2a) is a transcription factor that is involved in the response to hypoxia. Hypoxia is a major geographical condition associated with high-altitude environments 1. Hypoxia-inducible-factors (HIFs) are heterodimers consisting of an oxygen-labile HIFa subunit and a stable HIFb subunit 2. During hypoxia conditions, three isoforms either HIF1a, HIF2a or HIF3a and HIF1b are activated and function as transcriptional regulators of genes involved with the hypoxia response 3, 4, 5. Genome wide association studies (GWAS) on high-altitude adaptation have implicated several single nucleotide polymorphisms (SNPs) in the regulatory region of the EPAS1 gene which are responsible for the genetic adaptation of high-altitude hypoxia in Tibetans 6, 7, 8. Genetic variation in the regulatory region of the EPAS1 gene may influence gene expression and contribute to changes in biological functions 9. EPAS1 is expressed in organs that are involved in oxygen transport and metabolism, such the lung, placenta and vascular endothelium 10, and is associated with many biological processes and diseases related to metabolism 11, angiogenesis 12, 13, inflammation 14, 15 and cancer 16, 17, 18.

The EPAS1 gene maps to human chromosome 2p21 and is about 120 kb in size with a coding region consisting of 15 exons 19. Four HIF2a SNPs (rs56721780, rs6756667, rs7589621 and rs1868092) have been significantly associated with different levels of high-altitude hypoxia among native Tibetans 20. The rs56721780 SNP in the HIF2a promoter region has also been significantly associated with high-altitude adaptation of Tibetans 9 while the rs6756667 SNP from intron two has been significantly associated with susceptibility to acute mountain sickness in individuals unaccustomed to high altitude environments 21. The rs1868092 SNP near the HIF2a 3’UTR has been associated with high altitude polycythemia in male Han Chinese at the Qinghai-Tibetan plateau 22.

Single nucleotide changes that affect gene expression by impacting gene regulatory sequences such as promoters, enhances, and silencers are known as regulatory SNPs (rSNPs) 23, 24, 25, 26. A rSNPs within a transcriptional factor binding site (TFBS) can change a transcriptional factor’s (TF) ability to bind its TFBS 27, 28, 29, 30 in which case the TF would be unable to effectively regulate its target gene 31, 32, 33, 34, 35. This concept is examined for the abovefour HIF2a rSNPs and their allelic association with TFBS, where computation analyses 36, 37, 38, 39 is used to identify TFBS alterations created by the HIF2a rSNPs. In this report, the rSNP associations with changes in potential TFBS are discussed with their possible relationship to disease or sickness in humans.

Methods

The JASPAR CORE database 40, 41 and ConSite 42 were used to identify the potential STAT4 TFBS in this study. JASPAR is a database of transcription factor DNA-binding preferences used for scanning genomic sequences where ConSite is a web-based tool for finding cis-regulatory elements in genomic sequences. The TFBS and rSNP location within the binding sites have previously been discussed 43. The Vector NTI Advance 11.5 computer program (Invitrogen, Life Technologies) was used to locate theTFBS in theEPAS1 gene (NCBI Ref Seq NM_001430) from 1.6 kb upstream of the transcriptional start site to 539 bps past the 3’UTR which represents a total of 91 kb. The JASPAR CORE database was also used to calculate each nucleotide occurrence (%) within the TFBS, where upper case lettering indicate that the nucleotide occurs 90% or greater and lower case less than 90%. The occurrence of each SNP allele in the TFBS is also computed from the database (Table 3).

Results

EPAS1 rSNPs and TFBS

The allele frequencies of four EPAS1 SNPs (rs56721780, rs6756667, rs7589621 and rs1868092) significantly associated with different levels of high-altitude hypoxia among native Tibetans 20 are presented in Table 1 along with low altitude Han Chinese and Japanese populations. The common rs56721780 SNP EPAS1 -G allele creates nine unique punitive TFBS for the REL, RELA, RUNX1,TFAP2A, TFAP1(var.2), TFAP2B, TFAP2B(var.2), TFAP2C and TFAP2C(var.2) TFs, which are involved with inflammation, immunity, differentiation, cell growth, tumorigenesis, apoptosis, hematopoiesis, transcriptional activation and repression, respectively (Table 2 & Table 3). The minor EPAS1 -C allele creates two unique punitive TFBS for the FOXP3 and HOXA5 TFs which are involved with the homeostasis of the immune system and specific identities on the anterior-posterior axis during development, respectively (Table 2 & Table 3). There are also four conserved TBFS for the HLTF, HNF4G, NFAT5 and SOX10 TFs which are involved altering chromatin structure, transcription, regulation of osmoprotective and inflammatory genes and embryonic development, respectively (Table 2, Table 3).

Table 1. EPAS1 (HIF2a) SNPs and high altitude hypoxia among native Tibetans. These SNPs have been found to be significantly associated with hypoxia in Tibetan populations. The SNPs are located in the EPAS1 gene. MAF is the minor allele frequency. Allele frequency data from reference 20.
          MAF Han Chinese Japanese
          Tibetan populations/Altitude (M)
Gene EPAS1 (HIF2a) Gene Position SNP Chr Pos Alleles Bomi /2700 Qamdo /3200 Lhasa/3700 Amdo /4700 CHB/Beijing JPT/Tokyo
  promoter rs56721780 2:46523655 G/C C=0.184 C=0.159 C=0.328 C=0.311 C=0.01 C=0.022
  intron 2 rs6756667 2:46579409 A/G G=0.313 G=0.266 G=0.201 G=0.104 G=0.944 G=0.889
  intron 2 rs7589621 2:46582382 G/A A=0.254 A=0.221 A=0.163 A=0.079 A=0.789 A=0.767
  past 3'UTR rs1868092 2:46614202 A/G G=0.359 G=0.327 G=0.249 G=0.156 G=0.919 G=0.924

Table 2. Transcriptional factor (TF) abbreviations, protein name and descriptions.
TFs Protein name TF description
ATF4 Activating Transcription Factor 4 The protein encoded by this gene belongs to a family of DNA-binding proteins that includes the AP-1 family of transcription factors, cAMP-response element binding proteins (CREBs) and CREB-like proteins.
ATF7 Activating Transcription Factor 7 Plays important functions in early cell signaling.Has no intrinsic transcriptional activity, but activates transcription on formation of JUN or FOS heterodimers.
BARX1 BARX Homeobox 1 Transcription factor, which is involved in craniofacial development, in odontogenesis and in stomach organogenesis.
BSX Brain-Specific Homeobox DNA binding protein that function as transcriptional activator. Is essentiel for normal postnatal growth and nursing. Is an essential factor for neuronal neuropeptide Y and agouti-related peptide function and locomotory behavior in the control of energy balance.
CEBPa CCAAT/enhancer binding protein (C/EBP), alpha C/EBP is a DNA-binding protein that recognizes two different motifs: the CCAAT homology common to many promoters and the enhanced core homology common to many enhancers
CEBPb CCAAT/enhancer binding protein (C/EBP), beta Important transcriptional activator regulating the expression of genes involved in immune and inflammatory responses. Binds to regulatory regions of several acute-phase and cytokines genes and probably plays a role in the regulation of acute-phase reaction, inflammation and hemopoiesis.
CEBPg CCAAT/enhancer binding protein (C/EBP), delta The encoded protein is important in the regulation of genes involved in immune and inflammatory responses, and may be involved in the regulation of genes associated with activation and/or differentiation of macrophages.
CEBPe CCAAT/enhancer binding protein (C/EBP), epsilon The encoded protein may be essential for terminal differentiation and functional maturation of committed granulocyte progenitor cells. Mutations in this gene have been associated with Specific Granule Deficiency, a rare congenital disorder.
CREB1 CAMP Responsive Element Binding Protein 1 This gene encodes a transcription factor that is a member of the leucine zipper family of DNA binding proteins
DBP D Site Of Albumin Promoter (Albumin D-Box) Binding Protein The encoded protein can bind DNA as a homo- or heterodimer and is involved in the regulation of some circadian rhythm genes.
DLX6 Distal-Less Homeobox 6 This gene encodes a member of a homeobox transcription factor gene family similiar to the Drosophila distal-less gene. This family is comprised of at least 6 different members that encode proteins with roles in forebrain and craniofacial development.
E2F6 E2F transcription factor 6 The protein encoded by this gene is a member of the E2F family of transcription factors. The E2F family plays a crucial role in the control of cell cycle and action of tumor suppressor proteins and is also a target of the transforming proteins of small DNA tumor viruses.
EN1 Engrailed homeobox 1 Homeobox-containing genes are thought to have a role in controlling development.
EN2 Engrailed homeobox 2 The human engrailed homologs 1 and 2 encode homeodomain-containing proteins and have been implicated in the control of pattern formation during development of the central nervous system.
ESX1 ESX Homeobox 1 This gene likely plays a role in placental development and spermatogenesis.
EVX1 Even-Skipped Homeobox 1 May play a role in the specification of neuronal cell types
EVX2 Even-Skipped Homeobox 2 The encoded protein is a homeobox transcription factor that is related to the protein encoded by the Drosophila even-skipped (eve) gene, a member of the pair-rule class of segmentation genes.
FIGLA Folliculogenesis Specific Basic Helix-Loop-Helix The protein is a basic helix-loop-helix transcription factor that regulates multiple oocyte-specific genes, including genes involved in folliculogenesis and those that encode the zona pellucida.
FOS Jun Proto-Oncogene The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. The FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Controls osteoclast survival and size. As a dimer with JUN, activates LIF transcription.
FOS::JUN Jun Proto-Oncogene FBJ Murine Osteosarcoma Viral Oncogene Homolog Promotes activity of NR5A1 when phosphorylated by HIPK3 leading to increased steroidogenic gene expression upon cAMP signaling pathway stimulation. Has a critical function in regulating the development of cells destined to form and maintain the skeleton. It is thought to have an important role in signal transduction, cell proliferation and differentiation.
FOXA1 Forkhead Box A1 Transcription factor that is involved in embryonic development, establishment of tissue-specific gene expression and regulation of gene expression in differentiated tissues.
FOXC1 Forkhead box C1 This gene belongs to the forkhead family of transcription factors which is characterized by a distinct DNA-binding forkhead domain. An important regulator of cell viability and resistance to oxidative stress.
FOXH1 Forkhead Box H1 Transcriptional activator. Recognizes and binds to the DNA sequence 5-TGTGTGTATT-3. Required for induction of the goosecoid (GSC) promoter by TGF-beta or activin signaling.
FOXP3 Forkhead Box P3 Transcriptional regulator which is crucial for the development and inhibitory function of regulatory T-cells (Treg). Plays an essential role in maintaining homeostasis of the immune system by allowing the acquisition of full conventional T-cells. Suppressive function and stability of the Treg lineage, and by directly modulating the expansion and function of conventional T-cells.
GBX1 Gastrulation Brain Homeobox 1 Sequence-specific DNA binding transcription factor activity and sequence-specific DNA binding. An important paralog of this gene is DLX5.
GBX2 Gastrulation Brain Homeobox 2 May act as a transcription factor for cell pluripotency and differentiation in the embryo
GMEB2 Glucocorticoid Modulatory Element Binding Protein 2 This gene is a member of KDWK gene family. The product of this gene associates with GMEB1 protein, and the complex is essential for parvovirus DNA replication.
GSX1 GS Homeobox 1 Activates the transcription of the GHRH gene. Plays an important role in pituitary development.
HIC1 Hypermethylated In Cancer 1 This gene functions as a growth regulatory and tumor repressor gene.
HIC2 Hypermethylated In Cancer 2 Transcriptional repressor
HLF Hepatic Leukemia Factor The encoded protein forms homodimers or heterodimers with other PAR family members and binds sequence-specific promoter elements to activate transcription.
HLTF Helicase-like transcription factor This gene encodes a member of the SWI/SNF family. Members of this family have helicase and ATPase activities and are thought to regulate transcription of certain genes by altering the chromatin structure around those genes.
HMBOX1 Homeobox Containing 1 Transcription factor. Isoform 1 acts as a transcriptional repressor.
HNF4g Hepatocyte Nuclear Factor 4, Gamma Transcription factor. Has a lower transcription activation potential than HNF4-alpha
HOXA2 Homeobox A2 Sequence-specific transcription factor which is part of a developmental regulatory system that provides cells with specific positional identities on the anterior-posterior axis.
HOXA5 Hoxa5 Sequence-specific transcription factor which is part of a developmental regulatory system that provides cells with specific positional identities on the anterior-posterior axis.
HOXB2 Homeobox B2 Sequence-specific transcription factor which is part of a developmental regulatory system that provides cells with specific positional identities on the anterior-posterior axis.
HOXB3 Homeobox B3 The encoded protein functions as a sequence-specific transcription factor that is involved in development.
INSM1 Insulinoma-Associated 1 This gene is a sensitive marker for neuroendocrine differentiation of human lung tumors.
ISL2 ISL LIM Homeobox 2 Transcriptional factor that defines subclasses of motoneurons that segregate into columns in the spinal cord and select distinct axon pathways.
ISX Intestine-Specific Homeobox Transcription factor that regulates gene expression in intestine. May participate in vitamin A metabolism most likely by regulating BCO1 expression in the intestine.
JDP(var.2) Jun Dimerization Protein 2 Component of the AP-1 transcription factor that represses transactivation mediated by the Jun family of proteins. Involved in a variety of transcriptional responses associated with AP-1 such as UV-induced apoptosis, cell differentiation, tumorigenesis and antitumogeneris.
JUN Jun Proto-Oncogene Transcription factor that recognizes and binds to the enhancer heptamer motif 5-TGACGTCA-3. signaling pathway stimulation. Promotes activity of NR5A1 when phosphorylated by HIPK3 leading to increased steroidogenic gene expression upon cAMP signaling pathway stimulation.
JUND(var.2) Jun D Proto-Oncogene Transcription factor that recognizes and binds to the enhancer heptamer motif 5'-TGACGTCA-3'.
KLF5 Kruppel-like factor 5 (intestinal) Transcription factor that binds to GC box promoter elements. Activates transcription of genes.
LBX2 Ladybird Homeobox 2 Putative transcription factor.
LIN54 Lin-54 DREAM MuvB Core Complex Component Is a component of the LIN, or DREAM, complex, an essential regulator of cell cycle genes
MAX MGA, MAX Dimerization Protein The protein encoded by this gene is a member of the basic helix-loop-helix leucine zipper (bHLHZ) family of transcription factors
MEIS1 Meis Homeobox 1 Homeobox genes, of which the most well-characterized category is represented by the HOX genes, play a crucial role in normal development.
MEIS3 Meis Homeobox 3 Sequence-specific DNA binding and RNA polymerase II core promoter proximal region sequence-specific DNA binding transcription factor activity involved in positive regulation of transcription.
MEOX1 Mesenchyme Homeobox 1 Mesodermal transcription factor that plays a key role in somitogenesis and is specifically required for sclerotome development.
MEOX2 Mesenchyme Homeobox 2 The encoded protein may play a role in the regulation of vertebrate limb myogenesis. Mutations in the related mouse protein may be associated with craniofacial and/or skeletal abnormalities, in addition to neurovascular dysfunction observed in Alzheimer's disease.
MGA MGA, MAX Dimerization Protein Functions as a dual-specificity transcription factor, regulating the expression of both MAX-network and T-box family target genes. Functions as a repressor or an activator.
MIXL1 Mix Paired-Like Homeobox Regulates cell fate during development.
MSX1 Msh Homeobox 1 Acts as a transcriptional repressor. May play a role in limb-pattern formation. Acts in cranofacial development and specifically in odontogenesis.
MZF1 Myeloid Zinc Finger 1 Binds to target promoter DNA and functions as trancription regulator. May be one regulator of transcriptional events during hemopoietic development. Isoforms of this protein have been shown to exist at protein level.
NEUROD2 Neuronal Differentiation 2 Transcriptional regulator implicated in neuronal determination. Mediates calcium-dependent transcription activation by binding to E box-containing promoter. Critical factor essential for the repression of the genetic program for neuronal differentiation; prevents the formation of synaptic vesicle clustering at active zone to the presynaptic membrane in postmitotic neurons.
NFAT5 Nuclear Factor Of Activated T-Cells 5, Tonicity-Responsive Transcription factor involved in the transcriptional regulation of osmoprotective and inflammatory genes. Regulates hypertonicity-induced cellular accumulation of osmolytes.
NFATC3 Nuclear Factor Of Activated T-Cells, Cytoplasmic, Calcineurin-Dependent 3. Acts as a regulator of transcriptional activation. Plays a role in the inducible expression of cytokine genes in T-cells, especially in the induction of the IL-2.
165735777748000NFE2L1:MAFG Nuclear Factor, Erythroid 2-Like 1 V-Maf Avian Musculoaponeurotic ibrosarcoma Oncogene Homolog G Nuclear factor erythroid 2-related factor (Nrf2) coordinates the up-regulation of cytoprotective genes via the antioxidant response element (ARE). MafG is a ubiquitously expressed small maf protein that is involved in cell differentiation of erythrocytes. It dimerizes with P45 NF-E2 protein and activates expression of a and b-globin.
NFIA Nuclear Factor I/A Recognizes and binds the palindromic sequence 5-TTGGCNNNNNGCCAA-3 present in viral and cellular promoters transcription and replication and in the origin of replication of adenovirus type 2. These proteins are individually capable of activating transcription and replication
NFIC Nuclear Factor I/C (CCAAT-Binding Transcription Factor) Recognizes and binds the palindromic sequence 5'-TTGGCNNNNNGCCAA-3' present in viral and cellular promoters and in the origin of replication of adenovirus type 2. These proteins are individually capable of activating transcription and replication.
NFIL3 Nuclear factor, interleukin 3 regulated Expression of interleukin-3 (IL3; MIM 147740) is restricted to activated T cells, natural killer (NK) cells, and mast cell lines.
NFIX Nuclear Factor I/X (CCAAT-Binding Transcription Factor) Sequence-specific DNA binding transcription factor activity and RNA polymerase II distal enhancer sequence-specific DNA binding transcription factor activity.
NKX2-3 NK2 Homeobox 3 This gene encodes a homeodomain-containing transcription factor. The encoded protein is a member of the NKX family of homeodomain transcription factors.
NKX2-8 NK2 Homeobox 8 Transcriptional factor. Diseases associated with NKX2-8 include esophageal cancer.
NKX3-1 NK3 Homeobox 1 This gene encodes a homeobox-containing transcription factor. This transcription factor functions as a negative regulator of epithelial cell growth in prostate tissue.
NKX3-2 NK3 Homeobox 2 This gene encodes a member of the NK family of homeobox-containing proteins. Transcriptional repressor that acts as a negative regulator of chondrocyte maturation.
NKX6-1 NK6 Homeobox 1 Transcription factor which binds to specific A/T-rich DNA sequences in the promoter regions of a number of genes. Involved in transcriptional regulation in islet beta cells. Binds to the insulin promoter and is involved in regulation of the insulin gene.
NR2C2 Nuclear Receptor Subfamily 2, Group C, Member 2 Orphan nuclear receptor that can act as a repressor or activator of transcription. An important repressor of nuclear receptor signaling pathways such as retinoic acid receptor, retinoid X, vitamin D3 receptor, thyroid hormone receptor and estrogen receptor pathways.
NR3C1 Nuclear Receptor Subfamily 3, Group C, Member 1 (Glucocorticoid Receptor) Glucocorticoids regulate carbohydrate, protein and fat metabolism, modulate immune responses through supression of chemokine and cytokine production and have critical roles in constitutive activity of the CNS, digestive, hematopoietic, renal and reproductive systems. The protein encoded by this gene plays a role in protecting cells from oxidative stress and damage induced by ionizing radiation.
NR3C2 Nuclear Receptor Subfamily 3, Group C, Member 2 This gene encodes the mineralocorticoid receptor, which mediates aldosterone actions on salt and water balance within restricted target cells.
NR4A2 Nuclear Receptor Subfamily 4, Group A, Member 2 Transcriptional regulator which is important for the differentiation and maintenance of meso-diencephalic dopaminergic (mdDA) neurons during development.
NRL Neural Retina Leucine Zipper This gene encodes a basic motif-leucine zipper transcription factor of the Maf subfamily. The encoded protein is conserved among vertebrates and is a critical intrinsic regulator of photoceptor development and function.
PDX1 Pancreatic and duodenal homeobox 1 Activates insulin, somatostatin, glucokinase, islet amyloid polypeptide and glucose transporter type 2 gene transcription. Particularly involved in glucose-dependent regulation of insulin gene transcription.
PHOX2A Paired-Like Homeobox 2a May be involved in regulating the specificity of expression of the catecholamine biosynthetic genes. Acts as a transcription activator/factor.
POU2F1 POU Class 2 Homeobox 1 Transcription factor that binds to the octamer motif (5-ATTTGCAT-3) and activates the promoters of the genes for some small nuclear RNAs (snRNA) and of genes such as those for histone H2B and immunoglobulins. Modulates transcriptiontransactivation by NR3C1, AR and PGR
POU3F1 POU Class 3 Homeobox 1 Transcription factor that binds to the octamer motif (5-ATTTGCAT-3). Thought to be involved in early embryogenesis and neurogenesis
POU3F2 POU Class 3 Homeobox 2 This gene encodes a member of the POU-III class of neural transcription factors. The encoded protein is involved in neuronal differentiation and enhances the activation of corticotropin-releasing hormone regulated genes.
POU3F3 POU Class 3 Homeobox 3 This gene encodes a POU-domain containing protein that functions as a transcription factor. The encoded protein recognizes an octamer sequence in the DNA of target genes. This protein may play a role in development of the nervous system.
POU3F4 POU Class 3 Homeobox 4 This gene encodes a member of the POU-III class of neural transcription factors. This family member plays a role in inner ear development. The protein is thought to be involved in the mediation of epigenetic signals which induce striatal neuron-precursor differentiation.
POU5F1 POU Class 5 Homeobox 1B This gene encodes a transcription factor containing a POU homeodomain that plays a key role in embryonic development and stem cell pluripotency. Aberrant expression of this gene in adult tissues is associated with tumorigenesis. Forms a trimeric complex with SOX2 on DNA and controls the expression of a number of genes involved in embryonic development such as YES1, FGF4, UTF1 and ZFP206.
POU5F1B POU Class 5 Homeobox 1B This intronless gene was thought to be a transcribed pseudogene of POU class 5 homeobox 1; however, it has been reported that this gene can encode a functional protein. The protein has been shown to be a weak transcriptional activator and may play a role in carcinogenesis and eye development.
REL V-Rel Avian Reticuloendotheliosis Viral Oncogene Homolog Proto-oncogene that may play a role in differentiation and lymphopoiesis. NF-kappa-B is a pleiotropic transcription factor which is present in almost all cell types and is involved in many biological processes such as inflammation, immunity, differentiation, cell growth, tumorigenesis and apoptosis.
RELA V-Rel Avian Reticuloendotheliosis Viral Oncogene Homolog A RELA is a Protein Coding gene. NF-kappa-B is composed of NFKB1 or NFKB2 bound to either REL, RELA, or RELB. The most abundant form of NF-kappa-B is NFKB1 complexed with the product of this gene, RELA. Among its related pathways are PI3K-Akt signaling pathway and PI-3K cascade.
RHOXF1 Rhox Homeobox Family, Member 1 This gene is a member of the PEPP subfamily of paired-like homoebox genes. The gene may be regulated by androgens and epigenetic mechanisms. The encoded nuclear protein is likely a transcription factor that may play a role in human reproduction.
RUNX1 Runt-Related Transcription Factor 1 Core binding factor (CBF) is a heterodimeric transcription factor that binds to the core element of many enhancers and promoters. The protein encoded by this gene represents the alpha subunit of CBF and is thought to be involved in the development of normal hematopoiesis.
RUNX3 Runt-Related Transcription Factor 3 This gene encodes a member of the runt domain-containing family of transcription factors. Found in a number of enhancers and promoters, and can either activate or suppress transcription. It also interacts with other transcription factors. It functions as a tumor suppressor, and the gene is frequently deleted or transcriptionally silenced in cancer.
SOX10 SRY (sex determining region Y)-box 10 This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate.
SREBF1 Sterol regulatory element binding transcription factor 1 Transcriptional activator required for lipid homeostasis. Regulates transcription of the LDL receptor gene as well as the fatty acid and to a lesser degree the cholesterol synthesis pathway.
SREBF2 Sterol regulatory element binding transcription factor 2 This gene encodes a member of the a ubiquitously expressed transcription factor that controls cholesterol homeostasis by regulating transcription of sterol-regulated genes. The encoded protein contains a basic helix-loop-helix-leucine zipper (bHLH-Zip) domain and binds the sterol regulatory element 1 motif.
SRY SRY (sex determining region Y)-box 10 Transcriptional regulator that controls a genetic switch in male development. It is necessary and sufficient for initiating male sex determination by directing the development of supporting cell precursors
TBP TATA Box Binding Protein General transcription factor that functions at the core of the DNA-binding multiprotein factor TFIID. Binding of TFIID to the TATA box is the initial transcriptional step of the pre-initiation complex (PIC), playing a role in the activation of eukaryotic genes transcribed by RNA polymerase II.
TBX4 T-Box 4 Involved in the transcriptional regulation of genes required for mesoderm differentiation.
TBX5 T-Box 5 This gene is a member of a phylogenetically conserved family of genes that share a common DNA-binding domain, the T-box. T-box genes encode transcription factors involved in the regulation of developmental processes.
TEAD1 TEA Domain Family Member 1 This gene encodes a ubiquitous transcriptional enhancer factor that is a member of the TEA/ATTS domain family. This protein directs the transactivation of a wide variety of genes and, in placental cells, also acts as a transcriptional repressor.
TEAD3 TEA Domain Family Member 3 This gene product is a member of the transcriptional enhancer factor (TEF) family of transcription factors, which contain the TEA/ATTS DNA-binding domain. It is predominantly expressed in the placenta and is involved in the transactivation of the chorionic somatomammotropin-B gene enhancer.
TEAD4 TEA Domain Family Member 4 It is preferentially expressed in the skeletal muscle, and binds to the M-CAT regulatory element found in promoters of muscle-specific genes to direct their gene expression.
TFAP2A Transcription Factor AP-2 Alpha (Activating Enhancer Binding Protein 2 Alpha) The protein encoded by this gene is a transcription factor that binds the consensus sequence 5'-GCCNNNGGC-3' and activates the transcription of some genes while inhibiting the transcription of others.
TFAP2B Transcription Factor AP-2 Beta (Activating Enhancer Binding Protein 2 Beta) This gene encodes a member of the AP-2 family of transcription factors. AP-2 proteins form homo- or hetero-dimers with other AP-2 family members and bind specific DNA sequences. This protein functions as both a transcriptional activator and repressor.
TFAP2C Transcription Factor AP-2 Gamma (Activating Enhancer Binding Protein 2 Gamma) Sequence-specific DNA-binding protein that interacts with inducible viral and cellular enhancer elements to regulate transcription of selected genes. AP-2 factors bind to the consensus sequence 5'-GCCNNNGGC-3' and activate genes involved in a large spectrum of important biological functions including proper eye, face, body wall, limb and neural tube development.
TFEC Transcription Factor EC This gene encodes a member of the micropthalmia (MiT) family of basic helix-loop-helix leucine zipper differentiation. MiT transcription factors regulate the expression of target genes by binding to E-box recognition sequences as homo- or heterodimers, and play roles in multiple cellular processes including survival, growth and and differentiation.
THAP1 THAP domain containing, apoptosis associated protein 1 DNA-binding transcription regulator that regulates endothelial cell proliferation and G1/S cell-cycle progression.
USF1 Upstream Transcription Factor 1 This gene encodes a member of the basic helix-loop-helix leucine zipper family, and can function as a cellular.transcription factor.
USF2 Upstream Transcription Factor 2, C-Fos Interacting Transcription factor that binds to a symmetrical DNA sequence (E-boxes) (5-CACGTG-3) that is found in a variety of viral and cellular promoters.
YY1 YY1 Transcription Factor Multifunctional transcription factor that exhibits positive and negative control on a large number of cellular and viral genes by binding to sites overlapping the transcription start site
YY2 YY2 Transcription Factor The protein encoded by this gene is a transcription factor that includes several Kruppel-like zinc fingers in its C-terminal region. It possesses both activation and repression domains, and it can therefore have both positive and negative effects on the transcription of target genes.
ZNF354C Zinc finger protein 354C May function as a transcription repressor.

Table 3. The EPAS1 (HIF2a) SNPs that were examined in this study where the minor allele is in red. Also listed are the transcriptional factors (TF), their potential binding sites (TFBS) containing these SNPs and DNA strand orientation. TFs in red differ between the SNP alleles. Where upper case nucleotide designates the 90% conserved BS region and red is the SNP location of the alleles in the TFBS. Below the TFBS is the nucleotide occurrence (%) obtained from the Jaspar Core database. Also listed are the number (#) of binding sites in the gene for the given TF. Note: TFs can bind to more than one nucleotide sequence.
EPAS1 (HIF2a)        
SNP Allele TFs # of Sites TFBS Strand
rs56721780 G HLTF 1 agcCtTtggg plus
g=14%
    HNF4G 1 gaaaccCAaAGgcta minus
c=30%
    NFAT5 1 ggTTtCccag plus
g=21%
    REL 1 tgggttTccC plus
g=6%
    REL 1 ttgggtTtcC plus
g=53%
    RELA 1 ttGggtTtCC plus
g=39%
    RELA 1 tgGgttTcCC plus
G=100%
    RUNX1 2 gccTttGGgtt plus
G=92%
    SOX10 76 cttTgg plus
g=5%
    TFAP2A 1 acCCaaagGct minus
C=99%
    TFAP2A 1 agCCtttgGgt plus
G=99%
    TFAP2A(var.2) 1 aaCCcaaaGgct minus
C=99%
    TFAP2A(var.2) 1 agCCtttgGgtt plus
G=94%
    TFAP2B 1 aaCCcaaaGgct minus
C=99%
    TFAP2B 1 agCCtttgGgtt plus
G=97%
    TFAP2B(var.2) 1 acCCaaaGGCt minus
C=100%
    TFAP2C 1 aacCcaaaGgct minus
C=97%
    TFAP2C 1 agcCtttgGgtt plus
G=97%
    TFAP2C(var.2) 1 acCCaaagGCt minus
C=98%
    TFAP2C(var.2) 1 agCCtttgGgt plus
G=99%
  C FOXP3 20 gcaaAgg minus
g=65%
    HLTF 1 agcCtTtgcg plus
c=24%
    HNF4G 1 gaaacgCAaAGgcta minus
g=24%
    HOXA5 2 cgcaaagg minus
g=31%
    NFAT5 1 cgTTtCccag plus
c=21%
    SOX10 75 cttTgc plus
c=0%
rs6756667 A ATF4 1 aggTGAtGccAca minus
t=48%
    CEBP 1 gTggCatcAcc plus
a=78%
    CEBP 3 gTgatgccAc minus
t=10%
    CEBP 2 aTgccacAAt minus
T=100%
    CEBP 3 gTgatgccAc minus
t=9%
    CEBP 2 aTgccacAAt minus
T=100%
    CEBP 3 gTgatgccAc minus
t=18%
    CEBP 2 aTgccacAAt minus
T=98%
    CREB1 6 tGAtGcca minus
t=0%
    DBP 1 ggTgAtgccAca minus
t=38%
    DBP 1 tgTggcatcAcc plus
a=45%
    FIGLA 1 atCAcctTac plus
a=50%
    FOS::JUN 34 TgccacA minus
T=94%
    FOXH1 3 gtgAtgccACa minus
t=5%
        HIC1   2   aTgCCacaa minus  
T=95%
    HIC2 2 aTgCCacaa minus
T=98%
    HLF 1 ggTgatgccaca minus
t=11%
    HLF 1 tgTggcatcacc plus
a=33%
    HOXA2 2 tggcATcAcc plus
A=90%
    JUN 1 aaggTGAtGccAc minus
t=62%
    JUND (var.2) 1 taaggTGAtgccAca minus
t=44%
    JUND (var.2) 1 ggtgaTGccacaAtc minus
T=100%
    MEIS1 15 atGcCac minus
t=77%
    MEIS1 15 gtGgCat plus
a=83%
    MEIS3 6 gtGgCAtc plus
A=91%
    NFE2L1::MafG 40 caTcAc plus
a=85%
    NFIA 1 gaTGCCAcaa minus
T=100%
    NFIC 84 gTGGca plus
a=48%
    NFIX 4 gatGCCAca minus
t=18%
    NFIX 4 tgtGgCAtc plus
A=92%
    NRL 1 aaggtGatgcc minus
t=86%
    RHOXF1 3 gtgAtgcc minus
t=68%
    RUNX1 1 gatTgtGGcat plus
a=7%
    RUNX3 2 atgCCaCAat minus
t=6%
    SREBF1 1 aTCAccttac plus
a=48%
    SREBF2 2 gTaaggTGAt minus
t=57%
    SREBF2(var.2) 1 gtaAgGTGAt minus
t=47%
    SREBF2(var.2) 1 atCACcTtAc plus
a=73%
    TBX4 1 agGTGatg minus
t=45%
    TBX5 1 agGtGatg minus
t=51%
    TEAD3 6 tgATgCCa minus
T=100%
        TFEC   1   gtaAggtGat minus  
t=49%
    THAP1 2 atgCCacaa minus
t=63%
  G ATF4 1 aggTGAcGccAca minus
c=38%
    ATF7 1 aggTGACGccAcaa minus
C=99%
    ATF7 1 ttgTGgCGTcAcct plus
G=99%
    CEBP 1 gTgacgccAc minus
c=0%
    CEBP 1 gtggcgtcAc plus
g=80%
    CEBP 1 gTgacgccAc minus
c=88%
    CEBP 1 gTggcgtcAc plus
g=86%
    CEBP 1 gTgacgccAc minus
c=81%
    CEBP 1 gTggcgtcAc plus
g=73%
    CREB1 1 tGAcGcca minus
c=18%
    CREB1 1 tGgcGtca plus
G=91%
    DBP 1 ggTgAcgccAca minus
c=62%
    DBP 1 tgTggcgtcAcc plus
g=55%
    GMEB2 1 tgACGcca minus
C=100%
    HLF 1 ggTgacgccaca minus
c=83%
    HLF 1 tgTggcgtcacc plus
g=67%
    INSM1 1 tgtaaGGtGacg minus
c=8%
    JDP2(var.2) 1 ggTGACGcCAca minus
C=99%
    JDP2(var.2) 1 tgTGgCGTCAcc plus
G=98%
    JUN 1 aaggTGAcGccAc minus
c=16%
    JUN 1 attgTGgcGtcAc plus
G=97%
    JUND (var.2) 1 taaggTGAcgccAca minus
c=31%
    MEIS1 2 gtGACgc minus
C=99%
    MEIS3 1 gtGACgcc minus
C=97%
    MGA 1 tgGcGtcA plus
G=100%
    NFE2L1::MafG 56 ggTGAc minus
c=76%
    NFIX 2 gacGCCAca minus
c=30%
    NR4A2 7 aAGgtgAc minus
c=57%
    RUNX1 1 gatTgtGGcgt plus
g=4%
    RUNX3 1 acgCCaCAat minus
c=9%
    SREBF1 1 gTgAcgccac minus
c=88%
    SREBF1 1 gTCAccttac plus
g=28%
    SREBF2 1 gTaaggTGAc minus
c=34%
    SREBF2 1 gTGgcgTcAc plus
g=77%
    SREBF2(var.2) 1 gtaAgGTGAc minus
c=53%
    SREBF2(var.2) 1 gtCACcTtAc plus
g=27%
    TBX4 4 agGTGacg minus
c=29%
    TBX4 1 tgGcGtca plus
G=100%
    TBX5 4 agGtGacg minus
c=34%
    TFEC 1 gtaAggtGac minus
c=50%
    USF1 1 gtaAggTGacg minus
c=78%
    USF2 1 gtaAgGTGacg minus
c=5%
    ZNF354C 21 cgCCAC minus
c=38%
rs7589621 G BARX1 1 gtacTTAt plus
g=33%
    BSX 1 gtacTTAt plus
g=26%
    DBP 1 aagTAcgTAAag minus
c=62%
    DBP 1 ctTTAcgTActt plus
g=55%
    DLX6 1 gtacTTAt plus
g=26%
    EN1 1 gtAcTtAt plus
g=20%
    EN2 1 cgtacTtAtc plus
g=25%
    ESX1 1 cgtacTtAtc plus
g=7%
    EVX1 1 gataAgTAcg minus
c=27%
        EVX1   1   cgtacTTAtc plus  
g=33%
    EVX2 1 gataAgTAcg minus
c=24%
    EVX2 1 cgtacTTAtc plus
g=34%
    FOXA1 1 acttTacgtaCttat plus
g=6%
    GBX1 1 cgtAcTTAtc plus
g=25%
    GBX2 1 cgtAcTTAtc plus
g=31%
    GMEB2 1 gtACGTaa minus
C=98%
    GMEB2 1 ttACGTac plus
G=98%
    GSX1 1 cgtacttAtc plus
g=22%
    HLF 1 aagtacgtaaag minus
c=83%
    HLF 1 ctTtacgtactt plus
g=67%
    HLTF 1 gtaCtTatcc plus
g=20%
    HMBOX1 1 cgtacTtAtc plus
g=15%
    HOXA2 1 cgtacTtAtc plus
g=30%
    HOXB2 1 cgtacTtAtc plus
g=33%
    HOXB3 1 cgtacTTAtc plus
g=31%
    ISL2 1 gtAcTtat plus
g=6%
    ISL2 1 gtAcgtaa minus
c=34%
    ISX 1 gtAcTTAt plus
g=18%
    LBX2 1 cgtacTtAtc plus
g=16%
    MEOX1 1 cgtacTtAtc plus
g=45%
    MEOX2 1 cgtacTtatc plus
g=63%
    MIXL1 1 cgtacTtatc plus
g=20%
    MSX1 1 gtacTtAt plus
g=26%
    NFATC3 1 actTTaCgta plus
g=34%
    NFIL3 1 TTAcGTActta plus
G=91%
1379855743648500    NKX2-3 1 cgtACTTatc plus
g=48%
    NKX2-8 1 gtaCTtatc plus
g=33%
    NKX3-2 1 cgtACTTat plus
g=50%
    NKX6-1 1 gtacTTAt plus
g=33%
    NR3C1 1 aaGtACgtaaaGTgCct minus
C=100%
    NR3C1 1 agGcACtttacGTaCtt plus
G=100%
    NR3C2 1 aaGtACgtaaaGTgCct minus
C=100%
    NR3C2 1 agGcACtttacGTaCtt plus
G=100%
    PDX1 1 gtacTTAt plus
g=34%
    PHOX2A 1 ttAcgtacTta plus
g=20%
    POU2F1 1 agtAcgtaaAgt minus
c=5%
    POU5F1B 1 tAcgtaaAg minus
c=4%
    RORA(var.2) 1 gatAagTacGTaAa minus
c=0%
  A BARX1 4 atacTTAt plus
a=28%
    BSX 4 atacTTAt plus
a=16%
    DBP 1 aagTAtgTAAag minus
t=38%
    DLX6   4   atacTTAt plus
a=23%
    EN2 1 catacTtAtc plus
a=13%
    ESX1 1 catacTtAtc plus
a=18%
    EVX1 1 gataAgTAtg minus
t=22%
    EVX2 1 gataAgTAtg minus
t=18%
    EVX2 1 catacTTAtc plus
a=25%
    FOXC1 2 aggataAgtAt minus
t=64%
    GMEB2 2 gtAtGTaa minus
t=0%
    GMEB2 2 ttACaTac plus
a=0%
    GSX1 1 catacttAtc plus
a=22%
1370330934148500    HLF 1 ctTtacatactt plus
a=33%
1370330972248500    HLTF 2 ataCtTatcc plus
a=27%
    HNF4G 1 agtatgtAaAGtgCc minus
t=37%
    HMBOX1 1 catacTtAtc plus
a=10%
    HOXA2 1 catacTtAtc plus
a=18%
    HOXB2 1 catacTtAtc plus
a=21%
    HOXB3 1 catacTTAtc  
a=20% plus
    ISL2 4 atAcTtat plus
a=28%
    ISX 4 atAcTTAt plus
a=15%
    LBX2 1 catacTtAtc plus
a=12%
    LIN54 2 cTTtacAta plus
A=100%
    NFATC3 1 actTTaCata plus
a=55%
    NFIL3 1 gTAtGTAAagt minus
t=65%
    NEUROD2 3 taCaTActta plus
a=77%
    NKX2-3 1 cacACTTatc plus
a=48%
  NKX2-8 6 ataCTttc plus
a=23%
    NKX3-1 1 catACTTat plus
a=18%
    NKX3-2 1 catACTTat plus
a=30%
    NKX6-1 4 atacTTAt plus
a=18%
    PDX1 4 atacTTAt plus
a=21%
    POU2F1 1 agtATgtaaAgt minus
T=92%
    POU3F1 1 gtATgtaaAgtg minus
T=95%
    POU3F2 1 gtATGtaaAgtg minus
T=95%
    POU3F3 1 agtATGtaaAgtg minus
T=90%
    POU3F4 6 tATGaaAT minus
T=99%
    POU5F1B 2 tATgtaaAg minus
T=92%
    RORA(var.2) 1 gatAagTatGTaAa minus
t=0%
    TBP 1 gtATgtAaagtgcct minus
T=97%
    TEAD1 3 tacATaCtta plus
A=92%
    TEAD3 9 acATaCtt plus
A=100%
    TEAD4 3 tacATaCtta plus
A=94%
rs1868092 A E2F6 2 gaGatGGAggt plus
a=16%
    HLTF 2 ctcCaTctca minus
t=63%
    HLTF 3 gcaCtTtgag plus
a=25%
    HNF4G 1 ccatctCAaAGtgca minus
t=30%
    HOXA5 10 ctcaaagt minus
t=25%
    NKX2-3 1 tgCACTTtga plus
a=34%
    NR2C2 1 ccatctcaaaGtgca minus
t=81%
    NKX2-8 3 gcaCTttga plus
a=37%
    SOX10 91 cttTga plus
a=0%
    THAP1 5 cctCCatct minus
t=16%
    YY1 1 tgAgATGGaggt plus
A=94%
    YY2 1 gacCtCCATct minus
t=40%
  G E2F6 1 ggGatGGAggt plus
g=69%
    HIC2 1 aTcCCacaa minus
C=99%
    HLTF 5 gcaCtTtggg plus
g=25%
    HNF4G 1 ccatccCAaAGtgca minus
c=30%
    KLF5 1 ctccatCCCa minus
C=100%
    KLF5 2 cctcCatCCc minus
C=97%
    MZF1 85 ttGGGA plus
G=95%
    NFIA 1 caTcCCAAag minus
C=100%
    NFIC 85 tTGGga plus
G=96%
    NFIX 2 catcCCAaa minus
C=90%
    NKX2-3 1 tgCACTTtgg plus
g=30%
    NKX2-8 7 gcaCTttgg plus
g=30%
    SOX10 76 cttTgg plus
g=5%
    TEAD1 1 tccATcCcaa minus
C=95%
    THAP1 2 catCCcaaa minus
C=98%
    THAP1 3 cctCCatcc minus
c=17%

The common rs6756667 SNP EPAS1-A allele creates thirteen unique punitive TBFS for the CEBPa, FIGLA, FOS::JUN, FOXH1, HIC1 & 2, HOXA2, NFIA, NFIC, NRL, RHOXFA, TEAD3 and THAP1 TFs, which are involved with enhancers, folliculogenesis, signal transduction, tumor suppression, cell specific positional identities, transcription and replication, photoceptor development and function, transcriptional enhancer and transcription regulation respectively (Table 2 & Table 3). The minor EPAS1 -G allele creates nine unique punitive TFBS for the ATF7, GMEB2, INSM1, JDP2 (var.2), MGA, NR4A2, USF1 & 2, and ZNF354C TFs which are involved with early cell signaling, DNA replication, neuroendocrine differentiation of human lung tumors, tumorigenesis and anti-tumorigenesis, transcription activator or repressor, transcription regulator, cellular transcriptional factor and transcription repression, respectively (Table 3, Table 2). There are also twenty-one conserved TFBS for the ATF4, CEBPb, CEBPd, CEBPe, CREB1, DBP, HLF, JUN, JUND (var.2), MEIS1 & 3, NFE2L1:: MAFG, NFIX, RUNX1 & 3, SREBF1 & 2, SREBF2 (var.2), TBX4 & 5 and TFEC TFs which are c-AMP-response element binding proteins, DNA-binding proteins, immune and inflammatory responses, circadian rhythm, transcription activation, signaling pathway stimulator and enhancer, normal development, up-regulation of cytoprotective genes via the antioxidant response element, enhancer sequence-specific DNA binding TF, development of normal hematopoiesis, tumor suppressor, lipid homeostasis, cholesterol homeostasis, mesoderm differentiation and cellular processes, respectively (Table 2 & Table 3).

The common rs7589621 SNP EPAS1-G allele creates ten unique punitive TFBS for the EN1, FOXA1, GBX2, MEOX1 & 2, MIXL1, MSX1, NR3C1, NR3C2 and PHOX2A TFs which are involved with controlling development, embryonic development, cell pluripotency, sclerotome development, transcriptional repressor, regulation of carbohydrate, protein and fat metabolism, mediates aldosterone actions on salt and water balance, and catecholamine biosynthetic genes, respectively (Table 2 & Table 3). The minor rs7589621 SNP EPAS1-A allele creates eleven unique punitive TFBS for the FOXC1, HNF4G, LIN54, NEUROD2, POU3F1-4, TEAD1, and TEAD3 & 4, TFs which are involved with cell viability and resistance to oxidative stress, transcriptional repression, regulation of cell cycle genes, neuronal determination, early embryogenesis and neurogenesis, and enhancer for transcription, respectively (Table 2 & Table 3). There are also thirty-one conserved TFBS for the BARX1, BSX, DBP, DLX6, EN2, ESX1, EVX1 & 2, GMEB2, GSX1, HLF, HLTF, HMBOX1, HOXA2, HOXB2, HOXB3, ISL2, ISX, LBX2, NFATC3, NKX2-3, RORA AND TBP TFs which are involved with craniofacial development, transcriptional activation, circadian rhythm, roles in forebrain, central nervous system, placental development, specification of neuronal cell types, DNA replication, pituitary development, transcription activation, altering chromatin structure, transcriptional repressor, regulates development, axon pathways, regulates gene expression in the intestine, induces expression of cytokine genes in T-cells, homeodomain, nuclear hormone receptors, and the pre-initiation complex, respectively (Table 2 & Table 3).

The common rs1868092 SNP EPAS1-A allele creates four unique TFBS for the HOXA5, NR2C2, YY1 & 2 TFs which are involved with the development regulatory system, repression or activation of transcription, and positive and negative control of transcription at the transcription start site, respectively (Table 2 & Table 3). The minor EPAS1-G allele creates seven unique TFBS for the HIC2, KLF5, MZF1, NFIA, NFIC, NFIX and TEAD1 TFs which are involved with transcriptional repression, transcription, hemopoietic development, transcription and replication, and enhancer of transcription, respectively (Table 2 & Table 3). There are also seven conserved punitive TFBS for the E2F6, HLTF, HNF4G, NKX2-3, NKX2-8, SOX10 and THAP1 TFs which are involved in tumor suppression, chromatin structure, transcriptional activation, homeodomain, regulatory, and regulates endothelial cell proliferation, respectively (Table 2 & Table 3).

Discussion

Genome-wide association studies (GWAS) over the last decade have identified nearly 6,500 disease or trait-predisposing SNPs where only 7% of these are located in protein-coding regions of the genome 44, 45 and the remaining 93% are located within non-coding areas 46, 47 such as regulatory or intergenic regions. SNPs which occur in the putative regulatory region of a gene where a single base change in the DNA sequence of a potential TFBS may affect the process of gene expression are drawing more attention 23, 25, 48. A SNP in a TFBS can have multiple consequences. Often the SNP does not change the TFBS interaction nor does it alter gene expression since a transcriptional factor (TF) will usually recognize a number of different binding sites in the gene. In some cases the SNP may increase or decrease the TF binding which results in allele-specific gene expression. In rare cases, a SNP may eliminate the natural binding site or generate a new binding site. In which cases the gene is no longer regulated by the original TF. Therefore, functional rSNPs in TFBS may result in differences in gene expression, phenotypes and susceptibility to environmental exposure 48. Examples of rSNPs associated with disease susceptibility are numerous and several reviews have been published 48, 49, 50, 51.

The rs56721780 rSNP EPAS1-G allele G (+ strand) or C ( located in the unique RELA, RUNX1, TFAP2A, B & C TFBS have a 100%, 92% and 94-100% occurrence, respectively in humans (Table 3). Since these binding sites (BS) occur only once in the gene except for the RUNX1 TFBS which occurs twice, the rSNP G allele should have a tremendous impact on gene regulation by these TFs (Table 3). The minor rs56721780 rSNP EPAS1-C allele C (+ strand) or G ( located in the unique FOXP3 and HOXA5 TFBS have a 65% and 31% occurrence, respectively in humans. Since these TFBS have a low occurrence in humans and occur more than once in the gene, the respective TF would not be expected to have much impact on the regulation of the EPAS1 gene (Figure 1, Table 3).

Figure 1.Double stranded DNA from the EPAS1 gene showing the potential TFBS for twenty different TFs which can bind their respective DNA sequence either above (+) or below (-) the duplex (cf. Table 3). The rs56721780 rSNP common EPAS1-G allele is found in each of these TFBS. As shown, this rSNP is located in the promoter region of the EPAS1 gene. Also included with the potential TFBS is their % sequence homology to the duplex.
 Double stranded DNA from the EPAS1 gene showing the potential TFBS for twenty different TFs which can bind their respective DNA sequence either above (+) or below (-) the duplex (cf. Table 3). The rs56721780 rSNP common EPAS1-G allele is found in each of these TFBS. As shown, this rSNP is located in the promoter region of the EPAS1 gene. Also included with the potential TFBS is their % sequence homology to the duplex.

The rs6756667 rSNP EPAS1-A allele [A (+ strand) or T (- strand) located in the unique CEBPa, FOS::JUN, HIC 1 & 2, HOXA2, NFIA and NRL TFBS has a 78, 94, 95, 98, 90, 100, 86 and 100% occurrence, respectively in humans (Figure 2, Table 3); however, only the CEBPa, NFIA and NRL TFBS occur only once in the gene. Consequently the corresponding three TFs should impact the regulation of the EPAS1 gene. The minor rs6756667 rSNP EPAS1-G allele G (+ strand) or T ( located in the unique ATF7, GMEB2, JDP2, and MGA TFBS have a 99, 100, 99, 100% occurrence, respectively in humans and all only occur once in the EPAS1 gene. Consequently, the TFs for these TFBS could have some impact on the regulation of the EPAS1 gene (Table 2 & Table 3).

Figure 2.Double stranded DNA from the EPAS1 gene showing the potential TFBS for forty eight different TFs which can bind their respective DNA sequence either above (+) or below (-) the duplex (cf. Table 3). The rs7589621 rSNP common EPAS1-G allele is found in each of these TFBS. As shown, this rSNP is located in intron two of the EPAS1 gene. Also included with the potential TFBS is their % sequence homology to the duplex.
 Double stranded DNA from the EPAS1 gene showing the potential TFBS for forty eight different TFs which can bind their respective DNA sequence either above (+) or below (-) the duplex (cf. Table 3). The rs7589621 rSNP common EPAS1-G allele is found in each of these TFBS. As shown, this rSNP is located in intron two of the EPAS1 gene. Also included with the potential TFBS is their % sequence homology to the duplex.

The rs7589621 rSNP EPAS1-G allele G (+ strand) or C ( located in the unique NR3C1 & 2 TFBSs have a 100% occurrence in humans and are found only once in the EPAS1 gene (Table 3). Consequently, the corresponding glucocorticoid and mineralocorticoid nuclear receptors which bind their respective BS should have a major impact on the regulation of the EPAS1 gene (Table 2 & Table 3). The minor rs7589621 rSNP EPAS1-A allele A (+ strand) or T ( located in the unique LIN54, POU3F1-4, TEAD1, 3, 4 TFBS has a 100, 95, 95, 90, 99, 92, 100 and 94% occurrence, respectively in humans (Table 3). However, only the POU3F1-3 TFBS occur once in the EPAS1 gene, consequently, their corresponding TFs should have an impact on the EPAS1 gene regulation. The remaining TFBS occur multiple times in the gene and would not be expected to have much impact on gene regulation (Figure 2, Table 2 & Table 3).

The rs1868092 rSNP EPAS1-A allele A (+ strand) or T ( located in the unique NR2C2 and YY1 TFBS have a 81 and 94%% occurrence, respectively in humans and only occur once in the EPAS1 gene (Tables 2 & 3). The NR2C2 orphan nuclear receptor which can occur as a repressor of activator of transcription and the YY1 TF which exhibits both positive and negative control of transcript should have an impact on the regulation of the EPAS1 gene (Table 2). The minor rs1868092 rSNP EPAS1-G allele G (+ strand) or C ( located in the unique HIC2, KLF5, NFIA, and TEAD1 TFBS have a 99, 100, 100 and 95% occurrence, respectively in humans and occur only once in the EPAS1 gene (Table 3). Since the corresponding TFs function as activators, enhancers and repressors, the occurrence of these TFBS should impact the regulation of the EPAS1 gene (Table 2 & Table 3).

Human diseases or conditions can be associated with rSNPs of the EPAS1 gene as illustrated above. What a change in the rSNP alleles can do, is to alter the DNA landscape around the SNP for potential TFs to attach and regulate a gene. As an example, the punitive TFBS associated with the rs56721780 common rSNP EPAS1-G allele from Table 3 as illustrated in Figure 1 as well as the rs7589621 common rSNP STAT4-G allele as illustrated in Figure 2. As can be seen in Table 3, these potential TFBS change when an individual carries the alternate allele. The importance of this has been illustrated in Figure 1 with the punitive TFAP2A, B & C TFBS where the common G allele has binding sites for these TFs and the minor C allele does not. The TFAP2A, B & C TFs act as activators and repressors and are involved in a large spectrum of biological functions such as proper eye, face, body wall, limb and neural tube development (cf. Table 2). Another example would be the punitive NR3C1 & 2 TFBS where the common rs7589621 rSNP G allele has created these binding sites for the glucocorticoid and mineralocorticoid nuclear receptors which regulate carbohydrate, protein and fat metabolism while the minor A allele has eliminated these TFBS (Figure 2, Table 2 & Table 3). Other examples can be found in Table 3.

Conclusions

SNPs that alter the TFBS are not only found in the promoter regions but in the introns, exons and the UTRs of a gene. The nucleus of the cell is where epigenetic alterations occur and TFs operate to convert chromosomes into single stranded DNA for mRNA transcription while it is the cytoplasm where mRNA is processed by separating exons and introns for protein translation. Consequently, it doesn’t matter where TFs bind the DNA in the nucleus because it is only there that TFs function. The SNPs outlined in this report should be considered as rSNPs since they change the DNA landscape for TF binding and have been associated with disease. In this report, examples have been described to illustrate that a change in rSNP alleles in the EPAS1 gene can provide different TFBS which in turn are also associated with human disease or alterations in human health such as adaptions to high altitude. The punitive changes in TFBS created by the four rSNPs could very well influence the significant cline in allele frequencies seen in Tibetans with increasing altitude 20 or the haplotype association with high altitude polycythemia in male Han Chinese 22. As an example, the minor rs7589621 SNP EPAS1-A creates a potential TFBS for the FOXC TF which is an important regulator of cell viability and resistance to oxidative stress. Where oxidative stress is linked to oxygen, hypoxia, heart failure and the hypoxia-inducible factor transcriptional factors 52. The potential alterations in TFBS obtained by computational analyses need to be verified by future protein/DNA electrophoretic mobility gel shift assays and gene expression studies.

References

  1. 1.Ward M P, Everest. (1953) first ascent: a clinical record. High Alt Med Biol.2003;4:. 27-37.
  1. 2.Loboda A, Jozkowicz A, Dulak J. (2012) HIF-1 versus HIF-2--is one more important than the other? Vascul Pharmacol;. 56, 245-51.
  1. 3.Giaccia A J, Simon M C, Johnson R.The biology of hypoxia: the role of oxygen sensing in development, normal function, and disease. , Genes Dev 2004, 2183-94.
  1. 4.Semenza G L. (2001) HIF-1 and mechanisms of hypoxia sensing. , Curr Opin Cell Biol 13, 167-71.
  1. 5.Schofield C J, Ratcliffe P J. (2004) Oxygen sensing by HIF hydroxylases. , Nat Rev Mol Cell Biol 5, 343-54.
  1. 6.Peng Y, Yang Z, Zhang H, Cui C, Qi X et al.Genetic variations in Tibetan populations and high-altitude adaptation at the Himalayas. Mol Biol Evol;28:. 1075-81.
  1. 7.Xu S, Li S, Yang Y, Tan J, Lou H et al.A genome-wide search for signals of high-altitude adaptation in Tibetans. Mol Biol Evol;28:. 1003-11.
  1. 8.Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo Z X et al.Sequencing of 50 human exomes reveals adaptation to high altitude. Science;329: 75-8.
  1. 9.Xu X H, Huang X W, Qun L, Li Y N, Wang Y et al.Two functional loci in the promoter of EPAS1 gene involved in high-altitude adaptation of Tibetans. Sci Rep;4:. 7465.
  1. 10.Semenza G L. (2007) Life with oxygen. , Science 318, 62-4.
  1. 11.Majmundar A J, Wong W J, Simon M C.Hypoxia-inducible factors and the response to hypoxic stress. Mol Cell;40:. 294-309.
  1. 12.Skuli N, Simon M C. (2009) HIF-1alpha versus HIF-2alpha in endothelial cells and vascular functions: is there a master in angiogenesis regulation?. , Cell Cycle 8, 3252-3.
  1. 13.Skuli N, Liu L, Runge A, Wang T, Yuan L et al. (2009) Endothelial deletion of hypoxia-inducible factor-2alpha (HIF-2alpha) alters vascular function and tumor angiogenesis. , Blood 114, 469-77.
  1. 14.Semenza G L.Oxygen sensing, homeostasis, and disease. , N Engl J Med;365: 537-47.
  1. 15.Eltzschig H K, Carmeliet P.Hypoxia and inflammation. , N Engl J Med; 364, 656-65.
  1. 16.Kaelin W G. (2008) The von Hippel-Lindau tumour suppressor protein: O2 sensing and cancer. , Nat Rev Cancer 8, 865-73.
  1. 17.Han S S, Yeager M, Moore L E, Wei M H, Pfeiffer R et al.The chromosome 2p21 region harbors a complex genetic architecture for association with risk for renal cell carcinoma. Hum Mol Genet;21:. 1190-200.
  1. 18.Xue X, Taylor M, Anderson E, Hao C, Qu A et al.Hypoxia-inducible factor-2alpha activation promotes colorectal cancer progression by dysregulating iron homeostasis. , Cancer Res; 72, 2285-93.
  1. 19.Tian H, McKnight S L, Russell D W. (1997) Endothelial PAS domain protein 1 (EPAS1), a transcription factor selectively expressed in endothelial cells. , Genes Dev 11, 72-82.
  1. 20.Basang Z, Wang B, Li L, Yang L, Liu L et al.. HIF2A Variants Were Associated with Different Levels of High-Altitude Hypoxia among Native Tibetans.PLoS One; 10, 0137956.
  1. 21.Guo L I, Zhang J, Jin J, Gao X, Yu J et al.Genetic variants of endothelial PAS domain protein 1 are associated with susceptibility to acute mountain sickness in individuals unaccustomed to high altitude: A nested case-control study. , Exp Ther Med; 10, 907-14.
  1. 22.Chen Y, Jiang C, Luo Y, Liu F, Gao Y.An EPAS1 haplotype is associated with high altitude polycythemia in male Han Chinese at the Qinghai-Tibetan plateau. , Wilderness Environ Med; 25, 392-400.
  1. 23.Knight J C. (2003) Functional implications of genetic variation in non-coding DNA for disease susceptibility and gene regulation. , Clin Sci (Lond) 104, 493-501.
  1. 24.Knight J C. (2005) Regulatory polymorphisms underlying complex disease traits. , Journal of molecular medicine 83, 97-109.
  1. 25.Wang X, Tomso D J, Liu X, Bell D A. (2005) Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes. , Toxicol Appl Pharmacol 207, 84-90.
  1. 26.Wang X, Tomso D J, Chorley B N, Cho H Y, Cheung V G et al. (2007) Identification of polymorphic antioxidant response elements in the human genome. , Hum Mol Genet 16, 1188-200.
  1. 27.Claessens F, Verrijdt G, Schoenmakers E, Haelens A, Peeters B et al. (2001) Selective DNA binding by the androgen receptor as a mechanism for hormone-specific gene regulation. The Journal of steroid biochemistry and molecular biology 76:. 23-30.
  1. 28.Hsu M H, Savas U, Griffin K J, Johnson E F. (2007) Regulation of human cytochrome P450 4F2 expression by sterol regulatory element-binding protein and lovastatin. , J Biol Chem 282, 5225-36.
  1. 29.Takai H, Araki S, Mezawa M, Kim D S, Li X et al. (2008) AP1 binding site is another target of FGF2 regulation of bone sialoprotein gene transcription. Gene;410: 97-104.
  1. 30.Buroker N E, Huang J Y, Barboza J, Ledee D R, Eastman R J et al. (2012) The adaptor-related protein complex 2, alpha 2 subunit (AP2alpha2) gene is a peroxisome proliferator-activated receptor cardiac target gene. The protein journal 31:. 75-83.
  1. 31.Huang C N, Huang S P, Pao J B, Hour T C, Chang T Y et al. (2012) Genetic polymorphisms in oestrogen receptor-binding sites affect clinical outcomes in patients with prostate cancer receiving androgen-deprivation therapy. , Journal of internal medicine 271, 499-509.
  1. 32.Huang C N, Huang S P, Pao J B, Chang T Y, Lan Y H et al. (2012) Genetic polymorphisms in androgen receptor-binding sites predict survival in prostate cancer patients receiving androgen-deprivation therapy. Annals of oncology : official journal of the European Society for Medical Oncology /. , ESMO 23, 707-13.
  1. 33.Yu B, Lin H, Yang L, Chen K, Luo H et al. (2012) Genetic variation in the Nrf2 promoter associates with defective spermatogenesis in humans. , Journal of molecular medicine
  1. 34.Wu J, Richards M H, Huang J, Al-Harthi L, Xu X et al. (2011) Human FasL gene is a target of beta-catenin/T-cell factor pathway and complex FasL haplotypes alter promoter functions. , PLoS One 6, 26143.
  1. 35.Alam M, Pravica V, Fryer A A, Hawkins C P, Hutchinson. (2005) Novel polymorphism in the promoter region of the human nerve growth-factor gene. , International journal of immunogenetics 32, 379-82.
  1. 36.Kumar A, Purohit R. (2012) Computational investigation of pathogenic nsSNPs in CEP63 protein. , Gene 503, 75-82.
  1. 37.Kamaraj B, Purohit R. (2014) Computational screening of disease-associated mutations in OCA2 gene. , Cell Biochem Biophys 68, 97-109.
  1. 38.Kumar A, Rajendran V, Sethumadhavan R, Shukla P, Tiwari S et al. (2014) Computational SNP analysis: current approaches and future prospects. , Cell Biochem Biophys 68, 233-9.
  1. 39.Kumar A, Purohit R. (2014) Use of long term molecular dynamics simulation in predicting cancer associated SNPs. , PLoS Comput Biol 10, 1003318.
  1. 40.Bryne J C, Valen E, Tang M H, Marstrand T, Winther O et al. (2008) JASPAR: the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. , Nucleic Acids Res 36, 102-6.
  1. 41.Sandelin A, Alkema W, Engstrom P, Wasserman W W, Lenhard B. (2004) JASPAR: an open-access database for eukaryotic transcription factor binding profiles. , Nucleic Acids Res 32, 91-4.
  1. 42.Sandelin A, Wasserman W W, Lenhard B. (2004) ConSite: web-based prediction of regulatory elements using cross-species comparison. , Nucleic Acids Res 32, 249-52.
  1. 43.Buroker N E, Ning X H, Zhou Z N, Li K, Cen W J et al.AKT3, ANGPTL4, eNOS3, and VEGFA associations with high altitude sickness in Han and Tibetan Chinese at the Qinghai-Tibetan Plateau. International journal of hematology;. 96, 200-13.
  1. 44.Pennisi E. (2011) The Biology of Genomes. Disease risk links to gene regulation. , Science 332, 1031.
  1. 45.Kumar V, Wijmenga C, Withoff S. (2012) From genome-wide association studies to disease mechanisms: celiac disease as a model for autoimmune diseases. , Semin Immunopathol 34, 567-80.
  1. 46.Hindorff L A, Sethupathy P, Junkins H A, Ramos E M, Mehta J P et al. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. , Proc Natl Acad Sci U S A 106, 9362-7.
  1. 47.Kumar V, Westra H J, Karjalainen J, Zhernakova D V, Esko T et al. (2013) Human disease-associated genetic variation impacts large intergenic non-coding RNA expression. , PLoS Genet 9, 1003201.
  1. 48.Chorley B N, Wang X, Campbell M R, Pittman G S, Noureddine M A et al. (2008) Discovery and verification of functional single nucleotide polymorphisms in regulatory genomic regions: current and developing technologies. , Mutat Res 659, 147-57.
  1. 49.Prokunina L, Alarcon-Riquelme M E. (2004) Regulatory SNPs in complex diseases: their identification and functional validation. Expert Rev Mol Med. 6, 1-15.
  1. 50.Buckland P R. (2006) The importance and identification of regulatory polymorphisms and their mechanisms of action. , Biochim Biophys Acta 1762, 17-28.
  1. 51.Sadee W, Wang D, Papp A C, Pinsonneault J K, Smith R M et al. (2011) Pharmacogenomics of the RNA world: structural RNA polymorphisms in drug therapy. , Clin Pharmacol Ther 89, 355-65.
  1. 52.Giordano F J. (2005) Oxygen, oxidative stress, hypoxia, and heart failure. , J Clin Invest 115, 500-8.

Cited by (2)

  1. 1.Grijalva-Avila Julio, Villanueva-Fierro Ignacio, Lares-Asseff Ismael, Chairez-Hernández Isaías, Rivera-Sanchez Gildardo, et al, 2020, Milk intake and IGF-1 rs6214 polymorphism as protective factors to obesity, International Journal of Food Sciences and Nutrition, 71(3), 388, 10.1080/09637486.2019.1666805
  1. 2.Buroker Norman E., 2017, Identifying Changes in Punitive Transcriptional Factor Binding Sites Created by PPAR<i>α/δ/γ</i> SNPs Associated with Disease, Journal of Biosciences and Medicines, 05(04), 81, 10.4236/jbm.2017.54008