Interpreting Non-coding Genetic Variation in Multiple Sclerosis Genome-Wide Associated Regions

Research output: Contribution to journalArticle

Abstract

Multiple sclerosis (MS) is the most common neurological disorder in young adults. Despite extensive studies, only a fraction of MS heritability has been explained, with association studies focusing primarily on protein-coding genes, essentially for the difficulty of interpreting non-coding features. However, non-coding RNAs (ncRNAs) and functional elements, such as super-enhancers (SE), are crucial regulators of many pathways and cellular mechanisms, and they have been implicated in a growing number of diseases. In this work, we searched for possible enrichments in non-coding elements at MS genome-wide associated loci, with the aim to highlight their possible involvement in the susceptibility to the disease. We first reconstructed the linkage disequilibrium (LD) structure of the Italian population using data of 727,478 single-nucleotide polymorphisms (SNPs) from 1,668 healthy individuals. The genomic coordinates of the obtained LD blocks were intersected with those of the top hits identified in previously published MS genome-wide association studies (GWAS). By a bootstrapping approach, we hence demonstrated a striking enrichment of non-coding elements, especially of circular RNAs (circRNAs) mapping in the 73 LD blocks harboring MS-associated SNPs. In particular, we found a total of 482 circRNAs (annotated in publicly available databases) vs. a mean of 194 ± 65 in the random sets of LD blocks, using 1,000 iterations. As a proof of concept of a possible functional relevance of this observation, we experimentally verified that the expression levels of a circRNA derived from an MS-associated locus, i.e., hsa_circ_0043813 from the STAT3 gene, can be modulated by the three genotypes at the disease-associated SNP. Finally, by evaluating RNA-seq data of two cell lines, SH-SY5Y and Jurkat cells, representing tissues relevant for MS, we identified 18 (two novel) circRNAs derived from MS-associated genes. In conclusion, this work showed for the first time that MS-GWAS top hits map in LD blocks enriched in circRNAs, suggesting circRNAs as possible novel contributors to the disease pathogenesis.

Original languageEnglish
Pages (from-to)647
JournalFrontiers in Genetics
Volume9
DOIs
Publication statusPublished - 2018

Fingerprint

Multiple Sclerosis
Genome
Linkage Disequilibrium
Single Nucleotide Polymorphism
Genome-Wide Association Study
Untranslated RNA
Jurkat Cells
Disease Susceptibility
Nervous System Diseases
Genes
Young Adult
Genotype
circular RNA
Databases
RNA
Cell Line
Population

Cite this

@article{8ef8a21434ef47e588f536eee3870f94,
title = "Interpreting Non-coding Genetic Variation in Multiple Sclerosis Genome-Wide Associated Regions",
abstract = "Multiple sclerosis (MS) is the most common neurological disorder in young adults. Despite extensive studies, only a fraction of MS heritability has been explained, with association studies focusing primarily on protein-coding genes, essentially for the difficulty of interpreting non-coding features. However, non-coding RNAs (ncRNAs) and functional elements, such as super-enhancers (SE), are crucial regulators of many pathways and cellular mechanisms, and they have been implicated in a growing number of diseases. In this work, we searched for possible enrichments in non-coding elements at MS genome-wide associated loci, with the aim to highlight their possible involvement in the susceptibility to the disease. We first reconstructed the linkage disequilibrium (LD) structure of the Italian population using data of 727,478 single-nucleotide polymorphisms (SNPs) from 1,668 healthy individuals. The genomic coordinates of the obtained LD blocks were intersected with those of the top hits identified in previously published MS genome-wide association studies (GWAS). By a bootstrapping approach, we hence demonstrated a striking enrichment of non-coding elements, especially of circular RNAs (circRNAs) mapping in the 73 LD blocks harboring MS-associated SNPs. In particular, we found a total of 482 circRNAs (annotated in publicly available databases) vs. a mean of 194 ± 65 in the random sets of LD blocks, using 1,000 iterations. As a proof of concept of a possible functional relevance of this observation, we experimentally verified that the expression levels of a circRNA derived from an MS-associated locus, i.e., hsa_circ_0043813 from the STAT3 gene, can be modulated by the three genotypes at the disease-associated SNP. Finally, by evaluating RNA-seq data of two cell lines, SH-SY5Y and Jurkat cells, representing tissues relevant for MS, we identified 18 (two novel) circRNAs derived from MS-associated genes. In conclusion, this work showed for the first time that MS-GWAS top hits map in LD blocks enriched in circRNAs, suggesting circRNAs as possible novel contributors to the disease pathogenesis.",
author = "Paraboschi, {Elvezia Maria} and Giulia Cardamone and Giulia Sold{\`a} and Stefano Duga and Rosanna Asselta",
year = "2018",
doi = "10.3389/fgene.2018.00647",
language = "English",
volume = "9",
pages = "647",
journal = "Frontiers in Genetics",
issn = "1664-8021",
publisher = "Frontiers Media S. A.",

}

TY - JOUR

T1 - Interpreting Non-coding Genetic Variation in Multiple Sclerosis Genome-Wide Associated Regions

AU - Paraboschi, Elvezia Maria

AU - Cardamone, Giulia

AU - Soldà, Giulia

AU - Duga, Stefano

AU - Asselta, Rosanna

PY - 2018

Y1 - 2018

N2 - Multiple sclerosis (MS) is the most common neurological disorder in young adults. Despite extensive studies, only a fraction of MS heritability has been explained, with association studies focusing primarily on protein-coding genes, essentially for the difficulty of interpreting non-coding features. However, non-coding RNAs (ncRNAs) and functional elements, such as super-enhancers (SE), are crucial regulators of many pathways and cellular mechanisms, and they have been implicated in a growing number of diseases. In this work, we searched for possible enrichments in non-coding elements at MS genome-wide associated loci, with the aim to highlight their possible involvement in the susceptibility to the disease. We first reconstructed the linkage disequilibrium (LD) structure of the Italian population using data of 727,478 single-nucleotide polymorphisms (SNPs) from 1,668 healthy individuals. The genomic coordinates of the obtained LD blocks were intersected with those of the top hits identified in previously published MS genome-wide association studies (GWAS). By a bootstrapping approach, we hence demonstrated a striking enrichment of non-coding elements, especially of circular RNAs (circRNAs) mapping in the 73 LD blocks harboring MS-associated SNPs. In particular, we found a total of 482 circRNAs (annotated in publicly available databases) vs. a mean of 194 ± 65 in the random sets of LD blocks, using 1,000 iterations. As a proof of concept of a possible functional relevance of this observation, we experimentally verified that the expression levels of a circRNA derived from an MS-associated locus, i.e., hsa_circ_0043813 from the STAT3 gene, can be modulated by the three genotypes at the disease-associated SNP. Finally, by evaluating RNA-seq data of two cell lines, SH-SY5Y and Jurkat cells, representing tissues relevant for MS, we identified 18 (two novel) circRNAs derived from MS-associated genes. In conclusion, this work showed for the first time that MS-GWAS top hits map in LD blocks enriched in circRNAs, suggesting circRNAs as possible novel contributors to the disease pathogenesis.

AB - Multiple sclerosis (MS) is the most common neurological disorder in young adults. Despite extensive studies, only a fraction of MS heritability has been explained, with association studies focusing primarily on protein-coding genes, essentially for the difficulty of interpreting non-coding features. However, non-coding RNAs (ncRNAs) and functional elements, such as super-enhancers (SE), are crucial regulators of many pathways and cellular mechanisms, and they have been implicated in a growing number of diseases. In this work, we searched for possible enrichments in non-coding elements at MS genome-wide associated loci, with the aim to highlight their possible involvement in the susceptibility to the disease. We first reconstructed the linkage disequilibrium (LD) structure of the Italian population using data of 727,478 single-nucleotide polymorphisms (SNPs) from 1,668 healthy individuals. The genomic coordinates of the obtained LD blocks were intersected with those of the top hits identified in previously published MS genome-wide association studies (GWAS). By a bootstrapping approach, we hence demonstrated a striking enrichment of non-coding elements, especially of circular RNAs (circRNAs) mapping in the 73 LD blocks harboring MS-associated SNPs. In particular, we found a total of 482 circRNAs (annotated in publicly available databases) vs. a mean of 194 ± 65 in the random sets of LD blocks, using 1,000 iterations. As a proof of concept of a possible functional relevance of this observation, we experimentally verified that the expression levels of a circRNA derived from an MS-associated locus, i.e., hsa_circ_0043813 from the STAT3 gene, can be modulated by the three genotypes at the disease-associated SNP. Finally, by evaluating RNA-seq data of two cell lines, SH-SY5Y and Jurkat cells, representing tissues relevant for MS, we identified 18 (two novel) circRNAs derived from MS-associated genes. In conclusion, this work showed for the first time that MS-GWAS top hits map in LD blocks enriched in circRNAs, suggesting circRNAs as possible novel contributors to the disease pathogenesis.

U2 - 10.3389/fgene.2018.00647

DO - 10.3389/fgene.2018.00647

M3 - Article

VL - 9

SP - 647

JO - Frontiers in Genetics

JF - Frontiers in Genetics

SN - 1664-8021

ER -