An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

J. Liu, T. Lichtenberg, K. A. Hoadley, L. M. Poisson, A. J. Lazar, A. D. Cherniack, A. J. Kovatich, C. C. Benz, D. A. Levine, A. V. Lee, L. Omberg, D. M. Wolf, C. D. Shriver, V. Thorsson, Cancer Genome Atlas Research Network, H. Hu, M. (come contributors) Marino

Research output: Contribution to journalArticle

Abstract

For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale.
Original languageEnglish
Pages (from-to)400-416.e11
JournalCell
Volume173
Issue number2
DOIs
Publication statusPublished - Apr 5 2018
Externally publishedYes

Fingerprint

Atlases
Genes
Genome
Survival
Neoplasms
Tumors
Genomics

Keywords

  • Cox proportional hazards regression model
  • TCGA
  • The Cancer Genome Atlas
  • clinical data resource
  • disease-free interval
  • disease-specific survival
  • follow-up time
  • overall survival
  • progression-free interval
  • translational research

Cite this

Liu, J., Lichtenberg, T., Hoadley, K. A., Poisson, L. M., Lazar, A. J., Cherniack, A. D., ... Marino, M. . C. (2018). An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell, 173(2), 400-416.e11. https://doi.org/10.1016/j.cell.2018.02.052

An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. / Liu, J.; Lichtenberg, T.; Hoadley, K. A.; Poisson, L. M.; Lazar, A. J.; Cherniack, A. D.; Kovatich, A. J.; Benz, C. C.; Levine, D. A.; Lee, A. V.; Omberg, L.; Wolf, D. M.; Shriver, C. D.; Thorsson, V.; Network, Cancer Genome Atlas Research; Hu, H.; Marino, M. (come contributors).

In: Cell, Vol. 173, No. 2, 05.04.2018, p. 400-416.e11.

Research output: Contribution to journalArticle

Liu, J, Lichtenberg, T, Hoadley, KA, Poisson, LM, Lazar, AJ, Cherniack, AD, Kovatich, AJ, Benz, CC, Levine, DA, Lee, AV, Omberg, L, Wolf, DM, Shriver, CD, Thorsson, V, Network, CGAR, Hu, H & Marino, MC 2018, 'An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics', Cell, vol. 173, no. 2, pp. 400-416.e11. https://doi.org/10.1016/j.cell.2018.02.052
Liu J, Lichtenberg T, Hoadley KA, Poisson LM, Lazar AJ, Cherniack AD et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell. 2018 Apr 5;173(2):400-416.e11. https://doi.org/10.1016/j.cell.2018.02.052
Liu, J. ; Lichtenberg, T. ; Hoadley, K. A. ; Poisson, L. M. ; Lazar, A. J. ; Cherniack, A. D. ; Kovatich, A. J. ; Benz, C. C. ; Levine, D. A. ; Lee, A. V. ; Omberg, L. ; Wolf, D. M. ; Shriver, C. D. ; Thorsson, V. ; Network, Cancer Genome Atlas Research ; Hu, H. ; Marino, M. (come contributors). / An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. In: Cell. 2018 ; Vol. 173, No. 2. pp. 400-416.e11.
@article{0bbe7b3561c2408eb8784d0a08819ac6,
title = "An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics",
abstract = "For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale.",
keywords = "Cox proportional hazards regression model, TCGA, The Cancer Genome Atlas, clinical data resource, disease-free interval, disease-specific survival, follow-up time, overall survival, progression-free interval, translational research",
author = "J. Liu and T. Lichtenberg and Hoadley, {K. A.} and Poisson, {L. M.} and Lazar, {A. J.} and Cherniack, {A. D.} and Kovatich, {A. J.} and Benz, {C. C.} and Levine, {D. A.} and Lee, {A. V.} and L. Omberg and Wolf, {D. M.} and Shriver, {C. D.} and V. Thorsson and Network, {Cancer Genome Atlas Research} and H. Hu and Marino, {M. (come contributors)}",
note = "LR: 20180801; CI: Copyright (c) 2018; GR: P30 CA016672/CA/NCI NIH HHS/United States; GR: U24 CA143882/CA/NCI NIH HHS/United States; GR: U54 HG003067/HG/NHGRI NIH HHS/United States; GR: U24 CA143835/CA/NCI NIH HHS/United States; GR: U24 CA143866/CA/NCI NIH HHS/United States; GR: U24 CA210950/CA/NCI NIH HHS/United States; GR: U24 CA143845/CA/NCI NIH HHS/United States; GR: U24 CA143799/CA/NCI NIH HHS/United States; GR: U54 HG003273/HG/NHGRI NIH HHS/United States; GR: U24 CA144025/CA/NCI NIH HHS/United States; GR: U24 CA143840/CA/NCI NIH HHS/United States; GR: U24 CA143843/CA/NCI NIH HHS/United States; GR: U24 CA143858/CA/NCI NIH HHS/United States; GR: U24 CA143848/CA/NCI NIH HHS/United States; GR: U24 CA210957/CA/NCI NIH HHS/United States; GR: U54 HG003079/HG/NHGRI NIH HHS/United States; GR: U24 CA210949/CA/NCI NIH HHS/United States; GR: U24 CA210988/CA/NCI NIH HHS/United States; GR: U24 CA143883/CA/NCI NIH HHS/United States; GR: U24 CA143867/CA/NCI NIH HHS/United States; GR: U24 CA210990/CA/NCI NIH HHS/United States; JID: 0413066; NIHMS978596; OTO: NOTNLM; PMCR: 2019/04/05 00:00; 2017/07/21 00:00 [received]; 2017/11/11 00:00 [revised]; 2018/02/20 00:00 [accepted]; 2019/04/05 00:00 [pmc-release]; 2018/04/07 06:00 [entrez]; 2018/04/07 06:00 [pubmed]; 2018/04/07 06:00 [medline]; ppublish",
year = "2018",
month = "4",
day = "5",
doi = "10.1016/j.cell.2018.02.052",
language = "English",
volume = "173",
pages = "400--416.e11",
journal = "Cell",
issn = "0092-8674",
publisher = "Cell Press",
number = "2",

}

TY - JOUR

T1 - An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

AU - Liu, J.

AU - Lichtenberg, T.

AU - Hoadley, K. A.

AU - Poisson, L. M.

AU - Lazar, A. J.

AU - Cherniack, A. D.

AU - Kovatich, A. J.

AU - Benz, C. C.

AU - Levine, D. A.

AU - Lee, A. V.

AU - Omberg, L.

AU - Wolf, D. M.

AU - Shriver, C. D.

AU - Thorsson, V.

AU - Network, Cancer Genome Atlas Research

AU - Hu, H.

AU - Marino, M. (come contributors)

N1 - LR: 20180801; CI: Copyright (c) 2018; GR: P30 CA016672/CA/NCI NIH HHS/United States; GR: U24 CA143882/CA/NCI NIH HHS/United States; GR: U54 HG003067/HG/NHGRI NIH HHS/United States; GR: U24 CA143835/CA/NCI NIH HHS/United States; GR: U24 CA143866/CA/NCI NIH HHS/United States; GR: U24 CA210950/CA/NCI NIH HHS/United States; GR: U24 CA143845/CA/NCI NIH HHS/United States; GR: U24 CA143799/CA/NCI NIH HHS/United States; GR: U54 HG003273/HG/NHGRI NIH HHS/United States; GR: U24 CA144025/CA/NCI NIH HHS/United States; GR: U24 CA143840/CA/NCI NIH HHS/United States; GR: U24 CA143843/CA/NCI NIH HHS/United States; GR: U24 CA143858/CA/NCI NIH HHS/United States; GR: U24 CA143848/CA/NCI NIH HHS/United States; GR: U24 CA210957/CA/NCI NIH HHS/United States; GR: U54 HG003079/HG/NHGRI NIH HHS/United States; GR: U24 CA210949/CA/NCI NIH HHS/United States; GR: U24 CA210988/CA/NCI NIH HHS/United States; GR: U24 CA143883/CA/NCI NIH HHS/United States; GR: U24 CA143867/CA/NCI NIH HHS/United States; GR: U24 CA210990/CA/NCI NIH HHS/United States; JID: 0413066; NIHMS978596; OTO: NOTNLM; PMCR: 2019/04/05 00:00; 2017/07/21 00:00 [received]; 2017/11/11 00:00 [revised]; 2018/02/20 00:00 [accepted]; 2019/04/05 00:00 [pmc-release]; 2018/04/07 06:00 [entrez]; 2018/04/07 06:00 [pubmed]; 2018/04/07 06:00 [medline]; ppublish

PY - 2018/4/5

Y1 - 2018/4/5

N2 - For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale.

AB - For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale.

KW - Cox proportional hazards regression model

KW - TCGA

KW - The Cancer Genome Atlas

KW - clinical data resource

KW - disease-free interval

KW - disease-specific survival

KW - follow-up time

KW - overall survival

KW - progression-free interval

KW - translational research

U2 - 10.1016/j.cell.2018.02.052

DO - 10.1016/j.cell.2018.02.052

M3 - Article

VL - 173

SP - 400-416.e11

JO - Cell

JF - Cell

SN - 0092-8674

IS - 2

ER -