Web conversations about complementary and alternative medicines and cancer: Content and sentiment analysis

Mauro Mazzocut, Ivana Truccolo, Marialuisa Antonini, Fabio Rinaldi, Paolo Omero, Emanuela Ferrarin, Paolo De Paoli, Carlo Tasso

Research output: Contribution to journalArticlepeer-review


Background: The use of complementary and alternative medicine (CAM) among cancer patients is widespread and mostly self-administrated. Today, one of the most relevant topics is the nondisclosure of CAM use to doctors. This general lack of communication exposes patients to dangerous behaviors and to less reliable information channels, such as the Web. The Italian context scarcely differs from this trend. Today, we are able to mine and analyze systematically the unstructured information available in the Web, to get an insight of people's opinions, beliefs, and rumors concerning health topics. Objective: Our aim was to analyze Italian Web conversations about CAM, identifying the most relevant Web sources, therapies, and diseases and measure the related sentiment. Methods: Data have been collected using the Web Intelligence tool ifMONITOR. The workflow consisted of 6 phases: (1) eligibility criteria definition for the ifMONITOR search profile; (2) creation of a CAM terminology database; (3) generic Web search and automatic filtering, the results have been manually revised to refine the search profile, and stored in the ifMONITOR database; (4) automatic classification using the CAM database terms; (5) selection of the final sample and manual sentiment analysis using a 1-5 score range; (6) manual indexing of the Web sources and CAM therapies type retrieved. Descriptive univariate statistics were computed for each item: absolute frequency, percentage, central tendency (mean sentiment score [MSS]), and variability (standard variation Ò). Results: Overall, 212 Web sources, 423 Web documents, and 868 opinions have been retrieved. The overall sentiment measured tends to a good score (3.6 of 5). Quite a high polarization in the opinions of the conversation partaking emerged from standard variation analysis (δ≥1). In total, 126 of 212 (59.4%) Web sources retrieved were nonhealth-related. Facebook (89; 21%) and Yahoo Answers (41; 9.7%) were the most relevant. In total, 94 CAM therapies have been retrieved. Most belong to the "biologically based therapies or nutrition" category: 339 of 868 opinions (39.1%), showing an MSS of 3.9 (δ=0.83). Within nutrition, "diets" collected 154 opinions (18.4%) with an MSS of 3.8 (δ=0.87); "food as CAM" overall collected 112 opinions (12.8%) with a MSS of 4 (δ=0.68). Excluding diets and food, the most discussed CAM therapy is the controversial Italian "Di Bella multitherapy" with 102 opinions (11.8%) with an MSS of 3.4 (δ=1.21). Breast cancer was the most mentioned disease: 81 opinions of 868. Conclusions: Conversations about CAM and cancer are ubiquitous. There is a great concern about the biologically based therapies, perceived as harmless and useful, under-rating all risks related to dangerous interactions or malnutrition. Our results can be useful to doctors to be aware of the implications of these beliefs for the clinical practice. Web conversation exploitation could be a strategy to gain insights of people's perspective for other controversial topics.

Original languageEnglish
Article numbere120
JournalJournal of Medical Internet Research
Issue number6
Publication statusPublished - Jun 1 2016


  • Barriers to patient-doctor communication
  • Complementary and alternative medicine
  • Data mining
  • Health information online
  • Internet
  • Misinformation
  • Neoplasms
  • Sentiment analysis
  • Website content analysis

ASJC Scopus subject areas

  • Health Informatics


Dive into the research topics of 'Web conversations about complementary and alternative medicines and cancer: Content and sentiment analysis'. Together they form a unique fingerprint.

Cite this