Malignant pleural mesothelioma (MPM) is a rare neoplasm, mainly caused by asbestos exposure, with a high mortality rate. The management of patients with MPM is controversial due to a long latency period between exposure and diagnosis and because of non-specific symptoms generally appearing at advanced stage of the disease. Breath analysis, aimed at the identification of diagnostic Volatile Organic Compounds (VOCs) pattern in exhaled breath, is believed to improve early detection of MPM. Therefore, in this study, breath samples from 14 MPM patients and 20 healthy controls (HC) were collected and analyzed by Thermal Desorption-Gas Chromatography-Mass Spectrometry (TD-GC/MS). Nonparametric test allowed to identify the most weighting variables to discriminate between MPM and HC breath samples and multivariate statistics were applied. Considering that MPM is an aggressive neoplasm leading to a late diagnosis and thus the recruitment of patients is very difficult, a promising data mining approach was developed and validated in order to discriminate between MPM patients and healthy controls, even if no large population data are available. Three different machine learning algorithms were applied to perform the classification task with a leave-one-out cross-validation approach, leading to remarkable results (Area Under Curve AUC = 93%). Ten VOCs, such as ketones, alkanes and methylate derivates, as well as hydrocarbons, were able to discriminate between MPM patients and healthy controls and for each compound which resulted diagnostic for MPM, the metabolic pathway was studied in order to identify the link between VOC and the neoplasm. Moreover, five breath samples from asymptomatic asbestos-exposed persons (AEx) were exploratively analyzed, processed and tested by the validated statistical method as blinded samples in order to evaluate the performance for the early recognition of patients affected by MPM among asbestos-exposed persons. Good agreement was found between the information obtained by gold-standard diagnostic methods such as computed tomography CT and model output.