Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

ProteomicsML: An Online Platform for Community-Curated Datasets and Tutorials for Machine Learning in Proteomics

Published

Author(s)

Tobias Rehfeldt, Ralf Gabriels, Robbin Bouwmeester, Siegfried Gessulat, Magnus Palmblad, Ben Neely, Yasset Perez-Riverol, Tobias Schmidt, Juan Antonio Vizcaíno, Eric Deutsch

Abstract

Dataset acquisition and curation are often the hardest and most time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based LC-IM-MS datasets, due to the high-throughput data structure with high levels of noise and complexity between raw and machine learning-ready formats. While predictive proteomics is a field on the rise, when predicting peptide behavior in LC-IM-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based datasets and tutorials across most of the currently explored physico-chemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides datasets that are useful for comparing state-of-the-art (SOTA) machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available on https://www.proteomicsml.org/ and we welcome the entire proteomics community to contribute to the project at https://github.com/proteomicsml/.
Citation
ACS Journal of Proteome Research

Keywords

machine learning, deep learning, proteomics, educational platform, community platform, bioinformatics

Citation

Rehfeldt, T. , Gabriels, R. , Bouwmeester, R. , Gessulat, S. , Palmblad, M. , Neely, B. , Perez-Riverol, Y. , Schmidt, T. , Vizcaino, J. and Deutsch, E. (2023), ProteomicsML: An Online Platform for Community-Curated Datasets and Tutorials for Machine Learning in Proteomics, ACS Journal of Proteome Research, [online], https://doi.org/10.1021/acs.jproteome.2c00629, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=935701 (Accessed January 28, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created January 24, 2023, Updated January 27, 2023