Welcome to Imputomics!

Improve your metabolomics analysis by addressing missing values with ease.


About

This application incorporates R package imputomics. If you prefer to directly access R functions, the package is available on GitHub.
We appreciate your choice to use our application. If you have any questions, feedback, or suggestions, please submit them as an issue on GitHub. Your input is valuable to us!

Our app offers:

1. Seamless Integration: Import datasets in various formats, such as Excel, CSV.
2. Visualize Missingness: Gain insights into missing data patterns.
3. Customizable Imputation: Choose from the largest set of imputation methods.
4. Performance Evaluation: Compare imputation strategies for optimal results.
5. Export and Integration: Export completed datasets and integrate with popular analysis platforms.
6. Secure and Confidential: Your data privacy is our top priority.

How to cite?

Jarosław Chilimoniuk, Krystyna Grzesiak, Jakub Kała, Dominik Nowakowski, Adam Krętowski, Rafał Kolenda, Michał Ciborowski, Michał Burdukiewicz (2023). imputomics: web server and R package for missing values imputation in metabolomics data, Bioinformatics, 10.1093/bioinformatics/btae098.

Contact

If you have any questions, suggestions or comments, contact Michal Burdukiewicz.

Funding and acknowledgements

We want to thank the Clinical Research Centre (Medical University of Białystok) members for fruitful discussions. K.G. wants to acknowledge grant no. 2021/43/O/ST6/02805 (National Science Centre). M.C. acknowledges grant no. B.SUB.23.533 (Medical University of Białystok). The study was supported by the Ministry of Education and Science funds within the project 'Excellence Initiative - Research University'. We also acknowledge the Center for Artificial Intelligence at the Medical University of Białystok (funded by the Ministry of Health of the Republic of Poland).

Here you can upload your data!


1. Select your metabolomics dataset in CSV, Excel or RDS:
2. Select missing value denotement:

Click below to upload example data!


Loading...

Exclude variables from imputation data.

Choose from the numeric columns which ones to exclude. Categorical/text columns will be automatically ignored.



Columns to be ignored during imputation:



Loading...

Plot settings:




Loading...

Plot settings:



Loading...

Variables removal

1. Set threshold for missing values ratio


2. Set groups (optional)


3. Click Remove!



The following variables will be removed:

Ratio of missing data per group [%]


Loading...

Venn diagram:


Loading...

Let's impute your missing values!

Select one or more imputation methods from the list below and click impute!
Check the references panel for the references of all imputations methods.

Specify time limit per method below.



Usage note:

Searching for methods based on fitting a predefined hypothesis is associated with the issue of multiple comparisons and may lead to significant overfitting. When choosing an imputation method, prioritize the structure of your data over preconceived notions. The effectiveness of analytical methods varies depending on dataset characteristics. Consider nuances such as distribution and missing value patterns for a more accurate and reliable outcome, adhering to best practices in data analysis.



*The most resource demanding tools (MAI and Gibbs Sampler based methods) are available only in the R package (https://github.com/BioGenies/imputomics).
 
0%

Here you can check the results!

Imputation summary:


Success:


Error:

Loading...

Data preview:

Loading...

Plot settings:





Loading...

Select methods from the list below

... and click to download the results!

References of missing value imputation algorithms

The following references are the source references for missing value imputation algorithms included in the web server.

  1. Armitage EG, Godzien J, Alonso-Herranz V, López-Gonzálvez Á, Barbas C (2015). “Missing Value Imputation Strategies for Metabolomics Data: General.” ELECTROPHORESIS, 36(24), 3050-3060. ISSN 01730835, doi:10.1002/elps.201500352 https://doi.org/10.1002/elps.201500352.

  2. Borowski J, Fic P (2022). “NADIA: NA Data Imputation Algorithms.”

  3. van Buuren S, Groothuis-Oudshoorn K (2011). “Mice: Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software, 45, 1-67. ISSN 1548-7660, doi:10.18637/jss.v045.i03 https://doi.org/10.18637/jss.v045.i03.

  4. Chen LS, Prentice RL, Wang P (2014). “A Penalized EM Algorithm Incorporating Missing Data Mechanism for Gaussian Parameter Estimation.” Biometrics, 70(2), 312-322. ISSN 0006-341X, doi:10.1111/biom.12149 https://doi.org/10.1111/biom.12149.

  5. Davis TJ, Firzli TR, Higgins Keppler EA, Richardson M, Bean HD (2022). “Addressing Missing Data in GC × GC Metabolomics: Identifying Missingness Type and Evaluating the Impact of Imputation Methods on Experimental Replication.” Analytical Chemistry, 94(31), 10912-10920. ISSN 0003-2700, doi:10.1021/acs.analchem.1c04093 https://doi.org/10.1021/acs.analchem.1c04093.

  6. Dekermanjian J, Shaddox E, N D, y, Ghosh D, Kechris K (2023). “MAI: Mechanism-Aware Imputation.” Bioconductor version: Release (3.16). doi:10.18129/B9.bioc.MAI https://doi.org/10.18129/B9.bioc.MAI.

  7. Dekermanjian JP, Shaddox E, Nandy D, Ghosh D, Kechris K (2022). “Mechanism-Aware Imputation: A Two-Step Approach in Handling Missing Values in Metabolomics.” BMC Bioinformatics, 23(1), 179. ISSN 1471-2105, doi:10.1186/s12859-022-04659-1 https://doi.org/10.1186/s12859-022-04659-1.

  8. Di Guida R, Engel J, Allwood JW, Weber RJM, Jones MR, Sommer U, Viant MR, Dunn WB (2016). “Non-Targeted UHPLC-MS Metabolomic Data Processing Methods: A Comparative Investigation of Normalisation, Missing Value Imputation, Transformation and Scaling.” Metabolomics, 12(5), 93. ISSN 1573-3882, 1573-3890, doi:10.1007/s11306-016-1030-9 https://doi.org/10.1007/s11306-016-1030-9.

  9. Faquih T (2022). “Tofaquih/Imputation_ of_ untargeted_ metabolites: Official Release v1.4.” Zenodo. doi:10.5281/zenodo.6347808 https://doi.org/10.5281/zenodo.6347808.

  10. Faquih T, van Smeden M, Luo J, le Cessie S, Kastenmüller G, Krumsiek J, Noordam R, van Heemst D, Rosendaal FR, van Hylckama Vlieg A, Willems van Dijk K, Mook-Kanamori DO (2020). “A Workflow for Missing Values Imputation of Untargeted Metabolomics Data.” Metabolites, 10(12), 486. ISSN 2218-1989, doi:10.3390/metabo10120486 https://doi.org/10.3390/metabo10120486.

  11. Fouad KM, Ismail MM, Azar AT, Arafa MM (2021). “Advanced Methods for Missing Values Imputation Based on Similarity Learning.” PeerJ Computer Science, 7, e619. ISSN 2376-5992, doi:10.7717/peerj-cs.619 https://doi.org/10.7717/peerj-cs.619.

  12. Hastie T, Tibshirani R, Narasimhan B, Chu G (2023). “Impute: Impute: Imputation for Microarray Data.” Bioconductor version: Release (3.16). doi:10.18129/B9.bioc.impute https://doi.org/10.18129/B9.bioc.impute.

  13. Honaker J, King G, Blackwell M (2011). “Amelia II: A Program for Missing Data.” Journal of Statistical Software, 45, 1-47. ISSN 1548-7660, doi:10.18637/jss.v045.i07 https://doi.org/10.18637/jss.v045.i07.

  14. Jäger S, Allhorn A, Bießmann F (2021). “A Benchmark for Data Imputation Methods.” Frontiers in Big Data, 4, 693674. ISSN 2624-909X, doi:10.3389/fdata.2021.693674 https://doi.org/10.3389/fdata.2021.693674.

  15. Jin Z, Kang J, Yu T (2018). “Missing Value Imputation for LC-MS Metabolomics Data by Incorporating Metabolic Network and Adduct Ion Relations.” Bioinformatics, 34(9), 1555-1561. ISSN 1367-4803, 1367-4811, doi:10.1093/bioinformatics/btx816 https://doi.org/10.1093/bioinformatics/btx816.

  16. Josse J, Husson F (2016). “missMDA: A Package for Handling Missing Values in Multivariate Data Analysis.” Journal of Statistical Software, 70, 1-31. ISSN 1548-7660, doi:10.18637/jss.v070.i01 https://doi.org/10.18637/jss.v070.i01.

  17. Jr FEH, several functions and maintains latex functions) CD( (2023). “Hmisc: Harrell Miscellaneous.”

  18. Kokla M, Virtanen J, Kolehmainen M, Paananen J, Hanhineva K (2019). “Random Forest-Based Imputation Outperforms Other Methods for Imputing LC-MS Metabolomics Data: A Comparative Study.” BMC Bioinformatics, 20(1), 492. ISSN 1471-2105, doi:10.1186/s12859-019-3110-0 https://doi.org/10.1186/s12859-019-3110-0.

  19. Kowarik A, Templ M (2016). “Imputation with the R Package VIM.” Journal of Statistical Software, 74, 1-16. ISSN 1548-7660, doi:10.18637/jss.v074.i07 https://doi.org/10.18637/jss.v074.i07.

  20. Kumar N, Hoque MA, Sugimoto M (2021). “Kernel Weighted Least Square Approach for Imputing Missing Values of Metabolomics Data.” Scientific Reports, 11(1), 11108. ISSN 2045-2322, doi:10.1038/s41598-021-90654-0 https://doi.org/10.1038/s41598-021-90654-0.

  21. Kumar N, Hoque MA, Shahjaman \M, Islam SS, Mollah MNH (2018). “A New Approach of Outlier-robust Missing Value Imputation for Metabolomics Data Analysis.” Current Bioinformatics, 14(1), 43-52. ISSN 15748936, doi:10.2174/1574893612666171121154655 https://doi.org/10.2174/1574893612666171121154655.

  22. Lazar C, Burger T, Wieczorek S (2022). “imputeLCMD: A Collection of Methods for Left-Censored Missing Data Imputation.”

  23. Lee JY, Styczynski MP (2018). “NS-kNN: A Modified k-Nearest Neighbors Approach for Imputing Metabolomics Data.” Metabolomics, 14(12), 153. ISSN 1573-3882, 1573-3890, doi:10.1007/s11306-018-1451-8 https://doi.org/10.1007/s11306-018-1451-8.

  24. Li Q, Fisher K, Meng W, Fang B, Welsh E, Haura EB, Koomen JM, Eschrich SA, Fridley BL, Chen YA (2020). “GMSimpute: A Generalized Two-Step Lasso Approach to Impute Missing Values in Label-Free Mass Spectrum Analysis.” Bioinformatics, 36(1), 257-263. ISSN 1367-4803, 1367-4811, doi:10.1093/bioinformatics/btz488 https://doi.org/10.1093/bioinformatics/btz488.

  25. Liaw A, Wiener M (2002). “Classification and Regression by randomForest.” R News, 2(3), 18-22.

  26. Ma W, Kim S, Chowdhury S, Li Z, Yang M, Yoo S, Petralia F, Jacobsen J, Li JJ, Ge X, Li K, Yu T, Calinawan AP, Edwards N, Payne SH, Boutros PC, Rodriguez H, Stolovitzky G, Zhu J, Kang J, Fenyo D, Saez-Rodriguez J, Wang P (2021). “DreamAI: Algorithm for the Imputation of Proteomics Data.” doi:10.1101/2020.07.21.214205 https://doi.org/10.1101/2020.07.21.214205.

  27. Mazumder THaR (2021). “softImpute: Matrix Completion via Iterative Soft-Thresholded SVD.”

  28. Miller HA, Emam R, Lynch CM, Bockhorst S, Frieboes HB (2021). “Discrepancies in Metabolomic Biomarker Identification from Patient-Derived Lung Cancer Revealed by Combined Variation in Data Pre-Treatment and Imputation Methods.” Metabolomics, 17(4), 37. ISSN 1573-3882, 1573-3890, doi:10.1007/s11306-021-01787-2 https://doi.org/10.1007/s11306-021-01787-2.

  29. NishithPaul (2021). “NishithPaul/missingImputation.”

  30. Razzaghi T, Roderick O, Safro I, Marko N (2016). “Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values.” PLOS ONE, 11(5), e0155119. ISSN 1932-6203, doi:10.1371/journal.pone.0155119 https://doi.org/10.1371/journal.pone.0155119.

  31. Samad MD, Abrar S, Diawara N (2022). “Missing Value Estimation Using Clustering and Deep Learning within Multiple Imputation Framework.” Knowledge-Based Systems, 249, 108968. ISSN 09507051, doi:10.1016/j.knosys.2022.108968 https://doi.org/10.1016/j.knosys.2022.108968.

  32. Shah J, Brock GN, Gaskins J (2019). “BayesMetab: Treatment of Missing Values in Metabolomic Studies Using a Bayesian Modeling Approach.” BMC Bioinformatics, 20(S24), 673. ISSN 1471-2105, doi:10.1186/s12859-019-3250-2 https://doi.org/10.1186/s12859-019-3250-2.

  33. Shah JS, Rai SN, DeFilippis AP, Hill BG, Bhatnagar A, Brock GN (2017). “Distribution Based Nearest Neighbor Imputation for Truncated High Dimensional Data with Applications to Pre-Clinical and Clinical Metabolomics Studies.” BMC Bioinformatics, 18(1), 114. ISSN 1471-2105, doi:10.1186/s12859-017-1547-6 https://doi.org/10.1186/s12859-017-1547-6.

  34. Shahjaman \M, Rahman MR, Islam T, Auwul MR, Moni MA, Mollah MNH (2021). “rMisbeta: A Robust Missing Value Imputation Approach in Transcriptomics and Metabolomics Data.” Computers in Biology and Medicine, 138, 104911. ISSN 00104825, doi:10.1016/j.compbiomed.2021.104911 https://doi.org/10.1016/j.compbiomed.2021.104911.

  35. Stacklies W, Redestig H, Scholz M, Walther D, Selbig J (2007). “pcaMethods—a Bioconductor Package Providing PCA Methods for Incomplete Data.” Bioinformatics, 23(9), 1164-1167. ISSN 1367-4803, doi:10.1093/bioinformatics/btm069 https://doi.org/10.1093/bioinformatics/btm069.

  36. Stekhoven DJ, Bühlmann P (2012). “MissForest—Non-Parametric Missing Value Imputation for Mixed-Type Data.” Bioinformatics, 28(1), 112-118. ISSN 1367-4803, doi:10.1093/bioinformatics/btr597 https://doi.org/10.1093/bioinformatics/btr597.

  37. Su Y, Gelman A, Hill J, Yajima M (2011). “Multiple Imputation with Diagnostics (Mi) in R: Opening Windows into the Black Box.” Journal of Statistical Software, 45, 1-31. ISSN 1548-7660, doi:10.18637/jss.v045.i02 https://doi.org/10.18637/jss.v045.i02.

  38. Taylor S, Ponzini M, Wilson M, Kim K (2022). “Comparison of Imputation and Imputation-Free Methods for Statistical Analysis of Mass Spectrometry Data with Missing Data.” Briefings in Bioinformatics, 23(1), bbab353. ISSN 1467-5463, 1477-4054, doi:10.1093/bib/bbab353 https://doi.org/10.1093/bib/bbab353.

  39. Taylor SL, Ruhaak LR, Kelly K, Weiss RH, Kim K (2016). “Effects of Imputation on Correlation: Implications for Analysis of Mass Spectrometry Data from Multiple Biological Matrices.” Briefings in Bioinformatics, bbw010. ISSN 1467-5463, 1477-4054, doi:10.1093/bib/bbw010 https://doi.org/10.1093/bib/bbw010.

  40. Varga TV, Westergaard D (2020). “missCompare: Intuitive Missing Data Imputation Framework.”

  41. Wei R, Wang J, Jia E, Chen T, Ni Y, Jia W (2018). “GSimp: A Gibbs Sampler Based Left-Censored Missing Value Imputation Approach for Metabolomics Studies.” PLOS Computational Biology, 14(1), e1005973. ISSN 1553-7358, doi:10.1371/journal.pcbi.1005973 https://doi.org/10.1371/journal.pcbi.1005973.

  42. Wei R, Wang J, Su M, Jia E, Chen S, Chen T, Ni Y (2018). “Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.” Scientific Reports, 8(1), 663. ISSN 2045-2322, doi:10.1038/s41598-017-19120-0 https://doi.org/10.1038/s41598-017-19120-0.

  43. Wilson MD, Ponzini MD, Taylor SL, Kim K (2022). “Imputation of Missing Values for Multi-Biospecimen Metabolomics Studies: Bias and Effects on Statistical Validity.” Metabolites, 12(7), 671. ISSN 2218-1989, doi:10.3390/metabo12070671 https://doi.org/10.3390/metabo12070671.

  44. Xu D, Hu PJ, Huang T, Fang X, Hsu C (2020). “A Deep Learning\textendash Based, Unsupervised Method to Impute Missing Values in Electronic Health Records for Improved Patient Management.” Journal of Biomedical Informatics, 111, 103576. ISSN 15320464, doi:10.1016/j.jbi.2020.103576 https://doi.org/10.1016/j.jbi.2020.103576.

  45. Xu J, Wang Y, Xu X, Cheng K, Raftery D, Dong J (2021). “NMF-Based Approach for Missing Values Imputation of Mass Spectrometry Metabolomics Data.” Molecules, 26(19), 5787. ISSN 1420-3049, doi:10.3390/molecules26195787 https://doi.org/10.3390/molecules26195787.