AN ALGORITHM FOR DERIVING COMBINATORIAL BIOMARKERS BASED ON RIDGE REGRESSION

Main Article Content

Maxim Terpilowski
Ekaterina Korf
Richard Jenkins
Nikolay Goncharov

Abstract

Motivation: Combinatorial biomarkers are considered more specific and sensitive than single markers in medical diagnostics and prediction, yet even detection of such these combinatorial biomarkers requires deep computational analysis. The principles of analytic combinatorics, linear and kernel ridge regression, and machine learning were applied to derive new combinatorial biomarkers of muscle damage.


Results: Lactate, phosphate, and middle-chain fatty acids were most often included into biochemical combinatorial markers, while the following physiological parameters were found to be prevalent: muscle isometric strength, H-reflex length, and contraction tone. Several strongly correlated combinatorial biomarkers of muscle damage with high prediction accuracy scores were identified. The approach — based on computational methods, regression algorithms and machine learning — provides a flexible, platform independent and highly extendable means of discovery and evaluation of combinatorial biomarkers alongside current diagnostic tools.


Availability: The developed algorithm was implemented in Python programming language on a quantitative dataset comprising 23 biochemical parameters, 37 physiological parameters and 3,903 observations. The algorithm and our dataset are available free of charge on GitHub.


Supplementary information: Supplementary data are available at Journal of Bioinformatics and Genomics online.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Article Details

How to Cite
Terpilowski, M., Korf, E., Jenkins, R., & Goncharov, N. (2018). AN ALGORITHM FOR DERIVING COMBINATORIAL BIOMARKERS BASED ON RIDGE REGRESSION. JOURNAL OF BIOINFORMATICS AND GENOMICS, (1 (6). https://doi.org/10.18454/jbg.2018.1.6.2
Section
Research in Biology using computation

References

Kim, H. J., Lee, Y. H., & Kim, C. K. (2007). Biomarkers of muscle and cartilage damage and inflammation during a 200 km run. European journal of applied physiology, 99(4), 443-447. doi: 10.1007/s00421-006-0362-y

Nie, J., Tong, T. K., George, K., Fu, F. H., Lin, H., & Shi, Q. (2011). Resting and post‐exercise serum biomarkers of cardiac and skeletal muscle damage in adolescent runners. Scandinavian journal of medicine & science in sports, 21(5), 625-629. doi: 10.1111/j.1600-0838.2010.01096.x

Morozov, V., Kalinski, M., & Peake, J. (2011). Exercise and Cellular Mechanisms of Muscle Injury. Nova Science Publications. http://eprints.qut.edu.au/59912/

Ohlendieck, K. (2013). Proteomic identification of bi-omarkers of skeletal muscle disorders. Biomarkers, 7(1), 169-186. doi: 10.2217/bmm.12.96

Rebalka, I. A., Hawke, T. J. (2014). Potential biomarkers of skeletal muscle damage. Biomarkers, 8(3), 375-378. doi: 10.2217/bmm.13.163

Burch, P. M., Glaab, W. E. (2016). Novel Translational Biomarkers of Skeletal Muscle Injury. In: Drug Discovery Toxicology: From Target Assessment to Translational Biomarkers, 407-415.

Rakha, E. A., Reis-Filho, J. S., & Ellis, I. O. (2010). Combinatorial biomarker expression in breast cancer. Breast cancer research and treatment, 120(2), 293-308. doi: 10.1007/s10549-010-0746-x

Goncharov, N. V., Ukolov, A. I., Orlova, T. I., Mig-alovskaia, E. D., & Voitenko, N. G. (2015). Metabolomics: On the way to an integration of biochemistry, analytical chemistry, and informatics. Biology Bulletin Reviews, 5(4), 296-307. doi: 10.1134/S2079086415040027

Voitenko, N. G., Garniuk, V. V., Prokofieva, D. S., & Gontcharov, N. V. (2015). On new screening biomarker to evaluate health state in personnel engaged into chemical weapons extinction. Meditsina truda i promyshlennaia ekologiia, (3), 38-42. PMID: 26036023

Koop, R. (2005). Combinatorial biomarkers: from early toxicology assays to patient population profiling. Drug discovery today, 10(11), 781-788. doi: 10.1016/S1359-6446(05)03440-9

Kotthoff, L. (2016). Algorithm selection for combinato-rial search problems: A survey. In: Data Mining and Con-straint Programming (pp. 149-190). Springer, Cham. doi: 10.1007/978-3-319-50137-6_7

Buteneers, P., Caluwaerts, K., Dambre, J., Verstraeten, D., & Schrauwen, B. (2013). Optimized parameter search for large datasets of the regularization parameter and fea-ture selection for ridge regression. Neural processing let-ters, 38(3), 403-416. doi: 10.1007/s11063-013-9279-8

Pérez, F., Granger, B. E. (2007). IPython: a system for in-teractive scientific computing. Computing in Science & Engineering, 9(3). doi: 10.1109/MCSE.2007.53

Walt, S. V. D., Colbert, S. C., & Varoquaux, G. (2011). The NumPy array: a structure for efficient numerical com-putation. Computing in Science & Engineering, 13(2), 22-30. doi: 10.1109/MCSE.2011.37

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct), 2825-2830.

Hunter, J. D. (2007). Matplotlib: A 2D graphics envi-ronment. Computing In Science & Engineering, 9(3), 90-95. doi: 10.1109/MCSE.2007.55

McKinney, W. (2010). Data structures for statistical computing in python. In: Proceedings of the 9th Python in Science Conference (Vol. 445, pp. 51-56). Austin, TX: SciPy.

Barcucci, E., Lungo, A. D., Pergola, E., & Pinzani, R. (1999). ECO: a methodology for the enumeration of com-binatorial objects. Journal of Difference Equations and Applications, 5(4-5), 435-490.

Kriete, A. (2006). Biomarkers of aging: combinatorial or systems model? Science's SAGE KE, 2006(1), pe1. doi: 10.1126/sageke.2006.1.pe1