AN ALGORITHM FOR DERIVING COMBINATORIAL BIOMARKERS BASED ON RIDGE REGRESSION

Main Article Content

Maxim Alexandrovich Terpilowski, Dr.
Ekaterina A. Korf
Richard Owen Jenkins, Dr.
Nikolay V. Goncharov, Dr.

Abstract

Motivation: Combinatorial biomarkers are considered more specific and sensitive than single markers in medical diagnostics and prediction, yet even detection of such these combinatorial biomarkers requires deep computational analysis. The principles of analytic combinatorics, linear and kernel ridge regression, and machine learning were applied to derive new combinatorial biomarkers of muscle damage.


Results: Lactate, phosphate, and middle-chain fatty acids were most often included into biochemical combinatorial markers, while the following physiological parameters were found to be prevalent: muscle isometric strength, H-reflex length, and contraction tone. Several strongly correlated combinatorial biomarkers of muscle damage with high prediction accuracy scores were identified. The approach — based on computational methods, regression algorithms and machine learning — provides a flexible, platform independent and highly extendable means of discovery and evaluation of combinatorial biomarkers alongside current diagnostic tools.


Availability: The developed algorithm was implemented in Python programming language on a quantitative dataset comprising 23 biochemical parameters, 37 physiological parameters and 3,903 observations. The algorithm and our dataset are available free of charge on GitHub.


Supplementary information: Supplementary data are available at Journal of Bioinformatics and Genomics online.

Article Details

How to Cite
TERPILOWSKI, Maxim Alexandrovich et al. AN ALGORITHM FOR DERIVING COMBINATORIAL BIOMARKERS BASED ON RIDGE REGRESSION. Journal of Bioinformatics and Genomics, [S.l.], n. 1 (6), feb. 2018. ISSN 2530-1381. Available at: <http://journal-biogen.org/article/view/82>. Date accessed: 24 may 2018. doi: http://dx.doi.org/10.18454/jbg.2018.1.6.2.
Section
Research in Biology using computation
References
Kim, H. J., Lee, Y. H., & Kim, C. K. (2007). Biomarkers of muscle and cartilage damage and inflammation during a 200 km run. European journal of applied physiology, 99(4), 443-447. doi: 10.1007/s00421-006-0362-y
Nie, J., Tong, T. K., George, K., Fu, F. H., Lin, H., & Shi, Q. (2011). Resting and post‐exercise serum biomarkers of cardiac and skeletal muscle damage in adolescent runners. Scandinavian journal of medicine & science in sports, 21(5), 625-629. doi: 10.1111/j.1600-0838.2010.01096.x
Morozov, V., Kalinski, M., & Peake, J. (2011). Exercise and Cellular Mechanisms of Muscle Injury. Nova Science Pub-lications. http://eprints.qut.edu.au/59912/
Ohlendieck, K. (2013). Proteomic identification of bi-omarkers of skeletal muscle disorders. Biomarkers, 7(1), 169-186. doi: 10.2217/bmm.12.96
Rebalka, I. A., Hawke, T. J. (2014). Potential biomarkers of skeletal muscle damage. Biomarkers, 8(3), 375-378. doi: 10.2217/bmm.13.163
Burch, P. M., Glaab, W. E. (2016). Novel Translational Bi-omarkers of Skeletal Muscle Injury. In: Drug Discovery Toxi-cology: From Target Assessment to Translational Biomarkers, 407-415.
Rakha, E. A., Reis-Filho, J. S., & Ellis, I. O. (2010). Com-binatorial biomarker expression in breast cancer. Breast cancer research and treatment, 120(2), 293-308. doi: 10.1007/s10549-010-0746-x
Goncharov, N. V., Ukolov, A. I., Orlova, T. I., Mig-alovskaia, E. D., & Voitenko, N. G. (2015). Metabolomics: On the way to an integration of biochemistry, analytical chemistry, and informatics. Biology Bulletin Reviews, 5(4), 296-307. doi: 10.1134/S2079086415040027

Voitenko, N. G., Garniuk, V. V., Prokofieva, D. S., & Gontcharov, N. V. (2015). On new screening biomarker to evaluate health state in personnel engaged into chemical weap-ons extinction. Meditsina truda i promyshlennaia ekologiia, (3), 38-42. PMID: 26036023
Koop, R. (2005). Combinatorial biomarkers: from early tox-icology assays to patient population profiling. Drug discovery today, 10(11), 781-788. doi: 10.1016/S1359-6446(05)03440-9
Kotthoff, L. (2016). Algorithm selection for combinatorial search problems: A survey. In: Data Mining and Constraint Programming (pp. 149-190). Springer, Cham. doi: 10.1007/978-3-319-50137-6_7
Buteneers, P., Caluwaerts, K., Dambre, J., Verstraeten, D., & Schrauwen, B. (2013). Optimized parameter search for large datasets of the regularization parameter and feature selection for ridge regression. Neural processing letters, 38(3), 403-416. doi: 10.1007/s11063-013-9279-8
Pérez, F., Granger, B. E. (2007). IPython: a system for in-teractive scientific computing. Computing in Science & Engi-neering, 9(3). doi: 10.1109/MCSE.2007.53
Walt, S. V. D., Colbert, S. C., & Varoquaux, G. (2011). The NumPy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2), 22-30. doi: 10.1109/MCSE.2011.37
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct), 2825-2830.
Hunter, J. D. (2007). Matplotlib: A 2D graphics environ-ment. Computing In Science & Engineering, 9(3), 90-95. doi: 10.1109/MCSE.2007.55
McKinney, W. (2010). Data structures for statistical compu-ting in python. In: Proceedings of the 9th Python in Science Conference (Vol. 445, pp. 51-56). Austin, TX: SciPy.
Barcucci, E., Lungo, A. D., Pergola, E., & Pinzani, R. (1999). ECO: a methodology for the enumeration of combina-torial objects. Journal of Difference Equations and Applications, 5(4-5), 435-490.
Kriete, A. (2006). Biomarkers of aging: combinatorial or systems model? Science's SAGE KE, 2006(1), pe1. doi: 10.1126/sageke.2006.1.pe1