Application Of Wavelets To Improve Cancer Diagnosis Model In High Dimensional Linguistic DNA Microarray Datasets

Document Type : Research Paper


Department of computer science, faculty of basic sciences, kosar university of bojnord, bojnord, Iran.



DNA microarray datasets suffer scaling and uncertainty problems. This paper develops a model that manages DNA microarray datasets challenges more precisely by using the advantages of Wavelet decomposition and fuzzy numbers. For this aim, the proposed method is utilized to classify linguistic DNA microarray datasets set, where datasets can be given as linguistic genes. Linguistic genes are represented by using triangular fuzzy numbers provided as LR (left-right) fuzzy numbers. Then the WABL method is applied as the defuzzification method. Also, a set of orthogonal wavelet detail coefficients based on wavelet decomposition at different levels is extracted to specify the localized genes of DNA microarray datasets. Three DNA microarray datasets are used to evaluate this method. The experiments are shown that the proposed model has better diagnostic accuracy than other methods.


[1] A. Alkuhlani, M. Nassef and I.  Farag, Multistage feature selection approach for high-dimensional cancer data, Soft Comput., 21 (2017), 6895-6906.
[2] A. Grossmann and J. Morlet, Decomposition of hardy functions into square integrable wavelets of constant shape,   SIAM J. Math. Anal., 15(4) (1984), 723-736.
[3] A.M.S Roque, C. Mate, j. Arroyo and A. Sarabia, Imlp: applying multi-layer perceptrons to interval-valued data. Neural processing letters, SIAM J. Math. Anal., 25 (2007), 157-169.
[4] D.S Huang and CH. Zheng, Independent component analysis-based penalized discriminant Method for tumor classification using gene expression data, Biostat. Bioinform. Biomath., 22(15) (2006) 1855-1862.

[5] E. Nasibov and A. Mert, On methods of defuzzification of parametrically represented fuzzy numbers, Autom. Control Comput. Sci., 41 (2007), 265-273.
[6] E. Nasibov, Aggregation of fuzzy information on the basis of decompositional representation, Cybern. Syst. Anal., 41(2), (2005), 309-318.
[7] L. Yang and Z. Xu, Feature extraction by pca and diagnosis of breast tumors using SVM with de-based parameter tuning, International Journal of Machine Learning and Cybernetics, 10 (2017), 591-601.
[8] L.A. Zadeh, The concept of linguistic variable and its application to approximate reasoning, Inf. Sci., 8(3) (1975), 199-249.
[9] Z.M. Hira and D.F. Gillies, A review of feature selection and feature extraction methods applied on microarray data, Advances in Bioinformatics, 2015(5) (2015), 1-13. 

[10] R.E. Moore, R.B. Kearfott and M.J. Cloud, Introduction to Interval Analysis, Society for Industrial and Applied Mathematics Philadelphia, 2009.
[11] S. Sarbazi-azad, M. Saniee Abadeh and M.E. Mowlaei, Using data complexity measures and an evolutionary cultural algorithm for gene selection in microarray data, Soft computing letters, DOI:10.1016/j.socl.2020.100007, 2020.
[12] S. Tabakhi, A. Njafi. R. Ranjbar and P. Moradi, Gene selection for microarray data classification using a novel ant colony optimization, Neurocomputing, 168 (2015), 1024-1036.
[13] S.K Sava and E. Nasibov, A fuzzy id3 induction for linguistic data sets, International journal of intelligent systems, 33(4) (2018), 1-21.
[14] S.M Saqlainshah, F. Alishah, S.A Hussain and S. Batool, Support vector machines-based heart disease diagnosis using feature subset, wrapping selection and extraction methods, Comput. Electr. Eng., 84(10) (2020), DOI:10.1016/j.compeleceng.2020.106628.
[15] S.O.M kasha and M. A Akbarzadeh, A framework for short-term traffic flow forecasting using the combination of wavelet transformation and artificial neural networks, J. Intell. Transp. Syst., 23(1) (2019), 1-12.
[16] X. Zheng, W. Zhu, CH. Tang and M.  Wang, Gene selection for microarray data classification via adaptive hypergraph embedded dictionary learning, Gene, 706 (2019), 188-200.
[17] Y. Liu, Wavelet feature extraction for high-dimensional microarray data, Neurocomputing, 72 (2009), 985-990.
[18] X. Wu and et al., Top 10 algorithms in data mining, Knowledge and Information Systems, 14 (2008), 1-37.
[19] J. Basavaiah and A.A. Anthony, Tomato Leaf Disease Classification using Multiple Feature Extraction Techniques, Wireless Personal Communications, 115 (2020), 633-651.