Active Effects Selection which Considers Heredity Principle in Multi-Factor Experiment Data Analysis

  • Bagus Sartono Bogor Agricultural University (IPB)
  • Achmad Syaiful Bogor Agricultural University (IPB)
  • Dian Ayuningtyas Bogor Agricultural University (IPB)
  • Farit Mochamad Afendi Bogor Agricultural University (IPB)
  • Rahma Anisa Bogor Agricultural University (IPB)
  • Agus Salim La Trobe University
Keywords: Factorial Experiments, Genetic Algorithm, Heredity Priciple, Variable Selection

Abstract

The sparsity principle suggests that the number of effects that contribute significantly to the response variable of an experiment is small.  It means that the researchers need an efficient selection procedure to identify those active effects.  Most common procedures can be found in literature work by considering an effect as an individual entity so that selection process works on individual effect.  Another principle we should consider in experimental data analysis is the heredity principle. This principle allows an interaction effect is included in the model only if the correspondence main effects are there in.  This paper addresses the selection problem that takes into account the heredity principle as Yuan et al. (2007) did using least angle regression (LARS).  Instead of selecting the effects individually, the proposed approach perform the selection process in groups.  The advantage our proposed approach, using genetic algorithm, is on the opportunity to determine the number of desired effect, which the LARS approach cannot.

References

Aalaei, S., H. Shahraki, A. Rowhanimanesh, and S. Eslami (2016). Feature selection using genetic algorithm for breast cancer diagnosis experiment on three different datasets. Iranian Journal of Basic Medical Sciences 19, 476–482.

Algamal, Z. Y. (2019). Variable selection in count data regression model based on firefly algorithm. Stat. Optim. Inf. Comput. 7, 520–529.

Asadzadeh, L. and Zamanifar, K. (2010). An agent-based parallel approach for the job shop scheduling problem with genetic algorithms. Mathematical and Computer Modelling, 52:1957-1965.

Broadhurst, D., Goodacre, R., Jones, A., Rowland, J. J., and Kell, D. B. (1997). Genetic algorithms as a method for variable selection in multiple linear regression and partial least squares regression, with applications to pyrolysis mass spectrometry. Analytica Chimica Acta, 348:71-86.

Efron, B., Hastie, T., Johnstone, I., and Tishibrani, R. (2004). Least angle regression. The Annals of Statistics, 32:407-499.

Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96:1348-1360.

Georgiou, S. (2014). Supersaturated designs: A review of their construction and analysis. Journal of Statistical Planning and Inference, 144:92-109.

Hamada, M. and Wu, C. F. J. (1992). Analysis of designed experiments with complex aliasing. Journal of Quality Technology, 24:130-137.

Lesiak, P. and Bojarczyk, P. (2015). Application of genetic algorithms in design of public transport network. Logistics and Transport, 52:75-81.

Meier, L., van de Geer, S., and Buhlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society. Series B (methodological), 70:53-71.

Raghavarao, D. and Altan, S. (2003). A heuristic analysis of highly fractionated 2n factorial experiments. Metrika, 156:185-191.

Rais, F., Kamoun, A., Chaabouni, M., Claeys-Bruno, M., Phan-Tan-Luu, R., and Sergent, M. (2009). Supersaturated design for screening factors in uencing the preparation of sulfated amides of olive pomace oil fatty acids. Chemometrics and Intelligent Laboratory Systems, 99:71-78.

Rawlings, J., Pantula, S., and Dickey, D. A. (1998). Applied Regression Analysis: A Research Tool, Second Edition. Springer.

Schoen, E. D., Eendebak, P. T., and Nguyen, M. V. M. (2010). Complete enumeration of pure-level and mixed-level orthogonal arrays. Journal of Combinatorial Designs, 18:123-140.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (methodological), 58:267-288.

Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., and Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society. Series B (methodological), 67:91-108.

Umbarkar, A. and P. Sheth (2015). Crossover operators in genetic algorithms: a review. ICTACT Journal on Soft Computing 6, 1083–1092.

Vafaie, H. and De Jong, K. (1992). Genetic algorithms as a tool for feature selection in machine learning. In Proceeding of the 4th International Conference on Tools with Artificial Intelligence.

Vandewater, L., Brusic, V., Wilson, W., Macaulay, L., and Zhang, P. (2015). An adaptive genetic algorithm for selection of blood-based biomarkers for prediction of alzheimer's disease progression. BMC Bioinformatics, 16:1-10.

Wu, C. F. J. and Hamada, M. (2000). Experiments: Planning, Analysis and Parameter Design Optimization. Wiley, New York.

Yang, J. and Honavar, V. (1997). Feature subset selection using a genetic algorithm. Computer Science Technical Reports, 156.

Yuan, M., Joseph, V. R., and Lin, Y. (2007). An ecient variable selection approach for analyzing designed experiments. Technometrics, 49:430-438.

Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society. Series B (methodological), 68:49-67.

Zelenkov, Y., Fedorova, E., and Chekrizov, D. (2017). Two-step classication method based on genetic algorithm for bankruptcy forecasting. Expert Systems with Applications, 88:393-401.

Published
2020-05-27
How to Cite
Sartono, B., Syaiful, A., Ayuningtyas, D., Afendi, F. M., Anisa, R., & Salim, A. (2020). Active Effects Selection which Considers Heredity Principle in Multi-Factor Experiment Data Analysis. Statistics, Optimization & Information Computing, 8(2), 414-424. https://doi.org/10.19139/soic-2310-5070-628
Section
Research Articles