Statistics, Optimization & Information Computing
http://iapress.org/index.php/soic
<p><em><strong>Statistics, Optimization and Information Computing</strong></em> (SOIC) is an international refereed journal dedicated to the latest advancement of statistics, optimization and applications in information sciences. Topics of interest are (but not limited to): </p> <p>Statistical theory and applications</p> <ul> <li class="show">Statistical computing, Simulation and Monte Carlo methods, Bootstrap, Resampling methods, Spatial Statistics, Survival Analysis, Nonparametric and semiparametric methods, Asymptotics, Bayesian inference and Bayesian optimization</li> <li class="show">Stochastic processes, Probability, Statistics and applications</li> <li class="show">Statistical methods and modeling in life sciences including biomedical sciences, environmental sciences and agriculture</li> <li class="show">Decision Theory, Time series analysis, High-dimensional multivariate integrals, statistical analysis in market, business, finance, insurance, economic and social science, etc</li> </ul> <p> Optimization methods and applications</p> <ul> <li class="show">Linear and nonlinear optimization</li> <li class="show">Stochastic optimization, Statistical optimization and Markov-chain etc.</li> <li class="show">Game theory, Network optimization and combinatorial optimization</li> <li class="show">Variational analysis, Convex optimization and nonsmooth optimization</li> <li class="show">Global optimization and semidefinite programming </li> <li class="show">Complementarity problems and variational inequalities</li> <li class="show"><span lang="EN-US">Optimal control: theory and applications</span></li> <li class="show">Operations research, Optimization and applications in management science and engineering</li> </ul> <p>Information computing and machine intelligence</p> <ul> <li class="show">Machine learning, Statistical learning, Deep learning</li> <li class="show">Artificial intelligence, Intelligence computation, Intelligent control and optimization</li> <li class="show">Data mining, Data analysis, Cluster computing, Classification</li> <li class="show">Pattern recognition, Computer vision</li> <li class="show">Compressive sensing and sparse reconstruction</li> <li class="show">Signal and image processing, Medical imaging and analysis, Inverse problem and imaging sciences</li> <li class="show">Genetic algorithm, Natural language processing, Expert systems, Robotics, Information retrieval and computing</li> <li class="show">Numerical analysis and algorithms with applications in computer science and engineering</li> </ul>International Academic Pressen-USStatistics, Optimization & Information Computing2311-004X<span>Authors who publish with this journal agree to the following terms:</span><br /><br /><ol type="a"><ol type="a"><li>Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a <a href="http://creativecommons.org/licenses/by/3.0/" target="_new">Creative Commons Attribution License</a> that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.</li><li>Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.</li><li>Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See <a href="http://opcit.eprints.org/oacitation-biblio.html" target="_new">The Effect of Open Access</a>).</li></ol></ol>Rao-Robson-Nikulin Goodness-of-fit Test Statistic for Censored and Uncensored Real Data with Classical and Bayesian Estimation
http://iapress.org/index.php/soic/article/view/1710
<p>In this work, we provide a new Pareto type-II extension for censored and uncensored real-life data. With an emphasis on the applied elements of the model, some mathematical properties of the new distribution are deduced without excess. A variety of traditional methods, including the Bayes method, are used to estimate the parameters of the new distribution. The censored case maximum likelihood technique is also inferred. Using Pitman's proximity criteria, the likelihood estimation and the Bayesian estimation are contrasted. Three loss functions such as the generalized quadratic, the Linex, and the entropy functions are used to derive the Bayesian estimators. All the estimation techniques provided have been evaluated through simulated studies. The BB algorithm is used to compare the censored maximum likelihood method to the Bayesian approach. With the aid of two applications and a simulation study, the construction of the Rao-Nikulin-Robson (RRN) statistic<br>for the new model in the uncensored case is explained in detail. Additionally, the development of the Rao-Robson-Nikulin statistic for the novel model under the censored situation is shown using data from two<br>censored applications and a simulation study.</p>Salwa L. AlKhayyatHaitham M. YousofHafida Goual Talhi Hamida Mohamed S. Hamed Aiachi HibaMohamed Ibrahim
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-02-242025-02-241362205222510.19139/soic-2310-5070-1710The New Topp-Leone-Type II Exponentiated Half Logistic-Marshall-Olkin-G Family of Distributions with Applications
http://iapress.org/index.php/soic/article/view/1872
<p>In this paper, we propose a new family of generalized distributions called the Topp-Leone type II Exponentiated Half Logistic-Marshall-Olkin-G (TL-TIIEHL-MO-G) distribution. The new distribution can be expressed as an infinite linear combination of exponentiated-G family of distributions. Some special models of the new family of distributions are explored. Statistical properties including the quantile function, ordinary and incomplete moments, stochastic orders, probability weighted moments, distribution of the order statistics and Renyi entropy are presented. The maximum likelihood method is used for estimating the model parameters and Monte Carlo simulation is conducted to examine the performance of the model. The flexibility and importance of the new family of distributions is demonstrated by means of applications to real data for censored and complete sets, respectively</p>Broderick OluyedeGomolemo Jacqueline LekonoLesego Gabaitiri
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-192025-03-191362226226310.19139/soic-2310-5070-1872A New Left Truncated Distribution for Modeling Failure time data: Estimation, Robustness study and Application
http://iapress.org/index.php/soic/article/view/2056
<p>Truncation arises in many practical situations such as Epidemiology, Material science, Psychology, Social Sciences and Statistics where one wants to study about data which lie above or below a certain threshold or with in a specified range. Left-truncation occurs when observations below a given threshold are not present in the sample. It usually arises in employment, engineering, hydrology, insurance, reliability studies, survival analysis etc. In this article, we develop and analyze a new left truncated distribution by truncating an asymmetric and heavy tailed distribution namely Esscher transformed Laplace distribution from the left so that the resulting distriution lies with in (b,$\infty$). Various distributional and reliability properties of the proposed distribution are investigated. A real data analysis is done using failure time data.</p>KRISHNAKUMARI KDais George
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-012025-03-011362264227710.19139/soic-2310-5070-2056Discrimination between quantile regression models for bounded data
http://iapress.org/index.php/soic/article/view/2133
<p>Most often when we use the term `bounded', we mean a response variable that retains inherent upper and lower boundaries; for instance, it is a proportion or a strictly positive for example incomes. This constraint has implications for the type of model to be used since most traditional linear models may not respect these boundaries. Parametric quantile regression with bounded data thus comes with a framework for analysis and interpretation of how the predictor of interest influences the response variable over different quantiles while constrained by the bounds of the theoretically assumed distribution. In this paper, several parametric quantile regression models are explored and their performance is investigated under several conditions. Our Monte Carlo simulation results suggest that some of these parametric quantile regression models can bring significant improvement relative to other existing models under certain conditions.</p>Alla Abdul AlSattar HammodatZainab Tawfiq HamidZakariya Yahya Algamal
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-182025-03-181362278229310.19139/soic-2310-5070-2133Optimizing Automobile Insurance Pricing: A Generalized Linear Model Approach to Claim Frequency and Severity
http://iapress.org/index.php/soic/article/view/2157
<p>Morocco's insurance sector, particularly auto insurance, is experiencing significant growth despite economic challenges. To remain competitive, companies must innovate and adjust their pricing to meet customer expectations and strengthen their market position. Traditionally, actuaries have used the linear model to assess the impact of explanatory variables on the frequency and severity of claims. However, this model has limitations that do not always accurately reflect the reality of claims or costs, especially in auto insurance. Our study adopted the generalized linear model (GLM) to address these shortcomings, enabling a more precise statistical analysis that better aligns with market realities. This paper examines the application of GLM to model the total claim burden of an automobile portfolio and establish an optimal rate. The steps include data processing and analysis, segmentation of rating variables, as well as the selection of appropriate distributions using statistical tests such as the Wald test and the deviance test, all performed using SAS software.</p>Mekdad SlimeAbdellah Ould Khal Abdelhak Zoglat Mohammed El KamliBrahim Batti
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-04-032025-04-031362294231510.19139/soic-2310-5070-2157Advanced Parameter Estimation for the Gompertz-Makeham Process: A Comparative Study of MMLE, PSO, CS, and Bayesian Methods
http://iapress.org/index.php/soic/article/view/2167
<p>A research study investigates how to estimate Gompertz-Make ham Process (GMP) parameters within non-homogeneous Poisson processes (NHPP). Authorities have developed Modified Maximum Likelihood Estimation (MMLE) as an improvement over standard Maximum Likelihood Estimation (MLE) to resolve parameter estimation accuracy issues. The study utilizes combination artificial intelligence optimizations through particle swarm optimization (PSO) and cockoo search (CS) alongside Bayesian estimation to assess different methods. This study evaluates MMLE and PSO and CS with Bayesian methods through Root Mean Square Error (RMSE) and Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) statistical accuracy measurements during a simulation analysis. The MMLE estimation technique delivers better estimation precision than PSO, CS and Bayesian methods during the performance assessment. The methodology is validated through its use in modeling operational failures at the Badoush Cement Factory and COVID-19 case occurrences in Italy, showing its capability to model failure rates alongside event occurrences. The research generates progress in NHPP statistical estimation methods which gives a stronger analytical platform for reliability monitoring and survival model prediction and epidemiological projection. Research into the GMP needs to focus on including time-dependent elements and structural dependency mechanisms to enhance the model's capability and guess making power.</p>Adel S. HussainMuthanna Subhi SulaimanSura Mohamed HusseinEmad A. Az-Zo’biMohammad Tashtoush
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-062025-03-061362316233810.19139/soic-2310-5070-2167Bayesian accelerated life testing models for the log-normal and gamma distributions under dual-stresses
http://iapress.org/index.php/soic/article/view/2293
<p>In this paper, a Bayesian approach to accelerated life testing models with two stressors is presented. Lifetimes are assumed to follow either a log-normal distribution or a gamma distribution, which have been mostly overlooked in the Bayesian literature when considering multiple stressors. The generalized Eyring relationship is used as the time transformation function, which allows for the use of one thermal stressor and one non-thermal stressor. Due to the mathematically intractable posteriors of these models, Markov chain Monte Carlo methods are utilized to obtain posterior samples on which to base inference. The models are applied to a real dataset, where model comparison metrics are calculated and estimates are provided of the model parameters, predictive reliability, and mean time to failure. The robustness of the models is also investigated in terms of the prior specification.</p>Neill Smit
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-212025-03-211362339235210.19139/soic-2310-5070-2293A Novel Fréchet-Poisson Model: Properties, Applications under Extreme Reliability Data, Different Estimation Methods and Case Study on Strength-Stress Reliability Analysis
http://iapress.org/index.php/soic/article/view/2463
<p>A new compound extension of the Fréchet distribution is introduced and studied. Some of its properties including moments, incomplete moments, probability weighted moments, moment generating function, stress strength reliability model, residual life and reversed residual life functions are derived. The mean squared errors (MSEs) for some estimation methods including maximum likelihood estimation (MLE), Cram\'{e}r--von Mises (CVM) estimation, Bootstrapping (Boot.) estimation and Kolmogorov estimates (KE) method are used to estimate the unknown parameter via a simulation study. Two real applications are presented for comparing the estimation methods. Another two real applications are presented for comparing the competitive models. The nonparametric Hill estimator under the breaking stress of carbon fibers is estimated using the tail index (TIx) of the new model. Finally, a case study on reliability analysis of composite materials for aerospace applications is presented.</p>Mohamed IbrahimS. I. AnsariAbdullah H. Al-NefaieAhmad M. AboAlkhairMohamed S. HamedHaitham M. Yousof
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-04-182025-04-181362353238110.19139/soic-2310-5070-2463New parameter of conjugate gradient method for unconstrained nonlinear optimization
http://iapress.org/index.php/soic/article/view/2069
<p>We are interested in the performance of nonlinear conjugate gradient methods for unconstrained optimization. In<br>particular, we address the conjugate gradient algorithm with strong Wolfe inexact line search. Firstly, we study the descent<br>property of the search direction of the considered conjugate gradient algorithm based on a new direction obtained from a<br>new parameter. The main objective of this parameter is to improve the speed of the convergence of the obtained algorithm.<br>Then, we present a complete study that shows the global convergence of this algorithm. Finally, we establish comparative<br>numerical experiments on well-known test examples to show the efficiency and robustness of our algorithm compared to<br>other recent algorithms.</p>Mohamed Lamine OuaouaSamia KhelladiDjamel Benterki
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-02-242025-02-241362382239010.19139/soic-2310-5070-2069Blockchain technology for Green Manufacturing: A Systematic Literature Review on applications, drivers, enablers and challenges
http://iapress.org/index.php/soic/article/view/2182
<p>Blockchain technology(BCT) is a promising technology for Industry 4.0 and enhances sustainability, traceability, and resilience for Green manufacturing(GM) in the value chain. This literature study aims to evaluate the existing and current literature for contributing to the research focusing on BCT to GM industries with insight into the drivers, enablers, and challenges of BCT. This review is not limited to highlighting the contributions and application of blockchain to eco-friendly manufacturing, it will take into account the role of emerging technology applicable to GM in Industry 4.0. In conducting this review, the number of 113 qualitative articles were selected to be analyzed deeply using bibliometric and content analysis, based on their contents, year of publication, keywords, the methodology used, and recommendations of the authors. The results accentuated the connection between BCT and their associated technology, including Artificial Intelligence(AI) and the Internet of Things(IoT), for enhancing GM by accounting for the drivers, enablers, and challenges of implementing the BCT to GM.</p> <p>In the conclusion of our literature review reveals that BCT is a promising technology in the context of our review since it offers two main capabilities: transaction transparency and robustness, which are mandatory for GM implementation. In addition, we concluded that the majority of existing research works focus only on one or two aspects of GM and are destined to specific industries or use cases that limit their applicability. Unfortunately, there are gaps related to standardization, the 4.0 industry implications, and the adoption of BCT identified during the analysis of this review.</p>Clement Regis TUYISHIMEAsmae ABADIChaimae ABADI Mohammed ABADI
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-292025-03-291362391240510.19139/soic-2310-5070-2182On derivability criteria of h-Convex Functions
http://iapress.org/index.php/soic/article/view/2096
<p>This study pursues two main objectives. First, we aim to generalize the Criterion of Derivability for convex functions, which posits that for a specific type of mathematical function defined on an interval, the function is convex if and only if its rate of change (first derivative) is monotonically increasing across that interval. We aim to expand this concept to encompass the realm of 'h-convexity' which generalizes convexity for nonnegative functions by allowing a function h to act on the right hand side of the convexity inequality.</p> <p>Additionally, we delve into the second criterion of convexity, which asserts that for a similar type of function on an interval, the function is convex if and only if its second derivative remains non-negative across the entire interval, adhering to the conventional definition of convexity. Our goal is to reinterpret this criterion within the framework of 'h-convexity'. Furthermore, we prove that if a certain non-zero function defined on the interval [0,1] is non-negative, concave, and bounded above by the identity function, then this function is fixing the end point of the interval if and only if it is the identity function.<br>Finally, we will also provide a response in to the conjecture given by Mohammad W. Alomari (See [6]) that it is incorrect with two counterexamples.</p>Mousaab Bouafia Adnan YassineThabet Abdeljawad
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-212025-03-211362406241110.19139/soic-2310-5070-2096An Application of Ensemble Stacking in Machine Learning to Predict Short-term Electricity Demand in South Africa
http://iapress.org/index.php/soic/article/view/2170
<p>The massive increase in the collected data and the need for data mining and analyses has prompted the need to improve the accuracy and stability of traditional data mining and learning algorithms. This study proposes a robust stacking-ensemble algorithm for predicting the hourly electricity demand in South Africa. The structure of the proposed model is in two layers: the base model and the meta-model. Four machine learning models, that is, the gradient boosting machine (GBM), the deep neural network (DNN), the generalised linear model (GLM), and the random forest (RF), make up the base models. Output from the base models is integrated using ensemble stacking to form the meta-model. The stacking-ensemble (SE) model predicts South Africa's hourly electricity demand. The performance of the models is tested in different forecasting horizons. The prediction performance of the stacking-ensemble model is compared with the prediction performance of each of the base models using the root mean square error (RMSE), the mean absolute error (MAE), and the mean square error (MSE). In addition, the Giacomini-White test is used to identify the dominant model. Results showed that the RF model produced the most accurate predictions in all the forecasting horizons. The order of dominance is as follows: RF> SE > GBM> GLM. Thus, RF demonstrates the highest predictive capability, dominating the other models. The stacking-ensemble model produced the second most accurate results, with its results in the shortest forecasting horizon almost equal to that of the RF model. Thus, in this context, the stacking ensemble performs better than 3 of the 4 meta models. The proposed model produces a reasonable and accurate prediction of hourly electricity demand, which is strategically significant in planning and formulating electricity load-shedding strategies in South Africa or any other country.</p>Claris ShokoCaston SigaukeKatleho Makatjane
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-282025-03-281362412243310.19139/soic-2310-5070-2170Reflexive Edge Strength in Certain Graphs with Dominant Vertex
http://iapress.org/index.php/soic/article/view/2210
<p>Consider a basic, connected graph G with an edge set of $E(G)$ and a vertex set of $V(G)$. The functions $f_e$ and $f_v$, which take $k=max\{k_e, 2k_v\}$, from the edge set to the first $k_e$ natural number and the non-negative even number up to $2k_v$, respectively, are the components of total $k$-labeling. An \textit{edge irregular reflexive $k$ labeling} of the graph $G$ is the total $k$-labeling, if for every two different edges $x_1x_2$ and $x_1'x_2'$ of $G$, $wt(x_1x_2) \neq wt(x_1'x_2')$, where $wt(x_1x_2)=f_v(x_1)+f_e(x_1x_2)+f_v(x_2)$. The reflexive edge strength of graph $G$ is defined as the minimal $k$ for graph $G$ with an edge irregular reflexive $k$-labeling; it is denoted by $res(G)$. The $res(G)$, where $G$ are the book, triangular book, Jahangir, and helm graphs, was found in this work.</p>MarsidiDafikSusantoArika Indah KristianaIka Hesti AgustinM Venkatachalam
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-282025-03-281362434244710.19139/soic-2310-5070-2210Intelligent operation of photovoltaic generators in isolated AC microgrids to reduce costs and improve operating conditions
http://iapress.org/index.php/soic/article/view/2247
<p>This paper addresses the challenges associated with optimizing the operation of photovoltaic distributed generators in isolated electrical microgrids. With the aim of reducing energy production and system maintenance costs and improving the microgrid operating conditions, a master--slave methodology is proposed. In the master stage, the problem of intelligently injecting active power from photovoltaic generators is solved using the continuous versions of four optimization techniques: the Monte Carlo method, the Chu \& Beasley genetic algorithm, the population genetic algorithm, and the particle swarm optimizer. Meanwhile, the slave stage evaluates the solutions proposed by the master stage by solving an hourly power flow problem based on the successive approximations method. The proposed solution methodologies are validated in two test scenarios of 10 and 27 buses to select the one with the best performance. Then, the most efficient methodology is implemented in a real isolated grid located in Huatacondo, Chile. This validation aims to assess its ability to optimize the operation of photovoltaic generators in isolated microgrids, considering variations in power generation and demand across the different seasons of the year. The study underscores the importance of financial considerations in achieving an efficient and economically viable operation of photovoltaic generation systems. Furthermore, it provides valuable input to successfully integrate non-conventional renewable energy sources into isolated electrical microgrids.</p>Catalina Díaz CáceresLuis Fernando Grisales NoreñaBrandon Cortés-CaicedoJhony Andrés Guzmán-HenaoRubén Iván BolañosOscar Danilo Montoya Giraldo
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-04-132025-04-131362448247610.19139/soic-2310-5070-2247Function Representation in Hilbert Spaces Using Haar Wavelet Series
http://iapress.org/index.php/soic/article/view/2288
<p>This work explores the application of integral transforms using Scale and Haar wavelet functions to numerically represent a function \( f(t) \). It is based on defining a vector space where any function can be represented as a linear combination of orthogonal basis functions. In this case, the Haar wavelet transform is used, employing Haar functions generated from Scale functions. First, the fundamental mathematical concepts such as Hilbert spaces and orthogonality, necessary for understanding the Haar wavelet transform, are presented. Then, the construction of the Scale and Haar wavelet functions and the process for determining the coefficients for function representation are detailed. The methodology is applied to the function \( f(t) = t^2 \) over the interval \( t \in [-3, 3] \), showing how to calculate the series coefficients for different resolution levels. As the resolution level increases, the approximation of \( f(t) \) improves significantly. Furthermore, the representation of the function \( f(t) = \sin(t) \) over the interval \( t \in [-6, 6] \) using the Haar wavelet series is presented.</p>Andres Felipe Camelo Carlos Alberto RamírezJosé Rodrigo González
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-062025-03-061362477248610.19139/soic-2310-5070-2288Fuzzy Volterra Integral Equation Approximate Solution Via Optimal Homotopy Asymptotic Methods
http://iapress.org/index.php/soic/article/view/2302
<p>The field of fuzzy integral equations (FIEs) is significant for modeling complex, time-delayed, and uncertain physical phenomena. Nevertheless, the majority of current solutions for FIEs encounter considerable challenges, such as the inability to manage intricate fuzzy functions, stringent assumptions regarding the forms of fuzzy operations utilized, and numerical instability in extremely nonlinear issues. Moreover, the capability of traditional methods in producing precise or reliable outcomes for practical applications is limited, and if they can, will incur substantial computing expenses. These challenges underscore the demand for more effective and efficient methodologies. This study aims to address the demand by developing two approximate analytical techniques to solve the FIEs namely optimal homotopy asymptotic method (OHAM) and the multistage optimal homotopy asymptotic method (MOHAM). A novel iteration of fuzzy OHAM and MOHAM is introduced by integrating the fundamental concepts of these methodologies with fuzzy set theory and optimization techniques. Then, OHAM and MOHAM are further formulated to solve the second-kind linear Volterra fuzzy integral equations (VFIEs). These methods are named fuzzy Volterra optimal homotopy asymptotic method (FV-OHAM) and fuzzy Volterra multistage optimal homotopy asymptotic method (FV-MOHAM), respectively. From two linear examples, FV-MOHAM and FV-OHAM generated significantly more accurate results than other existing methods. A thorough assessment is performed to evaluate their effectiveness and practical use, potentially aiding in solving complex problems across several scientific and engineering fields.</p>Alzubi Muath Talal MahmoudFarah Aini AbdullahAli Fareed JameelAdila Aida Azahar
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-102025-03-101362487251010.19139/soic-2310-5070-2302Numerical Solution of the Lotka-Volterra Stochastic Differential Equation
http://iapress.org/index.php/soic/article/view/2307
<p>This paper presents the modeling of the stochastic differential equation of Lotka-Volterra and introduces the application of two numerical methods to approximately obtain the solution to this stochastic model. The methods used to solve the stochastic differential equation are the Euler-Maruyama method and the Milstein method. Additionally, a methodology will be presented to obtain the parameters of the predator-prey model equation based on empirically obtained data from observations conducted over a fixed period of time.</p>Erisbey Marín CardonaCarlos Alberto Ramírez-VanegasJosé Rodrigo González Granada
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-042025-03-041362511252010.19139/soic-2310-5070-2307Jackson’s theorem for the Kontorovich-Lebedev-Clifford transform
http://iapress.org/index.php/soic/article/view/2397
<p>In this paper, by using the Kontorovich-Lebedev-Clifford translation operators studied recently by A. Prasad and U.K. Mandal (The Kontorovich-Lebedev-Clifford transform, Filomat 35:14 (2021), 4811–4824.), we prove Jackson's theorem associated with the Kontorovich-Lebedev-Clifford transform.</p>Yassine FANTASSEAbdellatif Akhlidj
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-292025-03-291362521252810.19139/soic-2310-5070-2397From Extraction to Reasoning: A Systematic Review of Algorithms in Multi-Document Summarization and QA
http://iapress.org/index.php/soic/article/view/2398
<p>Multi-document summarization and question-answering (QA) have become pivotal tasks in Natural Language Processing (NLP), facilitating information extraction and decision-making across various domains. This systematic review explores the evolution of algorithms used in these tasks, providing a comprehensive taxonomy of traditional, modern, and emerging approaches. We examine the progression from early extractive methods, such as TFIDF and TextRank, to the advent of neural models like BERT, GPT, and T5 and the integration of retrieval-augmented generation (RAG) for QA. Hybrid models combining traditional techniques with neural approaches and graph-based methods are also discussed. Through a detailed analysis of algorithmic frameworks, we identify key strengths, weaknesses, and challenges in current methodologies. Additionally, the review highlights recent trends such as unified models, multimodal algorithms, and the application of reinforcement learning in summarization and QA tasks. We also explore the real-world relevance of these algorithms in sectors such as news, legal, medical, and education. The paper concludes by outlining open research directions, proposing new evaluation frameworks, and emphasizing the need for cross-task annotations and ethical considerations in future algorithmic development.</p>Emmanuel Efosa-ZuwaOlufunke OladipupoJelili Oyelade
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-03-152025-03-151362529255910.19139/soic-2310-5070-2398Randomized Algorithms for Low-Rank Tensor Completion in TT-Format
http://iapress.org/index.php/soic/article/view/2483
<p>Tensor completion is a crucial technique for filling in missing values in multi-dimensional data. It relies on the assumption that such datasets have intrinsic low-rank properties, leveraging this to reconstitute the dataset using low-rank decomposition or other strategies. Traditional approaches often lack computational efficiency,<br>particularly with singular value decomposition (SVD) for large-scale tensor. Furthermore, fixed-rank SVD methods struggle with determining a suitable initial rank when data are incomplete. This paper introduces two novel randomized algorithms designed for low-rank tensor completion in tensor train (TT) format, named TTrandPI and FPTT. The TTrandPI algorithm integrates randomized tensor train (TT) decomposition with power iteration techniques, thereby enhancing computational efficiency and accuracy by improving spectral decay and minimizing tail energy build-up. Meanwhile, the FPTT algorithm utilizes a fixed-precision low-rank approximation approach that adaptively selects tensor ranks based on error tolerance levels, thus reducing the dependence on a predetermined rank. By conducting numerical experiments on synthetic data, color images, and video sequences, both algorithms exhibit superior performance compared to some existing methods.</p>Yihao PanCongyi YuChaoping ChenGaohang Yu
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-04-272025-04-271362560257410.19139/soic-2310-5070-2483Hybrid Butterfly-Grey Wolf Optimization (HB-GWO): A Novel Metaheuristic Approach for Feature Selection in High-Dimensional Data
http://iapress.org/index.php/soic/article/view/2617
<p>Feature selection is a critical preprocessing step in high-dimensional data analysis, aiming to enhance model performance by eliminating irrelevant and redundant features. This paper introduces a novel hybrid metaheuristic algorithm, the Hybrid Butterfly-Grey Wolf Optimization (HB-GWO), which synergizes the global exploration capabilities of the Butterfly Optimization Algorithm (BOA) with the local exploitation strengths of the Grey Wolf Optimizer (GWO) to achieve an effective balance between exploration and exploitation in feature selection tasks. The algorithm incorporates an adaptive switching mechanism that dynamically adjusts the contribution of BOA and GWO throughout the optimization process. HB-GWO was evaluated on multiple benchmark datasets, including Breast Cancer, Madelon, Colon Cancer, and Arrhythmia, using a Random Forest classifier as the evaluation model. Experimental results demonstrate that HB-GWO consistently outperforms state-of-the-art metaheuristic algorithms (GA, PSO, BOA, GWO) in classification accuracy, feature reduction rate, and computational efficiency. An ablation study further confirms the contribution of each component of the hybrid algorithm. These findings position HB-GWO as a robust and efficient method for feature selection in high-dimensional data analysis.</p>Mohammed Aly Abdullah Shawan Alotaibi
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-05-282025-05-281362575260010.19139/soic-2310-5070-2617Forecasting Scientific Impact: A Model for Predicting Citation Counts
http://iapress.org/index.php/soic/article/view/2524
<p>Forecasting the citation counts of scientific papers is a challenging task, particularly when utilizing textual data such as author names, paper titles, abstracts, and affiliations. This task diverges from conventional regression problems involving numerical or categorical inputs, as it demands the processing of complex, high-dimensional text features. Traditional regression techniques, including Linear Regression, Polynomial Regression, and Decision Tree Regression, often fail to encapsulate the semantic intricacies of textual data and are susceptible to overfitting due to the expansive feature space. In the context of Vietnam, where research output is rapidly growing yet underexplored in predictive modeling, these limitations are especially pronounced. To tackle these issues, we leverage advanced Natural Language Processing (NLP) techniques, employing Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks. These deep learning models are adept at handling sequential data, capturing long-range dependencies, and preserving contextual nuances, rendering them well-suited for text-based citation prediction. We conducted experiments using a dataset of academic<br>papers authored by Vietnamese researchers across diverse disciplines, sourced from publications featuring Vietnamese author contributions. The dataset includes features such as author names, titles, abstracts, and affiliations, reflecting the unique characteristics of Vietnam’s research landscape. We compared the performance of LSTM and GRU models against traditional machine learning approaches, evaluating prediction accuracy with metrics like Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The results reveal that LSTM and GRU models substantially outperform their traditional counterparts. The LSTM model achieved an RMSE of 8.54 and an MAE of 8.1, while the GRU model yielded an RMSE of 8.32 and an MAE of 7.83, demonstrating robust predictive capabilities. In contrast, traditional models such as Decision Tree Regression and Linear Regression exhibited higher error rates, with RMSEs exceeding 12.0. These findings underscore the efficacy of deep learning in forecasting citation counts from textual data, particularly for Vietnamese research outputs, and highlight the potential of LSTM and GRU models to uncover intricate patterns driving scientific impact in emerging research ecosystems.</p>Bao T. NguyenThinh T. Nguyen
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-05-282025-05-281362601261510.19139/soic-2310-5070-2524A New Approach Of Multiple Merger And Acquisition (M &A) In AR Time Series Model Under Bayesian Framework
http://iapress.org/index.php/soic/article/view/2029
<p>Merger and acquisition (M\&As) concepts play a pivotal role in fostering economic development and are extensively examined worldwide across various empirical contexts, notably in the banking sector. The primary objective of this study is to introduce a novel approach termed the multiple-merger autoregressive (MM-AR) model, aimed at providing insights into the effects of mergers on model parameters and behaviour. Initially, we propose a comprehensive estimation framework utilizing posterior parameters within the Bayesian paradigm, incorporating diverse loss functions to enhance robustness. The uniqueness of this model is that it will also work for the situation when multiple series get merged at various time points in the same observed series. Bayesian estimation approach is used to record the results of the MM-AR model parameters in terms of MSE, AB, and AE and get good results. Under Bayesian estimation, SELF performs better than the other estimators for most of the parameters. Subsequently, we compute the Bayes factor to quantify the impact of merged series on the overall model dynamics. To further elucidate the efficacy of the proposed model, we conduct both simulation-based analyses and real-world applications focusing on the Indian banking sector. Through this research, we aim to offer valuable insights into the implications of M\&A activities. For the purpose of data analysis, we used PCR banking data of ICICI Banks Ltd. for simulation and empirical analysis to verify the models' applicability and purpose.</p>Jitendra KumarMohd Mudassir
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-05-262025-05-261362616263310.19139/soic-2310-5070-2029On the Use of Yeo-Johnson Transformation in the Functional Multivariate Time Series
http://iapress.org/index.php/soic/article/view/1569
<p>Box-Cox and Yeo-Johnson transformation models were utilized in this paper to use density function to improve multivariate time series forecasting. The K-Nearest Neighbor function is used in our model, with automatic bandwidth selection using a cross-validation approach and semi-metrics used to measure the proximity of functional data. Then, to decorrelate multivariate response variables, we use principal component analysis. The methodology was applied on two time series data examples with multiple responses. The first example includes three time series datasets of the monthly average of Humidity (H), Rainfall (R) and Temperature (T). The simulation studies are provided in the second example. Mean square errors of predicted values were calculated to show forecast efficiency. The results have proved that applying multivariate nonparametric time series transformed stationary datasets using the Yeo-Johnson model more efficient than applying the univariate nonparametric analysis to each response independently. </p>Sameera Abdulsalam OthmanHaithem Taha Mohammed Ali
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-04-092025-04-091362634264610.19139/soic-2310-5070-1569Bayesian Premium Estimators for NXLindley Model Under Different Loss Functions
http://iapress.org/index.php/soic/article/view/2442
<p>The conditional distribution of (X|θ) is regarded as the NXLindley distribution. This study is centered on the estimation of the Bayesian premium using the symmetric squared error loss function and the asymmetric Linex loss function, employing the extension of Jeffreys as non-informative priors and Gamma prior as informative priors. Owing to its complexity and lack of linearity, we rely on a numerical approximation for establishing the Bayesian premium. A simulation and comparison study with several sample sizes is presented.</p>Ahmed SadounImen OuchenFarouk Metiri
Copyright (c) 2025 Statistics, Optimization & Information Computing
2025-05-282025-05-281362647266810.19139/soic-2310-5070-2442