Heavy-Tailed Log-Logistic Distribution: Properties, Risk Measures and Applications

Heavy tailed distributions have a big role in studying risk data sets. Statisticians in many cases search try to find new or relatively new statistical models to fit data sets in different fields. This article introduced a relatively new heavy-tailed statistical model by using alpha power transformation and exponentiated log-logistic distribution, which called alpha power exponentiated log-logistic distribution. Its statistical properties were derived mathematically such as moments, moment generating function, quantile function, entropy, inequality curves and order statistics. Six estimation methods were introduced mathematically and the behaviour of the proposed model parameters was checked by randomly generated data sets and these estimation methods. Also, some actuarial measures were deduced mathematically such as value at risk, tail value at risk, tail variance and tail variance premium. Numerical values for these measures were performed and proved that the proposed distribution has a heavier tail than others compared models. Finally, three real data sets from different fields were used to show how these proposed models fitting these data sets than other many wells known and related models.


Introduction
Modelling data sets in different areas of search such as risk management, economic, and actuarial sciences need heavy-tailed distributions. These data sets may be unimodal shaped (Cooray and Ananda [20]), right-skewed (Lane [37]), positive (Klugman et al. [36]), and with heavy-tails (Ibragimov and Prokhorov [33]). Skewed data sets are preferable to be modelled by skewed distributions (Bernardi et al. [17]). Heavy tailed distributions have a great interest for modelling insurance data set by actuaries in which they are often interested in the chance of a negative outcome which can be expressed via value at risk (VaR). Statistical models which are heavy-tailed have an important role in actuarial science, particularly in providing sufficient descriptions of claim size distributions and for that reason, a noted interest has been shown to learn about these subjects in the previous decade or so, for example, see [Hogg and Klugman [32]; Qi [45]; Hao and Tang [31]; Yang et al. [49]; Afify et al. [4]], among many others. In previous years, the extension of distributions received great attention from scientists, especially generalized distributions. Many methods for generalized distributions were introduced to increase the flexibility of the initial distribution which depends on adding one or more parameters to the initial distribution. Providing more flexibility to distributions is very important for modelling data in many fields such as medicine, engineering, economics, etc...
In 2017, Mahdavi and Kundu [39] derived a new method for generating a relatively new distributions which depends on adding a parameter to the initial distribution and it is called alpha power transformation (APT) method.

911
If G(x) is the cumulative distribution function (CDF) of a continuous random variable X, then the CDF of APT is given by and its corresponding probability density function (PDF) is given by Many articles were published on generating distributions by APT. Mahdavi and Kundu [39] presented alpha power exponential distribution along with derived some of its statistical properties. Estimation of unknown parameters was obtained by solving non-linear equations. Unal et al. [18] introduced alpha power inverted exponential distribution and they deduce model parameters by maximum likelihood method. Two real data sets were applied to show that the generalized distribution is better than other compared distributions for fitting these data sets. Dey et al. [25] used inverse Lindley distribution and APT to obtain alpha power transformed Lindley distribution. Various properties of the proposed distribution were obtained and it can have an upsidedown bathtub failure rate function. A simulation was performed to examine estimated parameters. Mead et al. [41] provide extra mathematical properties to alpha power exponential distribution along with presented alpha power exponentiated Weibull distribution which can model monotone and non-monotone failure rate functions. They explain the importance of the new distribution by application on two real data sets. Nassar et al. [43] considered nine methods of estimation for alpha power exponential distribution parameters along with different sample sizes and different parameter values. Also, they provide applications to engineering and medical data to show the distribution potentiality. Afify et al. [44] presented the Marshall-Olkin alpha power family of distributions. They derived Marshall-Olkin alpha power exponential distribution as a member of this family along with some of its statistical properties. They illustrate the superiority of the proposed model through three real data sets.
In survival research, the log-logistic distribution (LLD) is used as a parametric model for events whose rate initially rises and later declines, for example, Coronavirus following diagnosis or treatment. This was often used in hydrology to predict the movement and runoff of water. LLD some times define as the PDF of a random variable whose logarithm has a logistic distribution. Collet [19] explained that properties of LLD make it an attractive alternative to the log-normal and Weibull distributions and suggested it for modelling the time following heart transplantation. The LLD is also known as Fisk distribution in the income distribution literature [26]. Some authors, such as, [23] and [47] refer to the Fisk distribution as the LLD, whereas Arnold [15] refers to it as Pareto type III distribution and includes an additional location parameter. Further details about the LLD can be explored in [35]. Many authors have studied several generalized forms of the LLD to improve its flexibility such as Kumaraswamy-LL by [24], beta-LL by [38], Marshall-Olkin LL by [29], McDonald LL by [48], Zografos-Balakrishnan LL by [30], Generalized LL by [34], extend LL by [12] and odd Lomax LL distributions by [21].
PDF and CDF of exponentiated (Ex) LLD are given by (x > 0) respectively, where a > 0 and c > 0 are shape parameters, and b > 0 is a scale parameter. By setting c = 1, we have LLD.
In this paper, we presented a more flexible version of the LL distribution called alpha power exponentiated log-logistic distribution (APExLLD), which can provide more flexibility in modelling different data sets than other competing models. It is more flexible extension with just an extra parameter to the alpha power log-logistic distribution [10]. It has a heavier tail than other compared related and well known models such as alpha power loglogistic, exponentiated log-logistic and log-logistic distributions. Its CDF, PDF and others related function have a simple closed forms, hence it is very convenient for analyzing censored data. It exhibits increasing, upside-down, 912 HEAVY-TAILED LOG-LOGISTIC DISTRIBUTION bathtub, decreasing, J-shaped and reversed-J hazard rate shapes, and symmetrical, asymmetrical (right-skewed or left-skewed) J-shape, unimodal, and reversed-J shape densities. Three real data sets from the engineering, Geology and insurance fields are analyzed using the proposed model, showing its flexibility over other competing distributions.
This paper is organized as the following. In section 2, we introduced a relatively new distribution by alpha power transformation called alpha power exponentiated log-logistic distribution along with its PDF, CDF and other related functions. Its statistical properties were derived in section 3 moments, moment generating function, moments of residual life function, quantile function and mode. Also, Rényi, Tsallis and Shannon entropies were derived along with inequality curves such as Lorenz, Bonferroni and Zenga curves. We derived PDF and CDF of i th order statistics along with limiting distribution of its maximum and minimum. Six different estimation methods of our distribution were introduced in section 4 which were used in the study of the behaviour of our distribution parameters by randomly generated data sets in section 5. Actuarial measures and their numerical values are introduced in section 6. In section 7, real data sets were applied to our distribution and other related distributions and showed the superiority of our distribution for fitting these data sets.

Alpha power exponentiated log-logistic distribution
In this section, we derived an extension of alpha power transformation which called alpha power exponentiated loglogistic distribution (APExLLD). If X is a continuous random variable having CDF 3, then the CDF of APExLLD is given by and its corresponding PDF is given by be setting c = 1, we have alpha power log-logistic distribution [10]. Plots of APExLLD PDF for some parametric values are shown in Figure 1. The survival, the hazard and the reverse hazard functions of APExLLD, respectively, are given by PDF and HRF plots of APExLLD are introduced in Figures 1 and 2, respectively. These plots show that the APExLLD can be upside-down, bathtub, bathtub upside-down, , increasing, decreasing and J-shape. Also, its densities can be left skewed, symmetrical, right skewed, J-shape and reversed-J shape.

Statistical properties
In this section, we illustrate that the proposed model is a heavy-tailed model, also, we derive important statistical properties of APExLLD such as moments, moment generating function (mgf), moments of residual life function, quantile function, mode, entropy, mathematical form of inequality curve and order statistics.

Heavy-tailed behavior
The heavy-tailed distributions have right tail probabilities which are heavier than the exponential one, that is, for any baseline with CDF G (x), they satisfy the following equation By applying the previous equation on the proposed model, we showed that it is a heavy-tailed distribution as the following We illustrated the heavy-tailed behavior of the proposed model in Figure 3. In Figure 3 (left panel), as the value of parameter b decreases, the proposed model become a more right-tailed curve. Also, in the same figure (right panel), as the value of parameter c decreases, the proposed model has a heavy-tailed curve.
Heavy-tailed distributions which have the regular variation property are very competitive models for modelling heavy-tailed data sets (for more information, see [6]). This property is an important characteristic for heavy-tailed distributions. A statistical model is said to have this property if it satisfied the following equation where a > 0 is called the index of regular variation. By applying the previous equation on our heavy-tailed proposed model, we have then the proposed model is regular varying with index of variation a.

Moments and moment generating function
The r th moments of APExLLD about the origin is given by , a > r. Mean, variance, skewness, and kurtosis plots of APELLD by using different values of parameters b and c and fixed parameters a = 4.5 and α = 0.5 are displayed in Figure 4. The mgf of APExLLD is given by , a > r, by setting t = jt in mgf, we obtain the characteristic function of APExLLD.

Moments of residual life function
The m th moments of residual life function of APExLLD is given by by setting m = 1, we obtain the mean residual life function of APExLLD and then by setting t = 0, we obtain the mean of APExLLD.

Quantile function and mode
By determining the inverse function of CDF (5), we cn easily obtain the quantile function of APExLLD as the following by setting p = 0.25, 0.5 and 0.75, we obtain the first, second (median) and third quartiles of APExLLD, respectively. By derivative the logarithm of PDF of APExLLD with respect to x, we get Then, the roots of following equation are the modes of APExLLD The previous equation may have more than one root. If x = x 0 is a root of the previous equation, then it may be local maximum, local minimum or inflection point which depends on whether ∂ 2 log f (x) ∂x 2 <, > or = 0, respectively.

Entropy
The entropy of a random variable X is defined as a measure of the randomness amount of information in the distribution. As entropy increase, as randomness amount increase. It has many applications in various fields such as engineering, physics and probability theory which its definition differ from one to another.
The continuous Rényi, Tsallis and Shannon entropies of APExLLD are given by respectively, where Ψ (z) = d dz log(Γ(z)) and γ is Euler Mascheroni constant.

Lorenz, Bonferroni and Zenga curves
Inequality is an important characteristic of non-negative distributions in which we used inequality curves to analyze this characteristic and they are drawn in the unitary square. (for more information see Arcagni and Porro [14]) Lorenz, Bonferroni and Zenga curves of APExLLD, respectively, are given by , are the quantile function of APExLLD and the hyper geometric function, respectively. Table 1 represents numerical values for Lorenz, Bonferroni and Zenga curves of APELLD for different values of p and fixed values of parameters (a=2, b=1, c=1.5, α = 0.5). Figure 5 represents the numerical values in Table 1 .

Order statistics
Let X 1 , X 2 ,. . . , X n be a random sample of APExLLD and X 1:n , X 2:n ,. . . , X n:n be the corresponding order statistics, then PDF and CDF, respectively, of the i th order statistics are given by By setting i = 1 and i = n, respectively, we obtain PDF and CDF of minimum and maximum order statistics.
If the sample size is odd, then by setting i = n + 1 2 , we obtain the PDF and CDF of the median order statistics, respectively.
Let Z n = X n:n and W n = X 1:n from APExLLD, then the limiting distribution of Z n and W n , respectively, can be obtained by using theorem (2.1.1) and (2.1.5) in Galambos [27] as the following

Maximum likelihood estimation
Let x 1 , x 2 , . . . , x n be a random sample of size n from APExLLD, then the log-likelihood function is given by By setting each one of the last four non-linear equations equal to zero and solving them, we can obtain parameters of MLE for APExLLD.

Ordinary least-squares and weighted least-squares estimation
Let x 1:n , x 2:n ,. . . , x n:n be the order statistics of a random sample of size n from APExLLD, then OLSE of APExLLD parameters are obtained by minimizing the following equation and these estimated parameters can also obtained by solving the following non-linear equations The WLSE of APExLLD parameters are obtained by minimizing the following equation and these estimated parameters can also obtained by solving the following non-linear equations where Φ s (x i ), s = 1, 2, 3, 4 were defined in 11, 12, 13 and 14, respectively. 920 HEAVY-TAILED LOG-LOGISTIC DISTRIBUTION

Anderson-Darling estimation
Let x 1:n , x 2:n ,. . . , x n:n be the order statistics of a random sample of size n from APExLLD, then ADE of APExLLD parameters are obtained by minimizing the following equation and these estimated parameters can also obtained by solving the following non-linear equations where Φ s (x i ), s = 1, 2, 3, 4 were defined in 11, 12, 13 and 14, respectively.

Cramér-von Mises estimation
Let x 1:n , x 2:n ,. . . , x n:n be the order statistics of a random sample of size n from APExLLD, then CVME of APExLLD parameters are obtained by minimizing the following equation and these estimated parameters can also obtained by solving the following non-linear equations where Φ s (x i ), s = 1, 2, 3, 4 were defined in 11, 12, 13 and 14, respectively.

Maximum product of spacings estimation
The MPS method is used to estimate the parameters of continuous univariate models as an alternative to the ML method. The uniform spacings of a random sample of size n from the BXLE distribution can be defined by MPS estimators of the BXLE parameters can be obtained by maximizing with respect to α, a, b and c. Further, the MPSE of the APExLLD parameters can also be obtained by solving

Simulation
In this section, we studies the performance of APExLLD parameters by using different methods of estimation. The procedure of simulation was executed by the following steps 1. Initialize distribution parameters and generate random sample of size n from uniform distribution.
2. Generate random sample of size n of proposed distribution by its quantile function.  Tables 11-14 introduced BIAS, MSE and MRE for different estimation methods. The row indicating Ranks gives the partial sum of the ranks. A superscript indicates the rank of each of the estimators among all the estimators for that metric. In terms of performance of the methods of estimation, we found that the MPSEs are the best estimators as they produce the least biases, MSE with the least MRE for most of the configurations considered in our study which showed in Table 2.

Value at risk
The VaR is also known as the quantile risk measure. The VaR of a random variable X is the q th quantile of its CDF, see Artzner [16]. If X is a random variable from APELLD, then

Tail value at risk
TVaR is used to determine the expected value of the loss given that an event outside a given probability level has occurred. Let X follows APELLD and from equation 15, then TVaR of X is defined as

Tail variance
The tail variance is one of the most important actuarial measures which pay attention to the tail variance beyond the VaR. The TV of a APELL distributed random variable is derived as where is hyper geometric function and T = [1 + ( xq b ) −a ] −c . By using equation (16), (17) and (18), we get TV of APELLD.

Tail variance premium
The TVP is another important measure that plays an essential role in insurance sciences. The TVP of APExLL distributed random variable is derived as where 0 < λ < 1.

Numerical study of actuarial measures
In this sub-section, we introduce some results of VaR, TVaR, TV and TVP for APExLLD, APLLD, ExLLD and exponential distribution (ED) for different values of parameters. These results are obtained as the following 1. Random sample of size n = 100 are generated from each one of used distributions and parameters have been estimated via the maximum likelihood method. 2. one thousand of repetitions are performed and then calculated average of VaR, TVaR, TV and TVP of the four distributions.
Tables (3) and (4) represent the simulated results of VaR, TVaR, TV and TVP for APExLLD along with the others compared distributions. In addition, these results are depicted graphically in figures (6) and (7). A model with higher values of VaR, TVaR, TV and TVP is said to have a heavier tail. The simulated results in Tables (3) and (4), and the plots in Figure (6) (7) show that APExLLD has higher values of these risk measures than the others compared distributions. Hence, APExLLD has a heavier tail than the other distributions and can be used effectively to model heavy-tailed insurance data.  Figure 6. Plots of the VaR, TVaR, TV and TVP using the results in Table 3. Figure 7. Plots of the VaR, TVaR, TV and TVP using the results in Table 4.

Applications
In this section, we presented the importance and the superiority of APExLLD by using three real data sets from three different fields. The first data set consists of 40 observations and it shows the time to failure (10 3h ) of a turbocharger of one type of engine which was studied by Nassar et al. [42]. The second data set consists of 72 observations and it represents the exceedances of flood peaks of the Wheaton river near Carcross in Yukon Territory, Canada which was studied by Mansour et al. [40]. The third data set from the insurance field represents monthly metrics on unemployment insurance from July 2008 to April 2013 from the department of labor, licensing and regulation. It consists of 58 observations and 21 variables, we studied the variable number 6. It is available at: https://catalog.data.gov/dataset/unemployment-insurance-data-july-2008-to-april-2013. The three real data sets values are available in Table 5, respectively. To compare APExLLD, we use well-known statistics such as Anderson-Darling (A), Cramér-Von Mises (W ) and Kolmogorov-Smirnov (KS) (its P-value). It is known that the smaller these statistics is the better model for fitting the data set. The distributions which compared with APExLLD are given in Table 16 along with their authors and their abbreviations. CDFs of these distributions and Table 16 are provided in Appendix A. Estimation of unknown parameters and statistics A, W and KS (P-value) determined by R language software. Table 6 presents some descriptive statistics for three real data sets, respectively. Numerical values of A, W , KS, KS P-value (KSP ) and maximum likelihood estimator (MLEs) with its standard error (SE) (in parenthesis) for real data sets are given in Tables 7, 8 and 9, respectively. APExLLD has the lowest value of A, W, KS between all compared distributions which give it the superiority for fitting two real data sets. Different methods of estimation were used to determine negative log-likelihood function, estimated parameters, A, W , KS and KSP for APExLLD which were presented in Table 10 for the three real data sets, respectively. Figure 8 presents fitted PDF, CDF, survival function (SF) and PP plot of APExLLD for the three real data sets, respectively. Figure 12 shows that the values of the estimated parameters maximize the log-likelihood function for the three real data sets. The P-P plots and histogram of three data sets with the fitted APExLLD density for various estimation methods are,respectively, shown in Figures 9, 10 and 11 that supports the results in Table 10. Figure 13 provides the Total test time (TTT) curve and plots of the estimates HRF of APExLLD for the three data sets, respectively, for more details about the TTT see Aarset [1].    Figure 8. The fitted APExLLD PDF, CDF, SF, and P-P plots for the three real data sets.

Conclusion
We propose a heavy-tailed model called alpha power exponentiated log-logistic distribution which is a relatively new distribution and is an extension of alpha power transformation. Many of its statistical properties were derived mathematically. Its hazard rate function has different shapes such as upside-down, bathtub, down-upside-down, increasing, J-shaped, and reverse J-shaped. Different estimation methods were derived and used to study the behaviour of proposed model parameters by randomly generated data sets. Also, we compare the performance of   these methods with each other. Risk measures for the proposed model were determined. By using these measures, it proved that the proposed model has heavy-tailed than related and well-known others models. Finally, the superiority and importance of the proposed model were obtained by using three real data sets from different three fields which showed that the proposed model fitting these data sets than the other twelve models. We hope that this model will be used for data analysis in many different fields such as economics, engineering, geology, etc... Figure 10. The P-P plots and histogram of the second data set with the fitted APExLLD density for estimation methods. Figure 11. The P-P plots and histogram of the third data set with the fitted APExLLD density for estimation methods.