Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Journal Issue Information

Archive

Year

Volume(Issue)

Issues

Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    1-24
Measures: 
  • Citations: 

    0
  • Views: 

    547
  • Downloads: 

    0
Abstract: 

Introduction With the advent of big data in the last two decades, to exploit and use this type of data, the need to integrate databases for building a stronger evidence base for policy and service development is felt more than ever. Therefore, familiarity with data linkage methodology as one of the data integration methods and machine learning methods to facilitate the process of recording records is essential. Material and Methods The record linkage process has five major steps including data pre-processing, indexing, record pair comparison, classification and evaluation step. There are two key methods (exact and probabilistic record linkage) for linking records. Exact linkage involves using a unique identifier that is present on both files to link records. In the presence of a unique identity number in a different data source, record linkage is easy to implement. Where a unique identifier is not available, or is not of sufficient quality, it is made by probabilistic record linkage which check the similarity of the features of each record that are common to both files to find records that are likely to belong to the same person. Classifying the compared record pairs based on their comparison vectors is a two-class (match or a non-match) or three-class (match, non-match or potential matches) classification task. In traditional data integration approaches, record pairs are classified into one of three classes, rather than only matches and non-matches and a manual clerical review is required to decide the final match status. Most research in record linkage in the past decade has concentrated on improving the classification accuracy of record pairs. Various machine learning techniques have been investigated, both unsupervised and supervised. In this paper, in addition to introducing the record linkage process and some related methods, machine learning algorithms are used to increase the speed of database integration, reduce costs and improve record linkage performance. Most classification techniques such as support vector machine, decision tree and bagging method, classify each compared record pair individually and independently from all other record pairs. From the classification point of view, each compared record pair is represented by its comparison vector that contains the individual similarity values that were calculated in the comparison step. These comparison vectors correspond to the feature vectors that are employed to train a classification model, and to classify record pairs with unknown match status. Results and Discussion In this paper, two databases of the Statistical Center of Iran and the Social Security Organization are linked. Three classification techniques including support vector machine, decision tree and bagging method, were used for data integration. In addition, ROC curves were plotted to find the best method of classification. The results showed that the support vector machine and decision tree method performed better than the bagging method. Conclusion Statistical organizations are challenged by the need to integrate diverse sets of inconsistent data and produce stable outputs. Instead of making the best possible statistics from a single data source, finding the best combination of sources is necessary to deliver the indicators or statistics that most efficiently satisfy the users’,needs.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 547

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    25-40
Measures: 
  • Citations: 

    0
  • Views: 

    99
  • Downloads: 

    0
Abstract: 

Introduction Convolutions of independent random variables often arise in many applied areas, including applied probability, reliability theory, actuarial science, nonparametric goodness-of-fit testing, and operations research. Since the distribution theory is quite complicated when the convolution involves independent and non-identical random variables, it is of great interest to investigate the stochastic properties of convolutions and derive bounds and approximations on some characteristics of interest in this setup. The results in this work give responses that under the new condition, the convolution of two random variables is ordered according to some stochastic orders such as likelihood ratio order, hazard rate order and reversed hazard rate order. In general cases, let X1, X2 and X,1,X,2 be independent random variables that X1 , lr X2 and X,1 , lr X,2. Then it is not necessarily true that X1 + X2 , lr X,1 + X,2. However, if these random variables have log-concave densities, then it is true. This paper compared the random variables from scale models according to likelihood ratio order, hazard rate order and reversed hazard rate order. Random variable X be said to belong to the scale family of distributions if it has the distribution function. The density function F(, x) and , f(, x), respectively, where F is an absolutely continuous distribution function with density function f and ,is the scale parameter. Material and Methods The comparison of essential characteristics associated with lifetimes of technical systems is an exciting topic in reliability theory since it usually enables us to approximate complex systems with simpler designs and subsequently obtain various bounds for important ageing characteristics of the complex system. A convenient tool for this purpose is the theory of stochastic orderings. Results and Discussion This paper deals with some stochastic comparisons of convolution of random variables comprising scale variables. Sufficient conditions are established for these convolutions’,likelihood ratio ordering and hazard rate ordering. The results established in this paper generalize some known results in the literature. Several examples are also presented for more illustrations. Conclusion Convolutions of independent random variables occur quite frequently in probability and statistics, stochastic activity networks, optics, acoustics, electrical engineering, physics, the area of digital signal and insurance mathematics. Therefore, their stochastic properties are essential and have been discussed extensively in the literature. We obtained sufficient conditions to compare the convolution of random variables from the scale model concerning likelihood ratio and hazard rate order. Recently Amini-Seresht and Barmalzan (٢, ٠, ٢, ٠, ) have studied ordering properties of parallel and series systems consisting of outlier scale components. They provided some sufficient conditions on the parameter vectors for the likelihood ratio, hazard rate, reversed hazard rate and mean residual lifetime orders between the lifetimes of the series and parallel systems, respectively. Therefore, a generalization of the present work to the case random variables with an outlier scale model framework will be of interest. We are working on this problem and hope to report these findings in a forthcoming paper.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 99

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

Bazyari A. | ALIZADEH M.

Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    41-62
Measures: 
  • Citations: 

    0
  • Views: 

    84
  • Downloads: 

    0
Abstract: 

Introduction This paper considers the collective risk model of an insurance company with constant surplus initial and premium when the claims are distributed as Exponential distribution and process number of claims distributed as Poisson distribution. It is supposed that the reinsurance is done based on excess loss, which in that insurance portfolio, the part of total premium is the share of the reinsurer. A general formula for computing the infinite time ruin probability in the excess loss reinsurance risk model is presented based on the classical ruin probability. The random variable of the total amount of reinsurer’, s insurer payment in the risk model of excess loss reinsurance is investigated, and explicit formulas for calculating the infinite time ruin probability in the risk model of excess loss are proposed for reinsurance. Finally, the results are examined using numerical data for Lindley and Exponential distributions. Material and Methods The infinite time ruin probability is computed in the collective risk model with constant surplus initial and premium when the claims are distributed as Exponential distribution and process number of claims distributed as Poisson distribution. Using mathematical and statistical approaches, for example, Laplace transform and moment generating function on the claim amounts, and some theorems are presented to compute the ruin probability. The primary method is to separate the ruin probability based on a certain threshold with constant initial reserve. It is supposed that reinsurance is based on excess loss. Results and Discussion Information about the company’, s status is essential for the company’, s managers at any time. The primary key is computing the ruin probabilities. In the present paper, we compute the infinite time ruin probability in the classical risk model with constant initial reserve when the claim amounts are distributed as Exponential distribution and the reinsurance is excess loss, which in that insurance portfolio, the part of total premium is the share of the reinsurer. The infinite time ruin probability computes various threshold and initial reserve values. The numerical result shows that the infinite time ruin probability decreases as the initial increases. Conclusion In studying an insurance risk model, knowing how the company’, s financial reserve may be expected over a certain period is essential. A common criterion to assess risks for an insurer is called ruin probability. Ruin is a principal technical term that does not necessarily mean that the company is bankrupt but instead that bankruptcy is at hand and that the company should be prompted to take action to improve its solvency status. Ruin theory is the branch of applied probability that quantifies a firm’, s vulnerability to insolvency and ruin. Computing the ruin probability is a central topic in insurance risk theory literature. The study of level crossing events is a standard topic of risk theory. It has turned out to be a fruitful area of applied mathematics and statistics, as (depending on the model assumptions) often subtle applications of tools from real and complex analysis, functional analysis, asymptotic analysis and algebra are needed. The classical insurance risk model includes information about the premium income rate and the initial capital necessary to meet the expected claims costs. The ruin probability function depends on some quantities. The results show that the infinite time ruin probability increases as the initial increases.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 84

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    63-89
Measures: 
  • Citations: 

    0
  • Views: 

    80
  • Downloads: 

    0
Abstract: 

Introduction Graphical mixture models provide a powerful tool to visually depict the conditional independence relationships between high-dimensional heterogeneous data. In the study of graphical mixture models, the distribution of the mixture components is mostly considered multivariate normal with different covariance matrices. The resulting model is the Gaussian graphical mixture model (GGMM). The nonparanormal graphical mixture model (NGMM) has been introduced by replacing the normal assumption with a semiparametric Gaussian copula, which extends the nonparanormal graphical model and mixture models. This study proposes clustering based on NGMM under two forms of ℓ, 1 penalty functions. Its performance is compared with clustering based on GGMM, in terms of cluster reconstruction and parameters estimation. Material and Methods The clustering based on NGMM is performed via a penalized EM algorithm under conventional and unconventional forms of ℓ, 1 penalty functions (denoted by NGMM0 and NGMM1, respectively) and its performance over Gaussian and non-Gaussian simulated data sets are compared with the Gaussian ones (represented by GGMM0 and GGMM1, respectively). Along with the conventional ℓ, 1 penalty, an alternative, unconventional penalty term is considered, which depends on the mixture proportions. Thus, the choice of mixture model distribution (Gaussian or nonparanormal) along with the choice of penalty function has emerged as the primary key of comparison. To better compare the studied methods in terms of robustness against outliers, we considered deterministic and random contamination mechanisms. The proposed methodology is applied to Wisconsin diagnostic breast cancer data set to diagnose malignant or benign cancer patients. Results and Discussion The results of the simulation study on normal and nonparanormal datasets in ideal and noisy settings, as well as the application of breast cancer data set, showed that clustering approaches based on NGMM (NGMM0 and NGMM1) are more efficient and robust in the recovery of true cluster assignments than the clustering based on GGMM (GGMM0 and GGMM1), whereas, the unconventional PMLEs (GGMM1 and NGMM1) are more efficient in estimating the elements of precision matrices than the conventional PMLEs (GGMM0 and NGMM0). Conclusion The performance of clustering methods depends on the choice of penalty function and model selection, such that the combination of the nonparanormal graphical mixture model and the penalty term depending on the mixing proportions (NGMM1) is more accurate than Gaussian ones in terms of cluster reconstruction and parameters estimation.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 80

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    91-108
Measures: 
  • Citations: 

    0
  • Views: 

    77
  • Downloads: 

    0
Abstract: 

Introduction Fail-safe systems ((n 􀀀,1)-out-of-n Systems) are commonly used in many day-to-day applied structures. A fail-safe is a special design feature that will respond when a failure occurs so that no harm happens to the system itself. The brake system in a train is an excellent example of a fail-safe system in which the brakes are held in off-position by air pressure. If a brake line splits or a carriage becomes separated, the air pressure will be lost,in that case, the brakes will be applied by a local air reservoir. However, another classic example of a fail-safe system is an elevator in which brakes are held off brake pads by tension, and if the tension gets lost, the brakes latch on the rails in the shaft, thus preventing the elevator from falling off. There are many other such fail-safe systems in common use. Balakrishnan et al. (2015) established necessary and sufficient conditions for comparing two fail-safe systems with independent homogeneous exponential components in terms of mean residual life, dispersive, hazard rate and likelihood ratio orders. Their results specifically showed how an (n−, ١, )-out-of-n system consisting of heterogeneous components with exponential lifetimes could be compared with any (m−, ١, )-out-of-m system consisting of homogeneous components with exponential lifetimes. Similarly, Zhang et al. (2019) presented sufficient (and necessary) conditions on the lifetimes of components and their survival probabilities from random shocks for comparing the lifetimes of two fail-safe systems in terms of standard stochastic, hazard rate and likelihood ratio orders. Cai et al. (2017) compared the hazard rate order of second-order statistics arising from two sets of independent multiple-outlier proportional hazard rates (PHR) samples. Material and Methods The comparison of essential characteristics associated with lifetimes of technical systems is an exciting topic in reliability theory since it usually enables us to approximate complex systems with simpler systems and subsequently obtain various bounds for important agreeing characteristics of the complex system. A convenient tool for this purpose is the theory of stochastic orderings. Results and Discussion This paper discusses the hazard rate order of (n􀀀, 1)-out-of-n systems arising from two sets of independent multiple-outlier modified proportional hazard rates components. Under certain conditions on the parameters and the submajorization order between the sample size vectors, the hazard rate order between the (n􀀀, 1)-out-of-n systems from multiple-outlier modified proportional hazard rates is established. Conclusion In this paper, we have presented sufficient conditions for the hazard rate order between fail-safe systems. It will be of great interest to generalize the current work from lifetimes of fail-safe systems to those of k-out-of-n. Another problem of interest will be to consider the setting of general systems with several subsystems having dependent components and extend the results established here to this general case. We are currently working on these problems and hope to report those findings in a future paper.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 77

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    109-126
Measures: 
  • Citations: 

    0
  • Views: 

    55
  • Downloads: 

    0
Abstract: 

Introduction The exponential distribution, because of its memoryless, has a central rule in reliability theory and survival analysis. But, this distribution is not a suitable model to fit the data sets in practical situations due to its constant hazard rate function. For this reason, some generalizations of the exponential distribution exist in the literature. The generalized exponential distribution is introduced by adding a shape parameter to the exponential distribution via the exponentiated method. This distribution admits both increasing and decreasing hazard rate function. For more information on the generalized exponential distribution and its applications, one can refer to Gupta and Kundu (2007) and Nadarajah (2011). Comparisons of parallel systems with two independent heterogeneous exponential components are studied extensively in the literature. Boland et al. (١, ٩, ٩, 4) proved that the hazard rate order between two parallel systems holds under the majorization order between the vectors of the hazard rate parameters. This result is extended to the likelihood ratio order by Dykstra et al. (1997). In this direction, Zhao and Balakrishnan (2012) obtained some characterization results concerning the hazard rate and likelihood ratio orders using the p-larger and weak majorization orders between the hazard rate parameters vectors. Yan et al. (2012) established sufficient conditions to compare two parallel systems in the hazard rate and likelihood ratio orders. The present work provides a sufficient condition to compare parallel systems comprising two independent heterogeneous generalized exponential components in the likelihood ratio order. Material and Methods The comparison of essential characteristics associated with lifetimes of technical systems is an exciting topic in reliability theory since it usually enables us to approximate complex systems with simpler systems and subsequently obtain various bounds for important ageing characteristics of the complex system. A convenient tool for this purpose is the theory of stochastic orderings. Results and Discussion Consider two parallel systems with their component lifetimes following a generalized exponential distribution. In this paper, based on existing shape and scale parameters included in the distribution of one of the systems, we introduce a region such that if the vector of scale parameters of another parallel system lies in that region, then the likelihood ratio ordering between the two systems hold. An extension of this result is also presented for the case when the lifetimes of components follow exponentiated Weibull distribution. Conclusion In this paper, based on the shape and scale vectors of parameters involved in the lifetime distribution of a parallel system consisting of two independent heterogeneous generalized exponential components, a region is obtained such that if the scale vector of parameters of another parallel system lies in this region, then the likelihood ratio order between systems holds.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 55

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    127-148
Measures: 
  • Citations: 

    0
  • Views: 

    60
  • Downloads: 

    0
Abstract: 

Introduction Due to its simplicity, the autoregressive process is widely used in practice. In some cases, we face restrictions in estimating the parameter for predicting the expected value of the time series models and cannot use an optimal fixed sample size procedure. One appropriate approach to tackle this problem is to use sequential methods. Sequential procedures are distinguished by the definition of the stopping rule and are stopped by a pre-defined stopping rule as soon as asymptotic properties of the procedure are observed. Many authors have researched sequential sampling procedures, namely purely sequential and two-stage procedures. We are also interested in investigating the performance of the modified twostage procedure because of the most important and widely used sequential methods and the operational savings of these procedures. The modified twostage approach proposes a situation where we can provide a strategy for determining the initial sample size in the two-stage process that, in many cases, prevents overestimation of the final sample size. The advantages of the modified two-stage include the simplicity of implementation and reduced weakness of the two-stage procedure in estimating. The point estimation is studied based on the least-squares estimator as the reciprocal of the cost per observation tends to infinity and interval estimation when the width of the confidence interval goes to zero. We presented the performance of the procedure under the theorems that demonstrate the asymptotic properties of the procedures, including asymptotic risk efficiency, asymptotic efficiency, and asymptotic consistency. Material and Methods We conduct Monte Carlo simulation studies to investigate the performance of procedures based on least-squares estimators. The performance of estimators and confidence intervals are evaluated utilizing a simulation study. We report the results in terms of the stopping variables, the ratio of the average stopping variable to the optimal fixed sample size, the root of mean square error (RMSE) of the estimators, and the proportion of risk efficiency functions. Furthermore, real-time series data is considered to illustrate the applicability of the modified two-stage procedure. Results and Discussion The simulation results confirm the theoretical results and show the procedure’, s effectiveness compared to the optimal fixed sample size. Also, the actual data results show the excellent performance of this procedure in practice. Conclusion The modified two-stage procedure and operational savings are more accessible to implement than the most commonly used purely sequential procedure. It is also more accurate than the two-stage procedure. Also, the procedure performs well and is an excellent candidate for analyzing time series models.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 60

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    149-163
Measures: 
  • Citations: 

    0
  • Views: 

    51
  • Downloads: 

    0
Abstract: 

Introduction Using the copulas function is a particular form of introducing variables and their dependency. Also, paying attention to copulas for estimating the dependence parameters has become popular in recent decades. As a semiparametric technique, Berahimi and Necir (2012), introduced copula moment (CM) and compared it with PMLE and ,and ,inversion methods, also Kojadinovic and Yan (2010) used three semi-parametric methods based on copula models. Taheri et al. (2018) studied the dependence of bivariate copulas in the presence of outliers. This article used three moment-based estimation methods in the presence of outliers. The moment method and two other estimation methods are related to moment, copula moment (CM) and their mixture. Material and Methods Let (X,Y ) be a random vector with copula function C and dependence parameter , . Also, let (X1,Y1), : : :, (Xn,Yn) are a random sample from (X,Y ). We assume that the random vector (X,Y ) is in the presence of outliers. In other words, assume that n 􀀀,k (k is an unknown and a random integer) elements of the random sample have true copula function C1 and dependence parameter ,and the remind elements have another copula function C2 and dependence parameter , , , where ,is an unknown real value called as a noised parameter. The copula functions C1 and C2 can have completely different structures. For estimating ,in the presence of outliers, we may obtain the joint density function of a random sample. A simulation study is used to select the best estimation method, and the estimators are compared based on MSE. Also, to illustrate the results achieved in a simulation study, we applied a real example for the tourists who visit the ”, Tomb of Ayub Prophet”,(TAP) and the ”, Imamzadeh Asgari Tomb”,(IAT) in North Khorasan, IRAN. Here, we test the dependence between the number of visitors to TAP and IAT at weekends and holidays. Results and Discussion For estimating parameters and choosing which value of ,is good for FGM copula in the presence of outliers, we test various ,and use them in the likelihood function for copulas in the presence of outliers. Using ١, ٠, ٠, ٠,independent repetitions of the likelihood function for FGM copula in a simulation study suggests that ,= 0: 1 is the best value. The dependence parameter is estimated by substituting ,= 0: 1 in the likelihood function of copulas in the presence of outliers. The simulation results show that when we use CM and mixed methods in the presence of outliers, the empirical MSEs are reasonable. Also, CM is the best estimator based on MSE. Conclusion Taheri et al. (٢, ٠, ١, ٨, ) showed that the best methods for estimating the dependence parameter in the presence of outliers are the MLE, PMLE, the inverse of , , and the inverse of ,and CM, respectively. In this paper, based on three moment methods, MM, Mixture and CM, the CM method is the best one according to their MSEs. From a practical point of view, it was also concluded that the estimations of the dependence parameter in the presence of outliers do not show a big difference in MLE and CM methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 51

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

OBEIDI R. | NASIRI P.

Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    165-188
Measures: 
  • Citations: 

    0
  • Views: 

    89
  • Downloads: 

    0
Abstract: 

Introduction In lifetime studies consider that different components cause the failure of the unit/item under study but of the same type that is not entirely observed. In this case, the failure time of the unit/item is recorded and evaluated based on the information obtained from the observation and as the minimum value among other components affecting failure. The experimenter cannot identify the component that led to the unit’, s failure. In the study of series systems, the minimum component lifetime among the effective components leads to failure and is observed. In recent literature, Adamidis and Loukas (1998) used the geometric distribution function as the number of failure components and introduced a two-parameter exponential-geometric distribution with a descending failure rate. In applying compounding distributions of lifetime study, the experimenter may face the phenomenon of censoring. Because there are cases in which the units/items, although alive, are lost or removed. In this study, type-II of censoring has been investigated. Recently, the inverse Weibull distribution in censored data has been studied by Ateya (2017), Singh and Tripathi (2018). This paper presents the inverse Weibull-Poisson distribution function in the series system of the type-II censored sample. Material and Methods This paper considers the classical and Bayesian estimation of parameters of inverse Weibull-Poisson distribution function under the type-II censoring. Since the normal equations are not solved analytically, the EM algorithm, as the numerical method, is used in estimating the maximum likelihood methods. Little and Rubin (1983) showed that the EM algorithm is more reliable than the Newton-Raphson method in the case of incomplete data. Then, the Fisher information matrix for censored data is obtained with the principle of Louis (1982), and the approximate confidence intervals can be calculated. Parameters are estimated under the square error and LINEX loss functions while Gamma distribution is prior distribution. In Bayesian estimation, since the posterior distribution is not obtained in closed form, parameters are estimated with Markov chain Monte Carlo techniques and samples are generated by Gibbs sampling via the Metropolis-Hastings algorithm. Finally, the Bayesian confidence intervals are obtained using Kundu (2008), and the HPD intervals are constructed with Chen and Shao (1999) methods. Results and Discussion To evaluate the performance of estimators in terms of MSE’, s and their corresponding confidence intervals, it is generated 10000 samples for different sample sizes and three censoring schemes. Conclusion The simulation results show that with decreasing number of censors for a fixed sample size, the estimation of the parameters is closer to the actual values, and the MSE are reduced. Moreover, with an increasing sample size, the MSE of parameters is reduced for a fixed censoring scheme. For the 30% censoring scheme, Bayesian estimators of the parameters under the square error loss function have small MSE. The maximum likelihood estimators of parameters for the 10% censoring scheme have small MSE. The simulation results using confidence intervals show that the length of confidence intervals is reduced for a fixed sample size with decreasing number of censored. Moreover, the classical confidence intervals have the shortest interval length for all censoring schemes. The length of the HPD confidence intervals is shorter than the Bayesian confidence intervals.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 89

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    189-207
Measures: 
  • Citations: 

    0
  • Views: 

    47
  • Downloads: 

    0
Abstract: 

Introduction In analyzing labor market statistics, policies and plans are usually based on changes in stock variables. The number of changing statuses can be estimated in countries where labor force surveys are based on rotational samples and ordinary people in different periods. Flow statistics present the input and output of the labor force and the number of changing status labor force between periods. One of the essential non-sampling errors in labor force statistics is the response error, which may lead to the incorrect classification of individuals. To use flow statistics, their classification error must be specified so that knowledge of this error and the analyzing on flow statistics can be done. Usually, the error of classifying flow statistics is estimated using the interview method, which is costly and complicated. In this paper, the feasibility of each model is examined while presenting the process of estimating flow statistics and appropriate models for calculating the classification error according to the sample rotation pattern in Iran. Material and Methods Markov Latent Class Models (MLC) for panel data analysis exploit panel surveys’,repeating nature to extract classification error information. MLC analysis may be the only way to evaluate flow error in panel surveys where are-interview program is not possible from the interview data. Let At be the observed data at wave t and Xt denote the observable true value at wave t. Also, the cross-classification of the variable A at three waves is denoted by A1A2A3. The MLC model contains two components: (1) the structural component , X1X2X3 x1x2x3 that describes the interdependencies between the Xt, t = 1,2,3 and the model covariates (grouping variables) which represent the time-to-time transitions among the true classifications and (2) the error component , A1A2A3jX1X2X3 a1a2a3jx1x2x3 describing the interactions between At at each wave t = 1,2,3 and Xt and other model covariates which represent the deviations between the true and observed classifications. To reduce the number of parameters, the transition probabilities can be assumed to be stationary. The Iranian Labour Force Survey (LFS) is a seasonal multi-stage survey. Weighting is carried out in three stages. To estimate changes between periods without losing efficiency of current level estimation, rotation sampling (2-2-2) is used. In this paper, we applied the MLC Model to estimate the flow error in the Labore Force Survey of Iran in the 2019 and 2020 years. Results and Discussion We assumed four different scenarios to evaluate the LFS flow error due to the limitation of 2-2-2 rotation pattern. The observations of four sampling periods were considered for each sample unit, assuming no changes in the situation between the second and third periods. Also, the flow statistics of four different time sequences were studied to assess the seasonal effects on flow error. The results demonstrated that the highest error is related to the unemployed status, and the lowest error is related to inactive status. Conclusion According to the pattern of sampling rotation in LFS of Iran, it is possible to estimate the flow statistics using labor force statistics. Non-sampling error is one of the most influential errors in flow statistics. one of the essential non-sampling errors in these statistics is response errors that may lead to misclassification of individuals into the workforce status. MLC models can be used to estimate the classification error of flow statistics. According to the proposed model, it is possible to provide labor force statistics and the classification error without spending money on re-interviewing.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 47

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    209-238
Measures: 
  • Citations: 

    0
  • Views: 

    58
  • Downloads: 

    0
Abstract: 

Introduction A general censoring scheme called progressive Type-II right censoring has been considered. The removal plan can be fixed or random, chosen according to a discrete probability distribution. In many practical problems, not only does an experiment process determines inevitably to use random removals, but also a fixed removal assumption may be cumbersome to analyze some results of statistical inference. The scenario of random removals has been introduced by Yuen and Tse (1996) under the Weibull lifetime distribution and the discrete Uniform distribution for random removals. Tse et al. (2000) discussed Binomial removals even though the parameter p enormously impressed the experiment time, and the Uniform and Binomial distributions were independent of the lifetime distribution. The limitations mentioned above motivate us to propose a new method for determining removals based on the failure times. Material and Methods Let the lifetimes of the n units placed on the life-test be distributed as two-parameter Weibull distribution. The proposed random removals use the relationship between the Weibull and Exponential and are based on two approaches: the normalized spacings with random and fixed coefficients according to progressively Type-II censored order statistics from the Exponential distribution. Wherein the time distance between consecutive failure times depends on the type of lifetime distribution and the number of units that will be removed after each failure are proportional to a root function of the difference between two last failure times divided by the time of the first failure. The joint probability mass functions of random removals are also derived. The estimations of parameters are derived using different estimation procedures such as the maximum likelihood, maximum product spacing, and least-squares methods. The proposed random removal schemes are compared to the discrete Uniform and the Binomial removal schemes via a Monte Carlo simulation study in terms of their biases, root mean squared errors of estimators, expected total test times and the Ratio of the Expected Experiment Time (REET) values. Finally, an innovative technique is introduced for deriving progressive type II censoring samples from a real data set. Results and Discussion From comparing the REET values, it is evident that a slight reduction in expected experiment time occurs when a large number of units are tested for lifetimes under Uniform and Binomial distributions with a considerable probability, p, especially for cases with decreasing failure rate ,> 1. Although the Binomial distribution with p < 0: 5 has relatively acceptable performance, two proposed approaches have smaller REET values, which decreases significantly as the sample size n increases. However, binomial removals perform better than uniform removals in terms of E(Xm: m: n). Still, the expected test time depends very much on the value of removal probability p. Conclusion It is shown that the expected total time under the random coefficients has the most negligible value concerning other approaches and reduces the expected full time on the test.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 58

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    16
  • Issue: 

    1
  • Pages: 

    239-252
Measures: 
  • Citations: 

    0
  • Views: 

    62
  • Downloads: 

    0
Abstract: 

Introduction The clustering of the high dimensional data is usually encountered with problems such as the curse of dimensionality. To overcome such obstacles, dimensionality reduction methods are often used. This view is typically referred to by two approaches,variable selection and variable extraction. Recently, researchers proposed a way that is claimed to lose less information in clustering high-dimensional data than other techniques. Among them, that presented by Anderlucci et al. (2021) under the title of Random Projections is very popular. The RP method is based on creating random projections, selecting a small subset, and then performing clustering tasks. Comparison and superiority of this method with conventional approaches of dimensionality reduction, using four critical criteria of clustering including adjusted Rand index, Jaccard index, Fowlkes-Malo index and the accuracy index is performed on three gene expression datasets in this article. Material and Methods One of the variable selection methods is the variable selection approach for clustering based on the Gaussian model. On the other hand, the principal components analysis method is one of the most popular methods for extracting variables. Another practical, new and exciting approach to performing dimensionality reduction is the Random Projections method. Using a group random projections, Andrelucci et al. (٢, ٠, ٢, ١, ) proposed clustering algorithm to cluster the high-dimensional data. This algorithm obtains the final output through Gaussian mixture model clustering applied to the optimal subset of random projections. Then, the original high-dimensional data is mapped onto the reduced spaces. Finally, model selection criteria are calculated for them and observations are clustered using optimal projections. Results and Discussion In this paper, the proposed methods by Anderlucci et al. (2021) are described and compared on three gene expression datasets, including leukaemia, lymphoma, and prostate cancers. Based on the gained results, using the introduced criteria, both competing methods have lower values than the random projections method and therefore have weaker performance. The final result is that the random projections method performs better for the three mentioned datasets. It should be noted that the purpose of the current study was only to compare the performance of clustering based on the three mentioned approaches and some different clustering criteria. So, other analytical aspects related to the random projection were not considered. Further exploration of these methods will be followed in our future research. Conclusion Clustering of high-dimensional data faces different statistical challenges, and various methods exist to overcome the related problems. One of these practical tools is reducing the data dimension. This article examined the random projection from both theoretical and practical aspects. Also, its performance was evaluated on three real data sets and compared with other standard methods, and its superiority was shown based on several conventional indicators of clustering measures. To conduct future research, one can address the probabilistic aspects of the random projections approach by considering proper statistical inference methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 62

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button