Archive

Year

Volume(Issue)

Issues

Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
Title: 
Author(s): 

Issue Info: 
  • Year: 

    0
  • Volume: 

    4
  • Issue: 

    1
  • Pages: 

    -
Measures: 
  • Citations: 

    1
  • Views: 

    965
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 965

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 2
Issue Info: 
  • Year: 

    2007
  • Volume: 

    4
  • Issue: 

    1
  • Pages: 

    1-14
Measures: 
  • Citations: 

    0
  • Views: 

    763
  • Downloads: 

    0
Abstract: 

Frame imperfection, non-response and unequal selection probabilities always affect survey results. In order to compensate for the effects of these problems, Devill and Särndal (1992) introduced a family of estimators called calibration estimators. In these estimators we look for weights that have minimum distance with design weights based on a distance function and satisfy calibration equations.In this paper after introducing generalized regression estimator, we explain general form of calibration estimators. Then special cases of calibration estimators due to using different distance functions, practical aspects and results of comparing the methods are considered.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 763

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2007
  • Volume: 

    4
  • Issue: 

    1
  • Pages: 

    15-28
Measures: 
  • Citations: 

    1
  • Views: 

    989
  • Downloads: 

    0
Abstract: 

In statistics it is often assumed that sample observations are independent. But sometimes in practice, observations are somehow dependent on each other. Spatiotemporal data are dependent data which their correlation is due to their spatiotemporal locations. Spatiotemporal models arise whenever data are collected across both time and space. Therefore such models have to be analyzed in terms of their spatial and temporal structure. Usually a spatiotemporal random field {Z(s, t) : (s, t) Î D x T} is used for modeling the spatiotemporal data, where D Ì Rd, d ³ 1 is a space region and T Í R is a time region. One of the fundamental subjects in analyzing such data is prediction. In spatial statistics, assuming that the spatiotemporal random field Z(s,.t) is stationary with finite variance at all coordinates (s, t) Î D x T, and spatiotemporal covariance function C(h, u) = cov (Z(s, t)j Z(s + h, t + u)) exists, the unknown value of the random field at a given location (s0, t0) is usually predicted with kriging as the best linear unbiased predictor. In practice, the spatiotemporal covariance function is unknown and a positive definite function should be fitted to the estimates of the covariance function. To ensure that a valid spatiotemporal covariance model is fitted to the data, one usually considers a parametric family whose members are known to be separable positive definite functions. A separable spatiotemporal covariance function might decompose into sum or product of a purely spatial and a purely temporal covariance function. In this paper the product-sum model introduced by De Iaco et al. (2001) is used to determine the spatiotemporal correlation of the data.In some applied problems, in addition to the values of an attribute of interest Z(0, 0), some additional information is available in each sample location, so the precision of prediction would be improved by their implementation. In this paper, to exploit this additional information in kriging, two techniques for spatiotemporal kriging of temperature are compared. The first technique, spatiotemporal ordinary kriging, is the simplest of the two, and uses only information about temperature. The second technique, spatiotemporal kriging with external drift, uses also the relationship between temperature and height to aid the interpolation. It is shown that the behavior of the temperature predictions is physically more realistic when using spatiotemporal kriging with external drift. The implementation of spatiotemporal kriging with external drift, then, is illustrated in a real problem, consisting of maximum and minimum temperature of 6 provinces in Iran.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 989

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 2
Issue Info: 
  • Year: 

    2007
  • Volume: 

    4
  • Issue: 

    1
  • Pages: 

    29-46
Measures: 
  • Citations: 

    0
  • Views: 

    1179
  • Downloads: 

    0
Abstract: 

In most situations the best estimator of a function of the parameter exists, but sometimes it has a complex form and we cannot compute its variance explicitly. Therefore, a lower bound for the variance of an estimator is one of the fundamentals in the estimation theory, because it gives us an idea about the accuracy of an estimator. It is well-known in statistical inference that the Cramer-Rao inequality establishes a lower bound for the variance of an unbiased estimator. But one has no idea how sharp the inequality is, i.e., how.close the variance is to the lower bound. It states that, under regularity conditions, the variance of any estimator cannot be smaller than a certain quantity. An important inequality to follow the Cramer-Rao inequality is that of a Bhattacharyya (1946, 1947).We introduce Bhattacharyya lower bounds for variance of estimator and show that Bhattacharyya inequality achieves a greater lower bound for the variance of an unbiased estimator of a parametric function, and it becomes sharper and sharper as the order of the Bhattacharyya matrix increases.Also, we study the structure and behavior of Bhattacharyya bound for natural exponential family (NEF) especially in the case of the negative binomial and exponential distributions as a member of natural exponential family with quadratic variance function (NEF-QVF). Shanbhag (1972, 1979) showed that the diagonality of the Bhattacharyya matrix characterizes the set of normal, Poisson, binomial, negative binomial, gamma, or Meixner hypergeometric distributions which are members of NEF-QVF. In view of Blight and Rao (1974), we approximate the variance of an unbiased estimator of parameter (p) in negative binomial and parameter (e=-a/q) in exponential distribution by a simulation study. Furthermore, in this research, the structure and behavior of Bhattacharyya bound for natural exponential family with cubic variance function (NEF-CVF) is considered in two subsections: (1) in distributions that the variance is a cubic function of q and E(X) a is linear function of q, e.g., inverse Gaussian distribution, and (2) in distributions that the variance is a cubic function of E(X) but the E(X) is not a linear function of q, e.g., Abel and Tackas distributions. In the first part we find that the general form of 5 x 5 Bhattacharyya matrix is as follow: j11    0    0   0      0         j22  j23 0      0               j33  j34  j35                     j44  j45                           j55    where Jrs = cov (f(r) (X|q)/f(x|q), f(s) (x|q)/f (X|q)), and f(r) (X|q) IS the rth order denvatives of f(XIO) with respect to O. We calculate the 5 x 5 Bhattacharyya matrix for the inverse Gaussian distribution and evaluate different Bhattacharyya bounds for the variance of estimator of the failure rate and coefficient of variation due to inverse Gaussian distribution.In the second part, we calculate the 4 x 4 Bhattacharyya matrix for the Abel distribution and evaluate different Bhattacharyya bounds for the variance of estimator of r(q) = e-q due to the Abel distribution. Also, we calculate the 2 x 2 Bhattacharyya matrix for the Tackas distribution and evaluate different Bhattacharyya bounds for the variance of estimator of t(q) = 1/q due to the Tackas distribution.The variance of an estimator can be approximated by Bhattacharyya bounds when the order of Bhattacharyya matrix is more than one. Hence, via a simulation study, it is shown that this approximation is better than the approximation by Cramer-Rao bound.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1179

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2007
  • Volume: 

    4
  • Issue: 

    1
  • Pages: 

    47-70
Measures: 
  • Citations: 

    0
  • Views: 

    1641
  • Downloads: 

    0
Abstract: 

Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. To draw a sample from the underlying population, a variety of sampling methods can be employed, individually or in combination. Cut-off sampling is a procedure commonly used by national statistical institutes to select samples. There are different types of cut-off sampling methods employed in practice. In its simplest case, part of the target population is deliberately excluded from selection. For example, in business statistics it is not unusual to cut off (very) small enterprises from the sampling frame. Indeed, it may be tempting not to use resources on enterprises that contribute little to the overall results of the survey. So, in this case, the frame and the sample are typically restricted to enterprises of at least a given size, e.g. a certain number of employees. It is assumed that the contribution of this part of the population is, if not negligible, at least small in comparison with the remaining population. In particular, cut-off sampling is used when the distribution of the values Y1, . . . ,YN is highly skewed, and no reliable frame exists for the small elements. As explained above, such populations are often found in business surveys. A considerable portion of the population may consist of small business enterprises whose contribution to the total of a variable of interest (for example, sales) is modest or negligible. At the other extreme, such a population often contains some giant enterprises whose inclusion in the sample is virtually mandatory in order not to risk large error in an estimated total. One may decide in such a case to cut off (exclude from the frame, thus from sample selection) the enterprises with few employees, say five or less. The procedure is not recommended if a good frame for the whole population can be constructed without excessive cost. This method may reduce the response burden for these small enterprises. On the other hand, this elementary form of cut-off sampling, which we refer to as type I cut-off sampling, may be considered a dirty method, simply because (i) the sampling probability is set equal to zero for some sampling units and so it can be considered as a type of non-probability sampling design, and (ii)it leads to biased estimates.However, the use of cut-off sampling and its modified versions can be justified by many arguments. Among other one can argue, and justify the use of cut-off sampling, that . It would cost too much, in relation to a small gain in accuracy, to construct and maintain a reliableframe for the entire population; . Excluding the units of population that give little contribution to the aggregates to be estimated usually implies a large decrease of the number of units which have to be surveyed in order to get a predefined accuracy level of the estimates;. Putting a constraint to the frame population and, as a consequence, to the sample allows to reduce the problem of empty strata; . The bias caused by the cut-off is deemed negligible. In this paper we discuss different types of cut-off sampling methods with more emphasize on analyzing type III cut-off sampling which consists of take all, take some, and take none criteria: Roughly speaking, in our discussed methods, the population is partitioned in two or three strata such that the units in each stratum are treated differently; in particular, a part of the target population is usually excluded a priori from sample selection. We discuss where we should consider cut-off sampling as a permitted method and how to deal with it concerning estimation of the population mean or total using modelbased, model-assisted, and design-based strategies. Theoretical results will be given to show how the cut-off thresholds and the sample size should be chosen.Different error sources and their effects on the overall accuracy of our presented estimates are also addressed. The outline of the paper is as follows. In section 2, we briefly discuss different types of cut-off sampling design and some of their properties. In section 3, we first introduce our notations and motivate the use of type III cut-off sampling. We further discuss estimation of the population mean (or total) based on ignoring the population units in "take none" strata or by modeling them using auxiliary information. We study the problem of ratio estimation of the population mean and type III sample size determination (for given precision of estimation) using design-based, model-based, and model-assisted strategies. In this section, we also study the problem of threshold calculation and its approximation using different methods and under different conditions. Finally, in section 4, we present a simulation study and compare our obtained results with the ones under commonly used cut-off sampling of type I and its modification.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1641

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2007
  • Volume: 

    4
  • Issue: 

    1
  • Pages: 

    71-90
Measures: 
  • Citations: 

    0
  • Views: 

    1378
  • Downloads: 

    0
Abstract: 

Small area estimation has received a lot of attention in recent years due to necessity demand for reliable small area statistics. Direct estimator may not provide adequate precision, because sample size in small areas is seldom large enough. Hence, by employing models that can use auxiliary information and area effects in descriptions, one can increase the precision of direct estimators. Due to more readily available level auxiliary information of area, simplicity and possibility of evaluation of the assumptions used by survey data, area level model has become of comprehensive importance, nowadays. Therefore, basic area level models have been extensively studied in this paper to derive empirical best linear unbiased prediction (EBLUP), empirical Bayes (EB), and hierarchical Bayes (HB) with several different assumptions on parameters. Those models are used to option the small area estimators, i.e., the mean of household income in several provinces of Iran, including Khorasan-e- Razavi, Hamedan, Lorsetan, and Tehran. To assess small area estimators, we used 1700 urban households who live in those provinces from the data set of the 2006-2007 Household's Income and Expenditure Survey. Some sampling scheme has been utilized. The optimal total sample size has been more than 400 units, but we have only 212 units available. Due to shortage of sample size, we face large MSE's, encountered us with small area problem. There are three measures for comparison of small area methods, including average square error (ASE), average absolute of relative bias (AARB), and average of absolute bias (AAB).We have used two types of transformations, logarithm transformation, and Box-Cox transformation, because of abnormality and heterogeneity of variances.Our data analysis has shown that it is better to use Box-Cox transformation than to use logarithm transformation, i.e., the test statistic is more significant by using this transformation; but Box-Cox transformation causes large sampling variance, which in some cases results in non-convergence in Gibbs algorithm.Likewise, HB approach gives better results than EBLUP and EB. All of these approaches are better than direct estimator, i.e., they have smaller values of ASE, AASB, and AAB.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1378

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2007
  • Volume: 

    4
  • Issue: 

    1
  • Pages: 

    91-107
Measures: 
  • Citations: 

    0
  • Views: 

    669
  • Downloads: 

    0
Abstract: 

When the comprehensive information about a topic is scattered among two or more data sets, using only one of those data sets would lead to information loss available in other data sets. Hence, it is necessary to integrate scattered information to a comprehensive unique data set. On the other hand, sometimes we are interested in recognition of duplications in a data set. The identification of duplications in a data set or the same identities in different data sets is called record linkage. Linkage of data sets that their information is registered in the context of Persian language has special difficulties due to particular writing characteristics of the Persian language such as connectedness of letters in words, existence of different writing versions for some letters and dependency of writing shape of letters to their position in words.In this paper, usual difficulties in linkage of data sets that their information is registered in the context of the Persian language are studied and some solutions are presented. We introduced some compatible methods for preparing and preprocessing of files through standardization, blocking and selection of identifier variables. A new method is proposed for dealing with missing data that is a major problem in real world applications of record linkage theory.The proposed method takes into account the probability of occurrence of missing data. We also proposed an algorithm for increasing the number of comparable fields based on partitioning of .composite fields such as address. Finally, the proposed methods are used to link records of establishing censuses in a geographical region in Iran. The results show that taking into account the probability of the occurrence of missing data increases the efficiency of the record linkage process. In addition, using different codes and notations for data registration in different times, leads to information loss. Specially, it is necessary to design a general pattern for writing addresses in Iran, considering geographical and environmental situations.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 669

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2007
  • Volume: 

    4
  • Issue: 

    1
  • Pages: 

    109-128
Measures: 
  • Citations: 

    2
  • Views: 

    1722
  • Downloads: 

    0
Abstract: 

When data are in the form of continuous functions, they may challenge classical methods of data analysis based on arguments in finite dimensional spaces, and therefore need theoretical justification. Infinite dimensionality of spaces that data belong to, leads to major statistical methodologies and new insights for analyzing them, which is called functional data analysis (FDA).Dimension reduction in FDA is mandatory, and is partly done by using principal components analysis (PCA). Similar to classical PCA, functional principal components analysis (FPCA) produces a small number of constructed variables from the original data that are uncorrelated and account for most of the variation in the original data set. Therefore, it helps us to understand the underlying structure of the data.Temperature and amount of precipitation are functions of time, so they can be analyzed by FDA. In this paper, we have treated Iranian temperature and precipitation in 2005, extract patterns of variation, explore the structure of the data, and that of correlation between the two phenomena. The data, collected from the weather stations across the country, were discrete and associated with the monthly mean of temperature and precipitation recorded at each station. However, we have first fitted appropriate curves to them in which we have taken smoothing methods into account. Then, we have started analyzing the data using FPCA, and interpreting the results. When estimating the eigenvalues, we have found that the first estimated eigenvalue (q1^) shows a strong domination of its associated variation on all other kinds. Furthermore, the first two eigenvalues explain more than 98% of the total variation, in which their contributions individually were 93.7 and 4.3 percent, respectively. Contributions from others, however, were less than 2 percent. Thus, we have only considered the first two components. The first estimated principal component (PC) shows that the majority of variability among the data can be attributed to differences between summer and winter temperatures. The second PC shows regularity of temperature when moving from winter to summer. In other words, it reflects the variation from the average of the difference between the winter and summer temperatures.Furthermore, bootstrap confidence bands for eigenvalues and eigenfunctions of the real data were obtained. They contain both individual and simultaneous confidence intervals for the eigenvalues. We have also obtained single and double bootstrap bands for the first two eigenfunctions, and seen that they are extremely close to each other, reflecting the high degree of accuracy of the bands that are obtained by the single bootstrap methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1722

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 2 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button