Research 2006

Department Home

Researchers

Guest Researchers

Research Interests

Research Output

Postgraduate Student Projects 2006

Research Findings

Funded Projects

Back To

Faculty Research Output

 

Faculty of Natural and Agricultural Sciences
School of Mathematical Sciences
Department of Statistics

Selected Highlights from Research Findings

The main objective of this research is to provide a theoretical foundation for analysing grouped data, taking the underlying continuous nature of the variable(s) into account. Statistical techniques have been developed and applied extensively for continuous data, but the analysis for grouped data has been somewhat neglected. This creates numerous problems especially in the social and economic disciplines, where variables are grouped for various reasons. Due to a lack for the appropriate statistical techniques to evaluate grouped data, researchers are often tempted to ignore the underlying continuous nature of the data and employ e.g. the class midpoint values as an alternative. This leads to an oversimplification of the problem and valuable information in the data is being ignored. The analysis of grouped data is performed utilizing the maximum likelihood (ML) estimation procedure of Matthews and Crowther (1995: A maximum likelihood estimation procedure when modeling in terms of constraints. South African Statistical Journal, 29, 29-51). Three integrated areas are addressed in this research and will be discussed briefly. Firstly, continuous distributions such as the exponential, normal, Weibull, log-logistic and Pareto distributions are fitted to a single frequency distribution. The constraints are formulated such that the cumulative relative frequencies equal the cumulative distribution curve at the upper class boundaries. A general method is proposed by formulating the vector of constraints in terms of a linear model. The second area concentrates on the analysis of generalised linear models where the response variable is presented in grouped format. A cross classification of the independent variables leads to various so-called cells in a single-factor, two-factor or even multifactor design. Each cell contains a frequency distribution of the response variable. A completely new approach, where a specified underlying continuous distribution for the grouped variable is fitted to each cell is introduced. Certain measures such as the average, median or even any other percentile of the fitted distributions are modeled to explain the influence of the independent variables on the response variable. The main objective is ultimately to provide a satisfactory fit that describes the data as effectively as possible, revealing the various trends in the data. A third contribution is the fit of a bivariate normal distribution to a two-way contingency table. The estimation of the bivariate normal distribution reveals the complete underlying continuous structure between the two variables. The ML estimate of the correlation coefficient is used to great effect to describe the relationship between the variables
Contact person: Mev G Crafford.

 

Related Links

Department of Statistics Home Page