About Heading

Smoothing in Ordinal Regression: An Application to Sensory Data

The so-called proportional odds assumption is popular in cumulative, ordinal regression. In practice, however, such an assumption is sometimes too restrictive. For instance, when modeling the perception of boar taint on an individual level, it turns out that, at least for some subjects, the effects of predictors (androstenone and skatole) vary between response categories. For more flexible modeling, we consider the use of a ‘smooth-effects-on-response penalty’ (SERP) as a connecting link between proportional and fully non-proportional odds models, assuming that parameters of the latter vary smoothly over response categories. The usefulness of SERP is further demonstrated through a simulation study. Besides flexible and accurate modeling, SERP also enables fitting of parameters in cases where the pure, unpenalized non-proportional odds model fails to converge.


 

serp: An R package for smoothing in ordinal regression

The use of specialized methods for the analysis of categorical data has been on the rise in recent years (Agresti, 2010). For instance, scientists frequently use the different forms of the ordinal model to analyze relationships between an ordinal response variable and covariates of interest. In particular, the cumulative link model (CLM) propagated by McCullagh (1980) finds a wide range of empirical applications in clinical trials, social surveys, market research, etc. However,in high-dimensional settings with a large number of unknown parameters, e.g., if several potential predictors and response categories are modeled, the so-called identification problem could be somewhat unavoidable (Bartels, 1985; Fisher, 1966). Thankfully, regularization techniques (see, e.g., Bühlmann & Van de Geer, 2011; Hastie et al., 2009) among other approaches, offer a remedy in such situations.


 

A Modification of McFadden’s R2 for Binary and Ordinal Response Models

A lot of studies on the summary measures of predictive strength of categorical response models consider the Likelihood Ratio Index (LRI), also known as the McFadden-R2, a better option than many other measures. We propose a simple modification of the LRI that adjusts for the effect of the number of response categories on the measure and that also rescales its values, mimicking an underlying latent measure. The modified measure is applicable to both binary and ordinal response models fitted by maximum likelihood. Results from simulation studies and a real data example on the olfactory perception of boar taint show that the proposed measure outperforms most of the widely used goodness-of-fit measures for binary and ordinal models. The proposed R2 interestingly proves quite invariant to an increasing number of response categories of an ordinal model.


 

gofcat: Goodness-of-Fit Measures for Categorical Response Models

Statistical models are considered simplification or approximation of reality (Burnham & Anderson, 2002). How close to the target a given model is, or how it compares to competing models is always of interest in real-world applications. Answers to such questions are mostly obtained via adequate goodness-of-fit (GOF) procedures. However, while such procedures alongside software implementations are readily available for various continuous outcome models, there are just a handful of open-source implementations available for categorical response models (CRMs). The gofcat R software package provides a quick means of evaluating some widely used CRMs in empirical studies. Depending on the model of interest, functions are available for the different forms of hypothesis tests associated with CRMs and for computing the summary measures of predictive strength of fits. For instance, the proportional odds assumption in the ordinal regression model can be tested using the Brant or the Likelihood-Ratio tests available in gofcat. Other crucial tests like the Hosmer-Lemeshow, the Lipsitz and the Pulkstenis-Robinson tests are also available for some widely used binary, multinomial and ordinal response models. Moreover, the assessment of prediction errors through error/loss functions and several summary measures of predictive strength of fitted models (pseudo-R2s) are also available in gofcat.