Hosmer-Lemeshow Test for Categorical Response Models

This is a post estimation goodness-of-fit test for categorical response models. It currently supports the binary, multinomial and ordinal logistic regression models.

hosmerlem(model, group = 10, tables = FALSE, customFreq = NULL)

Arguments

model: a model object or data.frame of observed and estimated values. The following class of objects can be directly passed to the function: glm(), vglm(), serp(), polr(), clm(), and mlogit(). Other class of objects require providing a data.frame of observed and predicted values.
group: Default to 10. The number of groups to be formed from the original observations.
tables: Default to FALSE. When TRUE, both the observed and the expected frequency tables are printed alongside test results.
customFreq: a vector of custom group frequencies to be used instead of the default equal group frequencies. The vector should, however, sum up to the total number of original observations.

Value

chi.sq: realized value of the chi-square statistic.
df: the degrees of freedom.
p.value: the p-value of the test.
observed: table of observed frequencies.
expected: table of estimated frequencies.
rho: percentage of estimated frequencies greater than one.
type: a character vector indicating the type of HL-test conducted.
tables: TRUE or FALSE logical vector.

Details

The implemented tests are discussed in Hosmer and Lemeshow (1980) (for the binary case), Fagerland, Hosmer, and Bofin (2008); Fagerland and Hosmer (2012) (for the multinomial case) and Fagerland and Hosmer (2013, 2016, 2017) (for the ordinal case). In each of the three settings, one makes a split of the original observations into k groups (10 by default), after ranking the observations by ordinal scores (OS). Ties are separated following an additional ranking based on the observed values. A Pearson chi-squared statistic is thereafter obtained from a (k X c) expected and observed frequency tables. The Hosmerlem() function automatically dictates which of the three HL tests to run based on the type of model supplied or by inspecting the class/level of response category in the supplied data.frame of predicted and observed values.

On the choice of k-groups, Fagerland and Hosmer (2013, 2016) recommend using ten groups. It is believed that number of groups below six adversely affects the heterogeneity within groups thereby resulting in a test with low power. Also having too many groups could lead to a sparsely populated contingency tables, affecting the distributional assumptions of the test statistic. The test statistic follows the chi-squared distribution with (k - 2)(r - 1) + (r - 2) degrees of freedom. Where k and r are respectively the number of groups and response category. For chi-square estimation to hold, it is recommended to have the percentage of estimated frequencies greater than 80 percent of all estimated frequencies (Fagerland and Hosmer, 2017).

Finally, it is particularly recommended to compare the results of the ordinal Hosmer-Lemeshow test with the lipsitz and the Pulkstenis-Robinson tests (Fagerland and Hosmer, 2016; 2017).

References

Hosmer, D. W. and Lemeshow, S. (1980). Goodness of fit tests for the multiple logistic regression model. Communications in Statistics-Theory and Methods, 9, 1043-1069.

Fagerland, M. W., Hosmer, D. W. and Bofin, A. M. (2008). Multinomial goodness-of-fit tests for logistic regression models. Statistics in Medicine, 27, 4238-4253.

Fagerland, M. W. and Hosmer, D. W. (2012). A generalized Hosmer-Lemeshow goodness-of-fit test for multinomial logistic regression models. Stata Journal, 12, 447-453.

Fagerland, M. W. and Hosmer, D. W. (2013). A goodness-of-fit test for the proportional odds regression model. Statistics in Medicine, 32, 2235-2249.

Fagerland, M. W. and Hosmer, D. W. (2016). Tests for goodness of fit in ordinal logistic regression models. Journal of Statistical Computation and Simulation, 86, 3398-3418.

Fagerland, M. W. and Hosmer, D. W. (2017). How to test for goodness of fit in ordinal logistic regression models. Stata Journal, 17, 668-686.

Examples


require(VGAM)
#> Loading required package: VGAM
#> Loading required package: stats4
#> Loading required package: splines
#> 
#> Attaching package: ‘VGAM’
#> The following object is masked from ‘package:serp’:
#> 
#>     wine

# Binary L-H test
set.seed(1)
gm <- glm(rbinom(100,1,0.5) ~ rnorm(100), family=binomial)
hosmerlem(gm, group = 10, customFreq = NULL, tables = TRUE)
#> 
#> Hosmer-Lemeshow Test:
#> 
#>                    Chi-sq  df  pr(>chi)
#> binary(Hosmerlem)  12.311   8    0.1379
#> 
#> H0: No lack of fit dictated
#> 
#> rho: 100% 
#> 
#> 
#> Observed Freqs:
#> 
#>    group    1    0 total
#> 1      1    4    6    10
#> 2      2    7    3    10
#> 3      3    2    8    10
#> 4      4    5    5    10
#> 5      5    5    5    10
#> 6      6    2    8    10
#> 7      7    5    5    10
#> 8      8    7    3    10
#> 9      9    7    3    10
#> 10    10    4    6    10
#> 
#> Expected Freqs:
#> 
#>    group       1       0
#> 1      1    4.31    5.69
#> 2      2    4.51    5.49
#> 3      3    4.61    5.39
#> 4      4    4.66    5.34
#> 5      5    4.73    5.27
#> 6      6    4.80    5.20
#> 7      7    4.89    5.11
#> 8      8    4.98    5.02
#> 9      9    5.14    4.86
#> 10    10    5.39    4.61
#> 

# multinomial L-H test
vg <- vglm(RET ~ DIAB + GH + BP, model = TRUE,
           family = multinomial(parallel = TRUE), data = retinopathy)
hosmerlem(vg)
#> 
#> Hosmer-Lemeshow Test:
#> 
#>                         Chi-sq  df  pr(>chi)    
#> multinomial(Hosmerlem)  198.02  16 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> H0: No lack of fit dictated
#> 
#> rho: 96.67% 
#> 

# ordinal L-H test
## proportional odds model
cu <- update(vg, family = cumulative(link = 'logitlink', parallel = TRUE))
hosmerlem(cu, tables=TRUE)
#> 
#> Hosmer-Lemeshow Test:
#> 
#>                     Chi-sq  df  pr(>chi)  
#> ordinal(Hosmerlem)  29.259  17   0.03221 *
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> H0: No lack of fit dictated
#> 
#> rho: 96.67% 
#> 
#> 
#> Observed Freqs:
#> 
#>  group total  1  2  3
#>      1    61 56  3  2
#>      2    61 58  2  1
#>      3    61 51 10  0
#>      4    61 52  8  1
#>      5    61 41 18  2
#>      6    61 39 18  4
#>      7    61 28 16 17
#>      8    61 29 16 16
#>      9    61 21 16 24
#>     10    64 13 11 40
#> 
#> Expected Freqs:
#> 
#>  group     OS       1       2       3
#>      1 1.0693 57.6617  2.4501  0.8883
#>      2 1.1252 55.0088  4.3439  1.6472
#>      3 1.1863 52.1616  6.3141  2.5243
#>      4 1.2658 48.5340  8.7171  3.7490
#>      5 1.3529 44.6729 11.1274  5.1998
#>      6 1.4879 38.9495 14.3422  7.7084
#>      7 1.6310 33.2305 17.0455 10.7240
#>      8 1.8163 26.4677 19.2724 15.2599
#>      9 2.0866 18.0019 19.7125 23.2856
#>     10 2.4746  9.0074 15.6136 39.3790
#> 
hosmerlem(cu, group = 5, tables=TRUE, customFreq = c(rep(100,5), 113))
#> 
#> Hosmer-Lemeshow Test:
#> 
#>                     Chi-sq  df  pr(>chi)  
#> ordinal(Hosmerlem)  16.842   9   0.05125 .
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> H0: No lack of fit dictated
#> 
#> rho: 100% 
#> 
#> 
#> Observed Freqs:
#> 
#>  group total  1  2  3
#>      1   100 93  4  3
#>      2   100 87 13  0
#>      3   100 76 21  3
#>      4   100 56 31 13
#>      5   100 48 24 28
#>      6   113 28 25 60
#> 
#> Expected Freqs:
#> 
#>  group     OS       1       2       3
#>      1 1.0869 93.1565  4.9988  1.8447
#>      2 1.1849 85.6258 10.2634  4.1109
#>      3 1.3176 75.7915 16.6619  7.5466
#>      4 1.5217 61.6211 24.5908 13.7881
#>      5 1.7946 44.7497 31.0434 24.2069
#>      6 2.3196 22.7514 31.3806 58.8681
#> 

## adjacent category model
ac <- update(vg, family = acat(parallel = TRUE))
hosmerlem(ac)
#> 
#> Hosmer-Lemeshow Test:
#> 
#>                     Chi-sq  df  pr(>chi)  
#> ordinal(Hosmerlem)  28.318  17   0.04136 *
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> H0: No lack of fit dictated
#> 
#> rho: 93.33% 
#> 

## continuation ratio model
cr <- update(vg, family = cratio(parallel = TRUE))
hosmerlem(cr)
#> 
#> Hosmer-Lemeshow Test:
#> 
#>                     Chi-sq  df  pr(>chi)  
#> ordinal(Hosmerlem)  29.584  17    0.0295 *
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> H0: No lack of fit dictated
#> 
#> rho: 93.33% 
#> 

## using data.frame
y <- ordered(retinopathy$RET)
df <- data.frame(y, fitted.values(cr))
hosmerlem(df, tables = TRUE)
#> 
#> Hosmer-Lemeshow Test:
#> 
#>                     Chi-sq  df  pr(>chi)  
#> ordinal(Hosmerlem)  29.584  17    0.0295 *
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> H0: No lack of fit dictated
#> 
#> rho: 93.33% 
#> 
#> 
#> Observed Freqs:
#> 
#>  group total  0  1  2
#>      1    61 56  3  2
#>      2    61 57  3  1
#>      3    61 53  8  0
#>      4    61 51  9  1
#>      5    61 41 18  2
#>      6    61 38 19  4
#>      7    61 30 15 16
#>      8    61 28 16 17
#>      9    61 21 16 24
#>     10    64 13 11 40
#> 
#> Expected Freqs:
#> 
#>  group     OS       0       1       2
#>      1 1.0682 57.0503  3.7392  0.2105
#>      2 1.1190 54.3224  6.0961  0.5815
#>      3 1.1743 51.5404  8.2876  1.1720
#>      4 1.2470 48.1336 10.6687  2.1977
#>      5 1.3294 44.5468 12.8154  3.6378
#>      6 1.4591 39.4040 15.1850  6.4110
#>      7 1.6071 34.1034 16.7604 10.1362
#>      8 1.7988 27.9752 17.3248 15.7000
#>      9 2.0857 19.9676 15.8387 25.1937
#>     10 2.4837 10.9564 11.1282 41.9154
#>