Debber provides probability distributions of DEB parameters. Each of its estimates (e.g., a median) is accompanied by uncertainty metrics such as percentiles. The validity of Debber results is therefore not determined by the accuracy of the single estimate, but by whether the true parameter values, if measured, are distributed as Debber suggests: do the true DEB parameters indeed have a 50% probability of falling within Debber's predicted 50% confidence interval, do they have a 95% probability of falling within Debber's predicted 95% confidence interval, etc. To verify this, we perform leave-one-out cross-validation: for each of the 2006 species in the Add-my-Pet database (version 20200514), we do the following:
- We temporarily drop the species from the AmP database.
- We re-calibrate the evolutionary model based on all remaining AmP entries. Effectively, we rerun the entire PhyloPars analysis to estimate phylogenetic and phenotypic covariances, ensuring that the inference model is not influenced by the values that are to be estimated.
- With the new evolutionary model, we estimate the parameters of the omitted species using the usual Debber method.
- For each estimated parameter, we quantify the cross-validation error by its z-score: the difference between the estimated mean and the true AmP value, divided by the estimated standard deviation. This is done in transformed parameter space, that is, after parameters have been log- or logit-transformed to make their distribution more normal-like. Thus, the z-score is a meaningful measure of the estimation error for all parameters. It considers the full [normal] probability distribution predicted by Debber.
After collecting the errors per parameter across all species, we plot the histogram (normalized to have a surface area of 1) of z-scores to check whether it resembles the expected standard normal distribution (μ=0, σ=1). In addition, we make a Q-Q plot comparing the two cumulative distributions. Click a parameter on the left to see its cross-validation results.