Systematic under-sampling of mutation datasets and comparative assessment of protein stability predictors