Modeling lexical diversity across language sampling and estimation techniques