Its not perfect (eg Indic combining chars have class 0)
How should this be identified then? Just by ranges or something?
but at least it should not result in invalid strings
If that's all we're after, I can keep the normalization step and remove the manual (incomplete?) combining character support, which should still do the right thing™ in the vast majority of cases -- not counting the fact that it's currently terribly broken yet nobody complained before.