Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, UTF-8 without the self-synchronization part. It’s a reasonable encoding, sure. I’m not sure what the cost of making the whole thing big endian is, though (guess I need to look at simdutf8!); at first blush it feels like making this wholly little-endian-adapted, with the continuation bits = byte count in unary located in the low bits, would result in very cheap decoding on little-endian machines.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: