code
6 June 2008 | 10 Comments
So, Colin Percival has posted a UTF-8 strlen which improves on my previous post. While his code runs slightly slower than mine on my PC, I assume that’s because his code is aimed at a 64-bit architecture. With 32-bits (reading 4 bytes at a time, instead of 8 ) it doesn’t quite get the same [...]
Tagged in C, fast, horrid, optimization, overengineered, silly, simd, sse, strlen, stupid
code
4 June 2008 | 6 Comments
‘Counting Characters in UTF-8 Strings Is Fast’ by Kragen Sitaker shows several ways to count characters UTF-8, using both assembly and C. But, with a few assumptions, we can go faster. Assumption One: We are dealing with a valid UTF-8 string Making this assumption means that once we hit the start of a multi-byte character [...]
Tagged in C, code, fast, speed, strings, strlen, utf8