Unicode as she is broke
Do you think your string-handling code is robust? Are there any problems with the following snippets?
Do you think your string-handling code is robust? Are there any problems with the following snippets?
I was looking for some standard symbols to represent the Control key and the Alt key, and couldn’t find one until I came across ISO/IEC 9995-7. Because I had much trouble finding a free copy of the document on the ’Net, I have made a table of the symbols and their functions below. I have [...]
This is in reference to the current ‘Twitter image encoding challenge’ running on StackOverflow. If we want to restrict ourselves to assigned, non-control, non-private Unicode characters, then by my reckoning that gives us 129,775 available characters. wget http://unicode.org/Public/UNIDATA/UnicodeData.txt awk -F ‘;’ UnicodeData.txt -f countUnichars.awk | bc countUnichars.awk source: BEGIN { print "ibase=16" } # set [...]
A search for the phrase "It’s like a light of a new day," breaks in more than one way. Not only does Google search fail to recognize that “it’s” is a word, it also ignores the quote marks, searching for the phrase as individual words.