What can we fit in 140 characters?

code 27 May 2009 | 0 Comments

This is in reference to the current ‘Twitter image encoding challenge’ running on StackOverflow. If we want to restrict ourselves to assigned, non-control, non-private Unicode characters, then by my reckoning that gives us 129,775 available characters. wget http://unicode.org/Public/UNIDATA/UnicodeData.txt awk -F ‘;’ UnicodeData.txt -f countUnichars.awk | bc countUnichars.awk source: BEGIN { print "ibase=16" } # set [...]

Tagged in , , , , , ,

Cleaning up a set of tags with Awk

code,commentary,replies,utility 28 January 2009 | 3 Comments

Introduction David R. MacIver has recently written this blog post about cleaning up a set of tags. This blog post, on the other hand, is about a nice old Unix tool called ‘awk’. Awk is one of those programs that is often overlooked. It is really a small domain-specific language for processing text. In some [...]

Tagged in , , , , , , ,