15 Nov 2007 ilgiz   » (Journeyer)

Java codecs

What a shame! I spent few hours trying to find a decoder for Java string literals.

It turned out the Apache Commons Lang project, not Sun, provides the decoder and the encoder. I remembered its value from the time I was looking for an XML encoder to use in my JSP page.

http://commons.apache.org/lang/api/org/apache/commons/lang/StringEscapeUtils.html

http://svn.apache.org/viewvc/commons/proper/lang/trunk/src/java/org/apache/commons/lang/StringEscapeUtils.java?view=markup

I started my search when I realized that a brute force approach such as s.replace("\\\\", "\\").replace("\\n", "\n") would fail to correctly decode a legitimate sequence of 3 characters '\\', '\\' and 'n' into a string of 2 characters '\\' and 'n'. This is because multiple .replace() invocations apply per-character decoding more than once, which is wrong.

Syndicated 2007-11-08 21:52:46 (Updated 2007-11-09 21:38:44) from Ilguiz (eel ghEEz) Latypov

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!