Wednesday, March 16, 2016

What is the point of Tomcat's setting URIEncoding?

Leave a Comment

In Apache Tomcat, parameter URIEncoding tells Tomcat how to interpret incoming URIs:

URIEncoding

This specifies the character encoding used to decode the URI bytes, after %xx decoding the URL. If not specified, ISO-8859-1 will be used.

Apache Tomcat 7 - The HTTP Connector

However, as explained for example in What is the proper way to URL encode Unicode characters? , non-ASCII characters in URIs are always encoded in UTF-8, following current standards (RFC 3986 and 3987).

So:

  • Why is there even a setting for something that is mandated by a standard?
  • Why is the default different from what the standard mandates? (ISO-8859-1 instead of UTF-8)

Is this simply because the Tomcat setting predates the standard, and was retained for backwards compatibility? Or is there some situation where a value different from UTF-8 makes sense?

1 Answers

Answers 1

I see that at least for Tomcat 6 and below URIEncoding was not only important, but necessary, with many people having issues if not explicitly setting it to 'UTF-8'. As for your question, I can only assume that it is for backward compatibility. Developers hate to remove code once they have written it, even if the possibility of ever needing it again is zero :)

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment