url decode problem 解决方法
作者:
今天被告诉了一个奇怪的事儿,第三方网站使用我们提供的签名是出现了错误,原因是使用php的urldecode时把加号(+) 替换成了空格
试验了一下python的urllib库以及js 的 encodeURIComponent 均不会替换。空格encode也是替换成了 '%20' 。python提供了urllib.quote_plus, urlib.unquote_plus来处理空格->加号,看起来还是比较合理的。
查了一下 RFC 3986: 有下面一段
Scheme names consist of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-").
RFC 2396 有下面的一段
The plus "+", dollar "$", and comma "," characters have been added to those in the "reserved" set, since they are treated as reserved within the query component.
表示加号已经是url的保留字了,不需要转义。
然后html4文档里才有关于加号的转义:
application/x-www-form-urlencoded
Forms submitted with this content type must be encoded as follows:
Control names and values are escaped. Space characters are replaced by`+', and then reserved characters.....
声明只有content-type为application/x-www-form-urlencoded时才会对+做转义。
又翻了下php的文档,发现有一个
rawurlencode() - URL-encode according to RFC 3986
也就是php又搞了rawurlencode和rawurldecode把标准实现了。。。。
不能反一下么,毕竟大部分人应该都会用urlencode。php真是蛋疼啊。。。。
查了一下 RFC 3986: 有下面一段
Scheme names consist of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-").
RFC 2396 有下面的一段
The plus "+", dollar "$", and comma "," characters have been added to those in the "reserved" set, since they are treated as reserved within the query component.
表示加号已经是url的保留字了,不需要转义。
然后html4文档里才有关于加号的转义:
application/x-www-form-urlencoded
Forms submitted with this content type must be encoded as follows:
Control names and values are escaped. Space characters are replaced by`+', and then reserved characters.....
声明只有content-type为application/x-www-form-urlencoded时才会对+做转义。
又翻了下php的文档,发现有一个
rawurlencode() - URL-encode according to RFC 3986
也就是php又搞了rawurlencode和rawurldecode把标准实现了。。。。
不能反一下么,毕竟大部分人应该都会用urlencode。php真是蛋疼啊。。。。