读书人

显示字符集编码表示

发布时间: 2012-07-05 07:59:18 作者: rapoo

展示字符集编码表示



输出结果
Charset: US-ASCII  input :?Ma?ana?Encoded:    0: 3f (?)   1: 4d (M)   2: 61 (a)   3: 3f (?)   4: 61 (a)   5: 6e (n)   6: 61 (a)   7: 3f (?)Charset: ISO-8859-1  input :?Ma?ana?Encoded:    0: bf (?)   1: 4d (M)   2: 61 (a)   3: f1 (?)   4: 61 (a)   5: 6e (n)   6: 61 (a)   7: 3f (?)Charset: UTF-8  input :?Ma?ana?Encoded:    0: c2 (?)   1: bf (?)   2: 4d (M)   3: 61 (a)   4: c3 (?)   5: b1 (±)   6: 61 (a)   7: 6e (n)   8: 61 (a)   9: 3f (?)Charset: UTF-16BE  input :?Ma?ana?Encoded:    0: 00   1: bf (?)   2: 00   3: 4d (M)   4: 00   5: 61 (a)   6: 00   7: f1 (?)   8: 00   9: 61 (a)  10: 00  11: 6e (n)  12: 00  13: 61 (a)  14: 00  15: 3f (?)Charset: UTF-16LE  input :?Ma?ana?Encoded:    0: bf (?)   1: 00   2: 4d (M)   3: 00   4: 61 (a)   5: 00   6: f1 (?)   7: 00   8: 61 (a)   9: 00  10: 6e (n)  11: 00  12: 61 (a)  13: 00  14: 3f (?)  15: 00Charset: UTF-16  input :?Ma?ana?Encoded:    0: fe (t)   1: ff (?)   2: 00   3: bf (?)   4: 00   5: 4d (M)   6: 00   7: 61 (a)   8: 00   9: f1 (?)  10: 00  11: 61 (a)  12: 00  13: 6e (n)  14: 00  15: 61 (a)  16: 00  17: 3f (?)


UTF -16BE 和UTF -16LE把每个字符编码为一个 2-字节数值。因此这类编码的解码器必须
要预先了解数据是如何编码的,或者根据编码数据流本身来确定字节顺序的方式。UTF -16
编码承认一种字节顺序标记:Unicode字符\uFEFF 。只有发生在编码流的开端时字节顺序
标记才表现为其特殊含义。如果之后遇到该值,它是根据其定义的 Unicode 值(零宽度,
无间断空格)被映射。外来的,小字节序系统可能会优先考虑\ uFEF 并且把流编码为
UTF -16LE。使用UTF -16编码优先考虑和认可字节顺序标记使系统带有不同的内部字节顺
序,从而与 Unicode数据交流

UTF-16BE无字节标记,编码高位字序UTF-16LE无字节标记,编码低位字序

更多信息请参考: orelly出版的 java nio 第6章.

读书人网 >其他相关

热点推荐