java中关于字符、文件乱码的笔记
?
老生常谈的问题了,记录一下以便他人翻阅。
?
结论:
1.???????字符串与byte数组的转换?默认编码是采用系统的编码(例如Windows下是GB18030)
2.???????java类文件中本身包含的字符串,采用什么编码是?根据javac编译时的环境变量决定的。
?
穿插一下:
?
重复一下http中的中文乱码问题,例如使用prototype.js发送包含中文的ajax报文时,Servlet接收端可能会遇到乱码问题。该情况可以用以下api解决:
ServletRequest.setCharacterEncoding
ServletResponse.setCharacterEncoding
?
程序代码:
?
Java代码??
- /**?
- ?*??
- ?*?@author?csbison?
- ?*/??
- public?class?BTest?{??
- ??
- ????private?final?static?char[]?hexDigits?=?{?'0',?'1',?'2',?'3',?'4',?'5',??
- ????????????'6',?'7',?'8',?'9',?'a',?'b',?'c',?'d',?'e',?'f'?};??
- ??
- ????/**?
- ?????*?@param?args?
- ?????*/??
- ????public?static?void?main(String[]?args)?{??
- ????????//?测试证明,?字符串与byte数组的转换,默认就是用系统平台的??
- ????????try?{??
- ????????????String?src?=?"诺基亚";??
- ????????????String?value?=?null;??
- ????????????value?=?bytesToHexString(src.getBytes());??
- ????????????System.out.println("Default="?+?value);??
- ??
- ????????????value?=?bytesToHexString(src.getBytes("ISO-8859-1"));??
- ????????????System.out.println("ISO-8859-1="?+?value);??
- ??
- ????????????value?=?bytesToHexString(src.getBytes("GBK"));??
- ????????????System.out.println("GBK="?+?value);??
- ??
- ????????????value?=?bytesToHexString(src.getBytes("UTF-8"));??
- ????????????System.out.println("UTF-8="?+?value);??
- ??
- ????????????System.out.println("/////////////////////////");??
- ??
- ????????????byte[]?aa?=?"诺基亚".getBytes();??
- ????????????System.out.println(new?String(aa));??
- ????????????System.out.println(new?String(aa,?"GBK"));??
- ????????????System.out.println(new?String(aa,?"ISO-8859-1"));??
- ????????????System.out.println(new?String(aa,?"UTF-8"));??
- ????????}?catch?(Exception?e)?{??
- ????????????e.printStackTrace();??
- ????????}??
- ??
- ????}??
- ??
- ????public?static?final?String?bytesToHexString(byte[]?buf)?{??
- ????????StringBuffer?sb?=?new?StringBuffer();??
- ????????for?(int?i?=?0;?i?<?buf.length;?i++)?{??
- ????????????int?n?=?buf[i];??
- ????????????if?(n?<?0)?{??
- ????????????????n?=?256?+?n;??
- ????????????}??
- ????????????int?d1?=?n?/?16;??
- ????????????int?d2?=?n?%?16;??
- ??
- ????????????sb.append(hexDigits[d1]);??
- ????????????sb.append(hexDigits[d2]);??
- ????????}??
- ????????return?sb.toString();??
- ????}??
- ??
- }??
?
Windows上的测试结果:
Default=c5b5bbf9d1c7
ISO-8859-1=3f3f3f
GBK=c5b5bbf9d1c7
UTF-8=e8afbae59fbae4ba9a
/////////////////////////
诺基亚
诺基亚
?|ì?¨′??
???
?
Linux上的测试结果:
1.???????在Locale=zh的情况下,javac,然后再java执行。结果如下:
-bash-3.00$ java BTest
Default=c5b5bbf9d1c7
ISO-8859-1=3f3f3f
GBK=c5b5bbf9d1c7
UTF-8=e8afbae59fbae4ba9a
/////////////////////////
诺基亚
诺基亚
???ù??
???
2.???????在Locale=C的情况下,直接java执行(也就是java文件还是在Locale=zh的环境下编译的)。结果如下:
-bash-3.00$ java BTest??
Default=3f3f3f
ISO-8859-1=3f3f3f
GBK=c5b5bbf9d1c7
UTF-8=e8afbae59fbae4ba9a
/////////////////////////
???
???
???
???
?
3.???????在Locale=C的情况下,javac,然后再java执行。结果如下:
-bash-3.00$ java BTest??????
Default=c5b5bbf9d1c7
ISO-8859-1=c5b5bbf9d1c7
GBK=3f3f3fa8b43f3f
UTF-8=c385c2b5c2bbc3b9c391c387
/////////////////////////
诺基亚
???
诺基亚
???
?
?
关于对字符编码的问题,可以参考:
http://www.utf.com.cn/article/s320????????????????????UTF-8?字符集基础(1)
http://www.phpweblog.net/XBOX/archive/2008/09/06/5726.html
http://javajiao.iteye.com/blog/151995??
?