nutch网页快照乱码解决方法
?
?
ParseData ParseData = bean.getParseData(details); String content = null; String contentType = ParseData.getMeta(Metadata.CONTENT_TYPE); if (contentType.startsWith("text/html")) { // FIXME : it's better to emit the original 'byte' sequence // with 'charset' set to the value of 'CharEncoding', // but I don't know how to emit 'byte sequence' in JSP. // out.getOutputStream().write(bean.getContent(details)) may work, // but I'm not sure. String encoding = ParseData.getMeta("CharEncodingForConversion"); if (encoding != null) { try { content = new String(bean.getContent(details), encoding); } catch (UnsupportedEncodingException e) { // fallback to windows-1252 content = new String(bean.getContent(details), "windows-1252"); } } else content = new String(bean.getContent(details),"GBK"); }?