读书人

AndroidSAX解析异常,内容获取不完整

发布时间: 2013-09-05 16:02:07 作者: rapoo

AndroidSAX解析错误,内容获取不完整

转载:http://blog.csdn.net/feng88724/article/details/7013675

在讲这次错误之前,先看一下下面这段代码。 ?【◆以下解析方法是错误的×】

?

[java]?view plaincopyprint??
  1. import?java.util.ArrayList;??
  2. import?java.util.List;??
  3. ??
  4. import?org.xml.sax.Attributes;??
  5. import?org.xml.sax.SAXException;??
  6. import?org.xml.sax.helpers.DefaultHandler;??
  7. ??
  8. import?android.util.Log;??
  9. ??
  10. public?class?XmlHandler?extends?DefaultHandler{??
  11. ??????
  12. ????private?final?String?TAG?=?this.getClass().getSimpleName();??
  13. ??????
  14. ????/**XML文件中标签定义*/??
  15. ????private?final?String?TAG_Article?=?"Article";??
  16. ????private?final?String?TAG_ArticleID?=?"ArticleID";??
  17. ????private?final?String?TAG_Title?=?"Title";??
  18. ????private?final?String?TAG_Date?=?"Date";??
  19. ????private?final?String?TAG_SmallPictures?=?"SmallPictures";??
  20. ????private?final?String?TAG_LargePictures?=?"LargePictures";??
  21. ????private?final?String?TAG_Category?=?"Category";??
  22. ????private?static?final?String?TAG_HeadNote?=?"HeadNote";??
  23. ????private?static?final?String?TAG_SubTitle?=?"SubTitle";??
  24. ????private?static?final?String?TAG_Source?=?"Source";??
  25. ??????
  26. ????//当前正在解析的TAG??
  27. ????private?String?currentName;??
  28. ??????
  29. ????//单个文章??
  30. ????private?News?news?=?null;??
  31. ??????
  32. ????//文章列表??
  33. ????private?List<News>??newsList?=?null;??
  34. ??????
  35. ????//解析开始时间??
  36. ????private?long?start_time;??
  37. ??????
  38. ????private?boolean?flag?=?false;??
  39. ??????
  40. ????@Override??
  41. ????public?void?characters(char[]?ch,?int?start,?int?length)??
  42. ????????????throws?SAXException?{??
  43. ????????super.characters(ch,?start,?length);??
  44. ??????????
  45. ????????if(!flag)?{??
  46. ????????????return;??
  47. ????????}??
  48. ????????//?取值??
  49. ????????String?value?=?new?String(ch,?start,?length);??
  50. ????????Log.d(TAG,?"Element:?"?+?currentName??+?"?Element?Value:?"?+?value);??
  51. ????????if(value?!=?null)?{??
  52. ????????????if(TAG_ArticleID.equals(currentName))?{??
  53. ????????????????news.setArticleId(value);??
  54. ????????????}?else?if(TAG_Title.equals(currentName))?{??
  55. ????????????????news.setTitle(value);??
  56. ????????????}?else?if(TAG_Date.equals(currentName))?{??
  57. ????????????????news.setDate(value);??
  58. ????????????}?else?if(TAG_Category.equals(currentName))?{??
  59. ????????????????news.setCategory(value);??
  60. ????????????}?else?if(TAG_SmallPictures.equals(currentName))?{??
  61. ????????????????news.setSmallPicture(value);??
  62. ????????????}?else?if(TAG_LargePictures.equals(currentName))?{??
  63. ????????????????news.setLargePicture(value);??
  64. ????????????}?else?if(TAG_HeadNote.equals(currentName))?{??
  65. ????????????????news.setHeadNote(value);??
  66. ????????????}?else?if(TAG_SubTitle.equals(currentName))?{??
  67. ????????????????news.setSubTitle(value);??
  68. ????????????}?else?if(TAG_Source.equals(currentName))?{??
  69. ????????????????news.setSource(value);??
  70. ????????????}??
  71. ????????}??
  72. ????}??
  73. ??
  74. ????@Override??
  75. ????public?void?startDocument()?throws?SAXException?{??
  76. ????????super.startDocument();??
  77. ??????????
  78. ????????start_time?=?System.currentTimeMillis();??
  79. ????????newsList?=?new?ArrayList<News>();??
  80. ????}??
  81. ??
  82. ????@Override??
  83. ????public?void?startElement(String?uri,?String?localName,?String?qName,?Attributes?attributes)?throws?SAXException?{??
  84. ????????super.startElement(uri,?localName,?qName,?attributes);??
  85. ????????this.currentName?=?localName;??
  86. ????????flag?=?true;??
  87. ????????if(TAG_Article.equals(localName))?{??
  88. ????????????news?=?new?News();??
  89. ????????}??
  90. ????}??
  91. ??????
  92. ????@Override??
  93. ????public?void?endElement(String?uri,?String?localName,?String?qName)??
  94. ????????????throws?SAXException?{??
  95. ????????super.endElement(uri,?localName,?qName);??
  96. ????????flag?=?false;??
  97. ??????????
  98. ????????if(TAG_Article.equals(localName))?{??
  99. ????????????newsList.add(news);??
  100. ????????}??
  101. ????}??
  102. ??????
  103. ????@Override??
  104. ????public?void?endDocument()?throws?SAXException?{??
  105. ????????super.endDocument();??
  106. ??????????
  107. ????????long?end?=?System.currentTimeMillis();??
  108. ????????Log.d(TAG,?"Parse?List's?Xml?Cost:?"?+?(end?-?start_time)?+?"?!!");??
  109. ????}??
  110. }??


Baidu 或者 Google 一下 “Android Sax 解析” , 给出的Sample无一例外都是如此。 坑爹啊... 甚至连有些书籍中都是这么写的, 比如《Android开发入门与实践》。(本书亲自确认过,其他书情况不详)

?

?

没错, 一般情况下,这么写是可以的, 而且在大多数情况下解析出来也是正确的。 但是就是偶尔会出错, 这个时候通常你都莫不着头脑, 怎么回事? 数据没错啊,解析部分代码貌似也没问题.. 真是奇了怪了。 其实问题都出在上面那段代码上!!

?

大家都认为 SAX 解析过程大致如下:

startDocument? -> ??startElement ?->?characters ->?endElement ->?endDocument

?

没错,就是这样,?startElement ?读取起始标签,?endElement 读取结束标签,characters 呢?当然是读取其值, 这没错,但是大家都天真的以为?characters 只执行一次,并且一次就读取了全部内容。错就错在这!

?

其实characters 是很有可能会执行多次的,当遇到内容中有回车,\t等等内容时,它很有可能就执行多次。 有的人可能会说,那我没有这些是不是就只执行一次了? 看下我实测结果:

AndroidSAX解析异常,内容获取不完整?

?

测试用XML如下:

?

[html]?view plaincopyprint??
  1. <News>??
  2. ????<Article>??
  3. ????????<ArticleID>1000555</ArticleID>??
  4. ????????<Title><![CDATA[?郑州“亚洲第一桥”通车6年成危桥?]]></Title>??
  5. ????????<Date>2011-11-25?14:23:52</Date>??
  6. ????????<SmallPictures>livenews/images/s20.png</SmallPictures>??
  7. ????????<LargePictures>livenews/images/l20.png</LargePictures>??
  8. ????????<Category>闻天下</Category>??
  9. ????????<HeadNote></HeadNote>??
  10. ????????<SubTitle></SubTitle>??
  11. ????????<Author></Author>??
  12. ????????<Source>人民日报</Source>??
  13. ????????<Abstract></Abstract>??
  14. ????</Article>??
  15. ????<Article>??
  16. ????????<ArticleID>1000554</ArticleID>??
  17. ????????<Title><![CDATA[?内地事业单位拟设统一工资制度?]]></Title>??
  18. ????????<Date>2011-11-25?14:22:33</Date>??
  19. ????????<Category><![CDATA[?闻天下?]]></Category>??
  20. ????????<HeadNote></HeadNote>??
  21. ????????<SubTitle></SubTitle>??
  22. ????????<Author></Author>??
  23. ????????<Source></Source>??
  24. ????????<Abstract></Abstract>??
  25. ????</Article>??
  26. ????<Article>??
  27. ????????<ArticleID>1000553</ArticleID>??
  28. ????????<Title></Title>??
  29. ????????<Date>2011-11-25?14:21:23</Date>??
  30. ????????<SmallPictures>livenews/images/s21.png</SmallPictures>??
  31. ????????<LargePictures>livenews/images/l21.png</LargePictures>??
  32. ????????<Category><![CDATA[?星娱乐?]]></Category>??
  33. ????????<HeadNote></HeadNote>??
  34. ????????<SubTitle></SubTitle>??
  35. ????????<Author></Author>??
  36. ????????<Source><![CDATA[?凤凰网综合?]]></Source>??
  37. ????????<Abstract></Abstract>??
  38. ????</Article>??
  39. <News>??

?

可以很明显的看到,在解析 <ArticleID>1000553</ArticleID> ?这一段时,?characters执行了两次,将内容"1000553"分两次读取.. 用上面那种方式的最终结果就是?ArticleID = 00553 了。 那如果你的应用需要根据这个id 进一步获取内容岂不是死翘翘了?(比如这边根据id获取新闻详细内容)

?

好了,废话不多说了,看下正确的写法! ?【★以下解析方法才是正确的 √ 】

?

[java]?view plaincopyprint??
  1. import?java.util.ArrayList;??
  2. import?java.util.List;??
  3. ??
  4. import?org.xml.sax.Attributes;??
  5. import?org.xml.sax.SAXException;??
  6. import?org.xml.sax.helpers.DefaultHandler;??
  7. ??
  8. import?android.util.Log;??
  9. ??
  10. public?class?XmlHandler?extends?DefaultHandler{??
  11. ??????
  12. ????private?final?String?TAG?=?this.getClass().getSimpleName();??
  13. ??????
  14. ????/**XML文件中标签定义*/??
  15. ????private?final?String?TAG_Article?=?"Article";??
  16. ????private?final?String?TAG_ArticleID?=?"ArticleID";??
  17. ????private?final?String?TAG_Title?=?"Title";??
  18. ????private?final?String?TAG_Date?=?"Date";??
  19. ????private?final?String?TAG_SmallPictures?=?"SmallPictures";??
  20. ????private?final?String?TAG_LargePictures?=?"LargePictures";??
  21. ????private?final?String?TAG_Category?=?"Category";??
  22. ????private?static?final?String?TAG_HeadNote?=?"HeadNote";??
  23. ????private?static?final?String?TAG_SubTitle?=?"SubTitle";??
  24. ????private?static?final?String?TAG_Source?=?"Source";??
  25. ??????
  26. ????//单个文章??
  27. ????private?News?news?=?null;??
  28. ??????
  29. ????//文章列表??
  30. ????private?List<News>??newsList?=?null;??
  31. ??????
  32. ????//解析开始时间??
  33. ????private?long?start_time;??
  34. ??????
  35. ????//(1)??
  36. ????private?StringBuilder?sb?=?new?StringBuilder();??
  37. ??????
  38. ????@Override??
  39. ????public?void?characters(char[]?ch,?int?start,?int?length)??
  40. ????????????throws?SAXException?{??
  41. ????????super.characters(ch,?start,?length);??
  42. ??????????
  43. ????????//(2)不管在startElement到endElement的过程中,执行了多少次characters,?都会将内容添加到StringBuilder中,不会丢失内容??
  44. ????????sb.append(ch,?start,?length);??
  45. ????}??
  46. ??
  47. ????@Override??
  48. ????public?void?startDocument()?throws?SAXException?{??
  49. ????????super.startDocument();??
  50. ??????????
  51. ????????start_time?=?System.currentTimeMillis();??
  52. ????????newsList?=?new?ArrayList<News>();??
  53. ????}??
  54. ??
  55. ????@Override??
  56. ????public?void?startElement(String?uri,?String?localName,?String?qName,?Attributes?attributes)?throws?SAXException?{??
  57. ????????super.startElement(uri,?localName,?qName,?attributes);??
  58. ????????//(3)?开始收集新的标签的数据时,先清空历史数据??
  59. ????????sb.setLength(0);??
  60. ????????if(TAG_Article.equals(localName))?{??
  61. ????????????news?=?new?News();??
  62. ????????}??
  63. ????}??
  64. ??????
  65. ????@Override??
  66. ????public?void?endElement(String?uri,?String?localName,?String?qName)??
  67. ????????????throws?SAXException?{??
  68. ????????super.endElement(uri,?localName,?qName);??
  69. ??????????
  70. ????????//(4)原来在characters中取值,现改在此取值??
  71. ????????String?value?=?sb.toString();??
  72. ??????????
  73. ????????if(TAG_ArticleID.equals(localName))?{??
  74. ????????????news.setArticleId(value);??
  75. ????????}?else?if(TAG_Title.equals(localName))?{??
  76. ????????????news.setTitle(value);??
  77. ????????}?else?if(TAG_Date.equals(localName))?{??
  78. ????????????news.setDate(value);??
  79. ????????}?else?if(TAG_Category.equals(localName))?{??
  80. ????????????news.setCategory(value);??
  81. ????????}?else?if(TAG_SmallPictures.equals(localName))?{??
  82. ????????????news.setSmallPicture(value);??
  83. ????????}?else?if(TAG_LargePictures.equals(localName))?{??
  84. ????????????news.setLargePicture(value);??
  85. ????????}?else?if(TAG_HeadNote.equals(localName))?{??
  86. ????????????news.setHeadNote(value);??
  87. ????????}?else?if(TAG_SubTitle.equals(localName))?{??
  88. ????????????news.setSubTitle(value);??
  89. ????????}?else?if(TAG_Source.equals(localName))?{??
  90. ????????????news.setSource(value);??
  91. ????????}?????????
  92. ????????if(TAG_Article.equals(localName))?{??
  93. ????????????newsList.add(news);??
  94. ????????}??
  95. ????}??
  96. ??????
  97. ????@Override??
  98. ????public?void?endDocument()?throws?SAXException?{??
  99. ????????super.endDocument();??
  100. ??????????
  101. ????????long?end?=?System.currentTimeMillis();??
  102. ????????Log.d(TAG,?"Parse?List's?Xml?Cost:?"?+?(end?-?start_time)?+?"?!!");??
  103. ????}??
  104. }??


归纳为三点:

?

1.startElement的时候,?new StringBuilder(); 或者?sb.setLength(0); (我建议后者)
2.characters的时候,sb.append(ch, start, length);
3.endElement的时候,sb.toString(); 此时StringBuilder中的内容才是解析的结果

?

通过这种方法就不会再有数据离奇丢失的情况了(同时也不需要像错误方法那样再设个currentTag之类的了,逻辑繁杂了,还出错)!?

?

希望大家可以尽早看到这篇文章,不要继续被吭了!!!

读书人网 >Android

热点推荐