读书人

求解正则表达式解决思路

发布时间: 2014-01-05 18:22:55 作者: rapoo

求解正则表达式
本帖最后由 AfterSeptember 于 2013-12-05 15:41:26 编辑
源字符串为这个

<tr class="">
<td class="title">
<a href="http://www.douban.com/group/topic/46630489/" title="po照会遇见熟人吗?" class="">po照会遇见熟人吗?</a>
</td>
<td nowrap="nowrap"><a href="http://www.douban.com/group/people/toffybobo/" class="">ToTo</a></td>
<td nowrap="nowrap" class="">233</td>
<td nowrap="nowrap" class="time">12-05 11:11</td>
</tr>
<tr class="">
<td class="title">
<a href="http://www.douban.com/group/topic/46681600/" title="今天出门你们有没有戴口罩" class="">今天出门你们有没有戴口罩</a>
</td>
<td nowrap="nowrap"><a href="http://www.douban.com/group/people/1354411/" class="">丛生</a></td>
<td nowrap="nowrap" class="">8</td>
<td nowrap="nowrap" class="time">12-05 11:06</td>
</tr>



想把46630489 233 ,46681600 8 匹配出来,该怎么写啊。

[解决办法]
>>> s = '''<tr class="">
<td class="title">
<a href="http://www.douban.com/group/topic/46630489/" title="po照会遇见熟人吗?" class="">po照会遇见熟人吗?</a>
</td>
<td nowrap="nowrap"><a href="http://www.douban.com/group/people/toffybobo/" class="">ToTo</a></td>
<td nowrap="nowrap" class="">233</td>
<td nowrap="nowrap" class="time">12-05 11:11</td>
</tr>
<tr class="">
<td class="title">
<a href="http://www.douban.com/group/topic/46681600/" title="今天出门你们有没有戴口罩" class="">今天出门你们有没有戴口罩</a>
</td>
<td nowrap="nowrap"><a href="http://www.douban.com/group/people/1354411/" class="">丛生</a></td>
<td nowrap="nowrap" class="">8</td>
<td nowrap="nowrap" class="time">12-05 11:06</td>
</tr>'''


>>> import re
>>> res = r'topic/(\d+?)/.*?class="">(\d+?)</td>'
>>> m = re.findall(res,s,re.S
[解决办法]
re.M)
>>> m
[('46630489', '233'), ('46681600', '8')]

读书人网 >perl python

热点推荐