读书人

求个HTML正则表达式解决思路

发布时间: 2012-01-09 21:05:42 作者: rapoo

求个HTML正则表达式
HTML如下:
<tr> <td width= '20 ' class= 'hei14 '> </td> <td width= '360 '> <a href=http://news.xinhuanet.com/travel/2007-05/17/content_6108964.htm target= '_blank ' class= 'hei14 '> 武夷山风景名胜区门票价格上调 </a> <span class= 'sj '> (05-17) </span> </td> </tr>

需要获取
1,http://news.xinhuanet.com/travel/2007-05/17/content_6108964.htm
2,武夷山风景名胜区门票价格上调
3,05-17

[解决办法]
格式固定吗,楼主应该是要同时取多个吧,这样试下

string yourStr = ...........;
MatchCollection mc = Regex.Matches(yourStr, @ " <tr[^> ]*?> [\s\S]*? <a\s+href=([ " " ']?)(? <url> [^ " " '\s]*)\1?[^> ]*?> (? <text> [^ <]*?) </a> \s* <span[^> ]*?> \((? <time> [^ <\)]*?)\) </span> </td> \s* </tr> ", RegexOptions.IgnoreCase);
foreach (Match m in mc)
{
richTextBox2.Text += m.Groups[ "url "].Value + "\n ";
richTextBox2.Text += m.Groups[ "text "].Value + "\n ";
richTextBox2.Text += m.Groups[ "time "].Value + "\n ";
}

读书人网 >asp.net

热点推荐