求正则获取页面中的链接的有关问题

求正则获取页面中的链接的问题！
我想获取页面源代码中的链接地址
如：www.baidu.com中获取链接地址

<a onClick="this.style.behavior='url(#default#homepage)';this.setHomePage('http://www.baidu.com')" href=http://utility.baidu.com/traf/click.php?id=215&url=http://www.baidu.com>把百度设为首页</a><a href=http://jingjia.baidu.com>企业推广</a> | <a href=http://top.baidu.com>搜索风云榜</a> | <a href=/home.html>关于百度</a> | <a href=http://ir.baidu.com>About Baidu</a>©2008 Baidu <a href=http://www.baidu.com/duty>使用百度前必读</a> <a href=http://www.miibeian.gov.cn target=_blank>京ICP证030173号</a> <a href=http://www.hd315.gov.cn/beian/view.asp?bianhao=010202001092500412><img src=http://gimg.baidu.com/img/gs.gif></a>

得到
http://utility.baidu.com/traf/click.php?id=215&url=http://www.baidu.com
http://jingjia.baidu.com
http://top.baidu.com
/home.html
等地址！也就是href=后的结果

我用

C# code

private string getUrlCode(string StrContent){string urlCode = null;Regex re = new Regex(@"&lt;a\s+href\s*=\s*('(?&lt;href&gt;[^']*)'|""(?&lt;href&gt;[^""]*)""|(?&lt;href&gt;[\S&gt;]*))[^&gt;]*&gt;.*?&lt;u&gt;(?&lt;link&gt;[^&lt;]+)&lt;/u&gt;.*?&lt;/a&gt;", RegexOptions.IgnoreCase | RegexOptions.Singleline);foreach (Match m in re.Matches(s)){    urlCode = urlCode + m.Groups["href"].Value+ "\r\n";}return urlCode;}

没效果！

[解决办法]

C# code

&lt;a.+?href=(?&lt;href&gt;[^&gt;]+)&gt;
[解决办法]
C# code(?&lt;=href=).*?(?=&gt;|\s)
[解决办法]
try:
C# codeMatchCollection mc = Regex.Matches(要匹配的字符串,"href=(?&lt;href&gt;[^&gt;]*)",RegexOptions.IgnoreCase);foreach ( Match match in mc )   Response.Write( match.Groups["href"].Value + "&lt;br /&gt;" );

求正则获取页面中的链接的有关问题

热点推荐