求一个正则用于匹配body标签的
抓取的html文本
我想用正则匹配出body标签内的所有内容,这个该怎么写?
[解决办法]
/<body>.*<\/body>/
??不是很明白楼主的意思
[解决办法]
字符串的话可以用
var bodyHTML;
try
{
var htmlStr = eval('('+str+')');
var bodyHTML = htmlStr.body.innerHTML;
}
catch(e)
{
}
[解决办法]
<body>([\s\S]+?)</body>
[解决办法]
你应该是把网页保存后用某个编辑软件打开这个网页后搜索吧?
我用EmEditor打开后搜索时在自定义里边吧.匹配换行符选中和.匹配换行符的行数改大后用那个正则匹配可以的
要不版主的那个匹配字符串可以 但在编辑器里边不行
[解决办法]
/<body[\s\S]*?>([\s\S]*?)<\/body>/
[解决办法]
var s="\r\n\r\n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\r\n\r\n<html xmlns=\"http://www.w3.org/1999/xhtml\">\r\n<head><meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" /><title>\r\n\r\n</title>\r\n <script type=\"text/javascript\" src=\"js/jquery-1.6.2.js\"><\/script>\r\n\t<script type=\"text/javascript\" src=\"js/custom.js\"><\/script>\r\n\r\n\t<link href=\"css/ui/ui.base.css\" rel=\"stylesheet\" media=\"all\" /><link href=\"css/themes/gray_standard/ui.css\" rel=\"stylesheet\" title=\"style\" media=\"all\" />\r\n\r\n\t<!--[if IE 6]>\r\n\t<link href=\"css/ie6.css\" rel=\"stylesheet\" media=\"all\" />\r\n\t<script src=\"js/pngfix.js\"><\/script>\r\n\t<script>\r\n\t /* Fix IE6 Transparent PNG */\r\n\t DD_belatedPNG.fix(\'.logo, ul#dashboard-buttons li a, .response-msg, #search-bar input\');\r\n\t<\/script>\r\n\t<![endif]-->\r\n \r\n</head>\r\n<body onunload=\"setheight()\">\r\n <form method=\"post\" action=\"welcome.aspx?action=get&name=\" id=\"form1\">\r\n<div class=\"aspNetHidden\">\r\n<input type=\"hidden\" name=\"__VIEWSTATE\" id=\"__VIEWSTATE\" value=\"/wEPDwUKMTY1NDU2MTA1MmRkj69cULYw1yCwNSQyt8QDH6tFWzo6ZRbIVECBzNiRyAY=\" />\r\n</div>\r\n\r\n <div>\r\n \r\n <div class=\"inner-page-title\">\r\n <h2>Welcome to Admintasia 2.3 Live Demonstration</h2>\r\n\t\t\t\t\t <span>You can start building your next user interface with this powerful UI framework !</span>\r\n\t\t\t\t </div>\r\n <div class=\"clear\"></div>\r\n\t\t\t\t <div class=\"content-box\">\r\n\t\t\t\t\t <div class=\"two-column\">\r\n\t\t\t\t\t\t <div class=\"column\">\r\n\t\t\t\t\t\t\t <div class=\"portlet ui-widget ui-widget-content ui-helper-clearfix ui-corner-all\">\r\n <div class=\"portlet-header ui-widget-header\">New Release: Admintasia 2.3<span class=\"ui-icon ui-icon-circle-arrow-s\"></span></div>\r\n\t\t\t\t\t\t\t\t <div class=\"portlet-content\">\r\n\t\t\t\t\t\t\t\t\t <p>\r\n\t\t\t\t\t\t\t\t\t\t <a href=\"http://www.admintasia.com\">\r\n <b>Visit the presentation website</b>\r\n </a>
[解决办法]
<b><a href=\"http://www.admintasia.com/live-demo\"><b>View Live Demonstration</b></a></b>\r\n\t\t\t\t\t\t\t\t\t\t <br /><br />\r\n Prices start from <b>$50</b> for the <b>Regular License</b> \r\n and <b>$299</b> for the <b>Extended License</b>. Both licenses \r\n come with <b>support access</b> and <b>lifetime updates</b>. \r\n </p>\r\n\t\t\t\t\t\t\t\t </div>\r\n\t\t\t\t\t\t\t </div>\r\n\t\t\t\t\t\t </div>\r\n\t\t\t\t\t\t <div class=\"column column-right\">\r\n\t\t\t\t\t\t\t <div class=\"portlet ui-widget ui-widget-content ui-helper-clearfix ui-corner-all\">\r\n\t\t\t\t\t\t\t\t <div class=\"portlet-header ui-widget-header\">Affiliate Program<span class=\"ui-icon ui-icon-circle-arrow-s\"></span></div>\r\n\t\t\t\t\t\t\t\t <div class=\"portlet-content\">\t\t\t\t\t\t\t\t\t\r\n <p> Join our affiliates program and earn 51% of every sale. \r\n <br />\r\n <br /><b><a href=\"http://www.admintasia.com/affiliates/\">Click here for more details about our affiliates program</a></b>.\r\n\t\t\t\t\t\t\t\t\t </p>\r\n\t\t\t\t\t\t\t\t </div>\r\n\t\t\t\t\t\t\t </div>\r\n\t\t\t\t\t\t </div>\r\n\t\t\t\t\t </div>\r\n\t\t\t\t\t "
+ "<p> Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus.</p>\r\n\t\t\t\t\t</div>\r\n\t\t\t\t</div>\r\n \r\n\r\n </div>\r\n </form>\r\n <script type=\"text/javascript\">\r\n $(document).ready(function () {\r\n parent.setheight(document.body.clientHeight);\r\n });\r\n function setheight() {\r\n parent.setheight(814);\r\n }\r\n <\/script>\r\n</body>\r\n</html>\r\n";
var rx = /<body[^>]*>([\s\S]+?)<\/body>/i///////////
var m = rx.exec(s);
if (m) m = m[1];
alert(m)