读书人

小弟我要抓取页面的信息。大家给点思路

发布时间: 2013-01-23 10:44:50 作者: rapoo

我要抓取页面的信息。大家给点思路
我要抓取页面的信息。但url重新过了。
http://flights.ctrip.com/booking/bjs-sha----adu-1/?dayoffset=7&ddate1=2013-01-12&dcityname1=%u5317%u4eac&acityname1=%u4e0a%u6d77
我用了很笨的办法。把页面保存在本地。然后读取,用正则匹配
WebClient wc = new WebClient();
string html = wc.DownloadString(url);
////////////
当我用webClient直接读取网上的页面的时候很多很多内容读取不到。

保存在可这远远不够。我想直接去读取那个页面。然后匹配。下载源代码。是不是要读取流才能做到,大家给我点思路。谢谢
[解决办法]
string whereUrl = "dayoffset=7&ddate1=2013-01-12&dcityname1=%u5317%u4eac&acityname1=%u4e0a%u6d77";
html = HttpPost1("http://flights.ctrip.com/booking/bjs-sha----adu-1/",whereUrl);
Response.Write(html);
}
public string HttpPost1(string url, string param)
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
request.Accept = "*/*";
request.Timeout = 50000;
request.AllowAutoRedirect = false;

StreamWriter requestStream = null;
WebResponse response = null;
string responseStr = null;

try
{
requestStream = new StreamWriter(request.GetRequestStream());
requestStream.Write(param);
requestStream.Close();

response = request.GetResponse();
if (response != null)
{
StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.GetEncoding("GB2312"));
responseStr = reader.ReadToEnd();
reader.Close();
}


}
catch (Exception)
{
throw;
}
finally
{
request = null;
requestStream = null;
response = null;
}
return responseStr;
}
解决了
[解决办法]
我说好像少了什么叫,刚才把for给扔了……


HtmlDocument objDoc = webBrowser1.Document;
for (int i = 0; i < objDoc.All.Count; i++){
if (objDoc.All[i].TagName.ToLower().Equals("html"))
{
richTextBox1.Text = objDoc.All[i].InnerHtml;
break;
}
}

读书人网 >asp.net

热点推荐