المساعد الشخصي الرقمي

مشاهدة النسخة كاملة : crawling a website



C# Programming
01-26-2010, 05:11 AM
Hi,
I need to crawl a website of below url format and store the content locally in a csv file
now, I only need to crawl URL of below format. I should be able to do a loop with query string ranging from 0 to 1000000(or a max value present).
http://www.mysite.com.au/products/products.asp?p=101

the only way i think will be to do a loop from 0 to 1000000 but again there are some id that doesnt exists and redirects to main page which i need to exclude from crawl list i.e it redirects to http://www.mysite.com.au/products page.
below is the code so far i have coded. please assist me how can i achieve this.



public static void CrawlSite()
{
Console.WriteLine("Beginning crawl.");
CrawlPage("http://www.mysite.com.au/products/products.asp?p="); Console.WriteLine("Finished crawl.");

}

private static void CrawlPage(string url)
{
for(int i=0; i