It is currently Fri Sep 10, 2010 8:50 am


Welcome to Mango12!


Mango12 is a programming orientated community that primarily focuses on creating bots, macros, and other such applications to make things on the web easier; from things like simulating game play to submitting web forms. For more info, Read the FAQ or Join the IRC.

All times are UTC



Post new topic Reply to topic  [ 5 posts ] 
Author Message
 Post subject: PHP Crawler Help
PostPosted: Sat May 01, 2010 6:22 am 
Offline
Leecher

Joined: Thu Feb 25, 2010 9:25 am
Posts: 4
Given: 2 thanks
Received: 0 thanks
Known Programming Languages: c++, web languages
Hey,
I was wondering if someone could help me with a problem I am having. I have a crawler, that is indexing a particular domain (*.rpi.edu) It works fine for the most part.

However, when it tries to get_headers($row['url'], 1) from links such as:
  • http://prod3.web.server.rpi.edu/peopledirectory/search.do?query=employee1_dept_org:3040&datasetName=directory&qct=200
  • http://webct.rpi.edu
It times out since it isn't getting any response at all within the maximum execution time. I am at a loss on how to check.

Below is the full error. I can provide other details if needed. Thank you for your help.



Quote:
Warning: get_headers(http://prod3.web.server.rpi.edu/peopledirectory/search.do?query=employee1_dept_org:3040&datasetName=directory&qct=200) [function.get-headers]: failed to open stream: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. in C:\xampp\htdocs\Crawler\pagetitle.php on line 12


Top
 Profile E-mail  
 
 Post subject: Re: PHP Crawler Help
PostPosted: Sat May 01, 2010 8:12 pm 
Offline
Super Member
User avatar

Joined: Sun Sep 28, 2008 9:58 pm
Posts: 106
Given: 1 thanks
Received: 15 thanks
Known Programming Languages: C++, PHP, SQL, C#, Java, Pascal, Assembly, Haskell
You just want to check to see if it failed?

  1. if(!$results = @get_headers(...))
  2. {
  3.     echo 'Timeout';
  4. }
  5. else
  6. {
  7.     ...
  8. }

The function returns an array of the headers on success, and false on failure.

_________________
CAPTCHA Exchange Proof-of-Concept - Do your CAPTCHAs at work/school.
Pending Orders: Office 2007 Ultimate - 12/9/2008


Top
 Profile E-mail  
 
 Post subject: Re: PHP Crawler Help
PostPosted: Sun May 02, 2010 12:25 am 
Offline
Leecher

Joined: Thu Feb 25, 2010 9:25 am
Posts: 4
Given: 2 thanks
Received: 0 thanks
Known Programming Languages: c++, web languages
Thanks for the suggestion, I tried that, the problem is the get header i believe times out longer than 60 seconds which is the time out time of php.


Top
 Profile E-mail  
 
 Post subject: Re: PHP Crawler Help
PostPosted: Sun May 02, 2010 5:01 am 
Offline
Super Member
User avatar

Joined: Sun Sep 28, 2008 9:58 pm
Posts: 106
Given: 1 thanks
Received: 15 thanks
Known Programming Languages: C++, PHP, SQL, C#, Java, Pascal, Assembly, Haskell
If this is on a local or dedicated server, then you can change the max_execution value in php.ini to zero to disable it. If not, you can set it in a .htaccess file with:
  1. php_value max_execution_time 0

or at the beginning of your script:

That only works if safe mode isn't enabled.

_________________
CAPTCHA Exchange Proof-of-Concept - Do your CAPTCHAs at work/school.
Pending Orders: Office 2007 Ultimate - 12/9/2008


Top
 Profile E-mail  
 
 Post subject: Re: PHP Crawler Help
PostPosted: Sun May 02, 2010 7:16 am 
Offline
Leecher

Joined: Thu Feb 25, 2010 9:25 am
Posts: 4
Given: 2 thanks
Received: 0 thanks
Known Programming Languages: c++, web languages
Ok, i'll try setting the limit to 5 minutes.


Top
 Profile E-mail  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2007 phpBB Group
xand 2 for phpBB3 by Thomas Hoornstra and Shadow_One reklama