Curl and screen scraping advice please :) |
This is a discussion on Curl and screen scraping advice please :) within the PHP forums, part of the Development category; Hi All, as im sure some of you are aware im setting up a price comparison feature from an api ... |
![]() |
| | LinkBack | Thread Tools | Display Modes |
| | #1 |
| Member Join Date: Feb 2009 Location: Cornwall
Posts: 412
Thanks: 23
Thanked 4 Times in 4 Posts
![]() Expertise: PHP Experience: Intermediate |
Hi All, as im sure some of you are aware im setting up a price comparison feature from an api protocal which stores millions of products, the main problem with this is that its not very reliable for example if i search modern warfare 2 on xbox 360 it might return 10 stores which sells it but half the products it finds are the communicator and there is no record of the actual game from particular merchants. for example play and zavvi dont have the game listed that i can find, so i was wondering how easy curl and screen scraping was and if im more likley to get better result set :) i have found a website Find DVD - Compare DVD prices from dozens of UK retailers which seems to be spot on when it comes to the price comparison all the links link to the actual product im searching for but im not sure how they do it :) any advice would be appreciated :) cheers Luke |
| | |
| | #2 |
![]() Join Date: Jan 2009 Location: Your Imagination
Posts: 739
Blog Entries: 4 Thanks: 1
Thanked 29 Times in 28 Posts
![]() ![]() ![]() Expertise: PHP Experience: Professional |
Well, first you would need a spider:
__________________ Panthr - Web Design and Development |
| | |
| | #3 |
| Member Join Date: Feb 2009 Location: Cornwall
Posts: 412
Thanks: 23
Thanked 4 Times in 4 Posts
![]() Expertise: PHP Experience: Intermediate |
interesting.. thanks mate
|
| | |
| | #4 |
| Member Join Date: Feb 2009 Location: Cornwall
Posts: 412
Thanks: 23
Thanked 4 Times in 4 Posts
![]() Expertise: PHP Experience: Intermediate |
Hi im back! lol after spending some time on the rest of my site i decided to take a closer look at Curl, and i have some code PHP Code: curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); which is why ive commented it out, but it does not insert anything into the db??? the error it thrown out is Warning: curl_setopt() [function.curl-setopt]: CURLOPT_FOLLOWLOCATION cannot be activated when in safe_mode or an open_basedir is set on line 17 any ideas thanks Luke p.s im trying to get the trailer from the url thanks.. |
| | |
| | #5 |
![]() Join Date: Jan 2009 Location: Your Imagination
Posts: 739
Blog Entries: 4 Thanks: 1
Thanked 29 Times in 28 Posts
![]() ![]() ![]() Expertise: PHP Experience: Professional |
There are several places where it could be going wrong. What HTML is cURL getting? What are the results from XPath?
__________________ Panthr - Web Design and Development |
| | |
| | #6 |
| Member Join Date: Feb 2009 Location: Cornwall
Posts: 412
Thanks: 23
Thanked 4 Times in 4 Posts
![]() Expertise: PHP Experience: Intermediate |
??? dunno mate, how does one find out? the code surrouding the flash object is Code: <div class="special slice"><ul><li><br><center><object type="application/x-shockwave-flash" width="403" height="298" id="playVideoPlayer" wmode="transparent" data="http://media.play.com/trailers/videoPlayer.swf?file=http://media.play.com/ProductPage_Trailers/Films/10674623.flv&vol=0.5&packShot=http://images.play.com/covers/10674623m.jpg" allowScriptAccess="always"><param name="movie" value="http://media.play.com/trailers/videoPlayer.swf?file=http://media.play.com/ProductPage_Trailers/Films/10674623.flv&vol=0.5&packShot=http://images.play.com/covers/10674623m.jpg" /><param name="wmode" value="transparent" /><param name="allowScriptAccess" value="always" /><embed name="playVideoPlayer" src="http://media.play.com/trailers/videoPlayer.swf?file=http://media.play.com/ProductPage_Trailers/Films/10674623.flv&vol=0.5&packShot=http://images.play.com/covers/10674623m.jpg" loop="false" width="403" height="298" allowScriptAccess="always" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer" /></object></center><br><br></li><li> Behind the Scenes</li><li> Interviews with Cast and Crew</li></ul></div> sorry if im being thick just got in from the pub :D |
| | |
| | #7 |
| Member Join Date: Feb 2009 Location: Cornwall
Posts: 412
Thanks: 23
Thanked 4 Times in 4 Posts
![]() Expertise: PHP Experience: Intermediate |
ok just done a print_r like so PHP Code: DOMXPath Object ( ) DOMNodeList Object ( ) so not a lot :( does that mean that this part is wrong PHP Code: PHP Code: Luke |
| | |
| | #8 |
| Member Join Date: Feb 2009 Location: Cornwall
Posts: 412
Thanks: 23
Thanked 4 Times in 4 Posts
![]() Expertise: PHP Experience: Intermediate |
hi mate, turns out i dint need to use curl afterall, the solution was much simpler PHP Code: cheers Luke |
| | |
![]() |
| Tags |
| curl, scraping |
| Thread Tools | |
| Display Modes | |