Ben Nash Posted April 2, 2014 Share Posted April 2, 2014 I'm trying to grab job listings from job sites like reed.com. <?php include("config.php"); echo" <form action='index.php' method='POST' /> What: <input type='text' name='what' /> Where: <input type='text' name='where' /> <input type='submit' name='search' value='Search'/> </form>"; if (isset($_POST['search'])) { $what = mysql_real_escape_string($_POST['what']); $where = mysql_real_escape_string($_POST['where']); if (empty($what) && (empty($where))) { echo"You need to enter info"; }else{ $curl = curl_init("http://www.reed.co.uk/jobs/$what-in-$where"); curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE); $page = curl_exec($curl); curl_close($curl); $regex = '/<p class="description">(.*?)<\/p>/s'; if ( preg_match($regex, $page, $list) ) echo $list[0]; } echo"<a href='http://www.reed.co.uk/jobs/$what-in-$where'>Continue</a>"; } ?> This is my current code. If you enter info into the fields a link will appear for testing purposes, click on it and it will take you to reed. You will see the data entered at my site is all stuffed into the location field on reed. This is causing my site to display incorrect data... What am I doing wrong? Quote Link to comment Share on other sites More sharing options...
Zettieee Posted April 2, 2014 Share Posted April 2, 2014 http://www.reed.co.uk/jobs/$where?keywords=$what Quote Link to comment Share on other sites More sharing options...
Ben Nash Posted April 2, 2014 Author Share Posted April 2, 2014 (edited) That fixed the link but It's not displaying the full listing on my site. I have a strong belief this is the issue: $regex = '/<p class="description">(.*?)<\/p>/s'; Edited April 2, 2014 by Ben Nash Quote Link to comment Share on other sites More sharing options...
Djkanna Posted April 2, 2014 Share Posted April 2, 2014 That fixed the link but It's not displaying the full listing on my site. I have a strong belief this is the issue: $regex = '/<p class="description">(.*?)<\/p>/s'; $url = 'http://www.reed.co.uk/jobs/'.$where.'?keywords='.$what; $curl = curl_init ( $url ); curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE); $page = curl_exec($curl); curl_close($curl); $regex = '/<p class="description">(.*?)<\/p>/s'; if ( preg_match_all ( $regex, $page, $matches ) ) { $matches = $matches[0]; // Want the <p> ? no? use $matches[1]; then. foreach ( $matches as $key => $val ) { echo 'Match('.$key.'): '.htmlentities ( $val, ENT_QUOTES, 'UTF-8' ).' <br />'; } } Quote Link to comment Share on other sites More sharing options...
Ben Nash Posted April 2, 2014 Author Share Posted April 2, 2014 Thanks Dj but it's just grabbing all the text. How do I output the html as well? Quote Link to comment Share on other sites More sharing options...
Djkanna Posted April 2, 2014 Share Posted April 2, 2014 Thanks Dj but it's just grabbing all the text. How do I output the html as well? It grabs what ever is in <p class="description"> and the p tag as per the regex. If you want the p tag as html, remove the htmlentities [be careful]. Quote Link to comment Share on other sites More sharing options...
Ben Nash Posted April 2, 2014 Author Share Posted April 2, 2014 What I mean is how do I make it link back to reed? Also how to display the listings on my site in a tidy fashion? atm it's all jumbled... up Quote Link to comment Share on other sites More sharing options...
BlackScorp Posted April 2, 2014 Share Posted April 2, 2014 it is not legal to crawl the webpage and parse the HTML Code.. if you want to use the search and data stuffs, therefore they made an API http://www.reed.co.uk/api Quote Link to comment Share on other sites More sharing options...
Script47 Posted April 2, 2014 Share Posted April 2, 2014 I know this is a bit off topic, but you seem to be using MRES, now don't you think you should start using MySQLi or PDO seeing as MySQL is insecure and deprecated. Quote Link to comment Share on other sites More sharing options...
BlackScorp Posted April 2, 2014 Share Posted April 2, 2014 I know this is a bit off topic, but you seem to be using MRES, now don't you think you should start using MySQLi or PDO seeing as MySQL is insecure and deprecated. well he use it , but it is useless there, since he think he can escape values for the URL.. Quote Link to comment Share on other sites More sharing options...
Ben Nash Posted April 2, 2014 Author Share Posted April 2, 2014 I know mysql_real_escape_string is not needed there... It was for another part of the script I took out as it was unnecessary, just forgot to remove it. Quote Link to comment Share on other sites More sharing options...
sniko Posted April 2, 2014 Share Posted April 2, 2014 it is not legal to crawl the webpage and parse the HTML Code.. if you want to use the search and data stuffs, therefore they made an API http://www.reed.co.uk/api Reiterating. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.