Star InactiveStar InactiveStar InactiveStar InactiveStar Inactive
 

load() Function for PHP - Fetch URL Content

I recently had to develop a small script that will fetch an XML file from the web. All I had to do is download a given URL and read its contents. To my great surprise I found that download the file using my jx Ajax library was much easier than doing it with PHP.

PHP make this very easy by including functions like file_get_contents() that has URL support. This code will get you the contents of an URL.

$contents = file_get_contents('http://example.com/rss.xml');

Unfortunately, this is a huge security threat - and many servers have disabled this feature in PHP. Also this is not the most optimized method to fetch an URL. Also, it is impossible to submit data using the POST method using this function.

Other Options - curl and fsockopen

PHP provide other two method to fetch an URL - Curl and Fsockopen. But to use this I have to write a lot more code.

load()

So I decided to create my own function that makes it much more easier.

Features

  • Easy to use.
  • Supports Get and Post methods.
  • Supports HTTP Basic Authentication - this will work - http://binny:password@example.com/
  • Supports both Curl and Fsockopen. Tries to use curl - if it is not available, users fsockopen.
  • Secure URL(https) supported with Curl

Options

The first argument of this function is the URL to be fetched. The second argument is an associative array. This is an optional argument. The following values are supported in this array.

return_infoPossible values - true/false
If this is true, the function will return an associative array rather than just a string. The array will contain 3 elements...
headersAn associative array containing all the headers returned by the server.bodyA string - the contents of the URL.infoSome information about the fetch. This is the result returned by the 'curl_getinfo()' function. Supported only with Curl.methodPossible Values - post/get
Specifies the method to be used.modified_sinceIf this option is set, the 'If-Modified-Since' header will be used. This will make sure that the URL will be fetched only it was modified.

Examples

The code to fetch the contents of an URL will look like this...

$contents = load('http://example.com/rss.xml');

Simple, no? This will just return the contents of the URL. If you need to do more complex stuff, just use the second argument to pass more options...

$options = array(
	'return_info'	=> true,
	'method'		=> 'post'
);
$result = load('http://www.bin-co.com/rss.xml.php?section=2',$options);
print_r($result);

The output will be like this...

Array
(
    [headers] => Array
        (
            [Date] => Mon, 18 Jun 2007 13:56:22 GMT
            [Server] => Apache/2.0.54 (Unix) PHP/4.4.7 mod_ssl/2.0.54 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2 SVN/1.4.2
            [X-Powered-By] => PHP/5.2.2
            [Expires] => Thu, 19 Nov 1981 08:52:00 GMT
            [Cache-Control] => no-store, no-cache, must-revalidate, post-check=0, pre-check=0
            [Pragma] => no-cache
            [Set-Cookie] => PHPSESSID=85g9n1i320ao08kp5tmmneohm1; path=/
            [Last-Modified] => Tue, 30 Nov 1999 00:00:00 GMT
            [Vary] => Accept-Encoding
            [Transfer-Encoding] => chunked
            [Content-Type] => text/xml
        )
	[body] => ... Contents of the Page ...
	[info] => Array
        (
            [url] => http://www.bin-co.com/rss.xml.php?section=2
            [content_type] => text/xml
            [http_code] => 200
            [header_size] => 501
            [request_size] => 146
            [filetime] => -1
            [ssl_verify_result] => 0
            [redirect_count] => 0
            [total_time] => 1.113792
            [namelookup_time] => 0.180019
            [connect_time] => 0.467973
            [pretransfer_time] => 0.468035
            [size_upload] => 0
            [size_download] => 2274
            [speed_download] => 2041
            [speed_upload] => 0
            [download_content_length] => 0
            [upload_content_length] => 0
            [starttransfer_time] => 0.826031
            [redirect_time] => 0
        )
)

 

Code

<?php
/**
 * See http://www.bin-co.com/php/scripts/load/
 * Version : 1.00.A
 */
function load($url,$options=array('method'=>'get','return_info'=>false)) {
    $url_parts = parse_url($url);
    $info = array(//Currently only supported by curl.
        'http_code'    => 200
    );
    $response = '';
    
    $send_header = array(
        'Accept' => 'text/*',
        'User-Agent' => 'BinGet/1.00.A (http://www.bin-co.com/php/scripts/load/)'
    );

    ///////////////////////////// Curl /////////////////////////////////////
    //If curl is available, use curl to get the data.
    if(function_exists("curl_init") 
                and (!(isset($options['use']) and $options['use'] == 'fsocketopen'))) { //Don't user curl if it is specifically stated to user fsocketopen in the options
        if(isset($options['method']) and $options['method'] == 'post') {
            $page = $url_parts['scheme'] . '://' . $url_parts['host'] . $url_parts['path'];
        } else {
            $page = $url;
        }

        $ch = curl_init($url_parts['host']);

        curl_setopt($ch, CURLOPT_URL, $page);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //Just return the data - not print the whole thing.
        curl_setopt($ch, CURLOPT_HEADER, true); //We need the headers
        curl_setopt($ch, CURLOPT_NOBODY, false); //The content - if true, will not download the contents
        if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
            curl_setopt($ch, CURLOPT_POST, true);
            curl_setopt($ch, CURLOPT_POSTFIELDS, $url_parts['query']);
        }
        //Set the headers our spiders sends
        curl_setopt($ch, CURLOPT_USERAGENT, $send_header['User-Agent']); //The Name of the UserAgent we will be using ;)
        $custom_headers = array("Accept: " . $send_header['Accept'] );
        if(isset($options['modified_since']))
            array_push($custom_headers,"If-Modified-Since: ".gmdate('D, d M Y H:i:s GMT',strtotime($options['modified_since'])));
        curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);

        curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt"); //If ever needed...
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);

        if(isset($url_parts['user']) and isset($url_parts['pass'])) {
            $custom_headers = array("Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']));
            curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
        }

        $response = curl_exec($ch);
        $info = curl_getinfo($ch); //Some information on the fetch
        curl_close($ch);

    //////////////////////////////////////////// FSockOpen //////////////////////////////
    } else { //If there is no curl, use fsocketopen
        if(isset($url_parts['query'])) {
            if(isset($options['method']) and $options['method'] == 'post')
                $page = $url_parts['path'];
            else
                $page = $url_parts['path'] . '?' . $url_parts['query'];
        } else {
            $page = $url_parts['path'];
        }

        $fp = fsockopen($url_parts['host'], 80, $errno, $errstr, 30);
        if ($fp) {
            $out = '';
            if(isset($options['method']) and $options['method'] == 'post' and isset($url_parts['query'])) {
                $out .= "POST $page HTTP/1.1 ";
            } else {
                $out .= "GET $page HTTP/1.0 "; //HTTP/1.0 is much easier to handle than HTTP/1.1
            }
            $out .= "Host: $url_parts[host] ";
            $out .= "Accept: $send_header[Accept] ";
            $out .= "User-Agent: {$send_header['User-Agent']} ";
            if(isset($options['modified_since']))
                $out .= "If-Modified-Since: ".gmdate('D, d M Y H:i:s GMT',strtotime($options['modified_since'])) ." ";

            $out .= "Connection: Close ";
            
            //HTTP Basic Authorization support
            if(isset($url_parts['user']) and isset($url_parts['pass'])) {
                $out .= "Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']) . " ";
            }

            //If the request is post - pass the data in a special way.
            if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
                $out .= "Content-Type: application/x-www-form-urlencoded ";
                $out .= 'Content-Length: ' . strlen($url_parts['query']) . " ";
                $out .= " " . $url_parts['query'];
            }
            $out .= " ";

            fwrite($fp, $out);
            while (!feof($fp)) {
                $response .= fgets($fp, 128);
            }
            fclose($fp);
        }
    }

    //Get the headers in an associative array
    $headers = array();

    if($info['http_code'] == 404) {
        $body = "";
        $headers['Status'] = 404;
    } else {
        //Seperate header and content
        $separator_position = strpos($response," ");
        $header_text = substr($response,0,$separator_position);
        $body = substr($response,$separator_position+4);
        
        foreach(explode(" ",$header_text) as $line) {
            $parts = explode(": ",$line);
            if(count($parts) == 2) $headers[$parts[0]] = chop($parts[1]);
        }
    }

    if($options['return_info']) return array('headers' => $headers, 'body' => $body, 'info' => $info);
    return $body;
}
Recent Topics
Subject
Post Reply
Open
Recent
Does MD Watermark work under Joomla 3.9.20?
By Balazs Thu 16 Jul 2020 4:48 pm Board English Language
6
717
Thu 16 Jul 2020 10:31 pm By Balazs
The program calculates the PMT
By prmindphp Wed 20 May 2020 6:45 pm Board English Language
1
240
Thu 21 May 2020 5:57 pm By prmindphp
Helper for calculating NPER values
By prmindphp Tue 19 May 2020 6:55 pm Board English Language
1
223
Thu 21 May 2020 5:59 pm By prmindphp
Program that will be used for home loan calculations
By prmindphp Mon 18 May 2020 6:49 pm Board English Language
1
248
Thu 21 May 2020 6:02 pm By prmindphp
Tools that will help calculate car tax
By prmindphp Thu 14 May 2020 7:00 pm Board English Language
1
238
Fri 15 May 2020 12:03 am By prmindphp
Helper to create slidershow, Module Product Slide of MooZiiCart
By prmindphp Thu 30 Apr 2020 6:32 pm Board English Language
0
298
Thu 30 Apr 2020 6:32 pm By prmindphp
Helper in searching products by category and keywords with Module Ajax Search of MooZiiCart
By prmindphp Tue 28 Apr 2020 6:17 pm Board English Language
0
265
Tue 28 Apr 2020 6:17 pm By prmindphp
Helper to restore the system to clear data, Plugin System MRestore
By prmindphp Fri 24 Apr 2020 7:05 pm Board English Language
0
248
Fri 24 Apr 2020 7:05 pm By prmindphp
Creating social share buttons, share the website's content to Social Media With Plugin Content Msocial
By prmindphp Wed 22 Apr 2020 7:05 pm Board English Language
0
237
Wed 22 Apr 2020 7:05 pm By prmindphp
Create Content Marketing and Product with Content Product Match of MooZiiCart
By prmindphp Tue 21 Apr 2020 7:19 pm Board English Language
0
276
Tue 21 Apr 2020 7:19 pm By prmindphp
Filtering for find the product with Module Filter product of MooZiiCart
By prmindphp Thu 09 Apr 2020 6:51 pm Board English Language
0
277
Thu 09 Apr 2020 6:51 pm By prmindphp
Template Megadeal of MooZiiCart for creating websites to sell products online
By prmindphp Fri 03 Apr 2020 6:45 pm Board English Language
0
435
Fri 03 Apr 2020 6:45 pm By prmindphp
[MooZiiCart] [Module] Displaying products that customers are interested by Wishlist Module
By prmindphp Fri 27 Mar 2020 5:46 pm Board English Language
0
645
Fri 27 Mar 2020 5:46 pm By prmindphp
Show the Bestsellers Product by Module Bestseller
By prmindphp Wed 25 Mar 2020 6:04 pm Board English Language
0
441
Wed 25 Mar 2020 6:04 pm By prmindphp
Show Products Category by Module Category
By prmindphp Tue 24 Mar 2020 6:07 pm Board English Language
0
544
Tue 24 Mar 2020 6:07 pm By prmindphp
Plugin System MZC Auto Close to enable the open and close odering system
By prmindphp Wed 19 Feb 2020 6:38 pm Board English Language
0
3134
Wed 19 Feb 2020 6:38 pm By prmindphp
MDPartner Component for store client data in CRM system
By prmindphp Sat 25 Jan 2020 3:04 pm Board English Language
0
900
Sat 25 Jan 2020 3:04 pm By prmindphp
i am looking for help
By Anonymous Tue 14 Jan 2020 5:46 pm Board English Language
1
934
Tue 14 Jan 2020 6:03 pm By noppadonsk
Dot Net Training
By Poonaam Fri 10 Jan 2020 2:20 pm Board English Language
0
1182
Fri 10 Jan 2020 2:20 pm By Poonaam
MDRental, Area management assistant for rental business
By prmindphp Wed 08 Jan 2020 6:51 pm Board English Language
0
915
Wed 08 Jan 2020 6:51 pm By prmindphp