Access backlink and outbound hyperlink data for any site

Access backlink and outbound hyperlink data for any site

When we first started crawling and analyzing sites, right away users were asking about the different types of APIs we could offer based on the data we collect. One of the most frequently requested services was for accessing the backlinks for websites. Today I’m happy to announce that’s exactly what we’re releasing: an easy to use API which allows you to scan through the backlinks for a site.

In addition to being able to query site backlinks, it’s also possible to see all of the outbound links from a site that we’ve analyzed. Outbound links are hyperlinks that originate on one site and point to an external location, such as a link on docs.webshrinker.com pointing to a page on www.webshrinker.com.

This is an example response when querying the backlinks for “example.com” with the results limited to 2 entries:

GET /hosts/v3/ZXhhbXBsZS5jb20=/links/inbound?limit=2 HTTP/1.1
Host: api.webshrinker.com

HTTP/1.1 200 OK
Content-Type: application/json

{
   "data": [
       {
           "start_date": "2016-10-01",
           "end_date": "2016-10-31",
           "links": [
               {
                   "from": "https://metalab.at/wiki/index.php?title=metaday_30&diff=30727&oldid=30726",
                   "to": "http://example.com/",
                   "seen": "2016-10-23T23:17:16Z",
                   "attrs": {
                       "text": "foo"
                   }
               },
               {
                   "from": "http://blog.youaresecure.be/",
                   "to": "http://example.com/",
                   "seen": "2016-10-23T21:49:12Z",
                   "attrs": {
                       "text": "link to example"
                   }
               }
           ]
       }
   ],
   "paging": {
       "cursors": {
           "after": "B04CCUJYBEtXDAwIUVYDWgMGBFYEVgMBCVRUXQgKBlYGAgJSUQAHUgZQAAIOAw8LWAMFUAVXAFUEC1NSXAlRBABUBlIDV1gHAQEAAFULVwBb"
       },
       "next": "https://api.webshrinker.com/hosts/v3/ZXhhbXBsZS5jb20=/links/inbound?limit=2&page=B04CCUJYBEtXDAwIUVYDWgMGBFYEVgMBCVRUXQgKBlYGAgJSUQAHUgZQAAIOAw8LWAMFUAVXAFUEC1NSXAlRBABUBlIDV1gHAQEAAFULVwBb",
       "count": 2,
       "remaining": 1998
   }
}

As shown in this example, there are two inbound links returned for the query. Each link shows the page that it was discovered on (“from”), the page it is linking to (“to”), when the link was discovered (“seen”), and any attributes about the link (“attrs”). The link attributes include the text displayed in the browser for the link, the target page (such as “_blank”), and any “title”, “rel”, and “alt” HTML tags.

You can also see in the example output there are more results that can be retrieved, as shown under the “paging” section. It’s estimating that there are about 1998 more records available for the query. To retrieve these extra records, simply make the same API call again but add an additional URL parameter called “page” with the value present in paging/cursors/after.

Here is code in Python that you can use to see similar results shown above, just remember to insert your own account API key and secret before running:

import requests
import json
from base64 import urlsafe_b64encode

try:
   from urllib import urlencode
except ImportError:
   from urllib.parse import urlencode

target_website = b"example.com"

key = "<insert your API key>"
secret_key = "<insert your API secret key>"

params = {
   "limit": 2
}

api_url = "https://api.webshrinker.com/hosts/v3/{}/links/inbound?{}".format(urlsafe_b64encode(target_website).decode('utf-8'), urlencode(params, True))

response = requests.get(api_url, auth=(key, secret_key))
status_code = response.status_code
data = response.json()

if status_code == 200:
   # Do something with the JSON response
   print(json.dumps(data, indent=4))
elif status_code == 400:
   # Bad or malformed HTTP request
   print("Bad or malformed HTTP request")
   print(data)
elif status_code == 401:
   # Unauthorized
   print("Unauthorized - check your access and secret key permissions")
   print(data)
elif status_code == 402:
   # Request limit reached
   print("Account request limit reached")
   print(data)
else:
   # General error occurred
   print("A general error occurred, try the request again")

Pricing for backlink/outbound links is pretty straight forward, using just 1 API credit per 100 results returned.

Additional documentation on using these new features is available here.

Most Popular
New Webshrinker Categories: Hate, Government, and Trackers
March 24, 2021
By
Peter Lowe

We curate our sets of categories very carefully, and only update them after thorough consideration. Here are the newest Webshrinker categories.

read more
This is some text inside of a div block.

Explore More Content

Ready to brush up on something new? We've got even more for you to discover.

Secure Your Organization Without Slowing Down

Content filtering for end-user protection. Block security threats and inappropriate content with DNSFilter.