How to extract urls from webpage for free?

Extract all URLs from a website with this short JavaScript snippet.
extract-urls-from-website

What do you do when you want to export all or specific links from a webpage? Copying them one after another is monotonous and useless especially when you can automate it with a line of JavaScript code. This article serves as a short demonstration of how you can use browser developer consoles to scrape data from the web page. If you are impressed with this, do learn some JavaScript as it comes very handy. Two other techniques to extract links from page are also shared here for people who don’t want to get their hands dirty with code 😄.

Extracting URLs using Dev Tools console

The browser console is an excellent tool to test and debug things. You can write JavaScript code and inject it into the current page to do all sorts of fancy things. I can’t stress enough how useful that is! To open the console on Chrome, press Cmd + Shift + i on Mac and Ctrl + Shift + i on Windows. The JavaScript snippets to extract links are given below. Copy the code, paste it into the console and hit enter.

Extract URLs + Corresponding Anchor Text

The following is a cross-browser supported code for extracting URLs along with their anchor text.

var urls = document.querySelectorAll('a');
for(url in urls){
 console.log("#"+url+" > "+urls[url].innerHTML +" >> "+urls[url].href)
}

Extract URLs + Corresponding Anchor Text – Styled Output (For Chrome & Firefox)

If you are using Chrome or Firefox use the following code for a styled version of the same.

extract-links-from-webpage-using-dev-console
Demo of extracting links from Wikipedia page using dev console
var urls = document.querySelectorAll('a');
for(url in urls){
 console.log("%c#"+url+" > %c"+urls[url].innerHTML +" >> %c"+urls[url].href,"color:red;","color:green;","color:blue;");
}

Extract URLs Only 

And if you want to extract just the links without the anchor text, then use the following code.

var urls = document.querySelectorAll('a');
for(url in urls)
 console.log(urls[url].href);

Extract External URLs Only

External Links are the ones that point outside the current domain. If you want to extract the external URLs only, then this is the code you need to use.

var links = document.querySelectorAll('a');
for (var i = links.length - 1; i > 0; i--) {
    if (links[i].host !== location.host) {
       console.log(links[i].href);
    }
}

Extract URLs with a specific extension

If you would like to extract links having a particular extension then paste the following code into the console. Pass the extension wrapped in quotes to the getLinksWithExtension() function. Please note that the following code extracts links from HTML link tag only (<a></a>)  and not from other tags such as a script or image tag.

function getLinksWithExtension(extension) {
    var links = document.querySelectorAll('a[href$="' + extension + '"]'),
        i;

    for (i=0; i<links.length; i++){
        console.log(links[i]);
    }
}
getLinksWithExtension('mp3') //change mp3 to any extension

Online URL Extractor Website

There are situations when you cannot follow the above method such as when you are using a mobile. In situations like that, you can follow this trick. iwebtool is a great site that offers URL extraction along with other features such as selective extraction of inbound or outbound links, anchor text extraction, etc. You can make 10 requests per hour with the free version of the tool. Visit iwebtool link extractor to get started. Enter the url in the text box and wait for the site to extract the links.

Extract URLs from a block of text

open-multi-urls-with-multiple-url-opener

Let’s say you’ve got a text file with a bunch of links in it. In such cases, you can use our free web tool to extract and open multiple URLs from texts.

  1. Go to Codegena URL Extractor and bulk URL opener
  2. Paste the text into the text area and hit Linkify to generate clickable links.
  3. Click on “Open all URL” button to automatically open all of the links in new tabs (allow popups to enable this feature)

You can also paste the source code of a website into this tool to extract the urls from it. However in such cases this tool will fail to extract relative links.

A similar tool is BuzzStream URL Extractor which also offers CSV export.

If you need any assistance, leave a comment below and I will get back to you.

Also Read

Total
0
Shares
5 comments
  1. all medthod cannot extract url type “onclick” in sourcecode i am super surprise !!!!!!!!!!!!!!

    you try go to any web which contain url of id line which it is this format

    onclick=”location.href=’http://line.me/ti/p/~petermarrylove789′”

    your method can not extract this url !!!

    pls help me too. thank you very much

Leave a Reply
Related Posts
Total
0
Share