Java Discover: Read web page

How to read webpage source code through Java?

Read web page

We might be seeing "view page source" option in all web browsers. Just by right click and by selecting "view page source" it will give us the complete client side source code of that particular page. So how we can get this done by using Java code? For this we need to use URLConnection class which establish the connection to the server and next by using InputStream class we can read complete page content as bytes.

Next by iterating InputStream instance (byte by byte) we can get the complete page source as bytes and we can store it in a file for our reference. Lets see simple example to read web page source code through Java.

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;

public class URLReader {
 
 public static void main(String[] args) {
  try{
   URL url = new URL("http://docs.oracle.com/javase/6/docs/api/java/net/URLConnection.html");
   URLConnection urlCon = url.openConnection();
   InputStream is = urlCon.getInputStream();
   
   File myFile = new File("C://URLConnection.html");
   if(!(myFile.exists())){ 
             myFile.createNewFile();
   }
   
   FileWriter fWrite = new FileWriter(myFile, true);  
         BufferedWriter bWrite = new BufferedWriter(fWrite); 
   int i=0;
   while((i=is.read()) != -1){
       bWrite.write((char)i); 
   }

   bWrite.close();
   System.out.println("Web page reading completed...");
   
  }catch (Exception e) {
   e.printStackTrace();
  } 
  
 }
}

OUTPUT:

How to read webpage source code through Java?

Get updates in your email box

Recent Posts

Labels

Category

Labels

Services

More Category

Followers