To read HTML from any website say "http://www.twitter.com ".
Print the retrived HTML
Save it to a text file on local machine .
import java.net.*;
import java.io.*;
public class oddless {
public static void main(String[] args) throws Exception {
URL oracle = new URL("http://www.fetagracollege.org");
BufferedReader in = new BufferedReader(new InputStreamReader(oracle.openStream()));
OutputStream os = new FileOutputStream("/Users/Rohan/new_sourcee.txt");
String inputLine;
while ((inputLine = in.readLine()) != null)
System.out.println(inputLine);
in.close();
}
}
Code above retrieves the data, prints it on console and saves it to a text file but mostly it retrieves only half code (because of line space in html code). It does not save the code further.
How can I save the full html code?
Are there any other alternatives?
I have used different approach but I received same output like you. Is not there problem on server side of this URL?
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("http://www.fetagracollege.org");
CloseableHttpResponse response1 = httpclient.execute(httpGet);
try {
System.out.println(response1.getStatusLine());
HttpEntity entity1 = response1.getEntity();
String content = EntityUtils.toString(entity1);
System.out.println(content);
} finally {
response1.close();
}
It finishes with:
</table>
<p><br>
UPDATE: This Faculty of Engineering and Technology does not have well formed home page. This content is complete, your code works well. But commentators have right, you shall use try/catch/finally block.