Get Plain HTML from HTTP Requests

dulcyn picture dulcyn · Dec 18, 2012 · Viewed 10.1k times · Source

I'm working on a grails app and for several hours already have been trying to get html code from a request. What I want to do is to get plain html (like in a webPage source, with all tags and stuff), so that i can work on that.

I have already managed to get it for my get requests with this code:

url = ("http://google.com").toURL().getText())

It works just fine, but I also need to be able to make post requests.

I've tried with httpBuilder, but the response I'm getting looks like well formated text (with white spaces and stuff), but that hasn't got any html tags, and I need them. Code I'm using looks like this:

def url = "http://urlToRemoteServer.com/"
def http = new HTTPBuilder(url);


http.post( path: 'pathToMyApp',
        requestContentType: "text/xml" ) { resp, reader ->

            println "Tweet response status: ${resp.statusLine}"
            assert resp.statusLine.statusCode == 200
            System.out << reader
        }

Can anyone tell me how to get that html code? I'm working on groovy, but Java solution will be just as good.

Answer

Brian Henry picture Brian Henry · Dec 18, 2012

Change the post map to include the contentType to force plain-text parsing (and, I believe change to Accepts header) as below:

http.post( path: 'pathToMyApp',
           requestContentType: "text/xml",
           contentType: "text/xml") { resp, reader ->

Alternatively, you can change parser for this and future requests by adding a ParserRegistry remap after the constructor:

http.parser.'text/html' = http.parser.'text/plain'

You can also add a call to setContentType(), after your constructor call for HTTPBuilder:

//...
def http = new HTTPBuilder(url);  //existing code
http.contentType = ContentType.TEXT //new addition
http.post( path: 'pathToMyApp', //existing code
//...