how to result the contents of a javascript variable using cheerio (jquery like selectors, but no dom)

Falieson picture Falieson · Feb 21, 2015 · Viewed 8.3k times · Source

There is a large html file with many javascript tags in it. I'm trying to scoop out the contents of that variable. The variable name stays the same but the contents change on every request.

examplefile.html

<script type="text/javascript">//.... more js</script>
<script type="text/javascript">//.... more js</script>
<script type="text/javascript">var foo = {"b":"bar","c":"cat"}</script>
<script type="text/javascript">//.... more js</script>
<script type="text/javascript">//.... more js</script>
<script type="text/javascript">//.... more js</script>

desired console result

> var result = $('script').<some_selection_thingy>
result = {"b":"bar","c":"cat"}

Let me explain a little bit... By I mean that my questions is - a) how do I select the array object with has the contents 'var foo' b) how do I get the contents of the var foo variable so that I can import that information into a local json variable for further processing.

when you run $('script') in the console, jquery returns an array.

> $('script')
[<script type="text/javascript">//.... more js</script>,<script type="text/javascript">//.... more js</script>,<script type="text/javascript">var foo = {"b":"bar","c":"cat"}</script>,<script type="text/javascript">...</script>]

Because this is cheerio not actually jquery, the dom isn't loaded so I can't just do $(foo) . There is an alternative that I can use jsdom instead of cheerio but I've read in other stackoverflow responses (while researching this question) that it's less performant so I'd prefer to learn the correct jquery selectors I need to scoop out this variable.

server.js

// some cheerio node code
url = 'someurl';
request(url, function(error, response, html){
    var $ = cheerio.load(html);
    result = $('script').map(&:text).select{ |s| s['var foo'] }
    result = result[0]
//SyntaxError: Unexpected token &

Which is of course expected because .map(&:text) is what I'd do if I was using xpath but doesn't work with cheerio (jquery).

Answer

Falieson picture Falieson · Feb 22, 2015

I got it!

function findTextAndReturnRemainder(target, variable){
    var chopFront = target.substring(target.search(variable)+variable.length,target.length);
    var result = chopFront.substring(0,chopFront.search(";"));
    return result;
}
var text = $($('script')).text();
var findAndClean = findTextAndReturnRemainder(text,"var foo =");
var result = JSON.parse(findAndClean);