Perl multiline regex

Nicolas Rodríguez Seara picture Nicolas Rodríguez Seara · Sep 26, 2013 · Viewed 9k times · Source

I have a file full of json objects to parse, similar to this one:

{
"_id" : ObjectId("523a58c1e4b09611f4c58a66"),
"_items" : [
    {
        "adGroupId" : NumberLong(1230610621),
        "keywordId" : NumberLong("5458816773")
    },
    {
        "adGroupId" : NumberLong(1230613681),
        "keywordId" : NumberLong("3204196588")
    },
    {
        "adGroupId" : NumberLong(1230613681),
        "keywordId" : NumberLong("4340421772")
    },
    {
        "adGroupId" : NumberLong(1230615571),
        "keywordId" : NumberLong("10525630645")
    },
    {
        "adGroupId" : NumberLong(1230617641),
        "keywordId" : NumberLong("4178290208")
    }
]}

I want to take the numbers from inside de NumberLong(). At first I needed just the keywordId, and managed to accomplish it with:

cat listado.txt |& perl -ne 'print "$1," if /\"keywordId\" : NumberLong\(\"?(\d*)\"?\)/' keywordIds.txt

This generated a comma separated file with the numbers. I now need also de adGroupIds, so I'm trying the following matching regex with no luck:

cat ./work/listado.txt |& perl -ne 'print "$1-$2," if /\"adGroupId\" : NumberLong\(\"?(\d*)\"?\),\s*\"keywordId\" : NumberLong\(\"?(\d*)\"?\)/m'

The regex matches, but I believe perl is not doing multiline, even though I'm using /m.

Any ideas?

Answer

ikegami picture ikegami · Sep 26, 2013

/m affects what ^ and $ match. You use neither, so /m has no effect.

You only read a single line at a time, so you only match against a single line at a time. /m cannot possibly cause the regex to match against data that is awaiting to be read from some file handle it doesn't know anything about.

You could load the entire file into memory by using -0777 and loop over all matches instead of just grabbing the first.