Database fetchrow_array failed long truncated DBI attribute

Marcus Lim picture Marcus Lim · Sep 7, 2012 · Viewed 12.6k times · Source

i am pulling urls from my database with a perl script where i employ fetchrow_array to pull URL from the database which worked fine until i encountered a very long URL georgelog24.blog.iskreni.net/?bid=6744d9dcf85991ed2e4b8a258153a1ab&lid=ff9963b9a798ea335b75b5f7c0c295d1
then it started to give me this error.

DBD::ODBC::st fetchrow_array failed: st_fetch/SQLFetch (long truncated DBI attribute LongTruncOk not set and/or LongReadLen too small) (SQL-HY000) [state was HY000 now 01004]
[Microsoft][ODBC SQL Server Driver]String data, right truncation (SQL-01004) at C:\test\multihashtest2.pl line 44.

I believe this is on the database side as the code i have been using to pull URL has worked before. The database that i am using is MSSQL server 2005.

the URL column in the database uses text type currently, but i have tried changing it to varchar(max) and nvarchar(max) but the error still stands.

After a bit of trial and error i found that the maximum length of the url then i could query successfully with fetchrow_array was 81 characters. And since URLs can span ridiculous lengths sometimes, i cannot put a restriction on URL length.

Can anybody help me understand and suggest a fix for this?

FYI: line 44 is the first line in my code below

while (($myid,$url) = $statement_handle->fetchrow_array()) { # executes as many threads as there are jobs to do 
    my $thread = threads->create(\&webcrawl); #initiate thread
    my $tid = $thread->tid;
    print "  - Thread $tid started\n";   #obtain thread no. and print
    push (@Threads, $thread);   #push thread into array for "housekeeping" later on
}

Answer

Tudor Constantin picture Tudor Constantin · Sep 7, 2012

Try with:

#not anymore errors if content is truncated - you don't necessarily want this
$statement_handle->{'LongTruncOk'} = 1;

#nice, hard coded constant for the length of data to be read from Longs
$statement_handle->{'LongReadLen'} = 20000;
while (($myid,$url) = $statement_handle->fetchrow_array()) { # executes as many threads as there are jobs to do 
    my $thread = threads->create(\&webcrawl); #initiate thread
    my $tid = $thread->tid;
    print "  - Thread $tid started\n";   #obtain thread no. and print
    push (@Threads, $thread);   #push thread into array for "housekeeping" later on
}

Also, I'd recommend you to try Parallel::ForkManager for parallelizing jobs - I find it much more intuitive and easy to use than threads