i am pulling urls from my database with a perl script where i employ fetchrow_array to pull URL from the database which worked fine until i encountered a very long URL georgelog24.blog.iskreni.net/?bid=6744d9dcf85991ed2e4b8a258153a1ab&lid=ff9963b9a798ea335b75b5f7c0c295d1
then it started to give me this error.
DBD::ODBC::st fetchrow_array failed: st_fetch/SQLFetch (long truncated DBI attribute LongTruncOk not set and/or LongReadLen too small) (SQL-HY000) [state was HY000 now 01004]
[Microsoft][ODBC SQL Server Driver]String data, right truncation (SQL-01004) at C:\test\multihashtest2.pl line 44.
I believe this is on the database side as the code i have been using to pull URL has worked before. The database that i am using is MSSQL server 2005.
the URL column in the database uses text type currently, but i have tried changing it to varchar(max) and nvarchar(max) but the error still stands.
After a bit of trial and error i found that the maximum length of the url then i could query successfully with fetchrow_array was 81 characters. And since URLs can span ridiculous lengths sometimes, i cannot put a restriction on URL length.
Can anybody help me understand and suggest a fix for this?
FYI: line 44 is the first line in my code below
while (($myid,$url) = $statement_handle->fetchrow_array()) { # executes as many threads as there are jobs to do
my $thread = threads->create(\&webcrawl); #initiate thread
my $tid = $thread->tid;
print " - Thread $tid started\n"; #obtain thread no. and print
push (@Threads, $thread); #push thread into array for "housekeeping" later on
}
Try with:
#not anymore errors if content is truncated - you don't necessarily want this
$statement_handle->{'LongTruncOk'} = 1;
#nice, hard coded constant for the length of data to be read from Longs
$statement_handle->{'LongReadLen'} = 20000;
while (($myid,$url) = $statement_handle->fetchrow_array()) { # executes as many threads as there are jobs to do
my $thread = threads->create(\&webcrawl); #initiate thread
my $tid = $thread->tid;
print " - Thread $tid started\n"; #obtain thread no. and print
push (@Threads, $thread); #push thread into array for "housekeeping" later on
}
Also, I'd recommend you to try Parallel::ForkManager
for parallelizing jobs - I find it much more intuitive and easy to use than threads