bash error in sort "sort: write failed: standard output: Broken pipe"

Dani  picture Dani · Sep 13, 2017 · Viewed 7.2k times · Source

When I run this script I recieve an error message with: "sort: write failed: standard output: Broken pipe"

If someone can help me it would be awesome, I am going crazy with this error

the input file is a list of files that all contain DNA sequences in a FASTA format, so each file has several sequences (each sequence in a single line) with the format: in $1 (Identifier) in $2,3,4,5,6,7&8 (more values) in $9 (the DNA sequence)

Then I want select each of this sequences by number of sequences ($common_hits) in each file (this number is not a fix value but i set 6 for the example) -All the files with less than 6 sequences must be removed -Files with 6 sequences are ok -The files with more than 6 sequences have to be reduced to 6 sequences (these sequences are selected by the higher values of field $5)

the output files must have all 6 sequences and the sequence (field $9) has to be in the line after the identifier

I am not removing the originals files with more than 6 sequences for now, because I want to be sure it works

par_list=`ls -1 *BR`

common_hits="6"

for i in ${par_list}

do

   if [ "`cat ${i} | wc -l`" -lt "${common_hits}" ]
   then
      rm -f ${i}
   elif [ "`cat ${i} | wc -l`" -gt "${common_hits}" ]
   then
      cat ${i} | sort -nr -k 5 | head -n ${common_hits} | \
      awk '{print $1"    " $2"    " $3"    " $4"    " $5"    " $6"    " $7"           "$8 ; print $9}' > ${i}.ph 
   fi

done 

Answer

Charles Duffy picture Charles Duffy · Sep 13, 2017

sort | head always reports an error, if head exits (or otherwise closes its stdin) before sort has written all its output (as will be the case, if the stream written by sort is much longer than that consumed by head). This is by-design: If sort can't write all its output, it's expected to fail; if it ignored such failures, it would also ignore cases where it couldn't write its output for other reasons (disk full, broken network connection, etc.

There's nothing unusual or undesirable about this. If you don't care about the error, ignore it, and check the number of lines of output from the pipeline to determine whether you had an error condition instead.