Shell Scripting unwanted '?' character at the end of file name

premprakash picture premprakash · Nov 6, 2012 · Viewed 13.5k times · Source

I get an unwanted '?' at the end of my file name while doing this:

emplid=$(grep -a "Student ID" "$i".txt  | sed 's/(Student ID:  //g' | sed 's/)Tj//g' ) 
 #gets emplid by doing a grep from some text file
echo "$emplid"   #prints employee id correctly 
cp "$i" "$emplid".pdf  #getting an extra '?' character after emplid and before .pdf

i.e instead of getting the file name like 123456.pdf , I get 123456?.pdf . Why is this happening if the echo prints correctly? How can I remove trailing question mark characters ?

Answer

Gordon Davisson picture Gordon Davisson · Nov 6, 2012

It sounds like your script file has DOS-style line endings (\r\n) instead of unix-style (just \n) -- when a script in this format, the \r gets treated as part of the commands. In this instance, it's getting included in $emplid and therefore in the filename.

Many platforms support the dos2unix command to convert the file to unix-style line endings. And once it's converted, stick to text editors that support unix-style text files.

EDIT: I had assumed the problem line endings were in the shell script, but it looks like they're in the input file ("$i".txt) instead. You can use dos2unix on the input file to clean it and/or add a cleaning step to the sed command in your script. BTW, you can have a single instance of sed apply several edits with the -e option:

emplid=$(grep -a "Student ID" "$i".txt  | sed '-e s/(Student ID:  //g' -e 's/)Tj//g' -e $'s/\r$//' )

I'd recommend against using sed 's/.$//' -- if the file is in unix format, that'll cut off the last character of the filename.