I have written a shell script in ksh to convert a CSV file into Spreadsheet XML file. It takes an existing CSV file (the path to which is a variable in the script), and then creates a new output file .xls. The script has no positional parameters. The file name of the CSV is currently hardcoded into the script.
I would like to amend the script so it can take the input CSV data from a pipe, and so that the .xls output data can also be piped or redirected (>) to a file on the command line.
How is this achieved?
I am struggling to find documentation on how to write a shell script to take input from a pipe. It appears that 'read' is only used for std input from kb.
Edit : script below for info (now amended to take input from a pipe via the cat, as per the answer to the question.
#Script to convert a .csv data to "Spreadsheet ML" XML format - the XML scheme for Excel 2003
# Take CSV data as standard input
# Out XLS data as standard output
DATE=`date +%Y%m%d`
#define tmp files
#take standard input and save as $INPUT (tmp.csv)
cat > $INPUT
#clean input data and save as $IN_FILE (in_file.csv)
grep '.' $INPUT | sed 's/ *,/,/g' | sed 's/, */,/g' > $IN_FILE
#delete original $INPUT file (tmp.csv)
#detect the number of columns and rows in the input file
ROWS=`wc -l < $IN_FILE | sed 's/ //g' `
COLS=`awk -F',' '{print NF; exit}' $IN_FILE`
#echo "Total columns is $COLS"
#echo "Total rows is $ROWS"
#create start of Excel File
echo "<?xml version=\"1.0\"?>
<?mso-application progid=\"Excel.Sheet\"?>
<Workbook xmlns=\"urn:schemas-microsoft-com:office:spreadsheet\"
<DocumentProperties xmlns=\"urn:schemas-microsoft-com:office:office\">
<Author>Ben Hamilton</Author>
<LastAuthor>Ben Hamilton</LastAuthor>
<ExcelWorkbook xmlns=\"urn:schemas-microsoft-com:office:excel\">
<Style ss:ID=\"Default\" ss:Name=\"Normal\">
<Alignment ss:Vertical=\"Bottom\" />
<Borders />
<Font />
<Interior />
<NumberFormat />
<Protection />
<Style ss:ID=\"AcadDate\">
<NumberFormat ss:Format=\"Short Date\"/>
<Worksheet ss:Name=\"Sheet 1\">
<Column ss:AutoFitWidth=\"1\" />"
#for each row in turn, create the XML elements for row/column
while (( r <= $ROWS ))
echo "<Row>\n"
while (( c <= $COLS ))
DATA=`sed -n "${r}p" $IN_FILE | cut -d "," -f $c `
if [[ "${DATA}" == [0-9][0-9]\.[0-9][0-9]\.[0-9][0-9][0-9][0-9] ]]; then
DD=`echo $DATA | cut -d "." -f 1`
MM=`echo $DATA | cut -d "." -f 2`
YYYY=`echo $DATA | cut -d "." -f 3`
echo "<Cell ss:StyleID=\"AcadDate\"><Data ss:Type=\"DateTime\">${YYYY}-${MM}-${DD}T00:00:00.000</Data></Cell>"
echo "<Cell><Data ss:Type=\"String\">${DATA}</Data></Cell>"
(( c+=1 ))
echo "</Row>"
(( r+=1 ))
echo "</Table>\n</Worksheet>\n</Workbook>"
rm $IN_FILE > /dev/null
exit 0
Commands inherit their standard input from the process that starts them. In your case, your script provides its standard input for each command that it runs. A simple example script:
cat > foo.txt
Piping data into your shell script causes cat
to read that data, since cat
inherits its standard input from your script.
$ echo "Hello world" | myscript.sh
$ cat foo.txt
Hello world
The read
command is provided by the shell for reading text from standard input into a shell variable if you don't have another command to read or process your script's standard input.
read foo
echo "You entered '$foo'"
$ echo bob | myscript.sh
You entered 'bob'