Comparing two files in batch script

user1495791 picture user1495791 · Oct 7, 2013 · Viewed 14.6k times · Source

i am trying to compare two files in the manner, each line of file 1 will be compared with every line of file 2 and if no match is found, write that line to a seperate file.

Below is the code i wrote but it is not working as expected,

@echo on
cd path
for /f %%a in (file1.txt) do (
for /f %%b in (file2.txt) do (
if %%a==%%b
(
echo lines are same
) else (
echo %%a >> file3.txt
)
)
)

I am getting an error saying, the syntax of the command is incorrect. Please help me with this.

Answer

dbenham picture dbenham · Oct 7, 2013

The FINDSTR method that foxidrive shows is definitely the fastest pure batch way to approach the problem, especially if file2 is large. However, there are a number of scenarios that can cause it to fail: regex meta-charactes in file 1, quotes and/or backslashes in file 1, etc. See What are the undocumented features and limitations of the Windows FINDSTR command? for all of the potential issues. A bit more work can make the solution more reliable.

  • The search should be explicitly made literal
  • The search should be exact match (entire line)
  • Any backslash in search line should be escaped as \\
  • Each search should be stored in a temp file and the \G:file option used

Also, you don't describe the format of each line. Your FOR /F statements will read only the first word of each line because of the the default delims option of <tab> and <space>. I suspect you want to set delims to nothing. You also want to disable the eol option so that lines beginning with ; are not skipped. This requires some weird looking syntax. I added the usebackq option in case you ever deal with file names that must be quoted.

@echo off
setlocal disableDelayedExpansion
set "file1=file1.txt"
set "file2=file2.txt"
set "file3=file3.txt"
set "search=%temp%\search.txt"

>"%file3%" (
  for /f usebackq^ delims^=^ eol^= %%A in ("%file1%") do if "%%A" neq "" (
    set "ln=%%A"
    setlocal enableDelayedExpansion
    (echo(!ln:\=\\!) >"%search%"
    findstr /lxg:"%search%" "%file2%" >nul || (echo(!ln!)
    endlocal
  )
)
del "%search%" 2>nul

There is an extremely fast one line solution if your file2 does not contain \" and you can afford to do a case insensitive search: simply reverse the FINDSTR search to look for any lines in file1 that don't exist in file 2. The search must be case insensitive because of Why doesn't this FINDSTR example with multiple literal search strings find a match?.

findstr /livxg:"file2.txt" "file1.txt" >"file3.txt"

This will not work if file2 contains \" because of escape issues. You could preprocess file2 and escape all \, but then you might as well use the first solution if you are restricting yourself to a pure batch solution.

If you are willing to use a hybrid JScript/batch utility called REPL.BAT, then I have an extremely simple and efficient solution. REPL.BAT performs a regex search and replace operation on each line of stdin, and writes the result to stdout.

Assuming REPL.BAT is in your current directory, or better yet, somewhere within your path:

@echo off
setlocal
set "file1=file1.txt"
set "file2=file2.txt"
set "file3=file3.txt"
set "search=%temp%\search.txt"

type "%file2%"|repl \\ \\ >"%search%"
findstr /livxg:"%search%" "%file1%" >"%file3%"
del "%search%" 2>nul

Note that this solution still must perform a case insensitive comparison.