It is often useful to compare versions of text files. For system administrators and software developers, this is particularly important. A system administrator may, for example, need to compare an existing configuration file to a previous version to diagnose a system problem. Likewise, a programmer frequently needs to see what changes have been made to programs over time.
The comm utility displays a line-by-line comparison of two sorted files. The first of the three columns it displays lists the lines found only in file1, the second column lists the lines found only in file2, and the third lists the lines common to both files. The basic syntax for the “comm” command is:
# comm [options] file1 file2
Arguements
The file1 and file2 arguments are pathnames of the files that comm compares. Using a hyphen (–) in place of file1 or file2 causes comm to read standard input instead of that file.
Options
You can combine the options. With no options, comm produces three-column output.
Options | Function |
---|---|
-1 | Does not display column 1 (does not display lines found only in file1). |
-2 | Does not display column 2 (does not display lines found only in file2). |
-3 | Does not display column 3 (does not display lines found in both files). |
-i | Case insensitive comparison of lines. |
– -check-order | Check the order of the input, even if all input lines are pairable |
– -nocheck-order | Ignore the order of the input |
– -output-delimiter=STR | delimates columns with delimeter “STR” |
– -help | Displays a help menu |
– -version | Display command version information |
Examples of using “comm” command in Linux
Example 1: Basic usage
Lets see a basic example of “comm” command to compare 2 sorted files. The files are as shown below:
# cat file1 aa bb cc dd
# cat file2 cc xx yy zz
The comm command compares files line by line and outputs any lines that are identical. For example:
# comm file1 file2 aa bb cc dd xx yy zz
This command output displays in three columns: column 1 shows lines only in file1 (aa, bb, dd), column2 shows every line only in file2 (xx, yy, zz), and column 3 shows every line that is the same between the two files (cc). This is a much more detailed comparison than with diff, and the output can be overwhelming when all you want is to find or check for one or two simple changes. However, it can be incredibly useful when you aren’t terribly familiar with either file and want to see how they compare.
Example 2: Suppressing the coumns
comm supports options in the form -n where n is either 1, 2, or 3. When used, these options specify which column(s) to suppress. For example, if we wanted to output only the lines shared by both files, we would suppress the output of columns 1 and 2:
# comm -12 file1 file2 cc
Similarly, you can only display the lines which are only present in file1 and file2 respectively using below commands.
# comm -23 file1 file2 aa bb dd
# comm -13 file1 file2 xx yy zz
Example 3: Sorting check on input
comm command provides 2 options to check for sorted inputs:
1. –check-order
2. –nocheck-order
The –check-order option checks that the input is correctly sorted before comparing. If the input is not sorted, you would get an error as shown below:
# comm --check-order file1 file2 aa bb cc dd xx comm: file 2 is not in sorted order
Whereas the –nocheck-order option allows the file comparison even if the input is not in the sorted format. For example:
# cat file1 aa bb cc dd
# cat file2 xx cc yy zz
# comm --nocheck-order file1 file2 aa bb cc dd xx yy zz
Example 4: delimited output
comm also provides an option to delemit the output using the user provided delimiter. For example, instead of the default “tab” delimitted output, we can use a delimiter such as “|” (pipe) as shown in the example below:
# comm --output-delimiter="|" file1 file2 aa bb ||cc dd |xx |yy |zz
comm V/s diff
comm is similar to diff in that both commands compare two files. But comm can also be used like uniq; comm selects duplicate or unique lines between two sorted files, whereas uniq selects duplicate or unique lines within the same sorted file.