-------------------------------------------------------------------------------- join -------------------------------------------------------------------------------- joins two files (file "1" and file "2") based on a common key which by default is that specified in column one. #IFS (input field separator) by default is \s or \t #by default the first column of the output is the field which #which is used for match (join) #by default, only records which are paired are printed -t "IFS" change IFS. OFS follows IFS -i ignore case -a 1 print unmatched lines in file 1 -o over-rides the output column order I. Simplest Example $ cat a 1 Protein 2 Carbohydrate 3 Fat $ cat b #note that the first column is sorted and has more entries 1 Cheese 2 Potato 3 Butter 4 Air 5 Water $ join a b #only common lines are joined 1 Protein Cheese 2 Carbohydrate Potato 3 Fat Butter $ join -v 1 a b #print lines which do not have counterpart in file 1 $ join -v 2 a b 0 Air 5 Water II. Joining on specific columns #remember: the specific columns must be sorted $ cat c Air 0 Cheese 1 Potato 2 Butter 3 Water 5 $ join -1 2 -2 1 c a #first file use column 2, second file use column 1 1 Cheese Protein 2 Potato Carbohydrate 3 Butter Fat III. Sorting before joining $ cat wine.txt White Reisling Germany Red Riocha Spain Red Beaunes France $ cat reviews.txt Riocha Meh Beaunes Great! Reisling Terrible! Fortunately bash allows sub-shells be treated as files. For reasons I do not understand you have to use GNU join $ gjoin -1 2 -2 1 <(sort -k 2 wine.txt) <(sort reviews.txt) Beaunes Red France Great! Reisling White Germany Terrible! Riocha Red Spain Meh IV. Sepcifying a different delimiter or IFS #done using -t option $ join -1 2 -2 1 -t, file1 file2 V. Grand Combo: specify a specific output sequence $ cat names.csv 1,John Smith,London 2,Arthur Dent, Newcastle 3,Sophie Smith,London $ cat transactions.csv £1234,Deposit,John Smith £4534,Withdrawal,Arthur Dent £4675,Deposit,Sophie Smith $ join -1 2 -2 3 -t , -o 1.1,1.2,1.3,2.2,2.1 names.csv transactions.csv 1,John Smith,London,Deposit,£1234 2,Arthur Dent, Newcastle,Withdrawal,£4534 3,Sophie Smith,London,Deposit,£4675 Of course you could have used awk to re-sequence the output. Ref: https://shapeshed.com/unix-join/ ------------------------------------------------------------------------ join whilst preserving the order of one of the input files ------------------------------------------------------------------------ A limitation of join is that the input files have to be sorted. Sometimes you wish to have the output to carry the same order as one of the input files. $ cat rabbits1 Parvi Brown Sarsi Black Lakshmi White Rusty Tan Malli Gray Kuro Blue $ cat rabbits2 Kuro Parsley Malli Coriander Rusty Celery Lakshmi Lettuce Sarsi Carrots Parvi Hay The key word here is name of the rabbit (column 1) and the other attributes are color and food choice. Say we wish to preserve ther order of the name in "rabbits1" (because these correspond to temporal order of my rabbits). The solution is to add a numerical index to rabbits1 file. $ cat -n rabbits1 1 Parvi Brown 2 Sarsi Black 3 Lakshmi White 4 Rusty Tan 5 Malli Gray 6 Kuro Blue We sort on the rabbit names and rewrite the output so that the first column is the number column $ join -1 2 -2 1 -o1.1 -o1.2 -o 1.3 -o 2.2 <(cat -n rabbits1 | sort -k2) <(sort rabbits2) 6 Kuro Blue Parsley 3 Lakshmi White Lettuce 5 Malli Gray Coriander 1 Parvi Brown Hay 4 Rusty Tan Celery 2 Sarsi Black Carrots Next we re-sort on the first column (the numerical index) $ join -1 2 -2 1 -o1.1 -o1.2 -o 1.3 -o 2.2 <(cat -n rabbits1 | sort -k2) <(sort rabbits2) | sort -k1 -n 1 Parvi Brown Hay 2 Sarsi Black Carrots 3 Lakshmi White Lettuce 4 Rusty Tan Celery 5 Malli Gray Coriander 6 Kuro Blue Parsley Finally, lop off the number column $ join -1 2 -2 1 -o1.1 -o1.2 -o 1.3 -o 2.2 <(cat -n rabbits1 | sort -k2) <(sort rabbits2) | sort -k1n | awk '{$1="";print}' Parvi Brown Hay Sarsi Black Carrots Lakshmi White Lettuce Rusty Tan Celery Malli Gray Coriander Kuro Blue Parsley We now have sorted list but the ordering is determined by rabbits1 To make the output pretty use "column" $ join -1 2 -2 1 -o1.1 -o1.2 -o 1.3 -o 2.2 <(cat -n rabbits1 | sort -k2) <(sort rabbits2) | sort -k1n | awk '{$1="";print}' | column -t Parvi Brown Hay Sarsi Black Carrots Lakshmi White Lettuce Rusty Tan Celery Malli Gray Coriander Kuro Blue Parsley