Increment duplicates in a file

Question

I have a file like this:
20001 
17001
17001
53001
90001 
90001
90001

and I'm trying to modify $1 by adding one to it when it's a duplicate entry, so the output will be like this:
20001 
17001
17002
53001
90001 
90002
90003

guest · Answer

awk '{$1+=seen[$1]++} 1' file

Add post-incremented hash value to current value of $1 before printing.
The above will produce duplicate numbers when values are close together, such as the sequence 2,2,3 – the output is 2,3,3. A loop can be used to make that 2,3,4:
awk '{while (c[$1]) {$1 += c[$1] += c[$1+c[$1]]} c[$1]++} 1'

Array c stores the offset by which $1 is to be increased (like seen in the first example). Instead of increasing $1 only by the offset for that unique value, it's also increased by the offset from the next value until a new previously unseen $1 has been reached.

Stéphane Chazelas · Answer

A variation on @guest's answer that guards against duplicate on output by incrementing the number as long as it's already been output before:
awk '{while ($1 in c) $1 += c[$1]++; c[$1]++; print}' file

Or the same in perl, processing numbers wherever they are in the input:
perl -pe 's{d+}{
            $i = $&;
            while (defined($c{$i})) {$i += $c{$i}++}
            $c{$i}++;
            $i
          }ge' file

On an input like:
1
1
1
5
5
10
10
1
1
1

They give:
1
2
3
5
6
10
11
4
7
8

Increment duplicates in a file

2 Answers

Add your own answers!

Ask a Question