csv - Bash - adding values in row based on column -
the 2nd column in csv file has duplicates. want add associated values column 1 based on duplicates.
example csv :
56, cc=dk 49, cc=us 34, cc=gb 32, cc=de 32, cc=nz 31, cc=dk 31, cc=gb 31, cc=gb
example result :
96, cc=gb # 96 = 34+31+31 87, cc=dk # 87 = 56+31 32, cc=de 32, cc=nz
you can use associative arrays in awk
:
awk '{s[$2]+=$1}end{for(k in s)print s[k]", ",k}' infile
expanding on readability, , using sum/key
rather s/k
:
{ # each line. sum[$2] += $1 # add first field accumulator, # indexed second field. # initial value zero. } end { # bit when whole file processed. (key in sum) # each key cc=us: print sum[key] ", " key # output sum , key. }
here's sample run on box:
pax$ echo;echo '56, cc=dk 49, cc=us 34, cc=gb 32, cc=de 32, cc=nz 31, cc=dk 31, cc=gb 31, cc=gb' | awk '{s[$2]+=$1}end{for(k in s)print s[k]", "k}' 32, cc=de 96, cc=gb 32, cc=nz 49, cc=us 87, cc=dk
this works despite fact first column of form 999,
(note comma @ end), because awk
, when evaluating strings in numeric context, uses prefix valid in context. hence 45xyzzy
become 45
and, more importantly, 49,
becomes 49
.
Comments
Post a Comment