csv - Bash - adding values in row based on column -
the 2nd column in csv file has duplicates. want add associated values column 1 based on duplicates.
example csv :
56, cc=dk 49, cc=us 34, cc=gb 32, cc=de 32, cc=nz 31, cc=dk 31, cc=gb 31, cc=gb example result :
96, cc=gb # 96 = 34+31+31 87, cc=dk # 87 = 56+31 32, cc=de 32, cc=nz
you can use associative arrays in awk:
awk '{s[$2]+=$1}end{for(k in s)print s[k]", ",k}' infile expanding on readability, , using sum/key rather s/k:
{ # each line. sum[$2] += $1 # add first field accumulator, # indexed second field. # initial value zero. } end { # bit when whole file processed. (key in sum) # each key cc=us: print sum[key] ", " key # output sum , key. } here's sample run on box:
pax$ echo;echo '56, cc=dk 49, cc=us 34, cc=gb 32, cc=de 32, cc=nz 31, cc=dk 31, cc=gb 31, cc=gb' | awk '{s[$2]+=$1}end{for(k in s)print s[k]", "k}' 32, cc=de 96, cc=gb 32, cc=nz 49, cc=us 87, cc=dk this works despite fact first column of form 999, (note comma @ end), because awk, when evaluating strings in numeric context, uses prefix valid in context. hence 45xyzzy become 45 and, more importantly, 49, becomes 49.
Comments
Post a Comment