csv - Bash - adding values in row based on column -


the 2nd column in csv file has duplicates. want add associated values column 1 based on duplicates.

example csv :

56,  cc=dk 49,  cc=us 34,  cc=gb 32,  cc=de 32,  cc=nz 31,  cc=dk 31,  cc=gb 31,  cc=gb 

example result :

96,  cc=gb # 96 = 34+31+31 87,  cc=dk # 87 = 56+31 32,  cc=de 32,  cc=nz 

you can use associative arrays in awk:

awk '{s[$2]+=$1}end{for(k in s)print s[k]", ",k}' infile 

expanding on readability, , using sum/key rather s/k:

{                                 # each line.     sum[$2] += $1                 # add first field accumulator,                                   #   indexed second field.                                   #   initial value zero. } end {                             # bit when whole file processed.     (key in sum)              # each key cc=us:         print sum[key] ", " key   # output sum , key. } 

here's sample run on box:

pax$ echo;echo '56,  cc=dk 49,  cc=us 34,  cc=gb 32,  cc=de 32,  cc=nz 31,  cc=dk 31,  cc=gb 31,  cc=gb' | awk '{s[$2]+=$1}end{for(k in s)print s[k]", "k}'  32, cc=de 96, cc=gb 32, cc=nz 49, cc=us 87, cc=dk 

this works despite fact first column of form 999, (note comma @ end), because awk, when evaluating strings in numeric context, uses prefix valid in context. hence 45xyzzy become 45 and, more importantly, 49, becomes 49.


Comments

Popular posts from this blog

javascript - AngularJS custom datepicker directive -

javascript - jQuery date picker - Disable dates after the selection from the first date picker -