csv - Bash - adding values in row based on column -


the 2nd column in csv file has duplicates. want add associated values column 1 based on duplicates.

example csv :

56,  cc=dk 49,  cc=us 34,  cc=gb 32,  cc=de 32,  cc=nz 31,  cc=dk 31,  cc=gb 31,  cc=gb 

example result :

96,  cc=gb # 96 = 34+31+31 87,  cc=dk # 87 = 56+31 32,  cc=de 32,  cc=nz 

you can use associative arrays in awk:

awk '{s[$2]+=$1}end{for(k in s)print s[k]", ",k}' infile 

expanding on readability, , using sum/key rather s/k:

{                                 # each line.     sum[$2] += $1                 # add first field accumulator,                                   #   indexed second field.                                   #   initial value zero. } end {                             # bit when whole file processed.     (key in sum)              # each key cc=us:         print sum[key] ", " key   # output sum , key. } 

here's sample run on box:

pax$ echo;echo '56,  cc=dk 49,  cc=us 34,  cc=gb 32,  cc=de 32,  cc=nz 31,  cc=dk 31,  cc=gb 31,  cc=gb' | awk '{s[$2]+=$1}end{for(k in s)print s[k]", "k}'  32, cc=de 96, cc=gb 32, cc=nz 49, cc=us 87, cc=dk 

this works despite fact first column of form 999, (note comma @ end), because awk, when evaluating strings in numeric context, uses prefix valid in context. hence 45xyzzy become 45 and, more importantly, 49, becomes 49.


Comments

Popular posts from this blog

Payment information shows nothing in one page checkout page magento -

tcpdump - How to check if server received packet (acknowledged) -