SAS: Summing across rows conditional on date in each row -
i have information shown in first 4 columns in table below , add column calls_previous_3_days containing sum of calls each custid each area previous 3 dates.
i.e., if custumer made call support on 17jan2015 new variable show sum of number of calls customer made support during period 14jan2015-16jan2015.
how calculate sum in column calls_previous_3_days dependent on custid, area , date?
custid area date calls calls_previous_3_days 3137 support 05jan2015 1 0 3137 support 14jan2015 4 0 3137 support 16jan2015 1 4 3137 support 17jan2015 1 5 3137 support 20jan2015 2 1 3137 support 22jan2015 1 2 5225 support 26jan2015 1 0 5225 support 27jan2015 1 1 5225 support 28jan2015 1 2 5225 sales 14feb2015 1 0 5225 sales 15feb2015 1 1 5225 sales 22feb2015 1 0
you can achieve arrays, storing last 3 values , dates , summing dates satisfy criteria.
proc sort data = have; custid area;; run; proc print; data want; set have; custid area; /* create temporary array (which automatically retains) */ array hist{3,2} _temporary_; /* initialise array values if starting new group */ if first.area call missing(of hist[*]); /* sum values in array satisfy condition */ callsp3 = sum( 0, (sum(date, - hist[1,2]) <= 3) * hist[1,1], (sum(date, - hist[2,2]) <= 3) * hist[2,1], (sum(date, - hist[3,2]) <= 3) * hist[3,1] ); /* store current value/date in array , shift old values down */ hist[1,1] = hist[2,1]; hist[1,2] = hist[2,2]; hist[2,1] = hist[3,1]; hist[2,2] = hist[3,2]; hist[3,1] = calls; hist[3,2] = date; run;
there (as sas) several ways approach problem. investigate lag()
functions or use proc sql
self join data on
clause specify condition. prefer array approach lag()
has gotchas , self joining slower. however, if want different or longer windows array approach can become unwieldy due length of code. can mitigated if confident in macro language, may better investigate different approach.
if have sas/ets licensed should able achieve proc expand
. cannot confirm want though don't have licence.
proc expand data = have out = want = day = day; custid area; id date; convert calls = callsp3 / method = none transformout = (movsum 4); run; data want; set want (where = (calls ~= .)); callsp3 = callsp3 - calls; run;
Comments
Post a Comment