IF THEN on a Dataframe in r with LAG -
i have dataframe multiple columns, 2 columns in particular interesting me. column1 contains values 0 , number (>0) column2 contains numbers well.
i want create 21 new columns containing new information column2 given column1.
so when column1 positive (not 0) want first new column, column01, take value column2 goes 10 back. , column02 goes 9 back,.. column11 exact same column2 value.. , column21 10 forward.
for example
column 1 column2 columns01 columns02.. columns11..columns20 columns21 0 5 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 3 0 0 3 5 4 0 10 0 0 0 0 0 0 83 0 0 0 0 0 0 2 0 0 0 0 0 0 5 0 0 0 0 0 0 4 0 0 0 0 0 1 8 0 5 8 5 3 0 6 0 0 0 0 0 0 5 0 0 0 0 0 0 55 0 0 0 0 0 0 4 0 0 0 0 0 2 3 10 83 3 5 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 4 0 0 0 0 0 0 5 0 0 0 0 0 0 3 0 0 0 0 0 1 22 6 5 22 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0
hope makes sense , can help.
here's 1 way using newly implemented shift()
function data.table v1.9.5
:
require(data.table) ## v1.9.5+ setdt(dat) ## (1) cols = paste0("cols", sprintf("%.2d", 1:21)) ## (2) dat[, cols[1:10] := shift(column2, 10:1, fill=0)] ## (3) dat[, cols[11] := column2] ## (4) dat[, cols[12:21] := shift(column2, 1:10, fill=0, type="lead")] ## (5) dat[column1 == 0, (cols) := 0] ## (6)
assuming
dat
data.frame,setdt(dat)
converts data.table, reference (the data not copied physically new location in memory, efficiency).generate column names.
generated lagged vectors of
column2
periods10:1
, assign first 10 columns.11th column =
column2
.generated leading vectors of
column2
periods1:10
, assign last 10 columns.get indices of rows
column1 == 0
, , replace/reset newly generated columns indices0
.
use setdf(dat)
if want data.frame back.
you can wrap in function values -10:10
, choosing type="lag"
or type="lead"
accordingly, depending on whether values negative or positive.. i'll leave you.
Comments
Post a Comment