IF THEN on a Dataframe in r with LAG -


i have dataframe multiple columns, 2 columns in particular interesting me. column1 contains values 0 , number (>0) column2 contains numbers well.

i want create 21 new columns containing new information column2 given column1.

so when column1 positive (not 0) want first new column, column01, take value column2 goes 10 back. , column02 goes 9 back,.. column11 exact same column2 value.. , column21 10 forward.

for example

  column 1  column2   columns01 columns02.. columns11..columns20 columns21       0        5          0         0           0          0         0       0        2          0         0           0          0         0        0        0          0         0           0          0         0         1        3          0         0           3          5         4       0        10         0         0           0          0         0       0        83         0         0           0          0         0       0        2          0         0           0          0         0       0        5          0         0           0          0         0       0        4          0         0           0          0         0       1        8          0         5           8          5         3       0        6          0         0           0          0         0       0        5          0         0           0          0         0       0        55         0         0           0          0         0       0        4          0         0           0          0         0       2        3          10       83           3          5         0       0        2          0         0           0          0         0       0        3          0         0           0          0         0       0        4          0         0           0          0         0       0        5          0         0           0          0         0       0        3          0         0           0          0         0       1        22         6         5          22          0         0       0        12         0         0           0          0         0       0        0          0         0           0          0         0       0        5          0         0           0          0         0 

hope makes sense , can help.

here's 1 way using newly implemented shift() function data.table v1.9.5:

require(data.table) ## v1.9.5+ setdt(dat)                                                      ## (1) cols = paste0("cols", sprintf("%.2d", 1:21))                    ## (2) dat[, cols[1:10] := shift(column2, 10:1, fill=0)]               ## (3) dat[, cols[11] := column2]                                      ## (4) dat[, cols[12:21] := shift(column2, 1:10, fill=0, type="lead")] ## (5) dat[column1 == 0, (cols) := 0]                                  ## (6) 
  1. assuming dat data.frame, setdt(dat) converts data.table, reference (the data not copied physically new location in memory, efficiency).

  2. generate column names.

  3. generated lagged vectors of column2 periods 10:1 , assign first 10 columns.

  4. 11th column = column2.

  5. generated leading vectors of column2 periods 1:10 , assign last 10 columns.

  6. get indices of rows column1 == 0, , replace/reset newly generated columns indices 0.

use setdf(dat) if want data.frame back.

you can wrap in function values -10:10 , choosing type="lag" or type="lead" accordingly, depending on whether values negative or positive.. i'll leave you.


Comments

Popular posts from this blog

Payment information shows nothing in one page checkout page magento -

tcpdump - How to check if server received packet (acknowledged) -