apache pig - store files based on date column -


please me.. have scenario below - input file..

id name time-stamp  1234 kiran 18-mar-2015 01:02:31 1234 kiran 18-mar-2015 01:02:31 1234 kiran 19-mar-2015 01:02:31  1234 kiran 18-mar-2015 11:02:31  1234 kiran 20-mar-2015 01:02:00 1234 kiran 11-mar-2015 21:12:31 1234 kiran 18-mar-2015 01:02:31  1234 kiran 30-mar-2015 01:02:31 1234 kiran 22-mar-2015 01:11:00 1234 kiran 30-mar-2015 01:02:31 1234 kiran 19-mar-2015 01:02:00 

now need write output files based on dates in time-stamp column output be:

user/username/date/part-m-000000  

-- date variable folder name should

user/username/18-mar-2015/part-m-000000  

above file contains value on single date

1234 kiran 18-mar-2015 01:02:31 1234 kiran 18-mar-2015 01:02:31  1234 kiran 18-mar-2015 11:02:31  1234 kiran 18-mar-2015 01:02:31 

another folder name should

user/username/19-mar-2015/part-m-000000  

above file contains value on single date

1234 kiran 19-mar-2015 01:02:31  1234 kiran 19-mar-2015 01:02:00 

another folder name should

user/username/20-mar-2015/part-m-000000  

above file contains value on singe date

1234 kiran 20-mar-2015 01:02:00 

another folder name should

user/username/22-mar-2015/part-m-000000  

above file contains value on singe date

1234 kiran 22-mar-2015 01:11:00

another folder name should

user/username/30-mar-2015/part-m-000000  

above file contains value on singe date

1234 kiran 30-mar-2015 01:02:31 1234 kiran 30-mar-2015 01:02:31 

please me

thank you.. sree

below steps should -

  1. use date functions convert time-stamp required format.
  2. group date
  3. flatten group
  4. save result of #3 using org.apache.pig.piggybank.storage.multistorage.

Comments

Popular posts from this blog

Payment information shows nothing in one page checkout page magento -

tcpdump - How to check if server received packet (acknowledged) -