csv - Using Unstack in Python -
i trying unstack column in python isn't quite doing expecting. table (called df) looks similar this:
station_id year day1 day2 210018 1916 4 7 1917 3 9 256700 1916 nan 8 1917 6 9 i want unstack year days year per stationn in 1 row. 2 days 1916 start first followed 2 days 1917 station 210018 , 256700.
an example this:
station_id 1916 1917 210018 4 7 3 9 256700 nan 8 6 9 i trying use code:
df2=df.unstack(level='year') df2.columns=df2.columns.swaplevel(0,1) df2=df2.sort(axis=1) i error says attributeerror: 'series' object has no attribute 'columns'.
any appreciated.
you need make year index before call unstack:
try: # python2 cstringio import stringio except importerror: # python3 io import stringio import pandas pd text = '''\ station_id year day1 day2 210018 1916 4 7 210018 1917 3 9 256700 1916 nan 8 256700 1917 6 9''' df = pd.read_table(stringio(text), sep='\s+') df = df.set_index(['station_id', 'year']) df2 = df.unstack(level='year') df2.columns = df2.columns.swaplevel(0,1) df2 = df2.sort(axis=1) print(df2) yields
year 1916 1917 day1 day2 day1 day2 station_id 210018 4 7 3 9 256700 nan 8 6 9 whereas, if year column, , not index, then
df = pd.read_table(stringio(text), sep='\s+') df = df.set_index(['station_id']) df2 = df.unstack(level='year') df2.columns = df2.columns.swaplevel(0,1) df2 = df2.sort(axis=1) leads attributeerror: 'series' object has no attribute 'columns'.
the level='year' ignored in df.unstack(level='year') when df not have index level named year (or even, say, blah):
in [102]: df out[102]: year day1 day2 station_id 210018 1916 4 7 210018 1917 3 9 256700 1916 nan 8 256700 1917 6 9 in [103]: df.unstack(level='blah') out[103]: station_id year 210018 1916 210018 1917 256700 1916 256700 1917 day1 210018 4 210018 3 256700 nan 256700 6 day2 210018 7 210018 9 256700 8 256700 9 dtype: float64 this source of surprising error.
Comments
Post a Comment