making multi-index in pandas dataframes in Python? -

- July 15, 2011

i have data set there matrix of numeric values indexed time variable. each matrix numpy array (that can converted dataframe columns corresponding columns of matrix). if have these matrices how can make them single dataframe each matrix has time index? specifically:

# time t1 d1 = pandas.dataframe({"a": [1,2,3,4], "b":[10,20,30,40]}) # time t2 d2 = pandas.dataframe({"a": [10,20,30,40], "b": [1,2,5,6]}) # time t3  d3 = ...

i want make index called "time" index these dataframes, , aggregate values columns "a" , "b" across time index. how can in pandas?

my attempt:

d=pandas.dataframe([d1,d2],index=(0, 1),columns=["time"])

update: unutbu's solution adding 2 hierarchical columns is:

c = pd.concat([d1, d2], keys=[('t1', 'p1'), ('t2', 'p2')], names=['time', 'position'])

my final question how access resulting structure? example how do vectorized operations across time, or across position? eg take average of rows each value of time.

also, how compare encoding time , position each dataframe , using groupby? in other words when use levels versus flat columns grouped? here's alternative solution using flat dataframe groupby:

d1["time"] = 1 d1["position"] = "x" d2["time"] = 2 d2["position"] = "y" c = pandas.concat([d1, d2]) # take mean time values c.groupby("time").apply(lambda x: np.mean(x, axis=1))

given

import pandas pd  d1 = pd.dataframe({"a": [1,2,3,4], "b":[10,20,30,40]}) d2 = pd.dataframe({"a": [10,20,30,40], "b": [1,2,5,6]})

then pd.concat([d1, d2], keys=['t1', 't2']) returns:

in [177]: pd.concat([d1, d2], keys=['t1', 't2']) out[177]:           b t1 0   1  10    1   2  20    2   3  30    3   4  40 t2 0  10   1    1  20   2    2  30   5    3  40   6

if wish add more 1 level new multiindex, can instead pass list of tuples keys parameter:

in [237]: pd.concat([d1, d2], keys=[('t1', 'p1'), ('t2', 'p2')], names=['time', 'position']) out[237]:                      b time position           t1   p1       0   1  10               1   2  20               2   3  30               3   4  40 t2   p2       0  10   1               1  20   2               2  30   5               3  40   6

note, important here keys receives list of tuples, rather list of lists.

Search This Blog

Soju

making multi-index in pandas dataframes in Python? -

Comments

Post a Comment

Popular posts from this blog

python - TypeError: start must be a integer -

c# - DevExpress RepositoryItemComboBox BackColor property ignored -

django - Creating multiple model instances in DRF3 -