python - Pandas multiindex and pytables... separate indexes or one concatenated index? -
what structure of pandas multiindex on hdf5 when data frame saved hdf5 through pytables? each of parts separate index or there 1 concatenated index?
it stored df.reset_index()
, except index columns automatically data columns (meaning can select them).
in [1]: df = dataframe({'a' : np.random.randn(9)},index=pd.multiindex.from_product([range(3),list('abc')],names=['first','second'])) in [2]: df out[2]: first second 0 -1.249058 b -0.674645 c -0.000458 1 0.455390 b -1.693221 c 1.245806 2 0.337478 b 0.672525 c 0.160914 in [3]: store = pd.hdfstore('test.h5',mode='w') in [4]: store.append('df',df) in [5]: store out[5]: <class 'pandas.io.pytables.hdfstore'> file path: test.h5 /df frame_table (typ->appendable_multi,nrows->9,ncols->3,indexers->[index],dc->[second,first])
here's actual structure looks like.
in [7]: store.get_storer('df').table out[7]: /df/table (table(9,)) '' description := { "index": int64col(shape=(), dflt=0, pos=0), "values_block_0": float64col(shape=(1,), dflt=0.0, pos=1), "second": stringcol(itemsize=1, shape=(), dflt='', pos=2), "first": int64col(shape=(), dflt=0, pos=3)} byteorder := 'little' chunkshape := (2621,) autoindex := true colindexes := { "index": index(6, medium, shuffle, zlib(1)).is_csi=false, "second": index(6, medium, shuffle, zlib(1)).is_csi=false, "first": index(6, medium, shuffle, zlib(1)).is_csi=false}
select levels name
in [9]: store.select('df',where='second="b"') out[9]: first second 0 b -0.674645 1 b -1.693221 2 b 0.672525 in [10]: store.select('df',where='second="b" & first=2') out[10]: first second 2 b 0.672525
Comments
Post a Comment