2016-06-28 5 views
2

Ich möchte mehrere Objekte in einem HDFStore speichern, aber ich möchte es durch Gruppierung organisieren. Etwas entlang der Linien von:Speichern mehrerer Objekte in einer HDFStore-Gruppe

import pandas as pd 
my_store = pd.HDFStore('my_local_store.h5') 
my_store._handle.createGroup('/', 'data_source_1') # this works, but I'm not sure what it does 
my_store['/data_source_1']['part-1'] = pd.DataFrame({'b':[1,2,9,2,3,5,2,5]}) # this does not work 
my_store['/data_source_1']['part-2'] = pd.DataFrame({'b':[3,8,4,2,5,5,6,1]}) # this does not work either 

Antwort

2

try this:

my_store['/data_source_1/part-1'] = ... 

Demo:

In [13]: store = pd.HDFStore('c:/temp/stocks.h5') 

In [15]: store['/aaa/bbb'] = df 

In [17]: store.groups 
Out[17]: 
<bound method HDFStore.groups of <class 'pandas.io.pytables.HDFStore'> 
File path: c:/temp/stocks.h5 
/aaa/bbb   frame  (shape->[3,7]) 
/stocks    wide_table (typ->appendable,nrows->6,ncols->3,indexers->[major_axis,minor_axis],dc->[AAPL,ABC,GOOG])> 

In [18]: store['/aaa/bbb2'] = df 

In [20]: store.items 
Out[20]: 
<bound method HDFStore.items of <class 'pandas.io.pytables.HDFStore'> 
File path: c:/temp/stocks.h5 
/aaa/bbb    frame  (shape->[3,7]) 
/aaa/bbb2   frame  (shape->[3,7]) 
/stocks    wide_table (typ->appendable,nrows->6,ncols->3,indexers->[major_axis,minor_axis],dc->[AAPL,ABC,GOOG])> 

UPDATE:

In [29]: store.get_node('/aaa') 
Out[29]: 
/aaa (Group) '' 
    children := ['bbb' (Group), 'bbb2' (Group)] 

PS AFAIK Pandas hält 0.123.(/aaa/bbb) als vollständigen Pfad

UPDATE2: Auflistung Shop:

wir haben folgendes Geschäft:

In [19]: store 
Out[19]: 
<class 'pandas.io.pytables.HDFStore'> 
File path: D:\temp\.data\hdf\test_groups.h5 
/data_source_1/subdir1/1   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index]) 
/data_source_1/subdir1/2   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index]) 
/data_source_1/subdir1/3   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index]) 
/data_source_1/subdir1/4   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index]) 
/data_source_1/subdir1/5   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index]) 
/data_source_1/subdir2/1   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 
/data_source_1/subdir2/2   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 
/data_source_1/subdir2/3   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 
/data_source_1/subdir2/4   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 
/data_source_1/subdir2/5   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 
/data_source_1/subdir2/6   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 
/data_source_1/subdir2/7   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 
/data_source_1/subdir2/8   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 
/data_source_1/subdir2/9   frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c]) 

finden Sie alle Einträge in /data_source_1/subdir2 lässt:

In [20]: [s for s in store if s.startswith('/data_source_1/subdir2/')] 
Out[20]: 
['/data_source_1/subdir2/1', 
'/data_source_1/subdir2/2', 
'/data_source_1/subdir2/3', 
'/data_source_1/subdir2/4', 
'/data_source_1/subdir2/5', 
'/data_source_1/subdir2/6', 
'/data_source_1/subdir2/7', 
'/data_source_1/subdir2/8', 
'/data_source_1/subdir2/9'] 

und mit den Tasten können Sie einfach Daten auswählen:

In [25]: dfs = [store.select(s, where='a > 5') for s in store if s.startswith('/data_source_1/subdir2/')] 

In [26]: [len(df) for df in dfs] 
Out[26]: [5, 5, 5, 5, 5, 5, 5, 5, 5] 

In [29]: dfs = [store.select(s, where='a > 7') for s in store if s.startswith('/data_source_1/subdir2/')] 

In [30]: [len(df) for df in dfs] 
Out[30]: [4, 4, 4, 4, 4, 4, 4, 4, 4] 
+0

Wie würde ich dann alle Elemente in der Gruppe 'aaa' auflisten? – mgoldwasser

+0

@mgoldwasser, siehe update - ist das wonach Sie suchen? – MaxU

+0

Das ist hilfreich - und wenn ich die Kinder durchlaufen wollte, denke ich, dass ich so etwas wie '[my_store.get ('/data_source_1/'+ child] für mein Kind in my_store.get_node ('/data_source_1 ') machen kann. _v_children.keys()] 'aber vielleicht gibt es einen besseren Weg ... – mgoldwasser