database - Pandas, large file with varying number columns, in memory append -


I would like to maintain a large PyTable in an HDF5 file. Generally, as new data comes, I am attached to the existing table:

  store = pd.HDFStore (path_to_dataset, 'a') store.append ("data", newdata) Store.close ()  

However, if the columns of old archived data and partial partial of incoming new data are only overlapping, this error is returned to the following:

  Exception:  

In these cases, I want to get the same behavior as normal dataframe addend function Fill non-overlapping entries PAD A = ("call1": category (10), "cola 2": class (10)} a = pd. Dataframe (a) b = {"nin as b1 < (10), "B2": Category (10)} b = pd.DataFrame (b) a.append (b)

  with import pods: Pre> 

Is it possible to have "memory in", or do I need to create a completely new file?

HDF Store Store line-oriented, then it is currently not possible

You have to read, attach and write it. Perhaps you can use it:

However, you can initially create a table with all the columns (and just leave them).


Comments