mercredi 29 juin 2016

Creating a data frame from two dataframes with period indices that overlap but are not identical


I've got two data frames, each one representing an irregular time series. Here is a sample from df1: index 2014-10-30 16:00 118 2014-10-30 19:00 160 2014-10-30 22:00 88 2014-10-31 00:00 128 2014-10-31 03:00 89 2014-10-31 11:00 66 2014-10-31 17:00 84 2014-10-31 20:00 104 2014-10-31 21:00 82 2014-10-31 23:00 95 2014-11-01 02:00 44 2014-11-01 03:00 54 2014-11-01 14:00 83 2014-11-02 03:00 78 2014-11-02 04:00 87 2014-11-02 13:00 90 And here is a sample from df2: index 2016-02-04 02:00 0.00 2016-02-06 00:00 50.00 2016-02-07 05:00 30.00 2016-02-07 21:00 26.00 2016-02-10 18:00 100.00 2016-02-11 00:00 20.00 2016-02-12 03:00 15.00 2016-02-12 18:00 90.00 2016-02-13 17:00 25.00 2016-02-13 19:00 40.00 2016-02-15 00:00 35.00 2016-02-18 04:00 14.00 2016-02-28 00:00 33.98 The indices are pandas Period objects with hourly frequency, and the range of time represented by the indices of the two data frames definitely has some overlap. How can I merge them into a single data frame that indexes by the union of their indices and leaves blanks (which I could later apply an ffill to) where one column lacks a value for a particular index? Here's what I tried: df1.merge(df2, how = 'outer') This gave me what seemed like a nonsensical result that loses the indices: 0 0 118.00 1 160.00 2 88.00 3 128.00 4 89.00 5 66.00 6 84.00 7 104.00 8 82.00 9 95.00 I also tried: df1.merge(df2, how = 'outer', left_on = 'index', right_on = 'index') This gave me a KeyError: pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)() pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3843)() pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12265)() pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12216)() KeyError: 'index' How can I do this merge?

Aucun commentaire:

Enregistrer un commentaire