vendredi 10 juin 2016

Amend column values according to timedelta and index


I would like to change my data in a pandas dataframe.

The data I collect needs to be assigned a step value. The conditions of what triggers a step change are occasionally time or high pressure or temperature values. I cannot get past the first step: When the row is over a certain pressure (1100 psi) and under a temp (40 C), this is the "dilution" phase.

When attempting to change the value with:

df.ix[(df['press'] > 1100) & (df['temp'] < 40),'proc'] = 'dilute';

I only seem to modify the top two rows.

items[0].head()
Out[37]: 
              time       mass       temp       press        proc
time                                                            
00:00:00  10:58:07  21.947102  23.306101    1.830506      dilute
00:00:01  10:58:08  22.076259  23.306101   57.274142      dilute
00:00:02  10:58:09  22.094710  23.306101  196.000203  pressurize
00:00:03  10:58:10  22.113161  23.306101  293.318991  pressurize
00:00:03  10:58:10  22.094710  23.306101  361.161415  pressurize

items[0].tail()
Out[38]: 
              time       mass       temp     press        proc
time                                                          
00:36:12  11:34:19  18.201538  39.798763 -1.678585  pressurize
00:36:13  11:34:20  18.183087  39.719165 -1.444645  pressurize
00:36:14  11:34:21  18.183087  39.671407 -1.444645  pressurize
00:36:15  11:34:22  18.219989  39.703246 -1.444645  pressurize
00:36:16  11:34:23  18.201538  39.758964 -1.444645  pressurize

Upon further inspection, the indexing does seem to work, giving me where I would expect to see the dilution occur...

print(df.ix[(df['press'] > 1100) & (df['temp'] < 40),'proc'].head(),
                df.ix[(df['press'] > 1100) & (df['temp'] < 40),'proc'].tail())
time
00:00:26    pressurize
00:00:27    pressurize
00:00:28    pressurize
00:00:29    pressurize
00:00:30    pressurize
Name: proc, dtype: object time
00:26:08    pressurize
00:26:09    pressurize
00:26:10    pressurize
00:26:11    pressurize
00:26:12    pressurize
Name: proc, dtype: object

However, when applying it to my data, I get only the first two values changed, and the message--

'C:Usersdarby.mcshainAppDataLocalContinuumAnaconda3libsite-packagespandascoreinternals.py:702: FutureWarning: in the future, boolean array-likes will be handled as a boolean array index values[indexer] = value'

Running the cookbook examples does give the expected response.

It seems that I have a nested index, but I'm not clear on why, or how to go about amending this. There are a few layers here and searches for solutions have not proved useful or provided the best route to help clarify.

I thought to reset the index, and go with numbers, but I need to sort steps by values and timedeltas.

The index is a timedelta, which I needed to normalize a number of runs launched over a number of periods to start all runs at the same time 0 seconds. My searches only yield date munging and not time, hence my normalizing values to zero with a timedelta index.

If there is a better way to publish this question, or more clarity, please ask. I'm more than willing to add clarity or trim. It is hard to predict what the helpful info would look like to a professional coder.


Aucun commentaire:

Enregistrer un commentaire