mercredi 6 juillet 2016

Index numpy array with pd.Series of boolean


I found a piece of code I don't really understand. It basically goes like this:

array = np.ones((5, 4))*np.nan
s1 = pd.Series([1,4,0,4,5], index=[0,1,2,3,4])
I = s1 == 4
print(I)

0    False
1     True
2    False
3     True
4    False
dtype: bool

I really understand this part, it return a pd.Series of boolean with True at the indexes where 4 is. Now, the author uses I to index array:

array[I,0] = 3
array[I,1] = 7
array[I,2] = 2
array[I,3] = 5
print(array)

[[  3.   7.   2.   5.]
 [  3.   7.   2.   5.]
 [ nan  nan  nan  nan]
 [ nan  nan  nan  nan]
 [ nan  nan  nan  nan]]

The new array makes no sense to me, I would like to return instead:

[[ nan  nan  nan  nan]
 [  3.   7.   2.   5.]
 [ nan  nan  nan  nan]
 [  3.   7.   2.   5.]
 [ nan  nan  nan  nan]]

Can someone explain what is happening here, and how I can change the code above to return what I need?


Aucun commentaire:

Enregistrer un commentaire