I am having trouble with a program that will first look up a date in another data frame, and then interpolate a certain value along the rows.
Problem:
Let the original data frames look like this:
A = pd.DataFrame({"date":["06/24/2014","06/25/2014","06/26/2014"], "value":[2, 4, 6]})
B = pd.DataFrame({"date":["06/25/2014","06/26/2014","06/24/2014"], "1":[0.1, 0.5, 0.9],"3":[0.2, 0.6, 1.0],"5":[0.3, 0.7, 1.1],"7":[0.4, 0.8, 1.2]})
The idea is that the program should first find the row in B that matches with A by "date" and them interpolate using the names of the columns as the x_value and the values in the row as y_value.
The output should look like this:
A = pd.DataFrame({"date":["06/24/2014","06/25/2014","06/26/2014"], "value":[2, 4, 6], "interp":[0.95,0.25, 0.75]})
My approach so far:
import pandas as pd
from scipy.interpolate import interp1d
A = pd.DataFrame({"date":["06/24/2014","06/25/2014","06/26/2014"], "value":[2, 4, 6]})
B = pd.DataFrame({"date":["06/25/2014","06/26/2014","06/24/2014"], "1":[0.1, 0.5, 0.9],"3":[0.2, 0.6, 1.0],"5":[0.3, 0.7, 1.1],"7":[0.4, 0.8, 1.2]})
# Define x as the names of the columns
x_value = (1,3,5,7)
#Define the interpolation function as follows
def interp(row):
idx = B[B['date'] == row['date']].index.tolist()[0] #get indx from B
z_value = [] #get values from row in B
for i in range(1,5):
z_value.append(float(B.iloc[idx][i]))
tuple(z_value)
f_linear = interp1d(x_value,z_value) #define interpolation function
y_il = f_linear(row['value'])
return y_il
Finally, I would apply the function to each row this way:
A['interp']=A.apply(interp, axis=1)
I get the following output. Is there a better way to do this??
>>> A
date interp value
0 06/24/2014 0.95 2
1 06/25/2014 0.25 4
2 06/26/2014 0.75 6
Aucun commentaire:
Enregistrer un commentaire