jeudi 30 juin 2016

Look up value from column in table and interpolate in rows (python pandas scipy)


I am having trouble with a program that will first look up a date in another data frame, and then interpolate a certain value along the rows. Problem: Let the original data frames look like this: A = pd.DataFrame({"date":["06/24/2014","06/25/2014","06/26/2014"], "value":[2, 4, 6]}) B = pd.DataFrame({"date":["06/25/2014","06/26/2014","06/24/2014"], "1":[0.1, 0.5, 0.9],"3":[0.2, 0.6, 1.0],"5":[0.3, 0.7, 1.1],"7":[0.4, 0.8, 1.2]}) The idea is that the program should first find the row in B that matches with A by "date" and them interpolate using the names of the columns as the x_value and the values in the row as y_value. The output should look like this: A = pd.DataFrame({"date":["06/24/2014","06/25/2014","06/26/2014"], "value":[2, 4, 6], "interp":[0.95,0.25, 0.75]}) My approach so far: import pandas as pd from scipy.interpolate import interp1d A = pd.DataFrame({"date":["06/24/2014","06/25/2014","06/26/2014"], "value":[2, 4, 6]}) B = pd.DataFrame({"date":["06/25/2014","06/26/2014","06/24/2014"], "1":[0.1, 0.5, 0.9],"3":[0.2, 0.6, 1.0],"5":[0.3, 0.7, 1.1],"7":[0.4, 0.8, 1.2]}) # Define x as the names of the columns x_value = (1,3,5,7) #Define the interpolation function as follows def interp(row): idx = B[B['date'] == row['date']].index.tolist()[0] #get indx from B z_value = [] #get values from row in B for i in range(1,5): z_value.append(float(B.iloc[idx][i])) tuple(z_value) f_linear = interp1d(x_value,z_value) #define interpolation function y_il = f_linear(row['value']) return y_il Finally, I would apply the function to each row this way: A['interp']=A.apply(interp, axis=1) I get the following output. Is there a better way to do this?? >>> A date interp value 0 06/24/2014 0.95 2 1 06/25/2014 0.25 4 2 06/26/2014 0.75 6

Aucun commentaire:

Enregistrer un commentaire