lundi 4 juillet 2016

Pandas: Why is default column type for numeric float?


I am using Pandas 0.18.1 with python 2.7.x. I have an empty dataframe that I read first. I see that the types of these columns are object which is OK. When I assign one row of data, the type for numeric values changes to float64. I was expecting int or int64. Why does this happen? Is there a way to set some global option to let Pandas knows that for numeric values, treat them by default as int unless the data has a .? For example, [0 1.0, 2.], first column is int but other two are float64? For example: >>> df = pd.read_csv('foo.csv', engine='python', keep_default_na=False) >>> print df.dtypes bbox_id_seqno object type object layer object ll_x object ll_y object ur_x object ur_y object polygon_count object dtype: object >>> df.loc[0] = ['a', 'b', 'c', 1, 2, 3, 4, 5] >>> print df.dtypes bbox_id_seqno object type object layer object ll_x float64 ll_y float64 ur_x float64 ur_y float64 polygon_count float64 dtype: object

Aucun commentaire:

Enregistrer un commentaire