Tuesday, February 23, 2016

Removing String Columns from a DataFrame

Sometimes you want to work just with numerical columns in a pandas DataFrame. The rule of thumb is that everything that has a type of object is something not numeric (you can get fancier with numpy.issubdtype). We're going to use the DataFrame dtypes with some boolean indexing to accomplish this.

In [1]: import pandas as pd  

In [2]: df = pd.DataFrame([
   ...:     [1, 2, 'a', 3],
   ...:     [4, 5, 'b', 6],
   ...:     [7, 8, 'c', 9],
   ...: ])  

In [3]: df  
   0  1  2  3
0  1  2  a  3
1  4  5  b  6
2  7  8  c  9

In [4]: df.dtypes  
0     int64
1     int64
2    object
3     int64
dtype: object

In [5]: df[df.columns[df.dtypes != object]]
   0  1  3
0  1  2  3
1  4  5  6
2  7  8  9

In [6]:   

No comments:

Post a Comment