This article is a small cheat sheet of Pandas selectors. To illustrate our examples we will use the following dataset containing information about price of housing in Albuquerque. For further information you can check this page.
I will try to regularly update this page with new examples.
Examples of selectors
We have a dataframe df containing the following columns : Price, SquareFeet, AgeYearsNumberFeatures, Northeast, CustomBuild, CornerLot.
You can select a column by using its name :
You can select a subset of columns by passing a list of the names :
The first column of our dataframe is the price (target variable). All the other columns are the predictor variables, which we can use to predict the price. If we want to select all of the predictor variables we can use the following code :
You can get the names of the columns and then select the predictor variables in another way :
We can select the first 50 rows (from 0 to 49):
We can select rows whose price is > $135000 :
We can also filter using two variables :
You can try to mix the different previous techniques :
In all of our examples we don’t get a copy of the sub dataframe but a reference to it. This means that if we make a change to our “new” dataframe, it will also impact the original dataframe.
More examples coming soon…