In the attached video below, I am simply showing the following 3 pieces of code with Jupyter Notebook:
- Remove spaces in dataframe columns.
- Remove none values from dataframe rows.
- Filter dataframes, based on values.
- E.g. – put all rows with values above certain amount in one dataframe and all with values below in another one
Generally the whole code is in GitHub here, and these are the pieces that make it work:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
import pandas as pd # 1. Remove spaces in dataframe columns. salary_data = { "people":['John', 'Peter', 'Sam'], "salary":[' 50 ', ' 40 ', '33 '] } salary = pd.DataFrame(salary_data) def whitespace_remover(df): for i in df.columns: if df[i].dtype == 'object': df[i] = df[i].map(str.strip) return df salary = whitespace_remover(salary) # 2. Remove `None` values from dataframe rows. salary_data = { "people":['John', 'Peter', 'Sam'], "salary":[' 50 ', None, '33 '] } salary = pd.DataFrame(salary_data) salary_none = salary[salary.isna().any(axis=1)] salary_without_none = salary.dropna() # 3. Filter dataframes, based on values. salary_data = { "people":['John', 'Peter', 'Sam', 'Vitosh', 'Karolina'], "salary":[50,44, 33, 101, 230] } salary = pd.DataFrame(salary_data) salary_above_100 = salary.loc[salary['salary']>100] salary_below_100 = salary.loc[salary['salary']<100] |
The YouTube video is below!
Thank you for your interest! 🙂