Python – Building New Columns in a Dataframe, Based on Other Columns

Working with python, pandas and dataframes is quite challenging even for experienced Excel & VBA developer, as I love to consider myself for one. Thus, today I was fighting for quite some time to add new columns to a dataframe, based on other columns and I have decided to summarize the knowledge, so I can use it later (pretty much the purpose of 95% of the articles here).

Creating the table

Imagine starting with the following table with two columns:

Adding computated columns

And the main purpose is initially to put the values from the FUNDS column to the power of 2, 3 and 4, just to see what would happen. As the .insert Python “magic” is to be used ( len(df.columns)), the length of the table should be defined, as the idea is that these columns should go to the right of the existing ones:

Adding columns, based on rows in another column

The idea here is a bit more complicated. Let’s consider that we are currently in the first month of 2020. Thus, based on our data, we would like to have structured information for the previous 4 periods on the same row with the data from 2020-01. Something like this:

This can be done with a loop, concerning the fact that the data which should be taken is sorted and is present at a specific row, below ours. If there are no more rows, the fill_value = 0 parameter comes to help:

Adding the columns on the left side

Ok, as far as adding to the right is done with len(df.columns), then you may need a couple of seconds to realize that for the opposite action, a is needed in the parameter:

The pandas.DataFrame documentation is here.

The examples are in Jupyter notebook in GitHub here.


Tagged with: , , , , , , , ,