To create a new field based on other numerical fields in a pandas DataFrame, you can use the pandas.DataFrame.assign()
method. This method allows you to specify a new column name and the values for the column, which can be derived from one or more existing columns in the DataFrame.
For example, suppose you have a DataFrame with columns A
, B
, and C
, and you want to create a new column called D
that is equal to the sum of columns A
and B
. You can use the following code to do that:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
df = df.assign(D = df['A'] + df['B'])
print(df)
This code will create a new column D
in the DataFrame with the values from columns A
and B
added together. The resulting DataFrame will look like this:
A B C D
0 1 4 7 5
1 2 5 8 7
2 3 6 9 9
Alternatively, you could also use the pandas.DataFrame.apply()
method to apply a custom function to each row of the DataFrame, and create the new column based on the output of that function. For example, the following code would accomplish the same task as the code above:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
def sum_columns(row):
return row['A'] + row['B']
df = df.assign(D = df.apply(sum_columns, axis=1))
print(df)
This code will also create a new column D
in the DataFrame with the values from columns A
and B
added together. The resulting DataFrame will be the same as the one shown above.