Question or problem about Python programming:
I have dataframe in Pandas for example:
Col1 Col2 A 1 B 2 C 3
Now if I would like to add one more column named Col3 and the value is based on Col2. In formula, if Col2 > 1, then Col3 is 0, otherwise would be 1. So, in the example above. The output would be:
Col1 Col2 Col3 A 1 1 B 2 0 C 3 0
Any idea on how to achieve this?
How to solve the problem:
Solution 1:
You just do an opposite comparison. if Col2 <= 1
. This will return a boolean Series with False
values for those greater than 1 and True
values for the other. If you convert it to an int64
dtype, True
becomes 1
and False
become 0
,
df['Col3'] = (df['Col2'] <= 1).astype(int)
If you want a more general solution, where you can assign any number to Col3
depending on the value of Col2
you should do something like:
df['Col3'] = df['Col2'].map(lambda x: 42 if x > 1 else 55)
Or:
df['Col3'] = 0 condition = df['Col2'] > 1 df.loc[condition, 'Col3'] = 42 df.loc[~condition, 'Col3'] = 55
Solution 2:
The easiest way that I found for adding a column to a DataFrame was to use the "add" function. Here's a snippet of code, also with the output to a CSV file. Note that including the "columns" argument allows you to set the name of the column (which happens to be the same as the name of the np.array that I used as the source of the data).
# now to create a PANDAS data frame df = pd.DataFrame(data = FF_maxRSSBasal, columns=['FF_maxRSSBasal']) # from here on, we use the trick of creating a new dataframe and then "add"ing it df2 = pd.DataFrame(data = FF_maxRSSPrism, columns=['FF_maxRSSPrism']) df = df.add( df2, fill_value=0 ) df2 = pd.DataFrame(data = FF_maxRSSPyramidal, columns=['FF_maxRSSPyramidal']) df = df.add( df2, fill_value=0 ) df2 = pd.DataFrame(data = deltaFF_strainE22, columns=['deltaFF_strainE22']) df = df.add( df2, fill_value=0 ) df2 = pd.DataFrame(data = scaled, columns=['scaled']) df = df.add( df2, fill_value=0 ) df2 = pd.DataFrame(data = deltaFF_orientation, columns=['deltaFF_orientation']) df = df.add( df2, fill_value=0 ) #print(df) df.to_csv('FF_data_frame.csv')