Question or problem about Python programming:
I want to apply a lambda function to a DataFrame column using if…elif…else within the lambda function.
The df and the code are smth. like:
df=pd.DataFrame({"one":[1,2,3,4,5],"two":[6,7,8,9,10]}) df["one"].apply(lambda x: x*10 if x<2 elif x<4 x**2 else x+10)
obviously this way it is not working.
Is there a way to apply if....elif....else to lambda?
How can I relize the same result with List Comprehension?
Thanks for any response.
How to solve the problem:
Solution 1:
Nest if .. else
s:
lambda x: x*10 if x<2 else (x**2 if x<4 else x+10)
Solution 2:
I do not recommend the use of apply
here: it should be avoided if there are better alternatives.
For example, if you are performing the following operation on a Series:
if cond1: exp1 elif cond2: exp2 else: exp3
This is usually a good use case for np.where
or np.select
.
numpy.where
The if
else
chain above can be written using
np.where(cond1, exp1, np.where(cond2, exp2, ...))
np.where
allows nesting. With one level of nesting, your problem can be solved with,
df['three'] = ( np.where( df['one'] < 2, df['one'] * 10, np.where(df['one'] < 4, df['one'] ** 2, df['one'] + 10)) df one two three 0 1 6 10 1 2 7 4 2 3 8 9 3 4 9 14 4 5 10 15
numpy.select
Allows for flexible syntax and is easily extensible. It follows the form,
np.select([cond1, cond2, ...], [exp1, exp2, ...])
Or, in this case,
np.select([cond1, cond2], [exp1, exp2], default=exp3)
df['three'] = ( np.select( condlist=[df['one'] < 2, df['one'] < 4], choicelist=[df['one'] * 10, df['one'] ** 2], default=df['one'] + 10)) df one two three 0 1 6 10 1 2 7 4 2 3 8 9 3 4 9 14 4 5 10 15
and/or (similar to the if/else)
Similar to if-else
, requires the lambda
:
df['three'] = df["one"].apply( lambda x: (x < 2 and x * 10) or (x < 4 and x ** 2) or x + 10) df one two three 0 1 6 10 1 2 7 4 2 3 8 9 3 4 9 14 4 5 10 15
List Comprehension
Loopy solution that is still faster than apply
.
df['three'] = [x*10 if x<2 else (x**2 if x<4 else x+10) for x in df['one']] # df['three'] = [ # (x < 2 and x * 10) or (x < 4 and x ** 2) or x + 10) for x in df['one'] # ] df one two three 0 1 6 10 1 2 7 4 2 3 8 9 3 4 9 14 4 5 10 15
Solution 3:
For readability I prefer to write a function, especially if you are dealing with many conditions. For the original question:
def parse_values(x): if x < 2: return x * 10 elif x < 4: return x ** 2 else: return x + 10 df['one'].apply(parse_values)