# Why is ‘x’ in (‘x’,) faster than ‘x’ == ‘x’?

## Question or problem about Python programming:

```>>> timeit.timeit("'x' in ('x',)")
0.04869917374131205
>>> timeit.timeit("'x' == 'x'")
0.06144205736110564
```

Also works for tuples with multiple elements, both versions seem to grow linearly:

```>>> timeit.timeit("'x' in ('x', 'y')")
0.04866674801541748
>>> timeit.timeit("'x' == 'x' or 'x' == 'y'")
0.06565782838087131
>>> timeit.timeit("'x' in ('y', 'x')")
0.08975995576448526
>>> timeit.timeit("'x' == 'y' or 'x' == 'y'")
0.12992391047427532
```

Based on this, I think I should totally start using in everywhere instead of ==!

## How to solve the problem:

### Solution 1:

As I mentioned to David Wolever, there’s more to this than meets the eye; both methods dispatch to `is`; you can prove this by doing

```min(Timer("x == x", setup="x = 'a' * 1000000").repeat(10, 10000))
#>>> 0.00045456900261342525

min(Timer("x == y", setup="x = 'a' * 1000000; y = 'a' * 1000000").repeat(10, 10000))
#>>> 0.5256857610074803
```

The first can only be so fast because it checks by identity.

To find out why one would take longer than the other, let’s trace through execution.

They both start in `ceval.c`, from `COMPARE_OP` since that is the bytecode involved

```TARGET(COMPARE_OP) {
PyObject *right = POP();
PyObject *left = TOP();
PyObject *res = cmp_outcome(oparg, left, right);
Py_DECREF(left);
Py_DECREF(right);
SET_TOP(res);
if (res == NULL)
goto error;
PREDICT(POP_JUMP_IF_FALSE);
PREDICT(POP_JUMP_IF_TRUE);
DISPATCH();
}
```

This pops the values from the stack (technically it only pops one)

```PyObject *right = POP();
PyObject *left = TOP();
```

and runs the compare:

```PyObject *res = cmp_outcome(oparg, left, right);
```

`cmp_outcome` is this:

```static PyObject *
cmp_outcome(int op, PyObject *v, PyObject *w)
{
int res = 0;
switch (op) {
case PyCmp_IS: ...
case PyCmp_IS_NOT: ...
case PyCmp_IN:
res = PySequence_Contains(w, v);
if (res < 0)
return NULL;
break;
case PyCmp_NOT_IN: ...
case PyCmp_EXC_MATCH: ...
default:
return PyObject_RichCompare(v, w, op);
}
v = res ? Py_True : Py_False;
Py_INCREF(v);
return v;
}
```

This is where the paths split. The `PyCmp_IN` branch does

```int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
Py_ssize_t result;
PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
if (sqm != NULL && sqm->sq_contains != NULL)
return (*sqm->sq_contains)(seq, ob);
result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}
```

Note that a tuple is defined as

```static PySequenceMethods tuple_as_sequence = {
...
(objobjproc)tuplecontains,                  /* sq_contains */
};

PyTypeObject PyTuple_Type = {
...
&tuple_as_sequence,                         /* tp_as_sequence */
...
};
```

So the branch

```if (sqm != NULL && sqm->sq_contains != NULL)
```

will be taken and `*sqm->sq_contains`, which is the function `(objobjproc)tuplecontains`, will be taken.

This does

```static int
tuplecontains(PyTupleObject *a, PyObject *el)
{
Py_ssize_t i;
int cmp;

for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
cmp = PyObject_RichCompareBool(el, PyTuple_GET_ITEM(a, i),
Py_EQ);
return cmp;
}
```

...Wait, wasn't that `PyObject_RichCompareBool` what the other branch took? Nope, that was `PyObject_RichCompare`.

That code path was short so it likely just comes down to the speed of these two. Let's compare.

```int
PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)
{
PyObject *res;
int ok;

/* Quick result when objects are the same.
Guarantees that identity implies equality. */
if (v == w) {
if (op == Py_EQ)
return 1;
else if (op == Py_NE)
return 0;
}

...
}
```

The code path in `PyObject_RichCompareBool` pretty much immediately terminates. For `PyObject_RichCompare`, it does

```PyObject *
PyObject_RichCompare(PyObject *v, PyObject *w, int op)
{
PyObject *res;

assert(Py_LT <= op && op <= Py_GE);
if (v == NULL || w == NULL) { ... }
if (Py_EnterRecursiveCall(" in comparison"))
return NULL;
res = do_richcompare(v, w, op);
Py_LeaveRecursiveCall();
return res;
}
```

The `Py_EnterRecursiveCall`/`Py_LeaveRecursiveCall` combo are not taken in the previous path, but these are relatively quick macros that'll short-circuit after incrementing and decrementing some globals.

`do_richcompare` does:

```static PyObject *
do_richcompare(PyObject *v, PyObject *w, int op)
{
richcmpfunc f;
PyObject *res;
int checked_reverse_op = 0;

if (v->ob_type != w->ob_type && ...) { ... }
if ((f = v->ob_type->tp_richcompare) != NULL) {
res = (*f)(v, w, op);
if (res != Py_NotImplemented)
return res;
...
}
...
}
```

This does some quick checks to call `v->ob_type->tp_richcompare` which is

```PyTypeObject PyUnicode_Type = {
...
PyUnicode_RichCompare,      /* tp_richcompare */
...
};
```

which does

```PyObject *
PyUnicode_RichCompare(PyObject *left, PyObject *right, int op)
{
int result;
PyObject *v;

if (!PyUnicode_Check(left) || !PyUnicode_Check(right))
Py_RETURN_NOTIMPLEMENTED;

return NULL;

if (left == right) {
switch (op) {
case Py_EQ:
case Py_LE:
case Py_GE:
/* a string is equal to itself */
v = Py_True;
break;
case Py_NE:
case Py_LT:
case Py_GT:
v = Py_False;
break;
default:
...
}
}
else if (...) { ... }
else { ...}
Py_INCREF(v);
return v;
}
```

Namely, this shortcuts on `left == right`... but only after doing

```    if (!PyUnicode_Check(left) || !PyUnicode_Check(right))

```

All in all the paths then look something like this (manually recursively inlining, unrolling and pruning known branches)

```POP()                           # Stack stuff
TOP()                           #
#
case PyCmp_IN:                  # Dispatch on operation
#
sqm != NULL                     # Dispatch to builtin op
sqm->sq_contains != NULL        #
*sqm->sq_contains               #
#
cmp == 0                        # Do comparison in loop
i < Py_SIZE(a)                  #
v == w                          #
op == Py_EQ                     #
++i                             #
cmp == 0                        #
#
res < 0                         # Convert to Python-space
res ? Py_True : Py_False        #
Py_INCREF(v)                    #
#
Py_DECREF(left)                 # Stack stuff
Py_DECREF(right)                #
SET_TOP(res)                    #
res == NULL                     #
DISPATCH()                      #
```

vs

```POP()                           # Stack stuff
TOP()                           #
#
default:                        # Dispatch on operation
#
Py_LT <= op                     # Checking operation
op <= Py_GE                     #
v == NULL                       #
w == NULL                       #
Py_EnterRecursiveCall(...)      # Recursive check
#
v->ob_type != w->ob_type        # More operation checks
f = v->ob_type->tp_richcompare  # Dispatch to builtin op
f != NULL                       #
#
!PyUnicode_Check(left)          # ...More checks
!PyUnicode_Check(right))        #
left == right                   # Finally, doing comparison
case Py_EQ:                     # Immediately short circuit
Py_INCREF(v);                   #
#
res != Py_NotImplemented        #
#
Py_LeaveRecursiveCall()         # Recursive check
#
Py_DECREF(left)                 # Stack stuff
Py_DECREF(right)                #
SET_TOP(res)                    #
res == NULL                     #
DISPATCH()                      #
```

Now, `PyUnicode_Check` and `PyUnicode_READY` are pretty cheap since they only check a couple of fields, but it should be obvious that the top one is a smaller code path, it has fewer function calls, only one switch
statement and is just a bit thinner.

###### TL;DR:

Both dispatch to `if (left_pointer == right_pointer)`; the difference is just how much work they do to get there. `in` just does less.

### Solution 2:

There are three factors at play here which, combined, produce this surprising behavior.

First: the `in` operator takes a shortcut and checks identity (`x is y`) before it checks equality (`x == y`):

```>>> n = float('nan')
>>> n in (n, )
True
>>> n == n
False
>>> n is n
True
```

Second: because of Python's string interning, both `"x"`s in `"x" in ("x", )` will be identical:

```>>> "x" is "x"
True
```

(big warning: this is implementation-specific behavior! `is` should never be used to compare strings because it will give surprising answers sometimes; for example `"x" * 100 is "x" * 100 ==> False`)

Third: as detailed in Veedrac's fantastic answer, `tuple.__contains__` (`x in (y, )` is roughly equivalent to `(y, ).__contains__(x)`) gets to the point of performing the identity check faster than `str.__eq__` (again, `x == y` is roughly equivalent to `x.__eq__(y)`) does.

You can see evidence for this because `x in (y, )` is significantly slower than the logically equivalent, `x == y`:

```In [18]: %timeit 'x' in ('x', )
10000000 loops, best of 3: 65.2 ns per loop

In [19]: %timeit 'x' == 'x'
10000000 loops, best of 3: 68 ns per loop

In [20]: %timeit 'x' in ('y', )
10000000 loops, best of 3: 73.4 ns per loop

In [21]: %timeit 'x' == 'y'
10000000 loops, best of 3: 56.2 ns per loop
```

The `x in (y, )` case is slower because, after the `is` comparison fails, the `in` operator falls back to normal equality checking (i.e., using `==`), so the comparison takes about the same amount of time as `==`, rendering the entire operation slower because of the overhead of creating the tuple, walking its members, etc.

Note also that `a in (b, )` is only faster when `a is b`:

```In [48]: a = 1

In [49]: b = 2

In [50]: %timeit a is a or a == a
10000000 loops, best of 3: 95.1 ns per loop

In [51]: %timeit a in (a, )
10000000 loops, best of 3: 140 ns per loop

In [52]: %timeit a is b or a == b
10000000 loops, best of 3: 177 ns per loop

In [53]: %timeit a in (b, )
10000000 loops, best of 3: 169 ns per loop
```

(why is `a in (b, )` faster than `a is b or a == b`? My guess would be fewer virtual machine instructions — `a in (b, )` is only ~3 instructions, where `a is b or a == b` will be quite a few more VM instructions)

Veedrac's answer — https://stackoverflow.com/a/28889838/71522 — goes into much more detail on specifically what happens during each of `==` and `in` and is well worth the read.