Sets in Python

Sets #

A set is yet another mutable sequence type used to store collection of unique items. You can think of sets just like a list. However, they are different from list in the following ways:

  1. Each element inside a set must be unique.
  2. Elements in a set are stored in no particular order.

If your application doesn't care about the way elements are stored, use sets instead of lists because when it comes to manipulation of elements, a set is far more efficient than list.

Creating Sets #

We can create a set using the following syntax:

Syntax: a_set = { item1, item2, item3 ..., itemN }

Notice that we are using curly braces ({}) here.

>>>
>>> s1 = {3, 31, 12, 1}    # creating a set s1
>>>
>>> type(s1)   
<class 'set'>       # The set class
>>>
>>> s1              # print set s1
{1, 3, 12, 31}
>>>
>>>
>>> s2 = {'one', 1, 'alpha', 3.14}    # creating a set s2
>>>
>>> s2                # print set s2
{1, 3.14, 'one', 'alpha'}
>>>

We can also use set() constructor function to create sets. Here are some examples:

>>>
>>> s3 = set({77, 23, 91, 271})
>>> s3
{271, 91, 77, 23}
>>>
>>>
>>> s4 = set("123abc")   # creating set from a string
>>> s4
{'b', '3', 'a', '1', '2', 'c'}
>>>
>>>
>>> s5 = set(['one', 'two', 'nine', 33, 13])   # creating set from a list
>>> s5
{33, 'two', 'one', 13, 'nine'}
>>>
>>>
>>> s6 = set(("alpha", "beta", "gamma"))   # creating set from a tuple
>>> s6
{'beta', 'gamma', 'alpha'}
>>>

List comprehension can also be used to create sets.

>>>
>>> s7 = set([x*2+3 for x in range(0, 5)])
>>> s7
{9, 3, 11, 5, 7}
>>>

As already mentioned, sets can only contain unique values. If you try creating a set with duplicate values, Python will automatically remove such values. For example:

>>>
>>> sd1 = {1, 2, "one", 3, 1, "one"} # two duplicate elements 1 and "one"
>>> sd1
{1, 2, 3, 'one'}
>>>
>>>
>>> sd2 = set("abcdabc") # three duplicate elements a, b and c
>>> sd2
{'b', 'c', 'd', 'a'}
>>>

Note that although we have used some duplicate elements while creating sets, such values only appears once when sets are printed because a set doesn't store duplicate elements.

Getting Length of Sets #

To determine the number of items in a set use the len() function.

>>>
>>> colors = {"blue", "orange", "green"}
>>> len(colors)
3
>>>

The max(), min(), sum() built-in functions #

As with list and tuples, we can also use these functions with sets.

>>>
>>> s1 = {33,11,88,55}
>>> s1
{88, 33, 11, 55}
>>>
>>> max(s1)  # get the largest element from the set
88
>>> min(s1)  # get the smallest element from the set
11
>>> sum(s1)  # get the sum of all the elements in a set
187
>>>

Adding and removing elements #

Remember sets are mutable objects and thus we can add or delete elements from sets without creating additional set objects in the process. We use add() and remove() methods to add and remove element from a set respectively.

>>>
>>> s1 = {4, 1, 9, 6}
>>> s1            # print the original set
{9, 1, 4, 6}
>>>
>>> id(s1)   # address of s1
35927880
>>>
>>> s1.add(0)   # add 0 to the set
>>> s1.add(10)  # add 10 to the set
>>>
>>> s1            # print the modified set
{0, 1, 4, 6, 9, 10}  
>>>
>>> id(s1)    # as sets are mutable objects, the address of s1 remains the same
35927880
>>>
>>> s1.remove(0)   # remove 0 from the set
>>> s1.remove(6)   # remove 6 from the set
>>>
>>> s1    # print s1 again
{1, 4, 9, 10}
>>>

If you try to remove an element which doesn't exists in the set, the remove() method will throw a KeyError exception.

>>>
>>> s1.remove(61)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 61
>>>

In case this behavior is not desired use the discard() method. The discard() method also removes the element from the set but in case the element is not found, it silently ignores the error.

>>>
>>> s1.discard(61)   # remove 61 from the set
>>>

We can also add multiple elements to the set at once using the update() method. The update() method accepts an object of iterable type, such as list, tuple, string etc.

>>>
>>> s1 = {1, 4, 9, 10}
>>>
>>> s1.update([33, 44])
>>>
>>> s1
{1, 33, 4, 9, 10, 44}
>>>
>>> s1.update(('a', 'b'))
>>>
>>> s1
{'b', 1, 33, 4, 9, 10, 44, 'a'}
>>>

Note that it is the individual elements of the object that becomes elements of the set not the object itself.

To remove all the elements from a set use the clear() method.

>>>
>>> s1.clear()
>>>
>>> s1
set()
>>>

Looping through sets #

Just as with other sequence types, we can use for loop to iterate over the elements of a set.

>>>
>>> a_set = {99, 33, 44, 124, 25}
>>>
>>> for i in a_set:
...    print(i, end=" ")
...
25 33 99 124 44 >>>
>>>
>>>

Membership Operator in and not in #

As usual, we can use use in and not in operators to find the existence of an element inside a set.

>>>
>>> 100 in a_set
False
>>> 200 not in a_set
True
>>>

Subset and Supersets #

Set A is a subset of B, if all the elements in set A is also the elements in set B. For example:

A = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
B = {1, 2, 3}

A is a set of first 10 natural numbers and B is a set of first 3 natural numbers. All the elements in set B are also the elements in set B. Hence B is a subset of A. In other words, we can also say that set A is a superset of B.

We can test whether a set is a subset or superset of another set using issubset() and issuperset() methods.

>>>
>>> A = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
>>> B = {1, 2, 3}
>>>
>>> A.issubset(B)   #  Is A subset of set B  ?
False
>>>
>>> B.issubset(A)   #  Is B subset of set A  ?
True
>>>
>>> B.issuperset(A)  #  Is B superset of set A  ?
False
>>>
>>> A.issuperset(B)  #  Is A subset of set B  ?
True
>>>

Comparing Sets #

We can also use relational operators to test whether a set is a subset or superset of another set. For example:

>>>
>>> A = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
>>> B = {1, 2, 3}
>>>
>>> B < A    # Is B subset of A  ?
True
>>>
>>> A < B    # Is A subset of B  ? 
False
>>>
>>> B > A    # Is B superset of A  ?
False
>>>
>>> A > B    # Is A superset of B  ?
True
>>>

The == and != operators can be used to test if two sets are same i.e they contain the same elements.

>>>
>>> s1 = {1,3,2}
>>> s2 = {1,2,3}
>>>
>>> s1 == s2   # Is set s1 and s2 equal ?
True
>>> s1 != s2   # Isn't set s1 and s2 equal ?
False
>>>
>>> B >= {1,2,3}   # Is B superset or equal to {1,2,3}  ?
True
>>>
>>> A <= {1, 2}    # Is A subset or equal to {1,2} ?
False
>>>

Union and Intersection of Sets #

Union of Sets: Say we have two sets A and B, then the union of A and B is a set which consists of all the elements from A and all the elements from B. Duplicate elements will be included only once. In Mathematics we use symbol to denote union. Symbolically, we write A union B as A ∪ B.

For example:

A = {10, 20, 30, 40}
B = {1000, 2000}

A ∪ B => {10, 20, 30, 40, 1000, 2000}

To perform union operation in Python, we can use union() method or | operator. For example:

>>>
>>> n1 = {2, 3, 4, 10}
>>> n2 = {10, 2, 100, 2000}
>>>
>>> n3 = n1.union(n2)
>>> n3
{2, 3, 4, 100, 10, 2000}
>>>
>>> n4 = n1 | n2
>>> n4
{2, 3, 4, 100, 10, 2000}
>>>

Intersection of sets: The intersection of sets A and B is a set which consists of all elements common to both A and B. The symbol denotes intersection. Symbolically, A intersection B is written as A ∩ B. For example:

A = {100, 200, 1, 2, 3, 4}
B = {100, 200}

A ∩ B => {100, 200}

To perform intersection operation in Python, we use intersection() method or & operator. For example:

>>>
>>> s1 = {20, 40}
>>> s2 = {10, 20, 30, 40, 50}
>>>
>>> s3 = s1.intersection(s2)
>>>
>>> s3
{40, 20}
>>>
>>> s4 = s1 & s2
>>>
>>> s4
{40, 20}
>>>

Difference and Symmetric Difference of Sets #

Difference of sets: The difference of sets A and B is a set of elements which contains all the elements from set A but not B. As usual we use - symbol to denote difference operation. Symbolically, A minus B is written as A - B. For example:

s1 = {2, 4, 6, 8, 10}
s2 = {2, 3, 5, 8}

s1 - s2 = {4, 6, 10}
s2 - s1 = {3, 5}

In Python, we use difference() method or - operator to perform the set difference.

>>>
>>> s1 = {'a', 'e','i', 'o', 'u'}
>>> s2 = {'e', 'o'}
>>>
>>> s3 = s1.difference(s2)
>>> s3
{'i', 'u', 'a'}
>>>
>>> s4 = s1 - s2
>>> s4
{'i', 'u', 'a'}
>>>

Symmetric Difference of sets: Symmetric difference of two sets A and B is a set which consists of elements that are in one set but not in both. The symbol denotes symmetric difference. For example:

s1 = {20, 20, 30, 40}
s2 = {30, 40, 200, 300}

s1 △ s2 = {200, 300, 20}

In Python, we use symmetric_difference() method or ^ operator to perform this operation.

>>>
>>> s1 = {'a', 'e','i', 'o', 'u'}
>>> s2 = {'e', 'o'}
>>>
>>> s3 = s1 ^ s2
>>> s3
{'i', 'u', 'a'}
>>>
>>> s4 = s2 ^ s1
>>> s4
{'i', 'u', 'a'}
>>>