Sets in Python

Sets #

A set is yet another mutable sequence type(some other are list, tuple, string etc) used to store collection of unique items. You can think of sets just like a list. However, they are different from list in the following ways:

  1. Each element inside a set must be unique.
  2. Elements in a set are stored in no particular order.

If your application doesn't care about the way elements are stored, use sets instead of lists because when it comes to manipulation of elements, a set is far more efficient than list.

Creating Sets #

We can create a set using the following syntax:

Syntax: a_set = {item1, item2, item3 ..., itemN}

Notice that we are using curly braces here.

>>>
>>> s1 = {3,31,12,1}      # creating a set s1
>>>
>>> type(s1)   
<class 'set'>       # The set class
>>>
>>> s1              # print set s1
{1, 3, 12, 31}
>>>
>>>
>>> s2 = {'one', 1, 'alpha', 3.14}   # creating a set s2
>>>
>>> s2         # printing set s2
{1, 3.14, 'one', 'alpha'}
>>>
>>>

We can also use set() constructor function to create sets. Here are some examples:

>>>
>>> s3 = set({77, 23, 91, 271})
>>> s3
{271, 91, 77, 23}
>>>

>>>
>>> s4 = set("123abc")   # creating set from string
>>>
>>> s4
{'b', '3', 'a', '1', '2', 'c'}
>>>

>>>
>>> s5 = set(['one', 'two', 'nine', 33, 13])   # creating set from list
>>>
>>> s5
{33, 'two', 'one', 13, 'nine'}
>>>

>>>
>>> s6 = set(("alpha", "beta", "gamma"))   # creating set from tuple
>>> s6
{'beta', 'gamma', 'alpha'}
>>>

List comprehension syntax can also be used to create sets.

>>>
>>> s7 = set([x*2+3 for x in range(0, 5)])
>>> s7
{9, 3, 11, 5, 7}
>>>

As already mentioned, sets can only contain unique values. If you try creating a set with duplicate values, Python will automatically remove such values.

>>>
>>> sd1 = {1, 2, "one", 3, 1, "one"} # two duplicate elements 1 and "one"
>>> sd1
{1, 2, 3, 'one'}
>>>

>>>
>>> sd2 = set("abcdabc") # three duplicate elements a, b and c
>>> sd2
{'b', 'c', 'd', 'a'}
>>>
>>>

Note that although we have used some duplicate elements while creating sets, such values only appears once when sets are printed because a set doesn't store duplicate elements.

Getting Length of Sets #

To determine the number of items in a set use the len() function.

>>>
>>>
>>> colors = {"blue", "orange", "green"}
>>>
>>> len(colors)
3
>>>
>>>

The max(), min(), sum() built-in functions #

As with list and tuples, we can also use these functions on sets.

>>>
>>> s1 = {33,11,88,55}
>>>
>>> s1
{88, 33, 11, 55}
>>>
>>> max(s1)  # find the element with the maximum value
88
>>> min(s1)  # find the element with the minimum value
11
>>> sum(s1)  # find the sum of all the elements in a set
187
>>>

Adding, removing and updating elements #

Remember sets are mutable objects and thus we can add and delete elements from sets without creating additional set objects in the process. We use add() and remove() methods to add and remove element from a set respectively.

>>>
>>> s1 = {4, 1, 9, 6}
>>> s1            # print the original set
{9, 1, 4, 6}
>>>
>>> id(s1)   # address of s1
35927880
>>>
>>> s1.add(0)   # add 0 to the set
>>> s1.add(10)  # add 10 to the set
>>>
>>> s1            # print the modified set
{0, 1, 4, 6, 9, 10}  
>>>
>>> id(s1)    # as sets are mutable objects, the address of s1 remains the same
35927880
>>>
>>> s1.remove(0)   # remove 0 from the set
>>> s1.remove(6)   # remove 6 from the set
>>>
>>>
>>> s1    # print s1 again
{1, 4, 9, 10}
>>>

If you try to remove an element which doesn't exists in the set, the remove() method will throw a KeyError exception.

>>>
>>> s1.remove(61)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 61
>>>

In case this behavior is not desired use the discard() method. The discard() method also removes the item from the set but in case argument is not found, it silently ignores the error.

>>>
>>> s1.discard(61)   # remove 61 from the set
>>>

We can also add group of elements to the set using update() method. The update() method accepts an object of iterable type, such as a list, a tuple, a string etc.

>>>
>>> s1 = {1, 4, 9, 10}
>>>
>>> s1.update([33, 44])
>>>
>>> s1
{1, 33, 4, 9, 10, 44}
>>>
>>> s1.update(('a', 'b'))
>>>
>>> s1
{'b', 1, 33, 4, 9, 10, 44, 'a'}
>>>

Note that it is the individual elements of the object becomes elements of the set not the object itself.

To remove every elements from a use the clear() method.

>>>
>>> s1.clear()
>>>
>>> s1
set()
>>>

Looping through sets #

Just as with other sequence types, we can use for loop with sets to iterate over the elements of a set.

>>>
>>> a_set = {99, 33, 44, 124, 25}
>>>
>>> for i in a_set:
...    print(i, end=" ")
...
25 33 99 124 44 >>>
>>>
>>>

Membership Operator in and not in #

As usual we can use use in and not in operators to find the existence of an element inside a set.

>>>
>>> 100 in a_set
False
>>> 200 not in a_set
True
>>>

Subset and Supersets #

Set A is a subset of B, if all the elements in set A is also the elements in set B. For example:

A = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
B = {1, 2, 3}

A is a set of first 10 natural numbers and B is a set of first 3 natural numbers. All the elements in set B are also the elements in set B. Hence B is a subset of A. In other words, we can also say that set A is a superset of B.

We can test whether a set is a subset or superset of another set using issubset() and issuperset() methods.

>>>
>>> A = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
>>> B = {1, 2, 3}
>>>
>>>
>>> A.issubset(B)   #  Is A subset of set B  ?
False
>>>
>>> B.issubset(A)   #  Is B subset of set A  ?
True
>>>
>>> B.issuperset(A)  #  Is B superset of set A  ?
False
>>>
>>> A.issuperset(B)  #  Is A subset of set B  ?
True
>>>

Comparing Sets #

We can also use comparison operators to test whether a set is a subset or superset of another set. For example:

>>>
>>> A = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
>>> B = {1, 2, 3}
>>>
>>> B < A    # Is B subset of A  ?
True
>>>
>>> A < B    # Is A subset of B  ? 
False
>>>

>>>
>>> B > A    # Is B superset of A  ?
False
>>>
>>> A > B    # Is A superset of B  ?
True
>>>

The == and != operators can be used to test if two sets are same i.e they contain same elements.

>>>
>>> s1 = {1,3,2}
>>> s2 = {1,2,3}
>>>
>>> s1 == s2   # Is set s1 and s2 equal ?
True
>>> s1 != s2   # Isn't set s1 and s2 equal ?
False
>>>

>>>
>>> B >= {1,2,3}   # Is B superset or equal to {1,2,3}  ?
True
>>>
>>> A <= {1, 2}    # Is A subset or equal to {1,2} ?
False
>>>

Union and Intersection of Sets #

Union of Sets: Say we have two sets A and B, then the union of A and B is a set which consists of all the elements from A and all the elements from B. Duplicate elements will be included only once. In Mathematics we use symbol to denote union. Symbolically, we write A union B as A ∪ B.

For example:

n1 = {10, 20, 30, 40}
n2 = {1000, 2000}

A ∪ B => {10, 20, 30, 40, 1000, 2000}

To perform union operation in Python, we can use union() method or | operator. For example:

>>>
>>> n1 = {2, 3, 4, 10}
>>> n2 = {10, 2, 100, 2000}
>>>
>>>
>>> n3 = n1.union(n2)
>>> n3
{2, 3, 4, 100, 10, 2000}
>>>
>>> n4 = n1 | n2
>>> n4
{2, 3, 4, 100, 10, 2000}
>>>

Intersection of sets: The intersection of sets A and B is a set which consists of all elements common to both A and B. The symbol denotes intersection. Symbolically, A intersection B is written as
A ∩ B. For example:

s1 = {100, 200, 1, 2, 3, 4}
s2 = {100, 200}

s1 ∩ s2 => {100, 200}

To perform intersection operator in Python, we use intersection() method or & operator. For example:

>>>
>>> s2 = {20, 40}
>>> s1 = {10, 20, 30, 40, 50}
>>>
>>> s3 = s1.intersection(s2)
>>>
>>> s3
{40, 20}
>>>
>>> s4 = s1 & s2
>>>
>>> s4
{40, 20}
>>>

Difference and Symmetric Difference of Sets #

Difference of sets: The difference of sets A and B is a set of elements which belongs to set A but not B. As usual we use - symbol to denote difference operation. Symbolically, A minus B is written as A - B. For example:

s1 = {2, 4, 6, 8, 10}
s2 = {2, 3, 5, 8}

s1 - s2 = {4, 6, 10}
s2 - s1 = {3, 5}

In Python, we use difference() method or - operator to perform the set difference.

>>>
>>> s1 = {'a', 'e','i', 'o', 'u'}
>>> s2 = {'e', 'o'}
>>>
>>> s3 = s1.difference(s2)
>>> s3
{'i', 'u', 'a'}
>>>
>>> s4 = s1 - s2
>>> s4
{'i', 'u', 'a'}
>>>

Symmetric Difference of sets: Symmetric difference of two set A and B is a set which consists of elements that are in one set but not in both. The symbol denotes symmetric difference. For example:

s1 = {20, 20, 30, 40}
s2 = {30, 40, 200, 300}

s1 △ s2 = {200, 300, 20}
s2 △ s2 = {200, 300, 20}

In Python we use symmetric_difference() method or ^ operator to perform this operation

>>>
>>> s1 = {'a', 'e','i', 'o', 'u'}
>>> s2 = {'e', 'o'}
>>>
>>> s1 ^ s2
{'i', 'u', 'a'}
>>>
>>> s2 ^ s1
{'i', 'u', 'a'}
>>>