After writing the article scraping a web site up to the N-th level with Python, I have decided to summarize what could be achieved with Set() in this language. Pretty much, sets can be described as lists with a few differences:
Set == List
- Set is iterable
- Set is mutable+
- If the set is frozenset, it is immutable
Set != List
- Set is extremely faster for checking if an element exists (it is HashTable)
- Set is unordered
- Set can only have unique elements
Creating, Adding, Removing from Set
1 2 3 4 5 |
first_set = set([1,2,3,3,3,3,4]) first_set.add(5) print(max(first_set)) first_set.remove(min(first_set)) print(first_set) |
Result:
1 2 |
5 {2, 3, 4, 5} |
Frozenset
The frozenset is an immutable set. Thus, once it is created, it cannot be immuted. Thus, even if the .copy() function is not used and simply frozen_first_set = frozenset(first_set) is carried out, the frozen set is not changed, once the initial set is changed. Which is not the case with the test_set :
1 2 3 4 5 6 7 8 9 |
first_set = set([1,2,3,3,3,3,4]) frozen_first_set = frozenset(first_set) test_set = first_set test_set_copy = first_set.copy() first_set.remove(min(first_set)) print (first_set) print (test_set) print (test_set_copy) print (frozen_first_set) |
Result:
1 2 3 4 |
{2, 3, 4} {2, 3, 4} {1, 2, 3, 4} frozenset({1, 2, 3, 4}) |
Set Union
Uniting two sets gives a new set, which consists of the unique values of these two sets. Makes sense.
1 2 3 4 |
first_set = set([1,2,3,3,4]) second_set = set([5,6,7,8]) new_set = second_set.union(first_set) print(new_set) |
Result:
1 |
{1, 2, 3, 4, 5, 6, 7, 8} |
Set Intersection
Intersection is tricky. It only returns the set of elements, which are present in both sets.
1 2 3 4 |
first_set = set([1,2,3,3,4,88]) second_set = set([5,3,3,8,88]) new_set = second_set.intersection(first_set) print(new_set) |
Result:
1 |
{88, 3} |
Set Symmetric Difference
This is the difference of the two sets. E.g., everything, which is not part of the intersection sets, for both sets.
1 2 3 4 |
first_set = set([1,2,3,3,4,88]) second_set = set([5,3,3,8,88]) new_set = second_set.symmetric_difference(first_set) print(new_set) |
Result:
1 |
{1, 2, 4, 5, 8} |
Set Difference
Difference is a bit tricky. It takes the first set and compares it with the second set. Then the returned set contains everything from the second set, which is not in the first.
1 2 3 4 |
first_set = set([1,2,3,3,4,88]) second_set = set([5,3,3,8,88]) new_set = second_set.difference(first_set) print(new_set) |
Result:
1 |
{8, 5} |
That’s all!