Skip to main content

Lesson 2: Set Operations

Now that you understand what sets are and how to create them, it's time to discover their real power. Sets aren't just useful for storing unique items—they're designed to perform mathematical operations on collections. In this lesson, you'll learn the four fundamental set operations that make sets essential for data comparison, filtering, and analysis.

When you need to find common interests between people, identify differences in datasets, or combine information without duplicates, set operations are your answer. And unlike many technical concepts, these operations have intuitive real-world meanings.


Set Operations: The Mathematical Foundations

Sets are based on mathematical set theory, which gives us four core operations. The beautiful part? Python makes these operations simple with elegant operator syntax.

Let's build our understanding progressively: start with the concepts, see them in code, then apply them to real problems.


Union: Combining Sets

Union combines elements from multiple sets into one set, keeping only unique items. It answers the question: "What items appear in any of these sets?"

Think of a music playlist: if you combine your favorite songs with your friend's favorite songs, union gives you all unique songs from both lists—no duplicates.

Code Example 1: Union Operations

# Union - combine all unique elements from both sets
team_a: set[str] = {"Alice", "Bob", "Charlie"}
team_b: set[str] = {"Bob", "David", "Eve"}

# Using | operator (most Pythonic)
all_members: set[str] = team_a | team_b
print(f"Union with |: {all_members}") # {'Alice', 'Bob', 'Charlie', 'David', 'Eve'}

# Using .union() method (equivalent)
all_members_method: set[str] = team_a.union(team_b)
print(f"Union with method: {all_members_method}") # Same result

# Union of multiple sets
team_c: set[str] = {"Frank", "Alice"}
everyone: set[str] = team_a | team_b | team_c
print(f"All three teams: {everyone}") # {'Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank'}

# Union is commutative (order doesn't matter)
print(f"Commutative: {team_a | team_b == team_b | team_a}") # True

Key observation: Notice that "Alice" and "Bob" appear in both team_a and team_b, but in the result, they appear only once. That's the uniqueness property at work.

💬 AI Colearning Prompt

"In a real company with multiple departments, how would you use set union to find all employees across department A and department B? Could you have duplicates?"

Your AI can walk you through practical applications and help you see why union matters in business scenarios.

🎓 Instructor Commentary

In AI-native development, you don't memorize whether union is | or .union()—both are correct. Your AI instantly reminds you. What matters is recognizing "I need to combine collections without duplicates" and choosing union.


Intersection: Finding Common Elements

Intersection finds elements that appear in all sets. It answers: "What items are in both sets?"

Imagine you're looking for bilingual developers. You have one set of Java developers and another set of Python developers. Intersection tells you who knows both languages.

Code Example 2: Intersection Operations

# Intersection - elements common to all sets
java_devs: set[str] = {"Alice", "Bob", "Charlie", "David"}
python_devs: set[str] = {"Bob", "David", "Eve", "Frank"}

# Using & operator
bilingual: set[str] = java_devs & python_devs
print(f"Know both languages: {bilingual}") # {'Bob', 'David'}

# Using .intersection() method
bilingual_method: set[str] = java_devs.intersection(python_devs)
print(f"Know both (method): {bilingual_method}") # Same result

# Intersection is commutative
print(f"Commutative: {java_devs & python_devs == python_devs & java_devs}") # True

# Real-world example: Find common interests
alice_interests: set[str] = {"hiking", "photography", "cooking"}
bob_interests: set[str] = {"cooking", "gaming", "hiking"}
common: set[str] = alice_interests & bob_interests
print(f"Alice and Bob both like: {common}") # {'hiking', 'cooking'}

Key observation: Intersection is commutativeA & B equals B & A. The order doesn't matter because we're finding what's in both.

🚀 CoLearning Challenge

Ask your AI Co-Teacher:

"I have two lists of customer IDs from two different marketing campaigns. I want to find which customers were in both campaigns. Use sets and intersection to solve this. Then explain why sets are better than lists for this task."

Expected Outcome: You'll see how intersection solves real business problems and understand why sets make this operation efficient.


Difference: Excluding Elements

Difference finds elements in the first set but not in the second. It answers: "What items are unique to the first set?"

Imagine a school where some students are graduating. Difference tells you which students are still enrolled (all students minus those who graduated).

Code Example 3: Difference Operations

# Difference - elements in first set but not in second
all_students: set[str] = {"Alice", "Bob", "Charlie", "David", "Eve"}
graduated: set[str] = {"Bob", "David"}

# Using - operator
still_enrolled: set[str] = all_students - graduated
print(f"Still enrolled: {still_enrolled}") # {'Alice', 'Charlie', 'Eve'}

# Using .difference() method
still_enrolled_method: set[str] = all_students.difference(graduated)
print(f"Still enrolled (method): {still_enrolled_method}") # Same result

# IMPORTANT: Order matters! Difference is NOT commutative
print(f"all - graduated: {all_students - graduated}") # {'Alice', 'Charlie', 'Eve'}
print(f"graduated - all: {graduated - all_students}") # {} (empty - all graduated students ARE in all_students)

# Real-world example: Finding items to order
in_inventory: set[str] = {"Apples", "Bananas", "Oranges", "Grapes"}
already_ordered: set[str] = {"Apples", "Oranges"}
need_to_order: set[str] = in_inventory - already_ordered
print(f"Items to order: {need_to_order}") # {'Bananas', 'Grapes'}

Key observation: Notice that A - B is different from B - A. The order matters because we're asking "what's in the first but not the second?"

💬 AI Colearning Prompt

"Explain when you'd use difference instead of intersection. Give me three real scenarios where difference helps you filter data."

Your AI can help you build intuition about when each operation solves your problem.

✨ Teaching Tip

Use Claude Code to verify operations: "Show me what happens with A - B vs. B - A using these specific sets. Explain why they're different."


Symmetric Difference: Elements in Either Set But Not Both

Symmetric Difference finds elements that appear in either set but not in both. It answers: "What items are unique to one set or the other?"

Think of two groups working different shifts. Symmetric difference tells you who never works with anyone from the other shift.

Code Example 4: Symmetric Difference

# Symmetric difference - elements in either set but not both
morning_shift: set[str] = {"Alice", "Bob", "Charlie"}
afternoon_shift: set[str] = {"Charlie", "David", "Eve"}

# Using ^ operator (exclusive or)
never_overlap: set[str] = morning_shift ^ afternoon_shift
print(f"Never work together: {never_overlap}") # {'Alice', 'Bob', 'David', 'Eve'}

# Using .symmetric_difference() method
never_overlap_method: set[str] = morning_shift.symmetric_difference(afternoon_shift)
print(f"Never work together (method): {never_overlap_method}") # Same result

# Symmetric difference is commutative
print(f"Commutative: {morning_shift ^ afternoon_shift == afternoon_shift ^ morning_shift}") # True

# Mathematical relationship: A ^ B == (A - B) | (B - A)
# "Everything unique to each side"
print(f"Using - and |: {(morning_shift - afternoon_shift) | (afternoon_shift - morning_shift)}")
# Same result!

# Real-world example: Finding differences between product catalogs
catalog_a: set[str] = {"Widget A", "Widget B", "Widget C"}
catalog_b: set[str] = {"Widget B", "Widget C", "Widget D"}
unique_to_either: set[str] = catalog_a ^ catalog_b
print(f"Products unique to one catalog: {unique_to_either}") # {'Widget A', 'Widget D'}

Key observation: Symmetric difference is commutative—order doesn't matter. It's like XOR in logic: true if one or the other, but not both.

🎓 Instructor Commentary

Symmetric difference seems more complex than the others, but it answers a practical question: "What's different between these two datasets?" It's powerful for data reconciliation.


Set Comprehensions: Creating Filtered Sets

Just like list comprehensions, set comprehensions let you create new sets by filtering and transforming. They're a powerful pattern for creating derived sets.

Code Example 5: Set Comprehensions with Filtering

# Set comprehension - create new set with filtering
numbers: set[int] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

# Filter even numbers
even_numbers: set[int] = {n for n in numbers if n % 2 == 0}
print(f"Even numbers: {even_numbers}") # {2, 4, 6, 8, 10}

# Filter and transform: squares of odd numbers
squares_of_odd: set[int] = {n**2 for n in numbers if n % 2 == 1}
print(f"Squares of odd numbers: {squares_of_odd}") # {1, 9, 25, 49, 81}

# Filter strings by length
words: set[str] = {"apple", "pie", "banana", "cat", "dog", "elephant"}
long_words: set[str] = {word for word in words if len(word) > 4}
print(f"Words with 5+ letters: {long_words}") # {'apple', 'banana', 'elephant'}

# Create set from list with built-in filter
user_ages: list[int] = [25, 32, 19, 45, 32, 25, 19]
adults: set[int] = {age for age in user_ages if age >= 18}
print(f"Adult ages (unique): {adults}") # {19, 25, 32, 45}

Key advantage: Set comprehensions automatically eliminate duplicates. You get both filtering and deduplication in one expression.

🚀 CoLearning Challenge

Ask your AI Co-Teacher:

"Write a set comprehension that extracts all vowels from a long sentence. Then explain how set comprehensions differ from list comprehensions."

Expected Outcome: You'll understand set comprehensions as a filtering pattern and see why they're useful for deduplication.

✨ Teaching Tip

Remember: set comprehensions have the same syntax as list comprehensions—just curly braces instead of square brackets. Your AI can instantly show you the transformation.


Real-World Problem: Finding Common Interests

Let's combine everything we've learned to solve a realistic problem: matching users based on shared interests.

Code Example 6: Real-World Problem — Finding Common Interests

# Scenario: Match users with shared interests (friend recommendation system)
alice_interests: set[str] = {"hiking", "photography", "cooking", "reading"}
bob_interests: set[str] = {"cooking", "gaming", "reading", "music"}
charlie_interests: set[str] = {"hiking", "cooking", "yoga"}

# Question 1: What do Alice and Bob have in common?
alice_bob_common: set[str] = alice_interests & bob_interests
print(f"Alice & Bob share: {alice_bob_common}") # {'cooking', 'reading'}

# Question 2: What do all three share?
all_three_common: set[str] = alice_interests & bob_interests & charlie_interests
print(f"All three share: {all_three_common}") # {'cooking'}

# Question 3: What interests are unique to Alice?
alice_unique: set[str] = alice_interests - (bob_interests | charlie_interests)
print(f"Alice's unique interests: {alice_unique}") # {'photography'}

# Question 4: What interests are in Alice or Charlie but not Bob?
alice_or_charlie_not_bob: set[str] = (alice_interests | charlie_interests) - bob_interests
print(f"In Alice/Charlie but not Bob: {alice_or_charlie_not_bob}") # {'photography', 'yoga', 'hiking'}

# Question 5: Find all interests across all users
all_interests: set[str] = alice_interests | bob_interests | charlie_interests
print(f"All interests: {all_interests}") # All unique interests combined

# Question 6: What makes Alice and Bob different from each other?
alice_bob_different: set[str] = alice_interests ^ bob_interests
print(f"Interests unique to either Alice or Bob: {alice_bob_different}")
# {'photography', 'hiking', 'gaming', 'music'}

This demonstrates: How set operations combine to solve real problems. Notice how we're chaining operations with | (union) and - (difference) to answer complex questions about relationships.

💬 AI Colearning Prompt

"Design a friend recommendation feature: given one user's interests and all users' interests, find the top 3 users with the most common interests. How would you use set operations?"

Your AI can help you think through the algorithm using intersection and cardinality.


Practice: Master Set Operations

Try these exercises to build confidence with all four operations:

Exercise 1: Intersection Practice

You have two lists of students:

  • Morning class: ["Alice", "Bob", "Charlie", "David"]
  • Afternoon class: ["Bob", "David", "Eve", "Frank"]

Convert to sets and find students in both classes using intersection.

Exercise 2: Union + Difference Chaining

You have inventory from three warehouses:

  • Warehouse A: {"Widget1", "Widget2", "Widget3"}
  • Warehouse B: {"Widget2", "Widget4", "Widget5"}
  • Warehouse C: {"Widget1", "Widget5", "Widget6"}

Find all unique widgets available across all warehouses (union), then find widgets that are in only one warehouse.

Exercise 3: Set Comprehension Challenge

Create a set of all odd numbers from 1 to 50 that are also greater than 20 using a set comprehension.

Exercise 4: Real-World Inventory Comparison

Two stores have product catalogs:

  • Store A: ["Apple", "Banana", "Orange", "Grape"]
  • Store B: ["Banana", "Orange", "Mango", "Pineapple"]

Find:

  1. Products both stores carry (intersection)
  2. All unique products across both stores (union)
  3. Products only Store A carries (difference)

🚀 CoLearning Challenge

Ask your AI Co-Teacher:

"Create a more complex set operations scenario: three datasets representing browser history, bookmarks, and recently visited sites. Use intersection and union to find sites that appear in all three categories, and sites that are unique to one category."

Expected Outcome: You'll practice combining multiple operations and see how sets elegantly solve data comparison problems.


Try With AI

Setup

Use your preferred AI companion (ChatGPT web, Claude Code, or Gemini CLI if you've set one up from previous lessons).

Prompt Set (Bloom's Progression)

Prompt 1 - Concept Visualization (Understand Level):

"Create a Venn diagram explanation of union, intersection, difference, and symmetric difference. Use overlapping circles and shade the regions. For each operation, give me a real-world example."

Expected Outcome: You visualize how the operations work mathematically and see practical business scenarios for each. Venn diagrams make abstract set theory concrete.


Prompt 2 - Operator vs. Method Comparison (Apply Level):

"I can do set_a | set_b or set_a.union(set_b). Show me both approaches for union, intersection, and difference. Are these exactly the same? When would I choose one over the other?"

Expected Outcome: You understand both syntaxes, learn the trade-off (operators are concise, methods are explicit), and make informed choices based on readability.


Prompt 3 - Symmetric Difference Deep Dive (Analyze Level):

"I'm confused about symmetric difference. Show me a practical example where I'd use A ^ B vs. (A - B) | (B - A). How are these related? When would symmetric difference be the clearest choice?"

Expected Outcome: You grasp the "either but not both" concept deeply and understand when symmetric difference is the most elegant solution to a problem.


Prompt 4 - Set Comprehension Challenge (Evaluate Level):

"Write a set comprehension that filters strings longer than 4 characters from a list, then converts them to uppercase. Explain the syntax step-by-step. Then show me how this differs from the list comprehension version."

Expected Outcome: You master set comprehension syntax with transformation, understand the automatic deduplication benefit, and see why it's different from list comprehensions.

Safety Note: "Set comprehensions are safe and powerful. The automatic deduplication is a feature, not a limitation. Use them whenever you want filtered, unique results."


Closing Reflection:

You now understand all four mathematical set operations—union, intersection, difference, and symmetric difference. You can combine them to solve complex data problems. You can write set comprehensions to create filtered, deduplicated collections. Most importantly, you understand when each operation answers your question: "What items do these collections have in common?" (intersection), "What items are in any of them?" (union), "What's missing from one?" (difference), or "What's different between them?" (symmetric difference).

In the next lesson, you'll dive deeper into why sets are fast—exploring the hashing mechanism that makes O(1) lookup possible and understanding the trade-offs between sets and other collection types.