grid47 Exploring patterns and algorithms
Sep 24, 2024
7 min read
Solution to LeetCode 433: Minimum Genetic Mutation Problem
You are given two gene strings, startGene and endGene, both consisting of 8 characters. You also have a gene bank of valid gene strings that can be mutated into one another. A mutation is defined as changing one character at a time, and the mutated gene string must exist in the gene bank. Your task is to determine the minimum number of mutations needed to transform startGene into endGene. If transformation is impossible, return -1.
Problem
Approach
Steps
Complexity
Input: You are given three inputs: `startGene` (an 8-character gene string), `endGene` (an 8-character gene string), and a `bank` (a list of valid gene strings).
Example: startGene = 'AACCGGTT', endGene = 'AACCGGTA', bank = ['AACCGGTA']
• Explanation: The solution involves two mutations: first from 'AACCGGTT' to 'AACCGGTA', and then from 'AACCGGTA' to 'AAACGGTA'.
Approach: We will use a breadth-first search (BFS) approach to find the shortest path from `startGene` to `endGene` by exploring all possible valid mutations in the gene bank.
Observations:
• This is a shortest path problem, and BFS is ideal because it explores all nodes at the present depth level before moving to nodes at the next depth level.
• We need to generate all possible gene strings from the current string by changing one character at a time, and check if the mutated string is in the gene bank.
Steps:
• 1. Initialize a queue for BFS with `startGene`.
• 2. Use a set for fast look-up of valid mutations from the bank.
• 3. For each gene string, generate all possible mutations and push valid ones to the queue.
• 4. Track the number of mutations and stop when `endGene` is found.
Empty Inputs:
• If the gene bank is empty, return -1 unless `startGene` is equal to `endGene`.
Large Inputs:
• Ensure that the BFS approach handles up to 10 valid mutations efficiently.
Special Values:
• If the `startGene` is equal to `endGene`, return 0 because no mutation is needed.
Constraints:
• Handle cases where no valid mutations exist or where the gene bank is empty.
Initializes a set with the bank to allow O(1) lookups for gene sequences.
3 : Base Condition
if(!st.count(endGene)) return-1;
Checks if the end gene is in the bank; if not, returns -1 as transformation is impossible.
4 : Queue Initialization
queue<string> q;
Declares a queue for BFS traversal of gene mutations.
5 : Enqueue
q.push(startGene);
Adds the starting gene to the BFS queue.
6 : Variable Initialization
int step =0, s;
Initializes variables for BFS steps and queue size.
7 : Variable Declaration
string cur, t;
Declares strings to hold the current gene and temporary mutations.
8 : While Loop
while(!q.empty()) {
Iterates as long as the queue is not empty.
9 : Queue Size
s = q.size();
Stores the current queue size for level-wise processing.
10 : For Each Level
while(s--) {
Processes all nodes at the current level in the BFS.
11 : Front Element
cur = q.front();
Gets the current gene at the front of the queue.
12 : Dequeue
q.pop();
Removes the processed gene from the queue.
13 : Check Goal
if(cur == endGene) return step;
Returns the number of steps if the current gene matches the end gene.
14 : Set Erase
st.erase(cur);
Removes the current gene from the set to avoid revisiting.
15 : Loop
for(int i=0; i <8; i++) {
Iterates through each character of the current gene to mutate.
16 : Copy String
t = cur;
Creates a copy of the current gene for mutation.
17 : Character Mutation
t[i] ='A';
Mutates the ith character of the gene to 'A'.
18 : Conditional Insertions
if(st.count(t)) q.push(t);
Checks if the mutated gene is in the set and enqueues it.
19 : Character Mutation
t[i] ='T';
Mutates the ith character of the gene to 'T'.
20 : Conditional Insertions
if(st.count(t)) q.push(t);
Checks if the mutated gene is in the set and enqueues it.
21 : Character Mutation
t[i] ='G';
Mutates the ith character of the gene to 'G'.
22 : Conditional Insertions
if(st.count(t)) q.push(t);
Checks if the mutated gene is in the set and enqueues it.
23 : Character Mutation
t[i] ='C';
Mutates the ith character of the gene to 'C'.
24 : Conditional Insertions
if(st.count(t)) q.push(t);
Checks if the mutated gene is in the set and enqueues it.
25 : Increment Step
step++;
Increments the step counter for the next level of BFS.
26 : Return Result
return-1;
Returns -1 if the end gene is not reachable from the start gene.
Best Case: O(1), if startGene is equal to endGene.
Average Case: O(N * 8), where N is the number of gene strings in the bank and each mutation takes O(8) time (since the length of each gene string is 8).
Worst Case: O(N * 8), where N is the number of possible gene mutations.
Description: BFS explores each possible mutation in the gene bank.
Best Case: O(1), if no mutations are required.
Worst Case: O(N), where N is the number of gene strings stored in the queue and the bank.
Description: The space complexity depends on the number of nodes in the BFS queue.