Leetcode 609: Find Duplicate File in System

grid47
grid47
Exploring patterns and algorithms
Sep 7, 2024 5 min read

A computer directory where duplicate files are identified and softly glowing, showing their locations.
Solution to LeetCode 609: Find Duplicate File in System Problem

Given a list of directory paths, each containing file names and their content, return the groups of duplicate files that have the same content.
Problem
Approach
Steps
Complexity
Input: Each string represents a directory path followed by multiple files and their contents in the format 'filename(content)'.
Example: paths = ["root/x 1.txt(abc) 2.txt(def)", "root/y 3.txt(abc)", "root/y/z 4.txt(def)", "root 5.txt(ghj)"]
Constraints:
• 1 <= paths.length <= 2 * 10^4
• 1 <= paths[i].length <= 3000
• 1 <= sum(paths[i].length) <= 5 * 10^5
• paths[i] consist of English letters, digits, '/', '.', '(', ')', and ' '.
• No two files in the same directory have the same name.
Output: Return a list of groups of file paths that have identical content.
Example: [["root/x/1.txt", "root/y/3.txt"], ["root/x/2.txt", "root/y/z/4.txt"]]
Constraints:
• The returned list should group files with the same content.
• Files should be in the order they appear in the input list.
Goal: Identify and group files that share identical content.
Steps:
• Parse each directory info to extract file paths and content.
• Use a map to store file contents as keys and their corresponding file paths as values.
• Group file paths that have the same content into a list.
Goal: The directory structure is valid and files have unique names within the same directory.
Steps:
• No two files in the same directory share the same name.
• The number of directories is limited to 2 * 10^4.
Assumptions:
• File contents are represented as strings enclosed in parentheses.
• File names are unique within a given directory.
Input: paths = ["root/x 1.txt(abc) 2.txt(def)", "root/y 3.txt(abc)", "root/y/z 4.txt(def)", "root 5.txt(ghj)"]
Explanation: Here, file '1.txt' in 'root/x' and '3.txt' in 'root/y' both have content 'abc', so they are grouped together. Similarly, '2.txt' in 'root/x' and '4.txt' in 'root/y/z' have identical content 'def', so they form another group.

Link to LeetCode Lab


LeetCode Solutions Library / DSA Sheets / Course Catalog
comments powered by Disqus