题解 | #不重复数字#

题目链接

不重复数字

题目描述

给定一个长度为 $n$ 的整数序列 $A$ 。你需要构建一个新的序列 $B$ ，规则如下：

遍历序列 $A$ 中的每个元素，从 $a_1$ 到 $a_n$ 。
对于当前元素，如果它在 $B$ 中尚未出现，则将其追加到 $B$ 的末尾。
如果已经出现过，则忽略。求最终得到的序列 $B$ 。

输入描述:

第一行包含一个整数 $t$ （ $1 \le t \le 100$ ），表示测试数据的组数。
接下来 $t$ 组数据，每组数据包含两行：
- 第一行是一个整数 $n$ （ $1 \le n \le 50000$ ）。
- 第二行是 $n$ 个整数，表示序列 $A$ 的元素。

输出描述:

对每组测试数据，输出一行，即最终的序列 $B$ 。元素之间用一个空格分隔。

示例输入:

3
6
1 2 1 3 2 4
5
5 5 5 5 5
4
10 20 10 20

输出:

1 2 3 4
5
10 20

解题思路

这道题要求我们对一个序列进行去重，同时要保持元素首次出现的相对顺序。这是一个经典的数据处理问题，通常结合使用哈希集合和列表来解决。

核心数据结构：
- 哈希集合 (Set): 用于高效地记录哪些数字已经出现过。它的优势在于近乎 $\mathcal{O}(1)$ 的插入和查找时间复杂度。
- 列表 (List/Vector): 用于按顺序存储最终结果。
算法流程:
- 本题包含多组测试数据，所以最外层需要一个循环来处理每一组。
- 对于每一组测试数据： a. 初始化一个空的哈希集合 seen 和一个空的结果列表 result_list。 b. 遍历输入的序列 $A$ 中的每一个数字 num。 c. 对于每个 num，检查它是否存在于 seen 集合中。 d. 如果 num 不在 seen 中，这表示它是第一次出现。此时，我们将 num 添加到 seen 集合中，并将其追加到 result_list 的末尾。 e. 如果 num 已经在 seen 中，我们什么也不做，继续处理下一个数字。
- 当该组测试数据的所有数字都处理完毕后，result_list 中就存储了按首次出现顺序排列的唯一元素。我们将这个列表格式化输出即可。

这个方法的时间复杂度由对输入序列的单次遍历主导，非常高效。

代码

cpp
java
python

#include <iostream>
#include <vector>
#include <set>

using namespace std;

void solve() {
    int n;
    cin >> n;
    set<int> seen;
    vector<int> result;
    for (int i = 0; i < n; ++i) {
        int num;
        cin >> num;
        if (seen.find(num) == seen.end()) {
            seen.insert(num);
            result.push_back(num);
        }
    }
    
    for (size_t i = 0; i < result.size(); ++i) {
        cout << result[i] << (i == result.size() - 1 ? "" : " ");
    }
    cout << endl;
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);
    int t;
    cin >> t;
    while (t--) {
        solve();
    }
    return 0;
}

import java.util.Scanner;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class Main {
    public static void main(String[] args) {
        Scanner sc = new Scanner(System.in);
        int t = sc.nextInt();
        while (t-- > 0) {
            int n = sc.nextInt();
            Set<Integer> seen = new HashSet<>();
            List<Integer> result = new ArrayList<>();
            for (int i = 0; i < n; i++) {
                int num = sc.nextInt();
                if (seen.add(num)) { // set.add() returns true if the element was new
                    result.add(num);
                }
            }

            for (int i = 0; i < result.size(); i++) {
                System.out.print(result.get(i) + (i == result.size() - 1 ? "" : " "));
            }
            System.out.println();
        }
        sc.close();
    }
}

def solve():
    n = int(input())
    nums = map(int, input().split())
    
    seen = set()
    result = []
    
    for num in nums:
        if num not in seen:
            seen.add(num)
            result.append(num)
            
    print(*result)

def main():
    t = int(input())
    for _ in range(t):
        solve()

if __name__ == "__main__":
    main()

算法及复杂度

算法: 哈希集合/有序集合 + 列表/动态数组
时间复杂度:
- C++: 对于每组测试数据，时间复杂度为 $\mathcal{O}(N \log K)$ 。其中 N 是输入序列的长度，K 是唯一元素的数量。这是因为 std::set 的操作是基于平衡二叉树的，每次操作为对数时间。
- Java/Python: 对于每组测试数据，时间复杂度为 $\mathcal{O}(N)$ 。HashSet/set 的操作平均时间为 $\mathcal{O}(1)$ 。
空间复杂度: $\mathcal{O}(K)$ ，其中 K 是唯一元素的数量。用于存储 seen 集合和结果列表。