写在前面

这里分享一下,Java 中 用于 list 中对象的多字段唯一标识,除重,代码示例

一、封装工具类

public class ListUtils {
   

    public static <T> List<T> distinctList(List<T> list, Function<? super T, ?>... keyExtractors) {
   
        return list.stream()
                .filter(distinctByKeys(keyExtractors))
                .collect(Collectors.toList());
    }

    private static <T> Predicate<T> distinctByKeys(Function<? super T, ?>... keyExtractors) {
   
        final Map<List<?>, Boolean> seen = new ConcurrentHashMap<>();
        return new Predicate<T>() {
   
            @Override
            public boolean test(T t) {
   
                final List<?> keys = Arrays.stream(keyExtractors)
                        .map(ke -> ke.apply(t))
                        .collect(Collectors.toList());
                return seen.putIfAbsent(keys, Boolean.TRUE) == null;
            }
        };
    }
}

1.1、测试示例


    private List<DataLineage> list;

    @Before
    public void inti() {
   
        List<DataLineage>  ll = Lists.newArrayList();
        for (int i = 0; i < 20; i++) {
   
            DataLineage p1 = new DataLineage();
            p1.setId(Long.valueOf(i));
            p1.setSourceDataBaseName("BaseName" + 1);
            p1.setSourceTableName("TableName" + RandomUtil.randomInt(5));
            p1.setSqlQuery("SqlQuery" +RandomUtil.randomInt(5));
            ll.add(p1);
        }
        list = ll;
    }


    @Test
    public void t1() {
   
        List<DataLineage> distinctDataLineageList = ListUtils.distinctList(list,
                DataLineage::getSourceDataBaseName,
                DataLineage::getSourceTableName);
        distinctDataLineageList.forEach(System.out::println);
    }

结果打印

DataLineage(id=0, sourceDataBaseName=BaseName1, sourceTableName=TableName1, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery2)
DataLineage(id=4, sourceDataBaseName=BaseName1, sourceTableName=TableName4, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery2)
DataLineage(id=6, sourceDataBaseName=BaseName1, sourceTableName=TableName3, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery4)
DataLineage(id=12, sourceDataBaseName=BaseName1, sourceTableName=TableName2, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery4)
DataLineage(id=19, sourceDataBaseName=BaseName1, sourceTableName=TableName0, sourceFieldName=null, sourceFieldType=null, sourceFieldComment=null, targetDataBaseName=null, targetTableName=null, targetFieldName=null, sqlQuery=SqlQuery0)

可验证,这里其实将 SourceDataBaseName 和 SourceTableName 作为唯一标识,打印了 整个 List 唯一标识过滤后的数据