开发者社区> 问答> 正文

Java数组,查找重复项

我有一个数组,正在寻找重复项。

duplicates = false; for(j = 0; j < zipcodeList.length; j++){ for(k = 0; k < zipcodeList.length; k++){ if (zipcodeList[k] == zipcodeList[j]){ duplicates = true; } } } 但是,当没有重复项时,此代码不起作用。为什么? 问题来源于stack overflow

展开
收起
保持可爱mmm 2020-02-09 14:09:43 606 0
1 条回答
写回答
取消 提交回答
  • 在鼻子上回答。 duplicates=false; for (j=0;j<zipcodeList.length;j++) for (k=j+1;k<zipcodeList.length;k++) if (k!=j && zipcodeList[k] == zipcodeList[j]) duplicates=true; 编辑后切换.equals()回原来的位置,==因为我在您正在使用的地方阅读int过,最初的问题尚不清楚。还要设置k=j+1,以将执行时间减半,但仍为O(n 2)。

    一种更快的方式 这是一种基于哈希的方法。您需要为自动装箱付款,但是它是O(n)而不是O(n 2)。一个进取的人会去寻找一个基于int的原始哈希集(Apache或Google Collections有这样的东西,methinks。)

    boolean duplicates(final int[] zipcodelist) { Set lump = new HashSet (); for (int i : zipcodelist) { if (lump.contains(i)) return true; lump.add(i); } return false; } 向HuyLe鞠躬 请参阅HuyLe的答案以了解或多或少的O(n)解决方案,我认为这需要几个附加步骤:

    static boolean duplicates(final int[] zipcodelist) { final int MAXZIP = 99999; boolean[] bitmap = new boolean[MAXZIP+1]; java.util.Arrays.fill(bitmap, false); for (int item : zipcodeList) if (!bitmap[item]) bitmap[item] = true; else return true; } return false; } 或者只是为了紧凑 static boolean duplicates(final int[] zipcodelist) { final int MAXZIP = 99999; boolean[] bitmap = new boolean[MAXZIP+1]; // Java guarantees init to false for (int item : zipcodeList) if (!(bitmap[item] ^= true)) return true; return false; } 有关系吗? 好吧,所以我运行了一个基准测试,到处都比较麻烦,但这是代码:

    import java.util.BitSet;

    class Yuk { static boolean duplicatesZero(final int[] zipcodelist) { boolean duplicates=false; for (int j=0;j<zipcodelist.length;j++) for (int k=j+1;k<zipcodelist.length;k++) if (k!=j && zipcodelist[k] == zipcodelist[j]) duplicates=true;

    return duplicates;
    

    }

    static boolean duplicatesOne(final int[] zipcodelist) { final int MAXZIP = 99999; boolean[] bitmap = new boolean[MAXZIP + 1]; java.util.Arrays.fill(bitmap, false); for (int item : zipcodelist) { if (!(bitmap[item] ^= true)) return true; } return false; }

    static boolean duplicatesTwo(final int[] zipcodelist) { final int MAXZIP = 99999;

    BitSet b = new BitSet(MAXZIP + 1);
    b.set(0, MAXZIP, false);
    for (int item : zipcodelist) {
      if (!b.get(item)) {
        b.set(item, true);
      } else
        return true;
    }
    return false;
    

    }

    enum ApproachT { NSQUARED, HASHSET, BITSET};

    /** * @param args */ public static void main(String[] args) { ApproachT approach = ApproachT.BITSET;

    final int REPS = 100;
    final int MAXZIP = 99999;
    
    int[] sizes = new int[] { 10, 1000, 10000, 100000, 1000000 };
    long[][] times = new long[sizes.length][REPS];
    
    boolean tossme = false;
    
    for (int sizei = 0; sizei < sizes.length; sizei++) {
      System.err.println("Trial for zipcodelist size= "+sizes[sizei]);
      for (int rep = 0; rep < REPS; rep++) {
        int[] zipcodelist = new int[sizes[sizei]];
        for (int i = 0; i < zipcodelist.length; i++) {
          zipcodelist[i] = (int) (Math.random() * (MAXZIP + 1));
        }
        long begin = System.currentTimeMillis();
        switch (approach) {
        case NSQUARED :
          tossme ^= (duplicatesZero(zipcodelist));
          break;
        case HASHSET :
          tossme ^= (duplicatesOne(zipcodelist));
          break;
        case BITSET :
          tossme ^= (duplicatesTwo(zipcodelist));
          break;
    
        }
        long end = System.currentTimeMillis();
        times[sizei][rep] = end - begin;
    
    
      }
      long avg = 0;
      for (int rep = 0; rep < REPS; rep++) {
        avg += times[sizei][rep];
      }
      System.err.println("Size=" + sizes[sizei] + ", avg time = "
            + avg / (double)REPS + "ms");
    }
    

    }

    } 使用NSQUARED:

    Trial for size= 10 Size=10, avg time = 0.0ms Trial for size= 1000 Size=1000, avg time = 0.0ms Trial for size= 10000 Size=10000, avg time = 100.0ms Trial for size= 100000 Size=100000, avg time = 9923.3ms 与HashSet

    Trial for zipcodelist size= 10 Size=10, avg time = 0.16ms Trial for zipcodelist size= 1000 Size=1000, avg time = 0.15ms Trial for zipcodelist size= 10000 Size=10000, avg time = 0.0ms Trial for zipcodelist size= 100000 Size=100000, avg time = 0.16ms Trial for zipcodelist size= 1000000 Size=1000000, avg time = 0.0ms 使用BitSet

    Trial for zipcodelist size= 10 Size=10, avg time = 0.0ms Trial for zipcodelist size= 1000 Size=1000, avg time = 0.0ms Trial for zipcodelist size= 10000 Size=10000, avg time = 0.0ms Trial for zipcodelist size= 100000 Size=100000, avg time = 0.0ms Trial for zipcodelist size= 1000000 Size=1000000, avg time = 0.0ms BITSET赢了! 但是只有一个头发.... 15ms的误差在内currentTimeMillis(),我的基准测试中有一些空白。请注意,对于任何超过100000的列表,您都可以简单地返回,true因为会有重复项。实际上,如果列表是随机的,则可以为更短的列表返回true WHP。道德是什么?在极限情况下,最有效的实现是:

    return true; 而且您不会经常犯错。

    2020-02-09 14:09:57
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
Spring Cloud Alibaba - 重新定义 Java Cloud-Native 立即下载
The Reactive Cloud Native Arch 立即下载
JAVA开发手册1.5.0 立即下载