Code Review Asked by user2652379 on December 30, 2020
I’m coding a random strings generator. It’s working, but I’m not sure if this is efficient.
/**
* @param n total number of strings to generate
* @param length length of generated strings
* @param charsetStr charset to be used when generate
*/
public static List<String> generateRandomStrings(int n, int length, @NonNull String charsetStr) {
if (n < 1 || length < 1 || charsetStr.length() < 1) {
throw new IllegalArgumentException(String.format("Illegal argument(s): %d, %d, %s", n, length, charsetStr));
}
//remove duplicate chars
Set<Character> charset = new HashSet<>(charsetStr.length());
StringBuilder sb = new StringBuilder();
for (int i = 0; i < charsetStr.length(); i++) {
char c = charsetStr.charAt(i);
if (charset.add(c)) {
sb.append(c);
}
}
charsetStr = sb.toString();
BigDecimal caseNum = BigDecimal.valueOf(charsetStr.length()).pow(length);
BigDecimal nBd = BigDecimal.valueOf(n);
if (caseNum.compareTo(nBd) < 0) {
throw new IllegalArgumentException(
String.format("Number of possible strings cannot exceed the requested size: (length of '%s') ^ %d < %d", charsetStr, length, n));
}
if (nBd.divide(caseNum, 1, RoundingMode.HALF_UP).compareTo(BigDecimal.valueOf(5, 1)) < 0) {
//when fill factor is below 50%
//generate until target size is reached
Set<String> results = new HashSet<>(n);
while (results.size() < n) {
results.add(RandomStringUtils.random(length, charsetStr));
}
return new ArrayList<>(results);
}
// when fill factor is above 50%
// pre-generate all possible strings
List<String> results = new ArrayList<>(caseNum.intValue());
generateAllPossibleStrings(results, "", charsetStr, length);
// then shuffle the collection and pick target sized sublist
Collections.shuffle(results);
return results.subList(0, n);
}
public static void generateAllPossibleStrings(@NonNull List<String> results, @NonNull String prefix, @NonNull String charset, int length) {
if (prefix.length() >= length) {
results.add(prefix);
return;
}
for (int i = 0; i < charset.length(); i++) {
generateAllPossibleStrings(results, prefix + charset.charAt(i), charset, length);
}
}
Any advice would be grateful, but my main focus here is the collisions. I’m using org.apache.commons.lang3.RandomStringUtils
to generate strings if n
is less than 50% of the number of all possible strings(let’s call it "fill factor"). If not, to avoid unnecessary collisions, I generate all possible strings using generateAllPossibleStrings
, then shuffle the result and get a sublist from it.
The reasons I used 2 methods for generating strings are:
generateAllPossibleStrings
seemed overkill. So I used org.apache.commons.lang3.RandomStringUtils
, hoping collisions not matter much.org.apache.commons.lang3.RandomStringUtils
seems inefficient since a lot of collisions will occur as time goes by. So I used generateAllPossibleStrings
in this case, dropping without any collision looks more efficient than generating.There is no big reason why I choose 50% as a diverging point; I just used my hunch. This too may need to be fixed.
How can I make this more time-efficient?
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP