Stack Overflow Asked on December 22, 2021
I want to randomly select an element from each list in a Series of lists.
import pandas as pd
import numpy as np
l=[['a','b','c'],['d','e','f'],['g','h','i'],['j','k','l'],['m','n','o']]
s = pd.Series(l)
So s
is:
0 [a, b, c]
1 [d, e, f]
2 [g, h, i]
3 [j, k, l]
4 [m, n, o]
dtype: object
I know I can do the following:
s = pd.Series([np.random.choice(i) for i in s])
Which does work:
0 a
1 e
2 h
3 j
4 m
dtype: object
But I am wondering if there is a non-loop approach to do this?
For instance, (assuming each list
is equal size) you could make an array of random indices to try and pick a different element from each list
:
i = np.random.randint(3, size=len(l))
#array([2, 2, 0, 1, 0])
But doing say s[i]
doesn’t work because that is indexing s
rather than applying to each list
:
2 [g, h, i]
2 [g, h, i]
0 [a, b, c]
1 [d, e, f]
0 [a, b, c]
dtype: object
My motivation is to have something that would work on a large amount of lists, hence the avoidance of a loop. But if my list comprehension seems like the most reasonable, or there is no builtin pandas
/numpy
function for this, please tell me.
I can only think of this way , however, the performance may be the problem
np.array(s.tolist())[np.arange(len(s)), np.random.randint(3, size=len(s))]
array(['c', 'e', 'i', 'k', 'n'], dtype='<U1')
Some timing
%timeit s.explode().sample(frac=1, random_state=1)
5.05 ms ± 294 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit pd.Series([np.random.choice(i) for i in s])
23.1 ms ± 184 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit np.array(s.tolist())[np.arange(len(s)), np.random.randint(3, size=len(s))]
1.63 ms ± 50.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Answered by BENY on December 22, 2021
You could try explode
, shuffle the exploded series, then sample. This doesn't even require that the lists have the same length.
(s.explode()
.sample(frac=1, random_state=1) # random_state added for repeatability, drop if needed
.groupby(level=0).head(1)
)
Output:
1 d
2 h
0 c
3 k
4 n
dtype: object
Answered by Quang Hoang on December 22, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP