Stack Overflow Asked by Commander on December 5, 2021
I am trying to create data bins in Python which produces the following output.
binsize = 5
data = 0.4, 1.7, 10.7, 8.0, 3.2, 6.7, 11.4, 10.4
(bin_lower_bound - bin_higher_bound)^as a tuple: num_frequency
0.4 - 5.4: 3
5.4 - 10.4: 2
10.4 - 15.4: 3
I have made an attempt at using a for loop to use the lower value within data as the lower_bound for the first bin and then create a new bin at each bin size until the maximum value has been reached. But no luck, unfortunately.
The idea is I’m trying to use a dictionary too but I’m trying to achieve this without NUMPY.
bins: {
0.4 – 5.4: 3
5.4 – 10.4: 2
10.4 – 15.4: 3
}
Any help would be appreciated.
Approach below should be quite efficient and doesn't use any imports (as requested). Of note with this approach if there is a bin that doesn't have any contents, it will not show up in the result. If you would rather see a "0" for a bin with no results, you'll have to make a quick lap through between the min-max and seed all of the bins with a zero. Right now they are made "on the fly" from the data.
binsize = 5
data = [0.4, 1.7, 10.7, 8.0, 3.2, 6.7, 11.4, 10.4]
min_val = min(data) # needed to anchor the first bin
bins = {}
for value in data:
bin_num = int((value - min_val) // binsize) # integer division to find bin
bins[bin_num] = bins.get(bin_num, 0) + 1
# pretty up the labels...optional
bins2 = { (round(k*binsize+min_val,1), round((k+1)*binsize+min_val,1)) :
bins[k] for k in bins }
# or with string-based labels
bins3 = { f'{round(k*binsize+min_val,1)} - {round((k+1)*binsize+min_val,1)}' :
bins[k] for k in bins}
print(bins2)
# {(0.4, 5.4): 3, (2.4, 7.4): 3, (1.4, 6.4): 2}
print(bins3)
# {'0.4 - 5.4': 3, '2.4 - 7.4': 3, '1.4 - 6.4': 2}
Answered by AirSquid on December 5, 2021
This will work for any data and any binsize.
data = [0.4, 1.7, 10.7, 8.0, 3.2, 6.7, 11.4, 10.4]
data.sort()
from collections import defaultdict
binsize = 5
minval = min(data)
maxval = max(data)
def create_bins(minval, maxval):
bins = []
while minval < maxval:
bins.append(f"{str(minval)} - {str(minval + binsize)}")
minval += binsize
return bins
bins = create_bins(minval, maxval)
bins_with_values = defaultdict(list)
i = 0
for val in data:
if i < len(bins):
temp = bins[i].split()
if val < float(temp[2]):
bins_with_values[bins[i]].append(val)
else:
i += 1
bins_with_values[bins[i]].append(val)
print(bins_with_values)
Output:
defaultdict(<class 'list'>, {'0.4 - 5.4': [0.4, 1.7, 3.2], '5.4 - 10.4': [6.7, 8.0], '10.4 - 15.4': [10.4, 10.7, 11.4]})
Answered by Kakarot_7 on December 5, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP