Stack Overflow Asked by gsoft on November 12, 2021
I am trying to extract the section (matrix) of the numbers in pandas dataframe like as marked in the given picture embedded above.
Please anyone who can assist me, I want to perform analytics based on the section (matrix) of a bigger data frame. Thank you in advance!!
Be sub_rows and sub_cols the dimension of the datafram to be extracted:
import pandas as pd
sub_rows = 10 # Amount of rows to be extracted
sub_cols = 3 # Amount of columns to be extracted
if sub_rows > len(df.index):
print("Defined sub dataframe rows are more than in the original dataframe")
elif sub_cols > len(df.columns):
print("Defined sub dataframe columns are more than in the original dataframe")
else:
for i in range(0,len(df.index)-sub_rows):
for j in range(0, len(df.columns)):
d.iloc[i:i+sub_rows, j:j+sub_cols] # Extracted dataframe
# Put here the code you need for your analysis
Answered by David Felipe Medina Mayorga on November 12, 2021
You can use the .iloc[] function to select the rows and columns you want.
dataframe.iloc[5:15,6:15]
This should select rows 5-14 and columns 6-14. Not sure if the numbers are correct but I think this method is what you were looking for.
edit: changed .loc[] to .iloc[] because we're using index values, and cleaned it up a bit
Here is the code to iterate over the whole dataframe
#df = big data frame
shape = (10,10) #shape of matrix to be analized, here is 10x10
step = 1 #step size, itterate over every number
#or
step = 10 #step size, itterate block by block
#keep in mind, iterating by block will leave some data out at the end of the rows and columns
#you can set step = shape if you are working with a matrix that isn't square, just be sure to change step in the code below to step[0] and step[1] respectively
for row in range( 0, len(df[0]) - shape[0]+1, step): #number of rows of big dataframe - number of rows of matrix to be analized
for col in range(0, len(df.iloc[0,:]) - shape[1]+1, step): #number of columns of big dataframe - number of columns of matrix to be analized
matrix = df.iloc[row:shape[0]+row, col:shape[1]+col] #slice out matrix and set it equal to 'matrix'
#analize matrix here
This is basically the same as @dafmedinama said, i just added more commenting and simplified specifying the shape of the matrix as well as included a step variable if you don't want to iterate over every single number every time you move the matrix.
Answered by George Sebastiaan van Heerden on November 12, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP