Stack Overflow Asked by Maulik Patel on November 24, 2021
Question:
Relabel the marital status variable DMDMARTL to have brief but informative character labels. Then construct a frequency table of these values for all people, then for women only, and for men only. Then construct these three frequency tables using only people whose age is between 30 and 40.
Now I have finished all except the male and female DMDMARTL between 30 and 40
Below is the whole code so far and this is the link to the dataset: https://raw.githubusercontent.com/Mauliklm10/Cartwheel.csv/master/datasetNHANES.csv
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import statsmodels.api as sm
import numpy as np
da = pd.read_csv("nhanes_2015_2016.csv") # this is where the dataset link will be entered
# prints the data in descending order
da.DMDMARTL.value_counts()
# We are now giving the numbers actual variable names
# The new relabeled variable will be a string first
# all the data is being stored in the sr. no. like 1, 2, 3 but we make them into meaningful variables like Married, Divorced etc.
da["DMDMARTLV2"] = da.DMDMARTL.replace({1:"Married",2:"Widowed",3:"Divorced",4:"Separated",5:"Never_Married",
6:"Living_With_Partner",77:"Refused",99:"Dont_Know"})
da.DMDMARTLV2.value_counts()
# Below is the way to find out the values that have been lost/are missing
pd.isnull(da.DMDMARTLV2).sum()
# We are relabeling the Gender variable as well as we will we working on them as well
# we relabel so that any changes will not be made to the roiginal dataset and
# also all the data is being stored in the sr. no. like 1, 2, 3 but we make them into meaningful variables like Male and Female
da["RIAGENDRV2"] = da.RIAGENDR.replace({1: "Male", 2: "Female"})
# We figure out that the numbers dont add up meaning there are some missing values
# and so we get all those values by the .fillna method
da["DMDMARTLV2"] = da.DMDMARTLV2.fillna("Missing")
da.DMDMARTLV2.value_counts()
# this is to get the frequency table for Females and Males individually
da.groupby("RIAGENDRV2")["DMDMARTLV2"].value_counts()
# this is to get the agegroup 30 to 40
da["agegrp"] = pd.cut(da.RIDAGEYR, [30, 40])
da.groupby("agegrp")["DMDMARTLV2"].value_counts()
# this is to get the agegroup 30 to 40 with males and females
da["agegrp"] = pd.cut(da.RIDAGEYR, [30, 40])
da.groupby("agegrp")("RIAGENDRV2")["DMDMARTLV2"].value_counts()
The above code gives me a TypeError: ‘DataFrameGroupBy’ object is not callable.
I got the answer no need to answer this post anymore: the line of code was: da.groupby(["agegrp", "RIAGENDRV2"])["DMDMARTLV2"].value_counts()
Answered by Maulik Patel on November 24, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP