Untitled
unknown
plain_text
a year ago
2.2 kB
15
Indexable
""" clear set obs 200 gen female = 0 replace female = 1 if _n>100 ** Education is costly, as in a Spence model. ** The cost of getting an education is distributed uniformly from 0 to 1, in both genders. ** Having an education doubles your wage. ** You only pursue an education if the benefit is worth the cost. gen education_cost ** Women are discriminated against, and earning 20% less than men. ** Occupation 1 pays $1 for men, and $0.80 for women. ** Only people with an education can be in Occupation 1. ** Occupation 2 pays $0.50 for men, and $0.40 for women = mod (_n, 100)/100 ** Women are less likely to get an education because of the discrimination ** Half of men get an education vs. 40% of women. = gen wage = 1 if ! female & education_cost>=0.5 replace wage 0.5 if ! female & education_cost<0.5 replace wage = 0.8 if female & education_cost>=0.6 replace wage = 0.4 if female & education_cost<0.6 gen goteducation = 0 replace goteducation=1 if wage>0.7 reg wage female reg wage female goteducation """ import pandas as pd import numpy as np import statsmodels.formula.api as smf def run(): # Create a DataFrame with 1000 observations df = pd.DataFrame(index=range(1000)) # Generate 'female' column df['female'] = 0 df.loc[500:, 'female'] = 1 # Generate 'education_cost' column df['education_cost'] = np.random.uniform(0, 1, 1000) # Generate 'wage' column df.loc[(df['female'] == 0) & (df['education_cost'] >= 0.5), 'wage'] = 10 df.loc[(df['female'] == 0) & (df['education_cost'] < 0.5), 'wage'] = 5 df.loc[(df['female'] == 1) & (df['education_cost'] >= 0.7), 'wage'] = 10 df.loc[(df['female'] == 1) & (df['education_cost'] < 0.7), 'wage'] = 5 # Generate 'goteducation' column, anyone with wage > 5 must have gotten education df['goteducation'] = 0 df.loc[df['wage'] > 7, 'goteducation'] = 1 # Log-transform 'wage' df['log_wage'] = np.log(df['wage']) # Run regressions model1 = smf.ols(formula='log_wage ~ female', data=df).fit() model2 = smf.ols(formula='log_wage ~ female + goteducation', data=df).fit() print(model1.summary()) print(model2.summary()) run()
Editor is loading...
Leave a Comment