Untitled

mail@pastecode.io avatar
unknown
plain_text
2 months ago
3.5 kB
1
Indexable
Never
**h) Given your analysis, write up a short paragraph describing what the causal mechanism between wealth and corruption could be that would explain the effect that you observe in the data.**

\newpage


## Question 2: Wealth and Infant Mortality

**a) Examine the distribution of per capita income and infant mortality. Make a scatter plot of per capita income and infant mortality. Then compare this to a scatter plot of logged per capita income and logged infant mortality. Which plot appears to represent a linear relationship between the variables?**

```{r}
#| message: false
#| warning: false

infmort <- read_dta("Data/infantmortality.dta")

ggplot(infmort, aes(x = income, y = infant)) +
  geom_point() +
  labs(x = "GDP/capita", y = "Infant Mortality") +
  theme_bw()

ggplot(infmort, aes(x = log(income), y = log(infant))) +
  geom_point() +
  labs(x = "Log GDP/capita", y = "Log Infant Mortality") +
  theme_bw()
```
The second plot, with logged data, seems to represent more clearly a linear relationship.

**b) Run a regression of log infant mortality on log income controlling for the region of the world (using Asia as the baseline) and whether countries are oil-exporting or not. Interpret the coefficients carefully.**

```{r}
#| message: false
#| warning: false

#Asia as a baseline: factor + relevel
infmort$region <- as.factor(infmort$region)
infmort$region <- relevel(infmort$region, ref = "Asia")

model_q2b <- lm(log(infant) ~ log(income) + region + oil, data = infmort)

modelsummary(model_q2b,
             coef_rename = 
               c("(Intercept)" = "Intercept",
                 "log(infant)" = "Log Infant mortality",
                 "log(income)" = "Log GDP/Capita",
                 "regionAfrica" = "Africa",
                 "regionAmericas" = "Americas",
                 "regionEurope" = "Europe",
                 "oilyes" = "Oil exporting country"),
             statistic = "p.value",
             stars = TRUE
                 )
```
**c) Now include an interaction between the oil dummy and income. Interpret the results and try to include informative plots in your writeup. Which model specification do you prefer?**
```{r}
model_q2c <- lm(log(infant) ~ log(income) + region + oil + oil:income, data = infmort)

modelsummary(model_q2c,
             coef_rename = 
               c("(Intercept)" = "Intercept",
                 "log(infant)" = "Log Infant mortality",
                 "log(income)" = "Log GDP/Capita",
                 "regionAfrica" = "Africa",
                 "regionAmericas" = "Americas",
                 "regionEurope" = "Europe",
                 "oilyes" = "Oil exporting country"),
             statistic = "p.value",
             stars = TRUE
                 )
```
**d) Based on your preferred model specification, calculate the expected levels of infant mortality in European countries for mean levels of income and oil export (including confidence intervals). Describe this result in one or two short sentences. Compare the model prediction with the actual average among European countries. Are there discrepancies? Why or why not?**

**e) A journalist working for The Economist has heard about your fascinating work and is interested in reporting your result in an article. He approaches you and asks you to provide him with an easily understandable plot of how income in dollars affects infant mortality rates in oil exporting vs. non-oil exporting countries including any uncertainty in your estimates. Create such a visualization and explain it in a few sentences.**
Leave a Comment