Since the data is not normally distributed ,  I decided to perform monte carlo test.

I started by preparing the data, separating the ages of black and white individuals killed by the police into two distinct arrays, ensuring any missing values were removed.

Next, I calculated the observed mean age difference between white and black individuals from the original data.

I then set up a Monte Carlo simulation, specifying 10000 simulations to be performed. For each simulation, I randomly sampled ages with replacement from the combined dataset of black and white individuals. In each sample, I calculated the mean age difference between the sampled white and black individuals. These calculated mean differences were stored in a list called mean_diffs.

After running the simulations, I measured the time taken for the entire process. I also filtered out any NaN values from the list of mean differences to ensure accurate analysis.

For a visual representation, I plotted a histogram of the mean differences obtained from the Monte Carlo simulations. To provide context, I included a vertical dashed line indicating the observed mean difference from the original data.

To draw meaningful conclusions, I calculated and printed the mean age difference for white and black individuals based on the original data. Additionally, I determined the number of simulated samples with a mean difference greater than the observed difference. This analysis helps in understanding whether the observed difference is statistically significant or if it could have occurred by chance.

The fact that none of the random samples had a mean age difference greater than 7.41 years (in 10,000, samples) strongly suggests that the observed difference is not due to random chance. This is reflected in the very low p-value (essentially zero), indicating a statistically significant result.

I also performed a Cohen’s d test because it not only provides a standardized measure of the effect size but also helps interpret the practical significance of the differences between groups. Which gave me a value of 0.589321

Interpreting this effect size, it means that the average age of white people killed by police is significantly different from the average age of black people killed by police, with white individuals generally being older than black individuals in these incidents.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *