Assignment instructions: Following completion of your first program, create a blog entry where you post 1) your program 2) the output that displays three of your variables as frequency tables and 3) a few sentences describing your frequency distributions in terms of the values the variables take, how often they take them, the presence of missing data, etc.
(Apologies for those of you not interested in reading raw SAS code, but for those of you who are, I’m open to suggestions for how to improve. This is fairly basic code, but useful for someone who’s never used SAS.)
Behind the jump for all the details..
My program so far:
LIBNAME mydata “/courses/d1406ae5ba27fe300″ access=readonly;
/* mydata is the local name for the database */
/* Research question: Race and perception of law enforcement and opportunity for achievement between Blacks and Whites during the beginning of the #BlackLivesMatter movement
SPECIFICALLY H1: Are non-Hispanic Blacks less likely to trust the federal government, the police, and/or the legal system than non-Hispanic Whites?
H1a: Are non-Hispanic Blacks less likely to trust the federal government than non-Hispanic Whites?
H1b: Are non-Hispanic Blacks less likely to trust the police than non-Hispanic Whites?
H1a: Are non-Hispanic Blacks less likely to trust the legal system than non-Hispanic Whites?
SPECIFICALLY H2: Does income-level influence levels of trust in the federal government, the police, and/or the legal system in both Blacks and White? */
DATA new; set mydata.oll_pds;
LABEL ppethm=”Race / Ethnicity”
pprent=”Ownership Status of Living Quarters”
ppwork=”Current Employment Status”
w1_h1=”Society has reached the point where Blacks and Whites have equal opportunities for achievement.”
w1_h6=”It’s Really a metter of some people not trying hard enough; if Blacks would only try harder they could be just as well off as Whites.”
w1_h7=”Generations of slavery and discrimination have created conditions that make it difficult for Blacks to work their way out of the lower class.”
w1_h8=”Discrimination against Blacks is no longer a problem in the U.S.”
w1_k1_a=”[The government in Washington] How much do you think you can trust the following institutions?”
w1_k1_b=”[The police] How much do you think you can trust the following institutions?”
w1_k1_c=”[The legal system] How much do you think you can trust the following institutions?”;
IF ppethm=1 or ppethm = 2;
/* Select statements limit the cases included in the analysis; includes only those who
indicated race/ethnicity of “White, Non-Hispanic” or “Black, Non-Hispanic” */
PROC SORT; by CASEID;
PROC FREQ; TABLES ppethm ppincimp pprent ppwork w1_h1 w1_h6 w1_h7 w1_h8 w1_k1_a
Output for the first four variables of interest, with notes and commentary embedded between tables:
The FREQ Procedure
|Race / Ethnicity|
I restricted the cases in the OOL dataset to just those with respondents that reported their race/ethnicity to be non-Hispanic White or non-Hispanic Black given my interest specifically in the differences between Blacks and Whites in their perceptions of the police and legal system. This limits the cases available for my analysis to 2,092 of the total 2,294 observations in the dataset. As the OOL dataset was intentionally oversampled for Blacks, it follows that 61% of the remaining respondents in my sample are Black (coded in the data as 2) and 39% are White (coded in the data as 1).
The values for Household income cover a wide range and do not increment evenly. The lowest range (coded above as 1) represents those with household incomes of $5,000 or less, but the next four values (coded above as 2, 3, 4, and 5) increment by $2,499 each. Starting with household incomes of $15,000 through household incomes of $39,999, the values increment by $4,999 each (codes 6 through 10), with the next two ranges represented increases of $9,999 each (codes 11 and 12) up to household incomes of $59,999. The next three groupings break with this pattern for reasons not provided; household incomes of $60,000 to $74,999 and $85,000 to $99,999 (both $14,999 increments) are coded as 13 and 15 respectively, but household incomes between $75,000 and $84,999 (a $9,999 increment) are coded as 14. The remainder of the values, up to household incomes of $174,999, are coded in increments of $24,999 (codes 16 through 18), with the final code (19) representing household incomes of $175,000 or more.
I will need to aggregate some of the response values to make the data easier to work with for my second research question relating to whether persons from lower-income households have more negative perceptions of the police and legal system. I will need to do additional investigation to determine the family size for each respondent and assign them to a group based on the federal poverty rate appropriate for their family size. I may also divide the sample into multiple groups to ascertain whether there are different relationships for low-income, middle-income, and high-income households.
|Ownership Status of Living Quarters|
Another option to consider when constructing my income groups may be whether or not respondents own or rent their living quarters. The distribution above shows that nearly 2/3rds (64%; coded as 1 above) of the sample own their living quarters, with just over 1/3rd (34%; coded as 2 above) renting. The remaining respondents (3%; coded as 3 above) reported that they neither own nor rent but instead occupy their living quarters without payment.
|Current Employment Status|
It may also be worthwhile to consider factoring current employment status in creating the comparison groups for my second research question. Not quite half (46%; coded above as 1) of the respondents reported working as a paid employee, with another 6% (coded above as 2) reporting that they are self-employed. The remaining respondents (48%) reported not working for a variety of reasons (e.g., on temporary layoff (1%; coded as 3 above); looking for work (11%; coded as 4 above); retired (22%; coded as 5 above); disabled (8%; coded as 6 above); or other (6%; coded as 7 above). Given the relatively large number of respondents who reported not working, specifically those who are retired, I will have to consider whether this variable will be useful to include or if I will rather use it to further limit the cases included in my sample to avoid unintentional skewness in the results. Given the timing of the data collection – during the period of slow recovery from the 2008 recession – consideration will need to be given to whether current household income is an appropriate variable to consider as a proxy for social class.