North Carolina trends: geography, population, income, and education

6.1 Introduction

Income is associated with school performance, however it doesn’t tell the whole story. There are significant disparities in educational outcomes for racial/ethnic subgroups and other “accountability groups” as noted in the chapter NC Education Section 5.5.1. Below I look at the associations of income, subgroup, and region with school performance: visually and numerically.

6.1.1 Income and earnings

The five-year ACS ending in 2019 includes a number of variables related to income and earnings. I use the following:

B07011_001 (Median income in the past 12 months; universe: Population 15 years and over in the United States with income)
B19301_001 (Per capita income in the past 12 months; universe: Total population). For per capita income by race I use “B19301A_001” through “B19301G_001”.
B20004_001 and related variables (Median earnings in the past 12 months; universe: Population 25 years and over with earnings) by level of educational attainment (and all education levels)

All are estimates in 2019 dollars with geographical boundary of “county” or “school district (unified) as indicated”¹

Note on estimates

Data are estimates from the US Census five-year American Community Survey (ACS) results ending in 2019. At the time I am doing this analysis (Nov/Dec 2022), this was the latest reliable data available: detailed 2020 decennial census income data had not been released, and 2020 ACS results were less reliable than usual due to the COVID-19 pandemic. ACS estimates have much larger margins of error (MOE) than the decennial census. In small sub-populations MOEs can exceed estimated values. To simplify the presentation in this chapter I have not represented this uncertainty in most cases; instead I use the estimates as given. An exception is Figure 6.4.

Relevant ACS definitions of some measures of income²:

Income of Individuals – Income for individuals is obtained by summing the eight types of income for each person 15 years old and over. The characteristics of individuals are based on the time of interview even though the amounts are for the past 12 months.

Median Income – The median divides the income distribution into two equal parts: one-half of the cases falling below the median income and one-half above the median. For households and families, the median income is based on the distribution of the total number of households and families including those with no income. The median income for individuals is based on individuals 15 years old and over with income. Median income for households, families, and individuals is computed on the basis of a standard distribution.

Per Capita Income – Per capita income is the mean income computed for every man, woman, and child in a particular group including those living in group quarters. It is derived by dividing the aggregate income of a particular group by the total population in that group. (The aggregate used to calculate per capita income is rounded.

Median Earnings – The median divides the earnings distribution into two equal parts: one- half of the cases falling below the median and one-half above the median. Median earnings is restricted to individuals 16 years old and over with earnings and is computed on the basis of a standard distribution. Median earnings figures are calculated using linear interpolation.

And as noted at censusreporter.org³:

…in Census Bureau terminology, earnings are a subset of income. Specifically, earnings are wages or salary from a job, or income from being self-employed. Other kinds of income, not included in earnings, include social security payments, interest and dividends, income from property rental, pensions, public assistance, and child support.

6.2 Income and earnings (county aggregation)

There are distributions in the estimates of the income measures across the counties of NC.

Figure 6.1: Cumulative distribution of median and per capita income

All three measures point to higher income in the urban areas, with per capita income showing the largest differences.

And last in this section, a look at median earnings by educational attainment (county aggregation). Again we see (1) the strong association between higher levels of educational attainment and median earnings, and (2) the differences generally are largest in urban areas.

6.3 Years of education (school district aggregation)

To simplify plotting and linear regression, I converted categorical levels of educational attainment to numerical years of education in var_new as shown below. The XX_years_ed is my generalization of the average number of years required to achieve this level of education.

description	var_new
Less than high school graduate	10_years_ed
High school graduate (includes equivalency)	12_years_ed
Some college or associate's degree	14_years_ed
Bachelor's degree	16_years_ed
Graduate or professional degree	19_years_ed
Median earnings - all education levels (age 25 and over with earnings)	all_years_ed
Median income in the past 12 months (age 15 and over)	med_income
Per capita income in the past 12 months (whole population)	pci

Figure 6.4: Income by years of education - cumulative distribution with margins of error

6.4 School performance score and median income (school district aggregation)

Is there a change in score associated with a change in median income of the people living the school district catchment area? Yes. As seen in Figure 6.6, scores increase as median income increases, however this is a smaller factor than racial/ethnic group, being economically disadvantaged, or being a student with a disability. Income is confounded by membership in racial/ethnic and economically disadvantaged subgroups, which are partially a proxy for income (which is not visible in this data set). And as we can see in Figure 6.7, region is a partial proxy for income too.

There are three clusters: (1) Students with disability (SWD) have a unique set of challenges, and their trend line is distinct from the other subgroups; (2) black (BU7), Native American (AM7), Hispanic (HI7), multiple races (MU7), and Economically disadvantaged (EDS); and (3) Asian (AS7) and white (WH7). These categories are overlapping: a student may belong to two or more subgroups.
The composite “All” subgroup is closest to the white (WH7) grouping, since in the majority of schools the white population is a large majority.

6.4.1 Score by median income (school district aggregation)

Figure 6.6: School performance score by median income

The patterns generally hold up when faceting by region.

Figure 6.7: School performance score by median income by region

6.4.2 Score by per capita income (school district aggregation)

The picture is different when looking at per capita income (PCI) instead of median income. In contrast to median income, (1) PCI varies by subgroup; and (2) there is less overlap in income ranges. Note that the PCI universe is not bounded by age, so subgroups that on average have larger families will have lower PCI. These differences in PCI by subgroup may be confounded by the mix of education levels in each group (data that would confirm this is not at hand).

Figure 6.8: School performance score for subgroups by per capita income

The patterns generally hold up when faceting by region, however the trend lines do not; likely this is due to the narrower income bands of the subgroups.

Figure 6.9: School performance score for subgroups by per capita income by region

6.5 Quantifying associations (school district aggregation)

Linear regression estimates quantify associations among variables. Can we find any terms in addition to years of education \(years\_ed\_centered\) that are strongly associated with median \(income\)? Yes.

A basic linear regression formula is of the form \[dependent\_variable \sim independent\_var_1 + independent\_var_2\]

Regression models that use \(subgroup\) as a term provide an estimate (for each subgroup) of the difference it makes after considering the reference subgroup, which does not show up in the list of regression estimates. In this case the reference subgroup is “AM7” American Indian and Native Alaska (aka “native_am”).

How much of the variance in the dependent variable can be explained by the independent variables? I use \(adjusted\ R^2\) as the metric to evaluate various linear regressions. Simpler explanations (i.e., fewer independent variables) are better than more complex ones and are less likely to overfit the data.

6.5.1 Summary: strongest associations

The strongest associations are the following:

Median income (\(income\)) with years of education (\(years\_ed\_centered\)). \(R^2 = 82\%\)
Per capita income (\(pci\)) with \(subgroup\). \(R^2 = 52\%\)
School performance score (\(spg\_score\)) with \(subgroup\). \(R^2 = 52\%\)

The details are below.

6.5.2 Median income

The baseline model \(income \sim years\_ed\_centered\) explains 82% of the variance, and adding \(subgroup\) or \(n\_student\) doesn’t help much. Neither \(region\) or \(subgroup\) are useful. The bars show 90% confidence intervals.

Figure 6.10: Models predicting median income

6.5.3 Per capita income

\(pci\) has the potential to be interesting, since it varies by racial \(subgroup\). Do we see it in the numbers? Yes. The baseline model \(pci \sim subgroup\) explains 52% of the variance, and adding \(n\_student\) doesn’t help much. \(years\_ed\_centered\) is not useful.

6.5.4 School performance score

Can we find meaningful associations with school performance grade score? Yes, but like \(pci\), not as strongly as median \(income\).

A very simple model is again the best: \(spg\_score \sim subroup\) explains 52% of the variance.

Here we see the same three clusters of subgroups as noted in the plots in Section 6.4.

What about \(pci\) instead of median \(income\)? \(R^2\) is 0.31 and adding \(subgroup\) raises \(R^2\) to 0.42, which is less explanatory than using \(subgroup\) alone.

Figure 6.12: Models predicting spg_score

What if we look only at economically disadvantaged students (EDS)? There is a weak association: the higher the percentage of economically disadvantaged students in a school, the lower the \(spg\_score\). As seen in Models and model parameters the model \(spg\_score \sim pct\_student\) using the data_grouping EDS explains only 4% of the variance in \(spg\_score\)

Figure 6.13: Considering only the economically disadvantaged student subgroup

6.5.5 Models and model parameters

The table below lists regression models and their key parameters. Some notes:

The mean years of education is 14.2 based on the informal mapping I created (see Years of education (school district aggregation)); \(years\_ed\_centered\) is the number of years +/- from this average. Thus when using \(years\_ed\_centered\) in a regression model, for example \(income \sim years\_ed\_centered\), the intercept indicates the income at the mean years of education.
Regression models that use \(subgroup\) as a term provide an estimate (for each subgroup) of the difference it makes after considering the reference subgroup, which does not show up in the list of regression estimates. In this case the reference subgroup is “AM7” American Indian and Native Alaska (aka “native_am”).
\(n\_student\) is the number of students in a school taking the end of course (EOC) exams. In addition to using it in some regressions, it provides a relative size of the schools and is helpful when calculating district-wide summaries using weighted means.

mod_id	data_grouping	term	estimate	p.value	conf.low¹	conf.high¹	adj.r.squared	sigma	nobs
Model summaries
income ~ years_ed_centered
1	all	(Intercept)	38,373.7	0.00000	38,177.1	38,570.3	0.82	6,069	2581
1	all	years_ed_centered	4,166.9	0.00000	4,103.9	4,230.0	0.82	6,069	2581
income ~ years_ed_centered + n_student
2	all	(Intercept)	36,102.8	0.00000	35,790.0	36,415.5	0.84	5,822	2581
2	all	years_ed_centered	4,165.8	0.00000	4,105.4	4,226.3	0.84	5,822	2581
2	all	n_student	2.9	0.00000	2.6	3.2	0.84	5,822	2581
income ~ region
3	subgroup	(Intercept)	36,976.0	0.00000	36,226.0	37,726.0	0.03	14,101	2581
3	subgroup	regionCoastal	216.8	0.85905	−1,792.0	2,225.7	0.03	14,101	2581
3	subgroup	regionManufacturing	−1,015.3	0.36470	−2,858.1	827.4	0.03	14,101	2581
3	subgroup	regionMountains	−2,250.0	0.01164	−3,716.4	−783.5	0.03	14,101	2581
3	subgroup	regionUrban crescent	4,786.8	0.00000	3,721.3	5,852.2	0.03	14,101	2581
7	subgroup	(Intercept)	36,976.0	0.00000	36,226.0	37,726.0	0.03	14,101	2581
7	subgroup	regionCoastal	216.8	0.85905	−1,792.0	2,225.7	0.03	14,101	2581
7	subgroup	regionManufacturing	−1,015.3	0.36470	−2,858.1	827.4	0.03	14,101	2581
7	subgroup	regionMountains	−2,250.0	0.01164	−3,716.4	−783.5	0.03	14,101	2581
7	subgroup	regionUrban crescent	4,786.8	0.00000	3,721.3	5,852.2	0.03	14,101	2581
income ~ years_ed_centered + subgroup
4	subgroup	(Intercept)	35,192.1	0.00000	33,905.1	36,479.2	0.83	6,061	10639
4	subgroup	years_ed_centered	4,310.9	0.00000	4,279.9	4,341.9	0.83	6,061	10639
4	subgroup	subgroupAS7	6,898.9	0.00000	5,482.8	8,315.0	0.83	6,061	10639
4	subgroup	subgroupBL7	3,751.3	0.00000	2,441.7	5,060.9	0.83	6,061	10639
4	subgroup	subgroupEDS	3,193.5	0.00006	1,890.2	4,496.8	0.83	6,061	10639
4	subgroup	subgroupHI7	3,947.5	0.00000	2,637.7	5,257.4	0.83	6,061	10639
4	subgroup	subgroupMU7	5,107.4	0.00000	3,755.4	6,459.4	0.83	6,061	10639
4	subgroup	subgroupSWD	3,646.0	0.00000	2,336.4	4,955.6	0.83	6,061	10639
4	subgroup	subgroupWH7	3,346.7	0.00002	2,042.7	4,650.6	0.83	6,061	10639
income ~ years_ed_centered + subgroup + n_student
5	subgroup	(Intercept)	34,175.5	0.00000	32,900.2	35,450.7	0.84	5,986	10639
5	subgroup	years_ed_centered	4,310.6	0.00000	4,280.0	4,341.2	0.84	5,986	10639
5	subgroup	subgroupAS7	7,346.3	0.00000	5,947.1	8,745.6	0.84	5,986	10639
5	subgroup	subgroupBL7	3,593.5	0.00000	2,300.0	4,887.0	0.84	5,986	10639
5	subgroup	subgroupEDS	2,676.9	0.00063	1,388.6	3,965.1	0.84	5,986	10639
5	subgroup	subgroupHI7	4,158.7	0.00000	2,864.9	5,452.6	0.84	5,986	10639
5	subgroup	subgroupMU7	5,813.2	0.00000	4,476.0	7,150.3	0.84	5,986	10639
5	subgroup	subgroupSWD	4,152.2	0.00000	2,857.9	5,446.6	0.84	5,986	10639
5	subgroup	subgroupWH7	2,631.1	0.00079	1,341.3	3,920.9	0.84	5,986	10639
5	subgroup	n_student	4.1	0.00000	3.7	4.5	0.84	5,986	10639
income ~ subgroup
6	subgroup	(Intercept)	35,192.1	0.00000	32,058.3	38,326.0	0.00	14,757	10639
6	subgroup	subgroupAS7	6,898.9	0.00100	3,450.9	10,346.9	0.00	14,757	10639
6	subgroup	subgroupBL7	3,746.8	0.05328	558.0	6,935.5	0.00	14,757	10639
6	subgroup	subgroupEDS	3,189.1	0.09833	15.7	6,362.6	0.00	14,757	10639
6	subgroup	subgroupHI7	3,935.2	0.04241	745.8	7,124.6	0.00	14,757	10639
6	subgroup	subgroupMU7	5,107.4	0.01072	1,815.4	8,399.3	0.00	14,757	10639
6	subgroup	subgroupSWD	3,643.0	0.06022	454.3	6,831.6	0.00	14,757	10639
6	subgroup	subgroupWH7	3,351.2	0.08253	176.3	6,526.2	0.00	14,757	10639
pci ~ years_ed_centered
8	all	(Intercept)	29,001.7	0.00000	28,809.8	29,193.5	0.00	5,957	2610
8	all	years_ed_centered	0.0	1.00000	−61.4	61.4	0.00	5,957	2610
pci ~ subgroup
9	subgroup	(Intercept)	18,014.3	0.00000	16,695.1	19,333.6	0.52	6,212	4955
9	subgroup	subgroupAS7	12,975.5	0.00000	11,524.0	14,427.1	0.52	6,212	4955
9	subgroup	subgroupBL7	3,379.7	0.00003	2,037.6	4,721.7	0.52	6,212	4955
9	subgroup	subgroupMU7	−1,733.0	0.03964	−3,118.2	−347.7	0.52	6,212	4955
9	subgroup	subgroupWH7	14,715.8	0.00000	13,379.4	16,052.2	0.52	6,212	4955
pci ~ subgroup + n_student
10	subgroup	(Intercept)	16,721.9	0.00000	15,431.8	18,012.1	0.54	6,045	4955
10	subgroup	subgroupAS7	13,544.4	0.00000	12,130.7	14,958.0	0.54	6,045	4955
10	subgroup	subgroupBL7	3,178.6	0.00006	1,872.5	4,484.8	0.54	6,045	4955
10	subgroup	subgroupMU7	−832.8	0.31058	−2,183.8	518.2	0.54	6,045	4955
10	subgroup	subgroupWH7	13,816.0	0.00000	12,512.4	15,119.5	0.54	6,045	4955
10	subgroup	n_student	5.2	0.00000	4.6	5.7	0.54	6,045	4955
spg_score ~ income
11	all	(Intercept)	71.4	0.00000	70.2	72.7	0.00	14	2581
11	all	income	0.0	0.08858	0.0	0.0	0.00	14	2581
spg_score ~ subgroup
12	subgroup	(Intercept)	56.7	0.00000	54.0	59.3	0.52	12	10665
12	subgroup	subgroupAS7	25.5	0.00000	22.6	28.4	0.52	12	10665
12	subgroup	subgroupBL7	2.9	0.07822	0.2	5.5	0.52	12	10665
12	subgroup	subgroupEDS	6.9	0.00002	4.2	9.6	0.52	12	10665
12	subgroup	subgroupHI7	6.3	0.00010	3.7	9.0	0.52	12	10665
12	subgroup	subgroupMU7	6.0	0.00038	3.2	8.7	0.52	12	10665
12	subgroup	subgroupSWD	−19.3	0.00000	−22.0	−16.6	0.52	12	10665
12	subgroup	subgroupWH7	21.2	0.00000	18.6	23.9	0.52	12	10665
spg_score ~ subgroup + n_student
13	subgroup	(Intercept)	57.7	0.00000	55.1	60.4	0.52	12	10665
13	subgroup	subgroupAS7	25.0	0.00000	22.1	27.9	0.52	12	10665
13	subgroup	subgroupBL7	3.0	0.06190	0.4	5.7	0.52	12	10665
13	subgroup	subgroupEDS	7.4	0.00000	4.8	10.1	0.52	12	10665
13	subgroup	subgroupHI7	6.1	0.00016	3.5	8.8	0.52	12	10665
13	subgroup	subgroupMU7	5.2	0.00183	2.5	8.0	0.52	12	10665
13	subgroup	subgroupSWD	−19.8	0.00000	−22.5	−17.2	0.52	12	10665
13	subgroup	subgroupWH7	22.0	0.00000	19.3	24.6	0.52	12	10665
13	subgroup	n_student	0.0	0.00000	0.0	0.0	0.52	12	10665
spg_score ~ pct_student
14	EDS	(Intercept)	71.4	0.00000	69.9	72.8	0.04	13	2370
14	EDS	pct_student	−16.3	0.00000	−19.1	−13.4	0.04	13	2370
spg_score ~ pci
15	subgroup	(Intercept)	44.5	0.00000	43.6	45.4	0.31	13	4955
15	subgroup	pci	0.0	0.00000	0.0	0.0	0.31	13	4955
spg_score ~ pci + subgroup
16	subgroup	(Intercept)	46.9	0.00000	44.4	49.5	0.42	12	4955
16	subgroup	pci	0.0	0.00000	0.0	0.0	0.42	12	4955
16	subgroup	subgroupAS7	18.5	0.00000	15.7	21.2	0.42	12	4955
16	subgroup	subgroupBL7	1.0	0.52798	−1.5	3.5	0.42	12	4955
16	subgroup	subgroupMU7	6.9	0.00001	4.3	9.5	0.42	12	4955
16	subgroup	subgroupWH7	13.3	0.00000	10.7	15.8	0.42	12	4955
¹ Confidendence interval 90%

Figure 6.14: Model results