Research In Progress

Kim, B. H. Park, J., Lo, P., Baker, D., Wong, N., Breen, S., Truong, H., Zheng, J, Rosinger, K., & Poon, O. (2024). Inequity and College Applications: Assessing Differences and Disparities in Letters of Recommendation from School Counselors with Natural Language Processing. Work generously supported by the Bill and Melinda Gates Foundation. [Open-access working paper version]

    Letters of recommendation from school counselors are required to apply to many selective colleges and universities. Still, relatively little is known about how this non-standardized component may affect equity in admissions. We use cutting-edge natural language processing techniques to algorithmically analyze a national dataset of over 600,000 student applications and counselor recommendation letters submitted via the Common App platform. We examine how the length and topical content of letters (e.g., sentences about Personal Qualities, Athletics, Intellectual Promise, etc.) relate to student self-identified race/ethnicity, sex, and proxies for socioeconomic status. Paired with regression analyses, we explore whether demographic differences in letter characteristics persist when accounting for additional student, school, and counselor characteristics, as well as among letters written by the same counselor and among students with comparably competitive standardized test scores. We ultimately find large and noteworthy naïve differences in letter length and content across nearly all demographic groups, many in alignment with known inequities (e.g., many more sentences about Athletics among White and higher-SES students, longer letters and more sentences on Personal Qualities for private school students). However, these differences vary drastically based on the exact controls and comparison groups included – demonstrating that the ultimate implications of these letter differences for equity hinges on exactly how and when letters are used in admissions processes (e.g., are letters evaluated at face value across all students, or are they mostly compared to other letters from the same high school or counselor?). Findings do not point to a clear recommendation whether institutions should keep or discard letter requirements, but reflect the importance of reading letters and overall applications in the context of structural opportunity. We discuss additional implications and possible recommendations for college access and admissions policy/practice.

Kim, B. H. (2022). What’s in a Letter? Using Natural Language Processing to Investigate Systematic Differences in Teacher Letters of Recommendation. Work generously supported by the National Academy of Education and Spencer Foundation Dissertation Fellowship. [Open-source codebase via GitHub] [Pre-analysis plan] [Dissertation manuscript]

    While scholars have already uncovered many ways that inequities can manifest across the postsecondary application portfolio – from standardized tests to advanced course-taking opportunities – we know almost nothing about whether teacher letters of recommendation also present differential barriers to students’ college aspirations. This blind spot is especially concerning given mounting evidence that recommendation letters in other contexts can contain biased language, that teachers can form biased perceptions of their students’ abilities, and that narrative application components more generally may contribute to racial discrimination in selective college admissions. In this paper, I conduct the first system-wide, large-scale text analysis of teacher recommendation letters in U.S. postsecondary applications using data from 1.6 million students, 540,000 teachers, and 800 postsecondary institutions. I use sophisticated natural language processing methods to examine the prevalence of potential inequities within these letters: whether students are described by teachers in systematically different ways across race and gender groups, even after accounting for salient confounding factors like student academic and extracurricular qualifications, teacher fixed effects, and institution fixed effects. I find evidence of salient linguistic differences in letters across gender, but less evidence for differences across race – except in the case of highly competitive admissions, where both Black and Asian students tend to have markedly different letters than White students. Moreover, these differences are generally most meaningful in terms of the topical content of letters; differences in terms of the positivity of letters are far smaller in relative magnitudes and thus are less likely to be perceptible in the actual reading of letters. Taken together, these findings have broad implications for the use of recommendation letters in selective admissions, affirmative action policies, and gender diversity in STEM fields.

Peer-Reviewed Publications

Kim, B. H., Bird, K. A., & Castleman, B. L. (2022). Crossing the Finish Line but Losing the Race? Socioeconomic Inequalities in the Labor Market Trajectories of Community College Graduates. Forthcoming at Education Finance and Policy. [Journal article link] [Open-access working paper version]

    Despite decades and hundreds of billions of dollars of federal and state investment in policies to promote postsecondary educational attainment as a key lever for increasing the economic mobility of lower-income populations, research continues to show large and meaningful differences in the mid-career earnings of students from families in the bottom and top income quintiles. Prior research has not disentangled whether these disparities are due to differential sorting into colleges and majors, or due to barriers lower socioeconomic status (SES) graduates encounter during the college-to-career transition. Using linked individual-level higher education and Unemployment Insurance (UI) records for nearly a decade of students from the Virginia Community College System (VCCS), we compare the labor market outcomes of higher- and lower-SES community college graduates within the same college, program, and academic performance level. Our analyses show that, conditional on employment, lower-SES graduates earn nearly $500/quarter less than their higher-SES peers one year after graduation, relative to higher-SES graduate average of $10,846/quarter. The magnitude of this disparity persists through at least three years after graduation. Disparities are concentrated among non-Nursing programs, in which gaps persist seven years from graduation. Our results highlight the importance of greater focus on the college-to-career transition.

Rodriguez-Segura, D. & Kim, B. H. (2021). The Last Mile in School Access: Mapping Education Deserts in Developing Countries. Development Engineering, 100064. [Open-source codebase via GitHub] [Open-access journal article]

    With recent advances in high-resolution satellite imagery and machine vision algorithms, fine-grain geospatial data on population are now widely available: kilometer-by-kilometer, worldwide. In this paper, we showcase how researchers and policymakers in developing countries can leverage these novel data to precisely identify “education deserts” – localized areas where families lack physical access to education – at unprecedented scale, detail, and cost-effectiveness. We demonstrate how these analyses could valuably inform educational access initiatives like school construction and transportation investments, and outline a variety of analytic extensions to gain deeper insight into the state of school access across a given country. We conduct a proof-of-concept analysis in the context of Guatemala, which has historically struggled with educational access, as a demonstration of the utility, viability, and flexibility of our proposed approach. We find that the vast majority of Guatemalan population lives within 3 km of a public primary school, indicating a generally low incidence of distance as a barrier to education in that context. However, we still identify concentrated pockets of population for whom the distance to school remains prohibitive, revealing important geographic variation within the strong countrywide average. Finally, we show how even a small number of optimally-placed schools in these areas, using a simple algorithm we develop, could substantially reduce the incidence of “education deserts” in this context. We make our entire codebase available to the public – fully free, open-source, heavily documented, and designed for broad use – allowing analysts across contexts to easily replicate our proposed analyses for other countries, educational levels, and public goods more generally.

Working Papers

Bartanen, B., Kwok, A., Avitabile, A., & Kim, B. H. (2023). Why Do You Want to Be a Teacher? A Natural Language Processing Approach. [Open-access working paper]

    Heightened concerns about the health of the teaching profession highlight the importance of studying the early teacher pipeline. This exploratory, descriptive paper examines preservice teachers’ (PST) expressed motivation for pursuing a teaching career and its relationship with PST characteristics and outcomes. Using data from one of the largest teacher education programs in Texas, we use a natural language processing algorithm to categorize into topical groups roughly 2,800 essay responses to the prompt, “Explain why you decided to become a teacher.” We identify 11 topics that largely reflect altruistic and intrinsic (though not extrinsic) reasons for teaching. The frequency of motivation topics varied substantially by PST gender, race/ethnicity, and certification area. While topics collectively explained little of the variance in PST outcomes, we found preliminary evidence that intrinsic enjoyment of teaching and prior experiences with adversity predicted higher performance during clinical teaching and lower attrition as a full-time K–12 teacher.

Park, J., Kim, B. H., Wong, N., Zheng, J, Breen, S., Lo, P., Baker, D., Rosinger, K., Nguyen, M. H., & Poon, O. (2023). Inequality Beyond Standardized Tests: Trends in Extracurricular Activity Reporting in College Applications Across Race and Class. [Open-source codebase via GitHub] [Open-access working paper]

    Inequality related to standardized tests in college admissions has long been a subject of discussion; less is known about inequality in non-standardized components of the college application. We analyzed extracurricular activity descriptions in 5,967,920 applications submitted through the Common Application platform. Using human-crafted keyword dictionaries combined with text-as-data (natural language processing) methods, we found that White, Asian American, high-SES, and private school students reported substantially more activities, more activities with top-level leadership roles, and more activities with distinctive accomplishments (e.g., honors, awards). Disparities decrease when accounting for other applicant demographics, school fixed effects, and standardized test scores. Still, salient differences remain, especially those related to first-generation applicants. Implications and recommendations for college admissions policy and practice are discussed.

Kim, B. H.Meyer, K., & Choe, A. (2023). Gauging Engagement: Measuring Student Response to a Large-Scale College Advising Field Experiment. [Open-source codebase via GitHub] [Open-access working paper]

    Interactive, text message-based advising programs have become an increasingly common strategy to support college access and success for underrepresented student populations. Because text conversations between students and advisors are flexible and responsive to student input, students engaged in advising interventions of this kind may experience different treatments from one another. Given the unstructured, textual nature of these interactions, it has historically been difficult to characterize this variation. In this paper, we revisit data from a large-scale text advising experiment designed to improve college completion and measure treatment variation using automated text analysis techniques. We examine text interactions between a student and their advisor using natural language processing to quantify variation in the intensity (e.g., number and length of student replies), tone (e.g., positivity/negativity), and topics (e.g., financial aid) of messages. Our results reveal substantial variation in sentiment and topics discussed among student- or advisor-initiated messages (e.g., non-scheduled), but little variation in the scheduled messages. These findings highlight the potential for treatment variability to increase as advising models encourage greater personalization or advisor agency, and demonstrate the importance of measuring such treatment variation to better understand program implementation across sites and students.

Technical Reports & Policy Briefs

Kim, B. H., Armstrong, E., Freeman, M., Hughes, R., Kajikawa, & Nolan, S. (2024). First-generation Status in Context. Common App Research Briefs. [Research brief part one] [Research brief part two] [Research brief part three] [Press: Inside Higher Ed] [Press: The Chronicle of Higher Education] [Press: The Hechinger Report]

    As policymakers and the public continue to lean on colleges as important engines for socioeconomic mobility and opportunity in our society, it is increasingly crucial to ask: how can we ensure that these institutions are accessible to students and families who have limited or no college exposure in their background? While supporting “first-generation” students has become an increasing policy and programmatic priority across the United States, organizations can differ widely in terms of who they actually mean when they talk about first-generation students, and, commensurately, what the actual accessibility needs of this population are. Through this three-part research series, we take a deep dive into first-generation status, parental education, and a host of related student characteristics. At the center of this examination are nearly a decade of application data for over 9 million domestic applicants from the Common App data warehouse. Across these three briefs, we ask the following primary questions: How have key components for defining first-generation status, like household structure, parental degree attainment, and related family structure details, changed over time? How does the exact definition of first-generation being used change who is considered a part of this population? And, finally: What more can we learn about applicants’ college readiness, socioeconomic status, and application behaviors when we look at finer-grain combinations of parental educational attainment groups versus the binary of first-generation and continuing-generation? We ultimately show that first-generation status and parental education backgrounds are exceptionally complex and nuanced constructs that require commensurate care, transparency, and intentionality in their use.

Kim, B. H., Freeman, M., Kajikawa, T., Karimi, H., & Magouirk, P. (2022). Unpacking Applicant Race and Ethnicity. Common App Research Briefs. [Research brief part one] [Research brief part two] [Press: Higher Ed Dive] [Press: Inside Higher Ed]

    Improving access, equity, and integrity in the college admissions process is the core mission of the Common App. Critical to understanding our progress in this mission is measuring how different students – especially those from diverse racial and ethnic backgrounds – access and navigate the complex college application process. In this research brief series, we use the detailed race and ethnicity data that first-year domestic applicants submit through the Common App to offer one of the most nuanced examinations of demographic trends in college applications to date. Whereas past analyses tend to use industry-standard race and ethnicity categories as defined by the U.S. Office of Management and Budget and used by the U.S. Census Bureau (e.g., exclusively White, Black or African American, Hispanic/Latinx, Two or More Races, etc.) that simplify and conceal nuance, we are able to unpack these groupings further: for example, applicants who identify as Asian are further invited to describe their background as Cambodian, Chinese, Malaysian, and so on, while applicants who identify as Black or African American can further describe their background as African, Caribbean, and more. We ultimately use these data to shed light on how industry-standard racial/ethnic categories can: (a) conceal the importantly distinct populations within these standard categories (e.g., within the Hispanic/Latinx group), (b) conceal how the composition of distinct groups within each category of applicant race/ethnicity is constantly shifting in meaningful ways over time and across regions, and (c) conceal the fact that applicants’ individual resources (e.g., low-income status and household income estimates), college readiness (e.g. reported GPA and standardized test scores) and application behaviors (e.g., number of applications sent and selectivity of institutions applied to) within these groups can differ radically across the more detailed background groupings. These observations ultimately highlight some of the issues inherent in simplifying applicants’ race/ethnicity into standard categories and emphasize the critical importance of incorporating greater context about applicants’ identities whenever possible.

Kim, B. H., Castleman, B. L., Song, Y., & Choe, A. (2022). New Strategies to Support Career Entry for Community College Graduates: Augmenting Intensive Career Advising Services with a Novel Job Recommendation Algorithm and Machine Learning. Work generously supported by the Ascendium Education Group. [Project launch press release] [Mid-project technical update] [Final project report] [Open-source codebase via GitHub]

    Individuals with college degrees experience better average labor market outcomes than non-degree holders, especially in times of economic downturn; the labor market premia associated with a college degree is particularly stark in the midst of the economic fallout from the COVID-19 health crisis. But that said, research continues to show large and meaningful differences in the mid-career earnings of college students from higher- and lower-income families (e.g., Chetty et al., 2017). Such disparities in economic well-being among students completing college and even graduating from the same college with the same GPA (Kim et al., 2022) raise a fundamental question: Are investments to increase degree attainment among lower-income students sufficient to narrow longer-run economic inequality, or are investments to ameliorate the barriers that graduates encounter after college also necessary to ensure positive labor market outcomes and upward economic mobility? Much like how policy and practice coalesced around the importance of supporting historically underserved students through the complex decision-making process of college applications, we anticipate similar interventions will be necessary to support these same students through the complex job market for college graduates. In this project, we developed (but ultimately did not implement) a large-scale, intrusive career advising intervention for community college students: one that leverages rich longitudinal workforce data and machine learning/predictive analytics to guide students towards available jobs relevant to their degree and with a track record of high-paying wages for similarly qualified graduates. By pairing these job recommendations with intensive career advising, we also intended to provide students explicit support in navigating the nuances and informal expectations of the college graduate job market. While we ultimately opted not to implement the intervention, our final algorithm code is available open-source, and our final report articulating our decision-making, design process, and final evaluation of the algorithm (including assessments for algorithmic bias) will be available soon.

Kim, B. H. (2021). Supporting Students at Any Cost? Examining the Dynamics of Teacher Out-of-Pocket Spending, Student Demographics, and Teacher Autonomy. [Pre-analysis plan] [Open-source codebase via GitHub] [Research report]

    Nearly every public school teacher in the country regularly spends their own personal funds to purchase classroom supplies, with amounts ranging from tens of dollars to well over a thousand each year. Past descriptive work on the subject suggests that teachers are often attempting to support students in ways their pre-existing school budgets either can’t or won’t, indicating that higher teacher out-of-pocket spending may be a useful proxy to understand the degree of student need otherwise going unmet in our classrooms. In this report, I explore this link further by examining the relationship between teacher out-of-pocket spending, student race/ethnicity, and self-reported teacher autonomy over classroom instruction and materials, with data from the NCES Schools and Staffing Survey. I find that as the share of racial/ethnic minority students in a school increases, teacher spending also increases, and this relationship holds when accounting for factors like school urbanicity, teacher experience, and interactions with the share of students qualifying for free and reduced-price lunch. For example, teachers in schools with 75-100% racial/ethnic minority students spend about $130 more per year than peer teachers in schools with 0-24% racial/ethnic minority students – an approximately 31% difference. Indeed, the results offer suggestive evidence that the link between student race/ethnicity and teacher spending is more influential than the well-studied link between student poverty and teacher spending. I also show that higher levels of teacher autonomy are negatively associated with teacher spending at a comparable magnitude, independent of student demographics – in other words, that higher levels of teacher autonomy over classroom supplies predicts substantially lower teacher spending. Altogether, these results offer additional evidence that teacher spending may represent a useful proxy for unmet student need, and that teachers in schools with greater shares of racial/ethnic minority students and lower autonomy may struggle the most to deliver the high-quality instruction they strive towards.

Castleman, B. L., Bird, K. A., & Kim, B. H. (2019). Pathways to Success: Analyzing Program-Level Heterogeneity in Labor Market Outcomes for a State Community College System. Working paper available upon request.

    Despite a significant body of evidence demonstrating program-level heterogeneity in the wage returns to a community college degree, we currently know little about the extent of program-level heterogeneity in non-wage labor market outcomes for community college graduates. We build on an existing literature by investigating the degree of institution- and program-level heterogeneity across several measures of employability, employment stability, and earnings for graduates of a large state community college system. We further examine whether the relative performance of colleges and programs are sensitive to the specific labor market metric we consider. Our descriptive results indicate substantial changes in the rank ordering of institutions or programs based on the labor market metric we employ. These findings demonstrate the importance of considering–and potentially increasing public sharing of– a broader range of labor market outcomes when assessing community college institutions or program returns.

Additional Research

Kim, B. H. (2020). Assessing the Role of Class Size Restrictions in Mitigating Community College Student COVID-19 Exposure through Student Network Analysis. Internal research report unavailable to public; please reach out for more information.

    At the outset of the COVID-19 pandemic, higher education institution leaders were faced with difficult decisions about whether and how to operate their academic programs safely given incomplete information on the dangers, infectiousness, and transmission vectors of coronavirus at the time. As leaders look ahead to future semesters, class size restrictions are an increasingly attractive policy option for institutions seeking to maintain some level of in-person instruction. Building on work by Cornwell and Weeden (2020), I use network analysis with students’ course-taking patterns to assess the extent to which varying in-person class size restrictions meaningfully reduces the connectedness of students’ in-person interaction networks for a large state community college network at the campus-by-campus level. I moreover investigate the extent to which varying class size restrictions impacts students’ ability to attend any in-person classes at all, with specific interest in how students of various demographics are differentially forced entirely online by such class size restriction regimes. My main conclusion is that class size limitations would need to be far more aggressive in this context — allowing in-person meetings only for classes of approximately 25 students or fewer — than most of the popular policies being considered to meaningfully reduce students’ exposure to one another through coursework. I also find no evidence of concerning patterns in students being forced entirely online along several salient lines of equity: race/ethnicity, sex, and first-generation status. In other words, class size limitations seem to impact students’ course-taking modality similarly regardless of their demographics on these dimensions.

Kim, B. H., & Castleman, B. L. (2020). Can Predictive Analytics Improve the Efficiency of High-Cost Interventions? Evidence from an Intensive College Advising Program. Internal research report unavailable to public; please reach out for more information.

    Education leaders seeking to improve equity in their institutions are often caught in an intractable bind: evidence-based interventions that successfully support improved outcomes among historically underserved students are often logistically intensive to implement and extremely expensive on a per-pupil basis, and budgets often don’t permit broad access to these programs as a result. While “nudge” style interventions were slated to help leaders at least partially resolve this cost-effectiveness quandary, recent evidence has found that maintaining the efficacy of these interventions is difficult as programs scale. Advances in predictive analytics and machine learning techniques now offer another possible solution: implement these costly interventions, but target them only to those students who stand to most benefit from them. In this report, we evaluate the effectiveness of one such intervention at a large public university system that provided more intensive college advising resources to students predicted to be at higher risk of drop out by a machine learning algorithm. Because students were assigned a continuous risk score and then categorized into discrete risk groups (e.g., highest risk, high risk, etc.) based on strict score thresholds, we employ a regression discontinuity design to evaluate the effectiveness of these additional intensive advising supports at each risk group threshold. Our results reveal that the added intervention supports did not improve student completion, credit accumulation, or grade point accumulation, at any of the investigated thresholds, but we often cannot rule out the presence of meaningful effect sizes due to low precision. Examinations of program take-up measures around each threshold (e.g., number of advising meetings attended) reveals that these null effects are likely driven by the fact that students above each cut off tended not to utilize the additional services available to them.

Kim, B. H. & Castleman, B. L. (2019). Can Earnings Outcomes Drive Student Enrollment to High-Earnings Community College Programs? The Impact of Integrating Earnings Data into a Popular Search Engine Platform. Internal research report unavailable to public; please reach out for more information.

    Enrollment and re-enrollment into postsecondary education can be a highly complex and difficult process for prospective students despite the consistently high economic returns to postsecondary degrees. Even with broad federal efforts to simplify the relevant information (e.g., the U.S. Department of Education’s College Scorecard) and provide additional college advising supports to historically underserved student populations, enrollment among adults without postsecondary degrees remains low. Through a novel partnership with a large state-wide community college network and a popular search engine platform, we develop a data tool that presents search engine users with the mid-career earnings of graduates from high earnings programs at their local community college, alongside relevant links for more information on enrollment, when users search for related terms. Using a differences-in-differences design, we compare the applications and enrollment for “target” programs at participating institutions against similar programs at non-participating institutions to examine whether the roll-out of this data tool impacted public interest in these programs. In brief, we find that the data tool did not produce detectable effects on either application volume nor enrollment rates, driven largely by a relatively diffuse treatment timing window and low general precision given the noisiness and idiosyncratic nature of application and enrollment rates over time.

Kim, B. H. (2019). Pathways to Success: Improving the Transparency of Student Outcomes in a Large State Community College System. Internal research report unavailable to public; please reach out for more information.

    One of the most consequential decisions for a community college student is their choice of major: the best causal evidence we have available shows that, depending on the major, an associate’s degree can increase the yearly earnings of graduates by as much as 103% and as little as 0%. Unfortunately, research also demonstrates that students severely misjudge the earnings associated with different majors, and detailed post-graduation employment outcomes by major tend to be difficult to obtain or otherwise inaccessible to prospective students – these factors combined making it difficult for students to make informed decisions about the critical question of their major. In this analysis, I present four policy alternatives that a large state community college partner could implement to address this lack of outcomes transparency within its network and improve the extent to which students are able to incorporate these valuable data into their enrollment decisions. I go on to estimate the possible repercussions of each of these alternatives in terms of their implementation costs, their utility to students, and their likely influence on student major decisions. I conclude by offering a specific recommendation to expand college advising services with these data, and offer explicit recommendations on how this recommendation could be implemented.

Kim, B. H., & Castleman, B. L. (2018). Exploring Heterogeneous Treatment Effects with Causal Forests: Evidence from a Large-Scale College Advising Nudge Experiment. Internal research report unavailable to public; please reach out for more information.

    In the context of policy research, policymakers are often interested not just in whether a policy works, but for whom it works, more specifically. Moreover, commonly reported average treatment effects can often mask meaningful differences by individual demographics that fundamentally change the value proposition of an intervention measured to be either effective of ineffective on average. Even so, examining these heterogeneous treatment effects can be complicated without strong theoretical priors — especially if interaction effects are likely — given the need to explore many possibilities and thus the potential for finding spurious relationships just by statistical chance. Causal forests (Wager & Athey, 2018) present an attractive potential solution to this problem by leveraging machine learning methods to estimate these heterogeneous treatment effects in an iterative, but principled manner. In this report, I revisit the results of a large-scale college advising nudge experiment that previously found precise null average effects across multiple years of data and tens of thousands of subjects. Using an implementation of causal forests, I explore the extent to which these null effects potentially mask heterogeneous treatment effects across a rich array of student demographic data. In sum, I find that the average null effects were estimated to be broadly applicable across student subgroups, and that heterogeneous treatment effects are unlikely to play a factor in this context.

Kim, B. H., Castleman, B. L., & Song, Y. (2018). Do Coaching Styles Matter for Principal Improvement? An Application of Natural Language Processing Methods to a Principal Improvement Coaching Intervention. Internal research report unavailable to public; please reach out for more information.

    K-12 principals are positioned as critical change-making agents within schools, and the policy research realm is only just beginning to explore their impact on student outcomes, teacher outcomes, and broader community outcomes. As the wave of high-profile teacher quality improvement policies continues in earnest, policymakers are increasingly looking for effective strategies to similarly improve the quality of their principals. Yet to date, we know relatively little about what interventions — if any — can support the development of principals. As part of an impact evaluation for a multi-site principal coaching program, we examine the extent to which specific coaching session content varies meaningfully with eventual principal improvement on a series of principal quality measures. We leverage a natural language processing technique known as word vector cluster analysis to analyze the detailed session notes written by coaches and measure how often coaches focused on varying skills, topics, and leadership frameworks. Our analysis reveals relatively little variation in coaching session content overall across coach-principal pairings, and similarly a lack of relationship between these measures and eventual principal quality outcomes. We conclude by offering recommendations for future applications of these automated content analysis techniques and coaching interventions of this style.