<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>Katie Johnson</title>
 <link href="https://katiejohnson.github.io/atom.xml" rel="self"/>
 <link href="https://katiejohnson.github.io/"/>
 <updated>2026-07-04T16:53:10-07:00</updated>
 <id>https://katiejohnson.github.io</id>
 <author>
   <name>Katie Johnson</name>
   <email>thisiskatiejohnson@proton.me</email>
 </author>

 
 <entry>
   <title>Remote Sensing–ML Approach for Household Wealth Index Estimation</title>
   <link href="https://katiejohnson.github.io/2025/11/01/wealth-index-estimation/"/>
   <updated>2025-11-01T00:00:00-07:00</updated>
   <id>https://katiejohnson.github.io/2025/11/01/wealth-index-estimation</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/images/projects/wealth-index-estimation.png&quot; alt=&quot;Median Household Income by Block Group and Dasymetric Downscaled Median HH Income, Rhode Island&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;U.S. Census Bureau — xD / SEHSD / AI and Global Development Lab&lt;/strong&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;the-problem&quot;&gt;The Problem&lt;/h3&gt;

&lt;p&gt;The Census Bureau’s mission is to serve as the nation’s leading provider of quality data about its people and economy — but existing tools for understanding economic wellbeing at fine spatial resolution are constrained by fundamental survey design limits. Traditional small area estimation methods become statistically unreliable below the block group level, carry high compliance overhead, and can take years to reflect ground conditions.&lt;/p&gt;

&lt;p&gt;Existing wealth measures capture only part of the picture. The Urban Institute’s True Cost of Economic Security (TCES) — which compares a family’s total costs (housing, health care, food, transportation, child care, savings, student debt, taxes, and other costs) against total resources — reveals that even families with meaningful incomes may not achieve genuine economic security when the full cost of living is accounted for. This cost-resource gap varies dramatically by geography, but current data tools cannot resolve it below the county or state level. The result: economic development interventions are targeted with a blunt instrument when a scalpel is needed.&lt;/p&gt;

&lt;p&gt;The map produced in this project makes the problem legible at a finer grain. Rhode Island’s 2023 ACS 5-year block group income data — one of the primary ground truth inputs — reveals stark spatial inequality even within a small state: dense urban cores (Providence and surrounding areas) show median household incomes concentrated below $100K and in some blocks below $50K, while suburban and coastal areas reach $150K–$250K. These differences matter enormously for families trying to meet basic costs, yet they are invisible to policy tools operating at the county level.&lt;/p&gt;

&lt;p&gt;The core observation motivating this project: from Earth’s orbit, it’s possible to observe economic development directly — through construction and maintenance of housing, farmland, roads, and infrastructure. A temporally aware model tracking this development over time can estimate not just current wealth levels but trajectories of change — something surveys fundamentally cannot do between collection cycles.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;the-approach&quot;&gt;The Approach&lt;/h3&gt;

&lt;p&gt;This project developed a geo-temporal EO-ML pipeline to predict average material wealth at the 1km grid level, designed to support tagging of social and economic policies at the neighborhood level. The wealth index is framed as a gap measure — absolute income minus relative cost of living (using TCES or Self-Sufficiency Standard as the relative cost denominator) — making outputs meaningful across different regional contexts. The work proceeded through two complete phases, with GEE satellite image ingestion as the planned next step.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Phase 1 — Survey data preparation and visualization (R)&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Phase 2 — Survey data formatting for model input (Python)&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Phase 3 — Satellite image export (configured, next step)&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Disclosure risk was addressed throughout using xD pre-mortem frameworks.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;what-it-makes-possible&quot;&gt;What It Makes Possible&lt;/h3&gt;

&lt;p&gt;The Rhode Island block group map produced in this work is itself a demonstration of the problem: meaningful income inequality is visible at the neighborhood level, but existing federal data products can’t track how it changes over time or predict where it’s heading. A completed version of this pipeline would produce historical, present, and projected wealth index estimates at 1km resolution — enabling:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Program targeting by NGOs and governments&lt;/li&gt;
  &lt;li&gt;Public infrastructure planning&lt;/li&gt;
  &lt;li&gt;Disaster preparedness&lt;/li&gt;
  &lt;li&gt;Causal analysis of policy treatment effects on wealth trajectories&lt;/li&gt;
  &lt;li&gt;Health and workforce development planning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The TCES framework provides a particularly powerful lens for interpreting outputs: a neighborhood where incomes appear adequate at the county level may still show a persistent cost-resource gap when housing, child care, student debt, and savings needs are fully accounted for. This project was designed to make that gap visible at the spatial resolution where interventions can actually be made.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;what-i-learned--whats-next&quot;&gt;What I Learned / What’s Next&lt;/h3&gt;

&lt;p&gt;Implementation reached a meaningful and reproducible milestone: a complete data preparation pipeline producing model-ready input data across three survey vintages, a validated block group income visualization, and a configured GEE exporter ready to run. The work that remains — satellite export, model training, and validation against held-out Rhode Island data — is well-scoped and picks up from clear documentation.&lt;/p&gt;

&lt;p&gt;The broader open questions are institutional: at what confidence interval does this approach become appropriate for official federal statistics, and which downstream applications are ready to use outputs at the current validation level?&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;stack--methods&quot;&gt;Stack &amp;amp; Methods&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Languages &amp;amp; Libraries:&lt;/strong&gt; R (tidycensus, terra, sf, exactextractr, sfarrow, ggplot2, dplyr), Python (pandas, numpy, ee, configparser, multiprocessing)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data &amp;amp; Infrastructure:&lt;/strong&gt; Google Earth Engine (Landsat 5/7/8, high-volume endpoint), LandSCAN Global population raster, ACS PUMS 5-year estimates (B19013)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Methods:&lt;/strong&gt; Dasymetric downscaling, TCES/Self-Sufficiency Standard wealth gap framing, LSTM/ResNet-18 architecture (Daoud/Adel-Petterson), Pearson’s r² evaluation, xD/Data &amp;amp; Society disclosure pre-mortem&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scope:&lt;/strong&gt; Rhode Island PoC — 11,926 grid cells, 2013/2018/2023 vintages&lt;/p&gt;
</content>
 </entry>
 

</feed>
