Case Study · SQL & Visualization

Medicare Hospital SQL Analysis

A 28-query PostgreSQL analysis of 6,000+ Medicare-registered U.S. hospitals — exploring quality ratings, emergency service coverage, and geographic disparities, translated into a 5-dashboard Tableau story.

SQL PostgreSQL Tableau Python Pandas Window Functions CTEs Healthcare Analytics CMS Dataset
6K+Hospitals Analyzed
28SQL Queries
5Tableau Dashboards
50+States & Territories
Live Dashboard
CMS Hospital Dashboard — Tableau Public
↗ Open full screen
Loading dashboard...
⚡ Fully interactive — navigate all 5 story pages using the tabs above the dashboard · open in Tableau Public for best experience
SQL Query Showcase
Query 1 of 28
⌥ Full notebook
1 / 28
The Problem

Federal healthcare data is publicly available but rarely made accessible. The CMS Hospital General Information dataset covers every Medicare-registered hospital in the U.S. — but in its raw form it's just a flat file. No stakeholder can act on a CSV with 6,000 rows.

The goal was to use SQL to ask the questions a healthcare analyst, policy researcher, or hospital administrator would actually care about: Which states are underserved? Where are emergency services lacking? How does ownership structure shape what types of hospitals exist — and where? Then make those answers interactive.

Approach
01
Data Ingestion & Cleaning
Loaded the CMS Hospital General Information dataset into PostgreSQL. Used Python and Pandas to scaffold and clean the data — handling missing ratings, standardizing state codes, and preparing the table for SQL analysis.
PostgreSQLPandasData CleaningPython
02
28-Query SQL Analysis
Built a structured notebook of 28 interview-ready SQL queries covering the full spectrum — from basic aggregations to window functions, CTEs, pivot-style queries, set operations, and string/date functions. Each query is framed around a real stakeholder question.
Window FunctionsCTEsPivot QueriesLAG / LEADSet Operations
03
Geographic & Ownership Analysis
Aggregated hospitals by state, type, and ownership to surface geographic disparities. Calculated each state's deviation from the national average in emergency hospital count — revealing which states are structurally underserved.
GROUP BYDeviation AnalysisGeographic Aggregation
04
5-Dashboard Tableau Story
Translated SQL outputs into a 5-page interactive Tableau story — a choropleth quality map, emergency coverage rankings, diverging deviation bars, type/state treemap, and an ownership cross-tab matrix. Designed for non-technical stakeholders.
TableauStory PointsChoropleth MapDashboard Design
Key Findings
Texas (459) and California lead in Medicare-registered hospital count — but their emergency coverage rates vary widely, showing that volume alone doesn't equal access.
Rural and tribal hospitals are significantly less likely to offer emergency services, highlighting a structural gap in healthcare access that federal data makes visible but policy rarely addresses.
Private non-profit hospitals dominate most facility types with 1,472 acute care facilities alone — while government-run hospitals concentrate in psychiatric and critical access care.
Utah leads in average quality rating (4.29), while the U.S. Virgin Islands sits at the bottom (1.0 average across 2 hospitals) — a disparity that points to resource allocation gaps in territories.
The deviation analysis shows Texas (+295) and California (+201) far exceed the national average in emergency hospitals, while states like Arkansas (−11) and Arizona fall well below — actionable insight for healthcare infrastructure planning.
Business Implication

This project demonstrates the full analyst workflow: raw federal data → structured SQL questions → visual story for decision-makers. The 28 queries aren't just analysis — they're a reusable template for any healthcare stakeholder asking the same questions about access, quality, and coverage.

For a policy team or hospital network, the geographic deviation analysis alone could inform where to prioritize infrastructure investment, which states need emergency capacity, and how ownership type shapes care availability — all derived from a single public dataset with SQL.