Resources
Resources for public interest data science
Uncommoned Goose draws on a variety of types of data to illustrate the stories and challenges associated with public interest data. This page provides an overview of common and important datasets spanning a variety of areas. Whether you are a student, data scientist, journalist, activist, or just curious, these resources can connect you with data and tools.
By making datasets more approachable, we hope to uncover opportunities for new analysis methods, visualization techniques, and storytelling strategies that work in the public interest.
Data is never neutral but responsible data analysis can advance the public good. We invite you to use and share these resources as tools for inspiration and action.
Email me at brian@uncommonedgoose.com if there is a data resource you would like to see added.
Newsletters
This is a newsletter, but it's not the first or best. Check out some of these newsletters around data storytelling, visualization, and journalism.
- Jeremy Singer-Vine's Data Is Plural
- Data Journalism's Mailbrew
- Walt Hickey's Numlock News
- The Economist's Graphic Detail and Off the Charts
- Data Elixir
- Giuseppe Sollazzo's Quantum of Sollazzo
- Nathan Yau's FlowingData
- Data Science Community Newsletter
- Storybench
Portals
There are many interesting collections of datasets worth bookmarking.
- Data.gov
- Awesome Public Datasets
- ArcGIS Data Hub
- Dariusk's corpora
- ICPSR
- Harvard Dataverse
- FiveThirtyEight's data
- Kaggle
- Google Dataset Search
Tools
These are tools for supporting data analysis and visualization.
- Python
- Analysis: pandas, scipy, scikit-learn, statsmodels, UMAP
- Visualization: matplotlib, seaborn, plotly, altair, bokeh
- Dashboards: Tableau, PowerBI, ObservableHQ,
- Interactives: Flourish, Datawrapper, Highcharts
Datasets
These are datasets organized by theme and are in no particular order.
Politics
- Federal Election Commission
- OpenSecrets
- University of Florida's Election Lab
- MIT's Election Lab
- Supreme Court Database
- Local Open Data Portals: New York City, Chicago, Los Angeles, etc.
Social
Technology
- Google Trends
- Wikipedia API, Analytics, and dumps
- GDELT Project
- GitHub Archive
- Internet Archive
- Reddit API and PRAW
- AT Protocol (Bluesky)
- Mastodon
- StackExchange archive
Environment
- Notre Dame Global Adaptation Initiative
Health
Economics
- International Monetary Fund
- U.S. Bureau of Economic Analysis
- U.S. Bureau of Labor Statistics
- U.S. Federal Reserve Economic Data
- Zillow Housing Data
Criminal Justice
- U.S. Bureau of Justice Statistics
- FBI Uniform Crime Reporting
- Police Data Accessibility
- Prison Policy Initiative
- Eviction Lab
Education
- U.S. Department of Education, NCES, College Scorecard
- UNESCO Institute for Statistics
- Common Data Set
- Chronicle of Higher Education
- NACUBO Historical Endowments
Spatial
- U.S. Census TIGER/Line Shapefiles
- Bureau of Transportation Statistics
- NYC Taxi trip records
- U.S. Housing and Urban Development
Books
These are books that are helpful for data cleaning, visualization, and storytelling.
Cleaning
- Chen, D. (2018). Pandas for Everyone: Python Data Analysis.
- McKinney, W. (2017). Python for Data Analysis.
- Mertz, D. (2021). Cleaning Data for Effective Data Science.
- Osborne, J. (2013). Best Practices in Data Cleaning.
- Vanderplas, J. (2016). Python Data Science Handbook.
Visualization
- Berinato, S. (2016). Good Charts: The HBR Guide to Making Smarter, More Persuasive Data Visualizations.
- Engebretsen, M. (2020). Data Visualization in Society.
- Kirk, A. (2012). Data Visualization: A Successful Design Process.
- Lima, M. (2011). Visual Complexity: Mapping Patterns of Information.
- Meyer, M. & Fisher, D. (2018). Making Data Visual: A Practical Guide to Using Visualization for Insight.
- Schwabish, J. (2021). Better Data Visualizations: A Guide for Scholars, Researchers, and Wonks.
- Tufte, E. (2001). The Visual Display of Quantitative Information.
- Wilke, C. (2019). Fundamentals of Data Visualization.
- Yau, N. (2011). Visualize This: The FlowingData Guide to Design, Visualization, and Statistics.
Storytelling
- Abela, A. (2013). Advanced Presentations by Design: Creating Communication that Drives Action.
- Allchin, C. (2021). Communicating with Data: Making Your Case with Data.
- Andrews, R. (2019). Info We Trust: How to Inspire the World with Data.
- Cairo, A. (2016). The Truthful Art: Data, Charts, and Maps for Communication.
- Cairo, A. (2019). How Charts Lie: Getting Smarter about Visual Information.
- Dykes, B. (2019). Effective Data Storytelling: How to Drive Change with Data, Narrative, and Visuals.
- Jones, B. (2020). Avoiding Data Pitfalls.
- Knaflic, C. (2015). Storytelling with Data: A Data Visualization Guide for Business Professionals.
- Nolan, D. & Stoudt, S. (2021). Communicating with Data: The Art of Writing for Data Science.
- Riche, N., Hunter, C., Diakopoulos, N., & Carpendale, S. (2018). Data-Driven Storytelling.
- Vora, S. (2019). The Power of Data Storytelling.
- Yau, N. (2013). Data Points: Visualization that Means Something.