A simple reminder of how hard it is to collect good data

I’m always harping on how your analysis is only as good as the data you have, and it appears as though I’ve found a soul mate.  In trying to collect the most basic of data (the number of steps he walked each day, using a pedometer), this data nut discovered just how hard it is to collect good, reliable data:

http://flowingdata.com/2007/08/13/my-mission-is-to-collect-basic-data/

Reminder: keep your data safe!

I was blown away by the following information from Attrition.org’s Data Loss Archive and Database (featured on Flowingdata.com):

Top 10 data breaches since 2000

Top 10 data breaches since 2000

Guide to Excel charts

I haven’t had a chance to look through this website thoroughly yet, but on the surface, it looks quite cool:

http://peltiertech.com/WordPress/jons-excel-and-chart-pages/

It is self-described as: “Featuring one of the Internet’s most extensive collections of information, tutorials, tips, and tricks relating to effective and innovative charting in Microsoft Excel, with a few general Excel items as well.”

Walmart takes over the US, one green dot at a time

My friend, Ms. L (a fellow data nut), sent along this wonderful link.  Similar to the power of the gapminder.com software, this dynamic picture maps the US-based growth of Walmart since 1962.   The rapid multiplication of green dots makes it look like a disease spreading across the US… inspires quite a visceral reaction in the viewer.

http://projects.flowingdata.com/walmart/

Easy-access stats on metropolitan areas, courtesy of Harvard

A wide array of indicators available (education, demographics, health, crime, poverty, housing and economic opportunities, etc.), by metropolitan area (including reports for the top 100 MSAs and all 331 MSAs) from Harvard’s School of Public Health:  www.diversitydata.org

One quibble: some of their data is a bit old. For example, for live teen births, they only have 2001-2002 data available. I believe more recent data is available from the National Center for Health Statistics.

They recently published an interesting, useful report (“Children Left Behind”) on how metropolitan areas are failing children.  Nice array of statistics included.

kidsdata.org: stats on Bay Area kids

Just found this excellent data website: kidsdata.org, sponsored by the Lucille Packard Foundation.  It contains all sorts of useful data on kids (related to poverty, physical health, emotional and behavioral health, education, language, immigration, family income, etc.) that is clearly organized and can even be presented in different, easy-to-read formats with the mere click of a button (toggle between table, bar chart, pie, trends).

PLUS! It has a great list of other California-specific and national data sources related to children and families (and beyond):

http://www.kidsdata.org/data_sources.jsp?csid=0

The reality of “western” diseases

This TED talk challenges conventional wisdom about the “western” predominance of certain diseases.

http://www.ted.com/index.php/talks/view/id/249

Mapping disease across genes

http://www.nytimes.com/interactive/2008/05/05/science/20080506_DISEASE.html#

Interesting article and amazing visual that shows which human diseases share genes, and thus – regardless of wildly different symptoms – may be tied together. As the article reports, there could be implications for the grouping and splitting of diseases. Particularly interesting, given that in 1909, one cause of death was listed as “visitation by God.”

Data Quality Campaign

I’ve been learning a lot about state and school district-level data systems for work recently, and wanted to post a link to a fascinating survey of state data systems that was conducted by the Data Quality Campaign in 2007.  The DQC is a sorely needed effort in the world of educational data systems – they are working to ensure that all states have data systems in place with 10 essential elements (e.g. unique student and teacher identifiers, which can be linked, etc.).   Check out their data survey (fascinating stuff, and interesting to see how fractured the market of providers is, and which of the 10 essential elements are really lacking in the field!):  www.dataqualitycampaign.org

Health (and education) statistics

A comprehensive compilation of links to health statistics, data sets, and tools for data collection, all at your fingertips.  Just wish the education sector had something comparable…*

http://phpartners.org/health_stats.html 

*It is worth noting that UMichigan does have its own site for the education sector:

http://www.lib.umich.edu/govdocs/steduc.html