Saturday, January 11, 2020

Puddles in Australia

Nathan Yau of FLOWINGDATA provides us with the "Best Data Visualization Projects of 2019".

Check out his work, it is awesome.

Visualization is a relatively new field, but we seem to be developing an understanding of what it is and what it can be used for. This year, we refined existing methods. With less emphasis on novelty of visual forms, we focused more on what we wanted to communicate.

He does prove his first sentence wrong with his first example, "Best Blend of New and Vintage" with the 3D elevation + 1878 USGS Yellowstone Geology Map.  

Here's a map from 1878 of my ancestors near Buck Lake. (Not in 3D)


1878 Map Of Loughborough

Notice the visualizations of lot numbers, landowner names, all nicely laid out in somewhat-to-scale quadrants of land ownership. 

The field of Visualization is as old as cave drawings, or pictographs, and carvings, or petroglyphs.  Take Milestones in the History of Thematic Cartography, Statistical Graphics, and Data Visualization for example.  If you type that title out you will probably get carpal tunnel syndrome.  It could probably be better described with a pictogram or symbol.


eye with apple slicer, 2020, some pictogram artist
The earliest known map, Çatalhöyük, which of course is disputed, is a Babylonian clay tablet with a nice 3D visualization.  And it isn't even a map.  My Cartocacoethes is kicking in again.

RIP Australian puddle map, is this Sudan or Australia?
Speaking of Australian puddle maps, another example on the FlowingData web site is Climate Coverage by The New York Times.  With augmented reality view.  I suppose you can make an example that this is a visualization which could only have happened in the last decade?  Searching for 16th century augmented reality, Google comes through again with An Eerie Augmented Reality Illusion from the 1850s is Still Being Used Today.  

Smoke and mirrors, still in use today by climate change deniers.  Mostly the smoke part.

I really like the Best Comic Chart winner, Something's Wrong.  
Fund him on Patreon.com/WillikinWolf
The CPI should have been added to this chart.  

Since all things revert to the mean I would expect that productivity will start going down and level out with wages once we get our robot overlords in place. Patreon has a great example of the job titles of the future, which of course we are now living in.

Podcaster
Video Creator
Musician
Visual Artist
Writer & Journalist
Community Coordinator
Gaming Creator
Nonprofit
Tutor and Educator
Creator-of-all-kinds

My 7-year-old asked me to setup his web site, which really means Youtube Channel to him.  Unlike the boomers who think that the internet is Internet Explorer, and a Relational Database is something hosted inside Internet Explorer, this generation will think that old-school television is Netflix, and the internet is some kind of I Can Has CHEEZburger meme browser running in your television next to Netflix.

Eventually this will revert to the mean and tools like IRC and Gopher will come back to us.  

What the crap? There's a Gophercon?  That would explain all the ASCII & ANSI art generators which are getting more popular.

I can't wait for squealie modems to come back to us.
I expect the 2020 visualization winner to be some ANSI terminal art depicting the trend of vandalizing web sites by setting their baud rates to 2400bps.




Saturday, January 3, 2015

Melting Columns to Rows in R

Pivoting data is a common issue when dealing with Excel or CSV input files.  Excel users generally like "dirty" yet presentable data.  Data dumps commonly place multiple tables worth of data into a single massive column table.  This style of multi-column data lends it self to problems when trying to analyze results using tools such as pivot tables.

For example, 12 months of column data should be converted to 2 columns (month name, value) for efficiently pivoting / filtering and sorting data from a Wide to Long format.

Excel can use some macro functionality to unpivot results, or an add-in like PrepYourData or Microsoft PowerQuery.
 
R provides this kind of functionality in 3 lines of code. In this case for subjects and scores across column to rows.

http://stackoverflow.com/questions/18446668/using-r-to-reformat-data-from-cross-tab-to-one-datum-per-line-format

http://seananderson.ca/2013/10/19/reshape.html

Tidy Data

Monday, November 10, 2014

Lifecycle of Data Science Project

The Lifecycle Summarized

1. Identify & Define the problem
2. Define and document data sources
3. Statistical data profiling
4. Implementation
5. Sharing and collaboration
6. Maintenance & Support

http://www.datasciencecentral.com/profiles/blogs/life-cycle-of-data-science-projects

Plus M&Ms, Jackknife (Swiss Army Style?) logistic and linear regression.
http://www.datasciencecentral.com/profiles/blogs/jackknife-logistic-and-linear-regression

Plus jackknifing your results in @ 4 lines of R code.
http://ryouready.wordpress.com/2008/12/19/r-jackknife-the-coefficients-of-a-linear-regression-model/

Random Forests in Tableau with R
http://boraberan.wordpress.com/2014/02/07/decision-trees-in-tableau-using-r/

Finally, using a jackknife to cut down some Hidden Decision Trees.
http://www.datasciencecentral.com/profiles/blogs/hidden-decision-trees-revisited

Monday, May 26, 2014

FRED Add-In for Excel and some Torontoist Centric Economic Data

The Federal Reserve Bank of St. Louis Economic Data (FRED) Add-In is free software that will significantly reduce the amount of time spent collecting and organizing macroeconomic data. The FRED add-in provides free access to over 210,000 data series from various sources (e.g., BEA, BLS, Census, and OECD) directly through Microsoft Excel.

Get it here
http://research.stlouisfed.org/fred-addin/

Are you looking for GDP, CPI, or microeconomic data from the US FED?  Stats Canada?

Some interesting visualizations and interpretations of this type of data.

How much you make vs. how much it really feels like per US city with the supporting data released April 2014.

Canadian Cities Where An Average Income Will No Longer Buy You a House 

Numbeo, Cost of Living In Canada

It will cost you about 6% more to eat at McDonalds in Barrie vs. Toronto.
It costs 90% more to buy a bag of potatoes in Oshawa than Toronto.

There must be a potato famine in Oshawa... or people there really like french fries.

Friday, August 10, 2012

Speedometer Design: Why It Works | DATA + DESIGN by Paul Van Slembrouck

Gauge controls may have some comfortable familiarity with certain business users (think oil & gas or automotive/transportation). 

However, do they convey business information in a quick, actionable way?
Speedometer Design: Why It Works | DATA + DESIGN by Paul Van Slembrouck