Hobby: Using data to trend San Francisco's housing market
Even if the data comes from a Tweet or an IM reply, there can be a wealth of insight to be gained.
Sometime during 2013, a friend shared with me and the rest of his Twitter followers the data in figure 1. Housing is a hot topic in our beloved city, so I was interested in seeing what the data had to say. I was eating lunch at my desk that day, so I decided to see what I could wrangle.
Figure 2 shows a multivariate analysis of the information from Figure 1. As one might expect, "Units completed" has strongest correlation to Net change in number of units. Number lost from or gained from alternations was more weakly associated. It is interesting to note that year (a linear trend) is more strongly associated than some other variables, supporting correlation does not always imply causation.
Finally, figure 3 is a chart of those variables tied together - net change of number of units against year, with units gained or lost shown as point size, and units completed shown in color. As a sanity check, the sum statistic verifies no data entry errors in the net change. Strongest correlation -- between net change and units completed -- is displayed as a color transition from blue to red. Number gained/lost from alterations -- displayed as data point size -- is not as strongly correlated, but there is a weak trend of the dot size increasing as net change in housing increases.
It would be interesting to compare this to other cities that offer more units built per year and see the regressions against something that could characterize the market -- if supply and demand holds, that implies median house price. What these results suggest is the next time someone says that SF needs to build more houses to accommodate the market, they are probably right. Some of us may feel this in our gut, but these numbers support that claim, an important requirement competitive revenue-driven space, like real estate in SF.
Figure 1: data from San Francisco's 2011 housing inventory. Source: http://sf-planning.org/ftp/files/publications_reports/2011_Housing_Inventory_Report.pdf
Figure 2: Multivariate analysis of the the aggregate data
Figure 3: Charting the data