Yesterday, after driving 475 miles, I got a Dr. Pepper and some - TopicsExpress



          

Yesterday, after driving 475 miles, I got a Dr. Pepper and some tacos, and I sat down with #FTMSQL2. My task was to track down a basket of 65 political consulting firms. These 65 companies appear 1213 different ways in the records, but Ive aggregated them. That took maybe four hours and I left two audit trails for my work. The Advance Group, for instance, appears 42 different ways in the records and Pitta Bishop et al appears a dizzying 111 ways in the files. Thats an average of 18 aliases per company. (This is filers typing names in that vary in lexical ways.) And while you might attain records this clean, covering the entire universe back to 1999, after a couple dozen weeks in a patchwork of spreadsheets, you have to begin your climb again with each fresh filing, ... Ive replaced your low-level campaign finance employee/intern or your weekend crashing fresh data with a shell script that runs in seconds. These 1213 different aliases invoke 2475 different street addresses, but keep in mind that addresses are themselves badly aliased in the same way. Many of these refer to the same building or office in different ways and are thus duplicative (and in need of aggregation). Those 2475 aliased addresses are in turn used by 130,293 contributor aliases. There may even be other contributors who use forms of those literal addresses too, while entering them in ways that are not even included in the 2475 locations we are sweeping. So this is conservative. There are a total of 2,123,439 contributor keys and a total of 2,257,227 location keys under management, and that is just in our NYSBOE universe. Via these 65 firms, 0.001% of the address space and 0.0005% of the namespace is claimed, but they share their locations, conservatively, with 0.05% of the namespace. It sounds abstract and dry, but consider the implications of being able to sample these valences everywhere, simultaneously, at once. In the short term, aggregating all these key strategic firms means that reflecting and demonstrating their relationships, their scale, and their priorities is pretty easy now. After weve brought clarity despite the aliasing problem with major entities, and by converting every address into canonical form, we can began to answer truly sophisticated questions about the political landscape. What do you want to know?
Posted on: Thu, 31 Jul 2014 15:07:34 +0000

Trending Topics



Recently Viewed Topics




© 2015