OpenNews/hackdays/storyandalgorithm/departmentofdefensedatadig

From MozillaWiki
Jump to: navigation, search
  • Project name: Department of Defense data dig
  • One-line description of project: Checking defense contracts against campaign donations.

Dept. of Defense Data Dig

  • Your team: Please list project team members

Nicola Hughes
Peter Richardson
Al Shaw

  • Project URL(s), if applicable:
  • Hashtag, if #relevant:
  • What are you building: What will the thing you are creating do, enable or solve, in 1 human-readable paragraph

Command-line tools to compare giant CSV files containing US Department of Defense contracts against data from open government APIs showing political contributions.

  • Who is it for: Describe your target user audience, e.g. "Everyone who uses the internet", "people in high risk environments", "People who still read 'Peanuts'"

Muck-raking trouble-makers who are likely to wind up on a list somewhere

  • Your goal for this weekend: where are you trying to get to by 2:45pm Sunday in terms of features, functionality or other creations

A working pipeline: data in -> data out

  • Your starting point: Are you writing from scratch? Designing a new concept? Modeling a data space? Extending a codebase or library? Adding a feature to a platform? Combining tools?

We have a large dataset from http://usaspending.gov/data and enough knowledge of Python and shell scripting to get into trouble

  • Anything else we should know: At most 1 non-long paragraph of useful additional context

It's unlikely we'll find anything suspicious, I've always found politicians and corporate types to be an honorable bunch of super-nice people

  • How is this project useful? The project helps to see if there are connections between contracts and campaign donations. The project is the first step in a larger project looking for lobbying influences between politicians and corporations.
  • Where is this project going and what lessons/concepts can be applied to other projects? Nicola will be following up on the project and it could lead to a bigger picture project, expanded to lobbying in other areas for example, or a narrower project following a single contract. The project also involved figuring out techniques for manipulating large data sets, using APIs with large data sets, and putting together command line tools and scripts to get a handle on large amounts of data. Lessons can be shared on what size data set causes something to crash.