OpenNews/hackdays/insideroutsider: Difference between revisions

Line 42: Line 42:
* [http://www.state.gov/r/pa/prs/appt/2013/index.htm U.S. State Department Public Schedule]
* [http://www.state.gov/r/pa/prs/appt/2013/index.htm U.S. State Department Public Schedule]
* [http://www.whitehouse.gov/omb/oira_meetings/ White House Office of Management and Budget Meeting Records]
* [http://www.whitehouse.gov/omb/oira_meetings/ White House Office of Management and Budget Meeting Records]


'''From Waldo Jacquith, Virginia Decoded:'''
'''From Waldo Jacquith, Virginia Decoded:'''
Line 49: Line 50:
Waldo notes: It's a huge obstacle that I simply haven't put any time into dealing with. Every few months I spend half an hour on trying to put together a system to systematically scrape data out, get discouraged, and give up. Footnotes, blockquotes, and page numbers just kill me, although even if I could get the raw text decently, rendered terribly, I could still extract great metadata from them.
Waldo notes: It's a huge obstacle that I simply haven't put any time into dealing with. Every few months I spend half an hour on trying to put together a system to systematically scrape data out, get discouraged, and give up. Footnotes, blockquotes, and page numbers just kill me, although even if I could get the raw text decently, rendered terribly, I could still extract great metadata from them.


From John Keefe & Stephen Menendez, WNYC:  
 
'''From John Keefe & Stephen Menendez, WNYC:'''
* [https://dl.dropboxusercontent.com/u/6682410/FY%202013%20Schedule%20C%20-%20Merge%20Final1.pdf 2013 New York City Council budget document (warning large PDF download)]
* [https://dl.dropboxusercontent.com/u/6682410/FY%202013%20Schedule%20C%20-%20Merge%20Final1.pdf 2013 New York City Council budget document (warning large PDF download)]
* [http://www.nyc.gov/html/nypd/html/traffic_reports/motor_vehicle_accident_data.shtml NYPD Motor Vehicle Accident Data]
* [http://www.nyc.gov/html/nypd/html/traffic_reports/motor_vehicle_accident_data.shtml NYPD Motor Vehicle Accident Data]


From Daniel X O'Neil, Smart Chicago Collaborative/Everyblock:  
 
'''From Daniel X O'Neil, Smart Chicago Collaborative/Everyblock:'''
* [http://www.nyc.gov/html/nypd/html/analysis_and_planning/stop_question_and_frisk_report.shtml NYPD Stop, Question and Frisk Report Database]
* [http://www.nyc.gov/html/nypd/html/analysis_and_planning/stop_question_and_frisk_report.shtml NYPD Stop, Question and Frisk Report Database]
The data is amazingly detailed ([http://www.jjay.cuny.edu/web_images/PRIMER_electronic_version.pdf here's a great primer]), and lends itself to great visualizations ([http://www.nytimes.com/interactive/2010/07/11/nyregion/20100711-stop-and-frisk.html?ref=stopandfrisk here's one re: 2009 data]). The data itself is published in a highly inaccessible to regular people (notwithstanding the fact that is extremely well-structured as an SPSS portable file. Publishing this info as an easy-to-search, RSS-ready list of items would be high value.
The data is amazingly detailed ([http://www.jjay.cuny.edu/web_images/PRIMER_electronic_version.pdf here's a great primer]), and lends itself to great visualizations ([http://www.nytimes.com/interactive/2010/07/11/nyregion/20100711-stop-and-frisk.html?ref=stopandfrisk here's one re: 2009 data]). The data itself is published in a highly inaccessible to regular people (notwithstanding the fact that is extremely well-structured as an SPSS portable file. Publishing this info as an easy-to-search, RSS-ready list of items would be high value.
Line 62: Line 65:
* [ftp://66.97.146.93/ Dallas FTP Bulk Crime Database]
* [ftp://66.97.146.93/ Dallas FTP Bulk Crime Database]
This is an enormous, underutilized cache of crime data. Chicago gets lots of attention and plaudits for their crime data, but the Dallas stuff goes even farther back (2000!) and contains narrative that will make your eyes bleed. They have the actual comments typed into the system by actual police officers, including graphic details about horrible crimes and a huge amount of profanity. This is a researcher's treasure chest.
This is an enormous, underutilized cache of crime data. Chicago gets lots of attention and plaudits for their crime data, but the Dallas stuff goes even farther back (2000!) and contains narrative that will make your eyes bleed. They have the actual comments typed into the system by actual police officers, including graphic details about horrible crimes and a huge amount of profanity. This is a researcher's treasure chest.


=== Tools & APIs ===
=== Tools & APIs ===
Confirmed users
147

edits