Drumbeat/MoJo/hackfest/berlin/projects/MetaProject: Difference between revisions

From MozillaWiki
< Drumbeat‎ | MoJo‎ | hackfest‎ | berlin‎ | projects
Jump to navigation Jump to search
No edit summary
No edit summary
Line 9: Line 9:
==Desired Functionality==
==Desired Functionality==
===TEXT MEDIA===
===TEXT MEDIA===
Valid Inputs:  URL, Plain Text, HTML
'''Valid Inputs:''' URL, Plain Text, HTML
Optional Inputs: Known Metadata
 
Returned Metadata:
'''Optional Inputs:''' Known Metadata
 
'''Returned Metadata:'''
  - Primary Themes (Document-wide)
  - Primary Themes (Document-wide)
  - Primary Themes (Per-paragraph)
  - Primary Themes (Per-paragraph)
Line 18: Line 20:


===VIDEO MEDIA===
===VIDEO MEDIA===
*Valid Inputs:* URL, Video (format?)
'''Valid Inputs:''' URL, Video (format?)


*Optional Inputs:* Transcript, Faces, Known Metadata
'''Optional Inputs:''' Transcript, Faces, Known Metadata


*Returned Metadata:*
'''Returned Metadata:'''
  - Transcript
  - Transcript
  - Moments of audio transition (new speaker)
  - Moments of audio transition (new speaker)
Line 32: Line 34:


===AUDIO MEDIA===
===AUDIO MEDIA===
*Valid Inputs:* URL, Audio (mp3, wav)
'''Valid Inputs:''' URL, Audio (mp3, wav)


*Optional Inputs:* Transcript, Voice Samples, Known Metadata
'''Optional Inputs:''' Transcript, Voice Samples, Known Metadata


*Returned Metadata:*
'''Returned Metadata:'''
  - Transcript
  - Transcript
  - Moments of audio transition (new speaker)
  - Moments of audio transition (new speaker)
Line 44: Line 46:


===IMAGE MEDIA===
===IMAGE MEDIA===
*Valid Inputs:* URL, Image (jpg, gif, bmp, png)
'''Valid Inputs:''' URL, Image (jpg, gif, bmp, png)


*Optional Inputs:* Faces, Known Metadata
'''Optional Inputs:''' Faces, Known Metadata


*Returned Metadata:*
'''Returned Metadata:'''
  - OCR data and it's coordinate location
  - OCR data and it's coordinate location
  - Object identification
  - Object identification
  - Face identification [only done if faces are provided]
  - Face identification [only done if faces are provided]

Revision as of 13:22, 26 September 2011

The Meta Project is a tool which provides a simple service: take in any piece of media, spit out all the meta possible.

Meta Standards Resources

(Add links and summaries to documents discussing metadata)

Known APIs and Tools

(Add links and summaries of toolkits and APIs which can help generate data!)

Desired Functionality

TEXT MEDIA

Valid Inputs: URL, Plain Text, HTML

Optional Inputs: Known Metadata

Returned Metadata:

- Primary Themes (Document-wide)
- Primary Themes (Per-paragraph)
- Suggested Tags
- Entities (Names, Locations) and their locations in text

VIDEO MEDIA

Valid Inputs: URL, Video (format?)

Optional Inputs: Transcript, Faces, Known Metadata

Returned Metadata:

- Transcript
- Moments of audio transition (new speaker)
- Moments of video transition (new scene)
- OCR data (any text that appears on image) and their timestamps
- Entities (Names, Locations) and their timestamps
- Suggested Tags
- Face identification and their timestamp ranges [only done if faces are provided]

AUDIO MEDIA

Valid Inputs: URL, Audio (mp3, wav)

Optional Inputs: Transcript, Voice Samples, Known Metadata

Returned Metadata:

- Transcript
- Moments of audio transition (new speaker)
- Entities (Names, Locations) and their timestamps
- Suggested Tags
- Voice identification  and their timestamp ranges [only done if voice samples are provided]

IMAGE MEDIA

Valid Inputs: URL, Image (jpg, gif, bmp, png)

Optional Inputs: Faces, Known Metadata

Returned Metadata:

- OCR data and it's coordinate location
- Object identification
- Face identification [only done if faces are provided]