Drumbeat/MoJo/hackfest/berlin/projects/MetaProject

From MozillaWiki
Jump to navigation Jump to search

The Meta Project is a tool which provides a simple service: take in any piece of media, spit out all the meta possible.

Meta Standards Resources

(Add links and summaries to documents discussing metadata)

  • rNews is a proposed standard for using RDFa to annotate news-specific metadata in HTML documents.

Known APIs and Tools

(Add links and summaries of toolkits and APIs which can help generate data!)

Desired Functionality

TEXT

Valid Inputs: URL, Plain Text, HTML

Optional Inputs: Known Metadata

Returned Metadata:

- Primary Themes (Document-wide)
- Primary Themes (Per-paragraph)
- Suggested Tags
- Entities (Names, Locations) and their locations in text

VIDEO

Valid Inputs: URL, Video (format?)

Optional Inputs: Transcript, Faces, Known Metadata

Returned Metadata:

- Transcript
- Moments of audio transition (new speaker)
- Moments of video transition (new scene)
- OCR data (any text that appears on image) and their timestamps
- Entities (Names, Locations) and their timestamps
- Suggested Tags
- Face identification and their timestamp ranges [only done if faces are provided]

AUDIO

Valid Inputs: URL, Audio (mp3, wav)

Optional Inputs: Transcript, Voice Samples, Known Metadata

Returned Metadata:

- Transcript
- Moments of audio transition (new speaker)
- Entities (Names, Locations) and their timestamps
- Suggested Tags
- Voice identification  and their timestamp ranges [only done if voice samples are provided]

IMAGE

Valid Inputs: URL, Image (jpg, gif, bmp, png)

Optional Inputs: Faces, Known Metadata

Returned Metadata:

- OCR data and it's coordinate location
- Object identification
- Face identification [only done if faces are provided]