Changes

Jump to: navigation, search

Community:SummerOfCode18

1,337 bytes added, 16:40, 29 January 2018
Increase quality of infosec gsoc proposal for further clarity on milestones / deliverables.
|-
| Timely Security Analytics
| InfoSec uses the Mozilla Defense Platform, MozDef to aggregate logs and alert on time series. This project seeks to create a structure for extract, transform, load operations (ETL) to process these time series events using MapReduce(Apache Spark). The project will have three milestones:* Demonstrated ability to load data from Mozilla SCL3, MDC1, MDC2 to GCP or AWS and process in apache spark. Student shall have choice of languages: scala, python, etc.* Demonstrated ability to reason about the MozDef standard data format using map reduce. Produce anomalies for known incident patterns in AWS CloudTrail logs. ( Example: An IAM user or service account used a service they have never used or logged in from a new GeoLocation / IP address tuple. )* Demonstrated abilility to reason about non structured data (non-MozDef standard format) and correlate with MozDef structured events.For more information see: [https://docs.google.com/document/d/1pzVbFw5TM8gG01Jabi4aOxAhiMvHGUHZTG-hV4dtAQs/edit?usp=sharing Google Doc]
| Languages or skills needed: Python, Scala, Javascript, ETL, AWS/GCP, Data Science Fundamentals, Apache Spark, Pig, or other big data.
| [https://mozillians.org/en-US/u/akrug/ Andrew Krug :akrug (mozilla)
| [https://mozillians.org/en-US/u/akrug/ Andrew Krug :akrug (mozilla)
[https://mozillians.org/en-US/u/michalpurzynski/] Michal Purzynski :michal` (mozilla)
| Aggregation This project represents a unique opportunity to work on one of the only fully open source SIEM projects in the community -- MozDef. The MozDef platform is running in production at Mozilla and ingests 14,000,000 events per year. The current alert system uses time series rules to generate meaningful alerts for the infosec team and Mozilla's Operations Center. We're hoping that with the help of qualified student(s) this rich data set can also generate alerts using basic machine learning techniques. This is a unique opportunity to work on aggregation of disparate sources of data: NSM, CloudTrail, etc. into spark. Real-time alerts over large quantities of data.
|-
| C++ Static Analysis
95
edits

Navigation menu