47
edits
| Line 53: | Line 53: | ||
== Project ideas == | == Project ideas == | ||
* Snappy parses symbol files in another process (which we informally call "Symbolication Process") due to the well known [https://www.youtube.com/watch?v=Obt-vMVdM8s GIL problems]. | * Snappy parses symbol files in another process (which we informally call "Symbolication Process") due to the well known [https://www.youtube.com/watch?v=Obt-vMVdM8s GIL problems]. We could have a Symbolication Process per CPU core, but we have a tricky problem here. The Symbolication Process maintains a in memory symbols cache, with the most recently used symbols. The problem is how we handle this, we could have a single shared cache among all Symbolication Processes, which would bring contention, hurting the code parallelism. Other approach is that each Symbolication Process could have its own memory cache, but we potentially could waste memory due to duplicated symbols among all processes, and we could duplicate work because all subprocesses would parse the same symbol file in case of several similar symbolication requests. One good solution is to maintain the memory cache in the parent process. | ||
We could have a Symbolication Process per CPU core, but we have a tricky problem here. The Symbolication Process maintains a in memory symbols cache, with the most recently used | |||
symbols. The problem is how we handle this, we could have a single shared cache among all Symbolication Processes, which would bring contention, hurting the code parallelism. | |||
Other approach is that each Symbolication Process could have its own memory cache, but we potentially could waste memory due to duplicated symbols among all processes, and we could | |||
duplicate work because all subprocesses would parse the same symbol file in case of several similar symbolication requests. One good solution is to maintain the memory cache in the parent process. | |||
* The Symbolication Process requests symbol files from S3. This is a I/O bound task, so this should happen in the main process, and then Snappy would use asynchronous I/O for that. | * The Symbolication Process requests symbol files from S3. This is a I/O bound task, so this should happen in the main process, and then Snappy would use asynchronous I/O for that. The problem is that we have to send the symbol file to the Symbolication Process through IPC. The IPC overhead could kill the performance gain with asynchronous requests. We need performance numbers here to make a decision on what's the best approach. | ||
The problem is that we have to send the symbol file to the Symbolication Process through IPC. The IPC overhead could kill the performance gain with asynchronous requests. We need performance | |||
numbers here to make a decision on what's the best approach. | |||
* Too bad we don't have unit tests. | * Too bad we don't have unit tests. | ||
edits