Sheriffing/Sheriff Duty/Survey

From MozillaWiki
Jump to navigation Jump to search

Below is a detailed result overview from our first survey:

99 % of the Community that took part in the survey had already contact with Sheriffs.

This Contact/Interaction was felt as postive and some comments were:

-> The sheriffs are awesome!! Seriously though, I'm always really happy with how they (you) can (almost always) stay composed in the face of so much orange. I don't know that I'd manage. -> So far I've only had good experiences with my [checkin-needed] requests. Sadly I've caused some bustage and things had to be backed out, but as an inexperienced contributor--I'm glad the Sheriffs were there to monitor that and help out. -> I think Firefox development would crawl to a stop without the sheriffs' tireless effort. Thank you! However, that we need sheriffs is a sign that our automation is lacking. We're patching technical problems with manual human work. -> Things generally go pretty well, imo. The only issue I've run into is getting backed out when I'm around but not watching IRC. -> Well: - Sheriffs very available on IRC, friendly and helpful with details and explaining when necessary. - The `checkin-needed` pipeline just works, if sheriffs made a mistake once I don't remember it. Wrong: Nothing really, but to help this survey I'll force myself to complain. - I feel my `checkin-needed` flags generally take a long time to be addressed. Not sure if this latency can be improved though, given the volume of complex things a few sheriffs have to deal with. - Also, it was not clear who was currently sheriffing in my time zone, but with a little experience I noticed that that's whoever does the most talking in #developers.

Of course we got some comments (as expected) around backouts:

-> I think the sheriffs do a pretty great job, especially given the complexity of all the different things they're trying to accomplish at once! One thing that can be frustrating is that it's desirable to push many patches as a group, to reduce the burden on our infrastructure. If it turns out that one of those patches causes an intermittent failure or something similar, though, generally all the patches in that push will be backed out. That can be problematic for a variety of reasons, which often pushes me to land important patches separately from anything else. -> The sheriffs are usually fantastic. I don't have specific complaints. Sometimes backouts can feel overzealous (trivial warning fix, test disablement), but it's understandable since there is so much happening and so much incoming that they can't waste too much time on any one issue. -> I got backed out, life is tough

It also seems that a lot of the checkin-needed requestors left feedback: -> I used to use checkin-needed a lot before I had level 3 and the occasional need for backouts. I've always found the process very efficient and done with good humour. -> I try to use checkin-needed as much as possible because I see it as one of the few ways to land code without a lot of wasted machine time on inbound, however this isn't always possible because it isn't possible to predict when something marked as checkin-needed will land, and sometimes things need to land at a specific time.

Also for ourself we got some improvement ideas (and on some we already talked about like uplift docs to make it easier and offload work from Ryan):

-> Being able to just request approval for backports and have the backporting fairy handle the rest is awesome. On the many instances of when I've broken things and needed backout, the sheriffs have always been polite and helpful. Different sheriffs definitely seem to have different strengths, but I have not yet come across a situation where I was stuck because the wrong sheriff was around (though admittedly that's partly because RyanVM seems to show up at odd times outside of his shift.) philor cracks me up, even when he's targeting his ire at me, and I really appreciate having his eye on various symptoms (eg flaky slaves or useless tests/platforms) that nobody else seems to pay a lot of attention to. -> More distributed responsibilities. For now, having only Ryan to do uplift to aurora, beta and release is a bottleneck. I know it is a lot to ask but having ETA on the beta landings would help us too. -> Send me e-mail together with the IRC ping about bustage. A form letter saying "bustage" and nothing else is probably just fine.

For the question "Are you aware that test suites can be hidden on Treeherder?" 66 % of the replys were a "yes" and some comments: -> It has been frustrating when Try server's visibility doesn't exactly match Inbound's. In this case you can either waste time worrying about something that is hidden OR get backed out for missing something that was hidden on try. -> Yes. Most often the problem I have in this respect is pushing to try, seeing failures, and not realizing that the job is hidden on m-c or the branch that I care about. I then waste some time doing re-pushes or tracking down the failure before I realize I don't need to worry about it. -> Yes, although the icon in Treeherder doesn't make it clear if hidden results are shown. I often have to toggle it to determine the visibility status. Even the tooltip doesn't reflect the state. -> I spent a while looking at jetpack errors on try before realizing they were hidden.

Also for for the question what people think "what could be automated" we got a lot of replys. Also most of them demand autoland :) -> Lots of them. Some of them annoyed me personally enough that I automated the little fuckers, if you'll excuse the terminology. (qbackout, bzexport, pieces of qimportbz.) But there's still plenty left. Autolander type stuff. I'd like when someone marks a bug sec-high or sec-crit, for the landing rules to be pasted into a comment since I never remember to request sec-approval. I'd like it to be easier to paw through the weckage of failed jobs. Optimally for linux jobs, that would be an rr recording of the relevant portion that I can run on a VM or something under my control. More generally, it'd be nice if build directories for failures were walled off from the bloodthirsty purge rabbits for a little while, or maybe even automatically archived for some duration. And if the developer requests it, that it would be frozen on ice for later examination in an environment close enough to the slave machine that you can rerun things and they would have a high probability of reproducing (if reproduction is based on the environment rather that a true intermittent.) Maybe that's a docker image? Or it's a clone of the actual slave with secure keys etc stripped out? reviewboard setup that doesn't suck. It's getting there, but it's not there yet. better ways to select related jobs. The TH related jobs pane is cool, but it still seems kind of hard to find the series of builds I really care about, and the filter selection is too abstract. Part of the problem is that the data model it's sitting on is fuzzy and weird. (job names vs job types vs builders vs .... wtf?) better trychooser. Something with less cryptic syntax, integrated with the actual set of builders available, not relying on weird string matching, capable of selecting based on various properties not just magic builder name strings. Support for more builders (eg pgo). An easier way to push different sorts of jobs (eg only run this test, but run it 100 times. Save out this log file and upload it if the job fails.) Log analysis tools. Why did this job fail but this other one succeed? Lots more, but I need to get to work now. -> Some kind of automated bisection of the appearance of new intermittent oranges via retriggers on TreeHerder itself could be useful. -> I was super-confused when the build machines run different jobs than are available on the try chooser. Not sure if there can be some automated check, at least to put a note on the trychooser page that it may not be in sync. Probably extremely difficult/nearly impossible to do. (For instance the mozilla telemetry page has a header that is updated that lets you know if there are telemetry problems ATM so you aren't left scratching your head.) -> I don't know if this is possible, but http://trychooser.pub.build.mozilla.org/ is sort of a hot mess. Anything to simplify choosing the right builds/tests would be a win. Maybe some simple presets that would pre-select the most useful options (which would lower confusion, and perhaps lower test time) - [ ] UI changes for X platform - [ ] platform changes - [ ] telemetry changes -> Honestly, improving the Bugzilla APIs would help us all improve our process. Filing a bug is bonkers time consuming; how can this not just be an API that I hit with a Python script or something? The android-sync git repo has a scripted-but-manual merge into m-c landing process that could be smoothed out, but there's only 2-3 developers using it so it's not worth improving.