Thunderbird:Unit Test Ideas
The following is a culmination of efforts to brainstorm on a unit test framework for Thunderbird. -Gary, June 2008
- 1 Unit tests that may be relevant to Thunderbird
- 2 Items that may not necessarily be relevant to Thunderbird
- 3 Miscellaneous questions
Unit tests that may be relevant to Thunderbird
- Always the number 1 priority as they determine whether Thunderbird is actually usable for that certain function that is defined in the crashtest.
- Standard8: So from a quick glance, Firefox loads in html (and other files?) into the browser and hence it shouldn't crash. How do we apply this to mailnews? If we crash on loading an email into the preview window due to a layout issue, then that is core. Crashes within mailnews code I think can normally be covered by xpcshell/make check level tests. Maybe this needs thinking about a bit more.
- dmose: Given that Tinderbox already turns orange when the browser crashes while running one of the other frameworks, it'd be interesting to understand the motivations behind having a separate framework here.
- Standard8: We mainly now just need to encourage devs to write tests.
- dmose: Agreed.
- These tests include but are not limited to POP, IMAP, SMTP, news, etc. There is some current traction in getting some of these in place, the ones for SMTP and News are already implemented - see Standard8's blog.
- Standard8: These are currently a subset of xpcshell tests, though I expect can be used for other testing architectures. Possibly need a bit more documentation about these on the wiki.
- dmose: Right; the fake server stuff is not it's own framework; it's simply a set of functionality that is useful within various other frameworks.
- jcranmer: Both fakeservers still have ways to go. Better documentation is of course necessary (I'm writing a test right now on testing news subscription, and will use that as the basis for a more comprehensive test guide). Even though it is just part of xpcshell framework, it is important enough to be its own subcategory IMO. The last thing I personally have to work out is nsIMsgWindow interactions...
- These tests are relevant, albeit in Thunderbird chrome instead of Firefox chrome or browser/chrome. Chrometests are basically Mochitests with chrome privileges. Similar to crashtests, I do not believe we have any Thunderbird-chrome tests in place at this point in time. Mail chrome tests are more advanced and powerful but they contain a high learning curve, as they require knowledge of the internals of Thunderbird in intricate detail. See Gristmill notes below.
- Standard8: So in Thunderbird terms, what is the difference between a chrome test and a browser chrome test?
- dmose: My somewhat fuzzy recollection is that there may be more than two types of chrome tests, even. This probably a good thing to talk to Gavin about.
- Standard8: What support environment do you get?
- dmose: Different for each framework, I think.
- Standard8: Would this be for testing items like the address book results window pane, popup menu contents etc?
- dmose: Yes. Or anything that requires driving the UI in some way.
- Standard8: In Firefox terms, are these all in /browser, or are some of these tests core?
- dmose: Some are in toolkit, I suspect, since there's a bunch of shared UI implemented there.
- Gristmill, a project that Clint and Mikael (from MoCo) are currently working on, can potentially replace manual litmus tests and possibly some browser/chrome tests (its relevance to Firefox), and is found here: http://wiki.mozilla.org/QA/TDAI/MozMillTestTool
- In demonstrations, Gristmill can allow simple functions, e.g. mouse-clicks, to be automated. An action to open up a browser window and click on the Home button, after which a new tab is added and a search in the search box is executed, all of which are automated.
- This may potentially be useful when we apply Gristmill to Thunderbird, to get it to open a Write window, type something in the subject and contents, then send it out. Upon successful sending, the fake server will then report the results.
- We can then tell if a certain action causes a crash, or even an assertion, and in addition to that, fake servers can tell if the message was sent successfully.
- By randomizing the actions, which are selected from pre-defined lists of objects (e.g. Write button is found on the main UI but not in the message compose window), depending on the names of ids found in the XUL files, and letting Gristmill run continuously.
- Looking at the logs will enable us to see which exact actions led to the crash / assertion. Human intervention can verify the suspects.
- Standard8: This sounds good, though obviously we'll have to wait a while to get it.
Items that may not necessarily be relevant to Thunderbird
- Basic Mochitests load an iframe, from which multiple tests are loaded within that iframe. Thus, this is likely HTML-related, and as such, layout / Gecko work. Most of it is already covered by existing Mochitests, and I do not see any advantage we can get from having Mochitests for Thunderbird.
- Standard8: Right, so the question here is do we want to be able to run these core tests on a Thunderbird build?
- Standard8: If we were on xulrunner I would say no, leave the xulrunner testing to xulrunner. Given the builds aren't quite the same yet, do we want to be able to run these tests on our build?
- dmose: In some ideal world, sure. But this seems significantly lower priority than almost any other sort of testing work I can think of.
- Reftests deal mostly with layout, and as such are likely to be Gecko-related as well, thus are already covered by existing Firefox unit test infrastructure.
- Standard8: Again, I think this has the same question as Mochitests.
- Quoted from one of the reference pages below, "Writing a C++ test is more difficult than writing an xpcshell test, and the code is harder to maintain and modify. Don't write a compiled-code test if you don't have to! You should use these tests only when the functionality being tested depends upon unscriptable interfaces, methods, or properties, as much as possible.".
- Thus, I do not believe such tests are necessarily a priority in the preliminary stages of setting up Thunderbird's unit test infrastructure.
- Standard8: The structure is actually already in place (only needs make check), and we do already have one of these tests.
- dmose: I think there are a few compiled code tests in the MIME code as well.
- Standard8: As Clint said, we have some no script interfaces that can't be tested any other way (although we should be looking at reducing those). I'd like to extend it to say we've got some purely c++ objects where the interfaces are not enough (because they don't need to be) for us to test them with xpcshell level tests.
- Standard8: From my own perspective, writing a c++ test for a single class isn't more difficult than an xpcshell test for the equivalent. Obviously if the test is getting complex, then it will be harder.
- dmose: Right, we're probably always going to need some amount of C++ testing, but like with JS code, it's nice to avoid when possible. And I agree with Gary's point that this isn't high priority.
- Standard8: How much do we want to test the core of our builds?
- dmose: From "fixing" core xpcshell tests, this has revealed some problems/bugs in our build partially due to us not being up to standard and potentially one of them has revealed some actual bugs in how we do things (I'm not 100% sure on that yet though).
- Standard8: Can we leave it up to Firefox testing the core?
- dmose: I think that tests for the core should live in the core. So in the cases where Firefox isn't (and isn't likely to) test core stuff, we should very much feel free to write and contribute tests, but they don't belong in a mailnews specific directory.
- Standard8: How do we run these test architectures, what support/driving tools are required? (e.g. do we need to write a browser add-on, or would a simple overlay do?)
- Standard8: Which tests architectures are unstable on Firefox and do they know why (i.e. why do they keep on getting random test failures)?
- dmose: Yep, these are definitely key questions.
- Standard8: If (with the set of test architectures that we decide on) we wrote enough tests of all the different types, would we (in theory) be able to get 100% code coverage (for mailnews/, mail/, editor/ui and directory/ (ldap) ) without adding more test architectures?
- Standard8: Would we be able to easily simulate any server/client setup to try and debug particular user problems and form a regression test for it?
- dmose: I'm not convinced that these two questions are what we want to spend time on now: it sounds like a bunch of work to figure out, and it's not clear to me what the benefit is. On the other hand, it seems like we have a very pressing need for the ability to test UI-driven stuff. I'd much prefer to spend effort looking into the various chrome testing frameworks and figuring out what the options are there.