DXR Query Language Refresh

From MozillaWiki
Jump to navigation Jump to search

The upcoming refresh of DXR's query language will make improvements to brevity, memorability, and compactness. This document represents our eventual goal. This will probably land a little at a time.

Simple Searching

Entering plain text searches the code and pathnames. Each word is taken as a separate substring to match, and the substrings are and-ed together on a per-line basis.

To exclude lines matching a word, precede the word with "-".

Filters

Other, more specific search filters are also available:

path
Pathname with shell-style globbing
fn (or fun or func or function?)
Definitions or declarations of a function with the given prefix
fn-ref
Uses of a given function
re
(or regex: or regexp: : Regular expression
member
Find member functions of a class (or struct?).
id
Identifier of any kind (maybe not necessary if we do search ranking better)
ref
A reference to any kind of identifier
caller or callers
Functions that call a given function
called-by
Functions called by a given function
type
The definition or declaration of a given type
type-ref
Uses of a given type
var
Definitions or declarations of a variable
var-ref
Uses of a given variable
namespace
Definition or declaration of a namespace
namespace-ref
Uses of a namespace
namespace-alias or namespace-alias-ref
Should these merge into the above?
macro
Definition or declaration of a macro
macro-ref
Use of a macro
subclass or sub
Subclass of a class
superclass or super
Superclass of a class
warning
Compiler warnings?
warning-opt
More compiler warnings?

Again, query terms are and-ed together and matched against individual lines of the codebase, like grep's single-line mode. This query, for example, finds all the lines from ``.h`` files containing the words "big", "angry", and either "hamster" or "hippo".

   path:*.h big angry re:hamster|hippo

You can negate a filter by preceding it with "-":

   -path:*.cpp -path:*.c fn:foo

Let's try to standardize with https://code.google.com/p/chromium/codesearch as much as possible.

Obsolete

  • `ext:` goes away. It's covered by `-path:*.c`
  • `*-decl:` goes away until somebody asks for it. It's merged into `*`.

To Be Determined

  • a way to express case-sensitivity. Possibilities include quotes and +.

More Explicit Operators

Some people expect runs of barewords to be treated as phrase matches. Here's how we could do that.

Take any continguous sequence of text: filters to be a single string. Otherwise, AND everything.

For example... ::

   three blind mice path:*.c def

…would search for "three blind mice" AND path:*.c AND "def".

You can go back to the old behavior by saying explicitly... ::

   three AND blind AND mice

Maybe we could even do… ::

   mice IN fn:main

…or… ::

   fn:frob IN type:Frobulator

Quoting and Escaping

To do phrase matching or include spaces in a term, use single or double quotes. Doubles can contain singles, and vice versa. You can also backslash-escape them.

   "big, bad wolf"
   'That "wolf" is a hamster.'
   'Don\'t call my wolf a "hamster".'
   re:"big old|great big"
   -"not this phrase"

You can use a literal quote without enclosing it in other quotes, as long as it isn't a leading one:

   path:/users/erik's/*.py

What of backslashes in unquoted strings and preceding things other than quotes?

  • In a quoted string, a backslash before a quote of the same type is an escaper. Otherwise, it's a literal backslash.
   "one long \"quoted\" string"
   "literal \backslash"
   "\\" is backslash-quote"
  • Later, we may let backslashes in unquoted strings escape spaces:
   one\ long\ string

These rules are akin to common shell syntax and designed so you don't need to plan ahead (or backtrack) when typing a query.

See Also

GitHub's issue query language might provide some inspiration.