User:Jorend/Python set usage: Difference between revisions

 
(9 intermediate revisions by the same user not shown)
Line 3: Line 3:
A <code>Set()</code> constructor is proposed for a future edition of the ECMAScript programming language standard. Should the constructor take a single iterable argument or any number of arguments? One factor in such a design decision is usefulness. Variable arguments are nicer when the number of values to be added to the set is known at the call site, as in <code>Set(1, 2, 3)</code> vs. <code>Set([1, 2, 3])</code>. A single iterable argument is nicer when the number of values is not statically known, as in <code>Set(names)</code> vs. <code>Set(...names)</code>. Which need is more common?
A <code>Set()</code> constructor is proposed for a future edition of the ECMAScript programming language standard. Should the constructor take a single iterable argument or any number of arguments? One factor in such a design decision is usefulness. Variable arguments are nicer when the number of values to be added to the set is known at the call site, as in <code>Set(1, 2, 3)</code> vs. <code>Set([1, 2, 3])</code>. A single iterable argument is nicer when the number of values is not statically known, as in <code>Set(names)</code> vs. <code>Set(...names)</code>. Which need is more common?


A script was used to search the Python standard library for expressions that create nonempty sets. Of these expressions, the number of values in the new set is statically apparent in only 41% of the cases.
A script was used to search several Python codebases (including the Python standard library) for expressions that create nonempty sets. Of these expressions, the number of values to be added to the new set is statically apparent in less than half (41% in the standard library, 26.8% overall).


=Background=
=Background=
Line 15: Line 15:
* Calling the <code>set</code> constructor. The constructor takes an optional iterable argument which provides the values. The number of values in a call to <code>set</code> is sometimes statically apparent, as in <code>set([SUCCESS, FAILURE])</code>, and sometimes not, as in <code>set(self.get_labels())</code>.
* Calling the <code>set</code> constructor. The constructor takes an optional iterable argument which provides the values. The number of values in a call to <code>set</code> is sometimes statically apparent, as in <code>set([SUCCESS, FAILURE])</code>, and sometimes not, as in <code>set(self.get_labels())</code>.


=Methodology=
=Method=


A copy of the Python standard library and a corresponding Python executable were obtained using the commands:
A copy of the Python standard library and a corresponding Python executable were obtained using the commands:
Line 28: Line 28:
(On Mac OS X Lion, configure with <code>CC=gcc-apple-4.2 ./configure</code> instead. Apple’s LLVM-based gcc that ships with Lion can’t build Python.)
(On Mac OS X Lion, configure with <code>CC=gcc-apple-4.2 ./configure</code> instead. Apple’s LLVM-based gcc that ships with Lion can’t build Python.)


The script below was then run in the same directory. The script searches the <code>Lib</code> directory for Python files. It disregards unit tests, as they typically contain very unusual code. It parses each other file using the Python parser and finds each set-creation expression. The expression <code>set()</code>, which creates an empty set, is disregarded; the rest are classified as one of the following:
The script below was then run in the <code>Lib</code> directory. The script searches the current directory for Python files. It disregards unit tests, as they typically contain very unusual code. It parses each other file using the Python parser and finds each set-creation expression. The expression <code>set()</code>, which creates an empty set, is disregarded; the rest are classified as one of the following:


* A - statically known number of values (supporting variable arguments)
* A - statically known number of values (supporting variable arguments)
Line 34: Line 34:
* C - using a comprehension (also supporting a single iterable argument, assuming ECMAScript gets array comprehensions)
* C - using a comprehension (also supporting a single iterable argument, assuming ECMAScript gets array comprehensions)


import os, ast, collections
<pre>
from __future__ import division
def target_files(dir):
import os, ast, collections
    for dirpath, dirnames, filenames in os.walk(dir):
 
        dirnames[:] = [d for d in dirnames if d not in ('test', 'tests')]
def target_files(dir):
        for f in filenames:
    for dirpath, dirnames, filenames in os.walk(dir):
            if f.endswith('.py'):
        dirnames[:] = [d for d in dirnames if d not in ('test', 'tests', '_tests')]
                yield os.path.join(dirpath, f)
        for f in filenames:
            if f.endswith('.py') and not f.endswith('Tests.py'):
def main():
                yield os.path.join(dirpath, f)
    counts = collections.defaultdict(int)
 
def main():
    for filename in target_files('./Lib'):
    counts = collections.defaultdict(int)
        with open(filename) as f:
    loc = 0
            src = f.read()
    for filename in target_files('.'):
        lines = src.splitlines()
        with open(filename) as f:
        parse_tree = ast.parse(src, filename)
            src = f.read()
        lines = src.splitlines()
        def log(node, code):
        loc += len(lines)
            counts[code] += 1
        parse_tree = ast.parse(src, filename)
            print('{0} {1}:{2}:{3}'.format(code, filename, node.lineno, lines[node.lineno-1]))
 
        def log(node, code):
        for node in ast.walk(parse_tree):
            counts[code] += 1
            if isinstance(node, ast.Call):
            print('{0} {1}:{2}:{3}'.format(code, filename, node.lineno, lines[node.lineno-1]))
                if isinstance(node.func, ast.Name) and node.func.id == 'set':
 
                    assert len(node.args) in (0, 1)
        for node in ast.walk(parse_tree):
                    assert len(node.keywords) == 0
            if isinstance(node, ast.Call):
                    assert node.starargs is None
                if isinstance(node.func, ast.Name) and node.func.id == 'set':
                    assert node.kwargs is None
                    assert len(node.args) in (0, 1)
                    if len(node.args) == 1:
                    assert len(node.keywords) == 0
                        arg = node.args[0]
                    assert node.starargs is None
                        if isinstance(arg, (ast.Str, ast.List, ast.Tuple)):
                    assert node.kwargs is None
                            log(node, 'A')
                    if len(node.args) == 1:
                        elif isinstance(arg, (ast.ListComp, ast.SetComp, ast.GeneratorExp)):
                        arg = node.args[0]
                            log(node, 'C')
                        if isinstance(arg, (ast.Str, ast.List, ast.Tuple)):
                        else:
                            log(node, 'A')
                            log(node, 'B')
                        elif isinstance(arg, (ast.ListComp, ast.SetComp, ast.GeneratorExp)):
            elif isinstance(node, ast.Set):
                            log(node, 'C')
                log(node, 'A')
                        else:
            elif isinstance(node, ast.SetComp):
                            log(node, 'B')
                log(node, 'C')
            elif isinstance(node, ast.Set):
                log(node, 'A')
    total = sum(counts.values())
            elif isinstance(node, ast.SetComp):
    print()
                log(node, 'C')
    for t, n in sorted(counts.items()):
 
        print("Type {0}: {1} ({2:.1%})".format(t, n, n / total))
    total = sum(counts.values())
    print()
main()
    for t, n in sorted(counts.items()):
        print("Type {0}: {1} ({2:.1%})".format(t, n, n / total))
    print("total: {0}".format(total))
    print("set expressions per 1000 lines of code: {0}".format(1000 * total / loc))
 
main()
</pre>


= Results =
= Results =


The full output is shown below. In short there are 117 qualifying set-creation expressions in the standard library. Of these:
The full output from parsing the Python standard library is shown below. There are 117 qualifying set-creation expressions in the standard library. Of these:


* 48 (41%) are type A (supporting a constructor with variable arguments),
* 48 (41.0%) are type A (supporting a constructor with variable arguments),
* 59 (50.4%) are type B (supporting a constructor with a single iterable argument), and
* 59 (50.4%) are type B (supporting a constructor with a single iterable argument), and
* 10 (8.5%) are type C (comprehensions, supporting a constructor with a single iterable argument)
* 10 (8.5%) are type C (comprehensions, supporting a constructor with a single iterable argument)


A ./Lib/_markupbase.py:152:        if sectName in {"temp", "cdata", "ignore", "include", "rcdata"}:
The numbers for all codebases are:
A ./Lib/_markupbase.py:155:        elif sectName in {"if", "else", "endif"}:
<table class="data">
  A ./Lib/_markupbase.py:210:                if name not in {"attlist", "element", "entity", "notation"}:
  <tr> <th>Codebase        <th>A          <th>B          <th>C          <th>total <th>density
  A ./Lib/_markupbase.py:129:                elif decltype in {"attlist", "linktype", "link", "element"}:
  <tr> <td>standard library <td> 48 (41.0%) <td> 59 (50.4%) <td>10 (8.5%) <td>117  <td>0.45
B ./Lib/_pyio.py:158:    modes = set(mode)
  <tr> <td>Django          <td> 9 (13.6%) <td> 36 (54.5%) <td>21 (31.8%) <td>66    <td>0.65
A ./Lib/_pyio.py:159:    if modes - set("axrwb+tU") or len(mode) > len(modes):
  <tr> <td>Mercurial        <td> 23 (14.0%) <td>110 (67.1%) <td>31 (18.9%) <td>164  <td>2.24
C ./Lib/_weakrefset.py:175:        return self.data <= set(ref(item) for item in other)
  <tr> <td>MoinMoin        <td> 21 (72.4%) <td>  7 (24.1%) <td> (3.4%) <td>29    <td>0.76
  C ./Lib/_weakrefset.py:182:        return self.data >= set(ref(item) for item in other)
  <tr> <td>scons            <td>  2 (28.6%) <td> 5 (71.4%) <td> 0        <td>7    <td>0.11
C ./Lib/_weakrefset.py:187:        return self.data == set(ref(item) for item in other)
  <tr> <td>'''Total'''      <td>103 (26.8%) <td>217 (56.6%) <td>63 (16.4%) <td>383  <td>-
C ./Lib/abc.py:132:        abstracts = {name
</table>
A ./Lib/abc.py:189:       return any(cls.__subclasscheck__(c) for c in {subclass, subtype})
 
B ./Lib/ast.py:60:           return set(map(_convert, node.elts))
(The density column tells how many qualifying set-expressions each project has per 1000 lines of code.)
B ./Lib/bdb.py:22:       self.skip = set(skip) if skip else None
 
C ./Lib/cgi.py:627:       return list(set(item.name for item in self.list))
(I also ran the analysis on Zope, but it apparently does not use sets at all.)
B ./Lib/cmd.py:287:        commands = set(self.completenames(*args))
 
C ./Lib/cmd.py:288:        topics = set(a[5:] for a in self.get_names()
<pre>
A ./Lib/difflib.py:1221:           if tag in {'replace', 'delete'}:
A ./Lib/_markupbase.py:152:        if sectName in {"temp", "cdata", "ignore", "include", "rcdata"}:
A ./Lib/difflib.py:1224:           if tag in {'replace', 'insert'}:
A ./Lib/_markupbase.py:155:        elif sectName in {"if", "else", "endif"}:
A ./Lib/difflib.py:1305:       if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group):
A ./Lib/_markupbase.py:210:               if name not in {"attlist", "element", "entity", "notation"}:
A ./Lib/difflib.py:1314:        if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group):
A ./Lib/_markupbase.py:129:               elif decltype in {"attlist", "linktype", "link", "element"}:
A ./Lib/ftplib.py:931:   if treply[:3] not in {'125', '150'}: raise error_proto  # RFC 959
B ./Lib/_pyio.py:158:   modes = set(mode)
A ./Lib/ftplib.py:933:   if sreply[:3] not in {'125', '150'}: raise error_proto  # RFC 959
A ./Lib/_pyio.py:159:   if modes - set("axrwb+tU") or len(mode) > len(modes):
A ./Lib/ftplib.py:178:        if s[:5] in {'pass ', 'PASS '}:
C ./Lib/_weakrefset.py:175:        return self.data <= set(ref(item) for item in other)
A ./Lib/ftplib.py:228:       if c in {'1', '2', '3'}:
C ./Lib/_weakrefset.py:182:        return self.data >= set(ref(item) for item in other)
A ./Lib/ftplib.py:252:       if resp[:3] not in {'426', '225', '226'}:
C ./Lib/_weakrefset.py:187:        return self.data == set(ref(item) for item in other)
A ./Lib/ftplib.py:570:        if resp[:3] in {'250', '200'}:
C ./Lib/abc.py:132:       abstracts = {name
A ./Lib/ftplib.py:388:        if user == 'anonymous' and passwd in {'', '-'}:
A ./Lib/abc.py:189:       return any(cls.__subclasscheck__(c) for c in {subclass, subtype})
A ./Lib/ftplib.py:819:           if resp[:3] not in {'426', '225', '226'}:
B ./Lib/ast.py:60:           return set(map(_convert, node.elts))
B ./Lib/hashlib.py:59:algorithms_guaranteed = set(__always_supported)
B ./Lib/bdb.py:22:        self.skip = set(skip) if skip else None
B ./Lib/hashlib.py:60:algorithms_available = set(__always_supported)
C ./Lib/cgi.py:627:       return list(set(item.name for item in self.list))
B ./Lib/inspect.py:994:   possible_kwargs = set(args + kwonlyargs)
B ./Lib/cmd.py:287:       commands = set(self.completenames(*args))
B ./Lib/mailbox.py:1614:        flags = set(flags)
C ./Lib/cmd.py:288:        topics = set(a[5:] for a in self.get_names()
B ./Lib/mailbox.py:1110:           all_keys = set(self.keys())
A ./Lib/difflib.py:1221:           if tag in {'replace', 'delete'}:
B ./Lib/mailbox.py:1646:           flags = set(self.get_flags())
A ./Lib/difflib.py:1224:           if tag in {'replace', 'insert'}:
B ./Lib/mailbox.py:1733:            sequences = set(self.get_sequences())
A ./Lib/difflib.py:1305:        if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group):
B ./Lib/mailbox.py:1823:            labels = set(self.get_labels())
A ./Lib/difflib.py:1314:        if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group):
B ./Lib/mailbox.py:1547:           flags = set(self.get_flags())
A ./Lib/ftplib.py:931:   if treply[:3] not in {'125', '150'}: raise error_proto  # RFC 959
B ./Lib/mailbox.py:1744:           sequences = set(self.get_sequences())
A ./Lib/ftplib.py:933:    if sreply[:3] not in {'125', '150'}: raise error_proto  # RFC 959
B ./Lib/mailbox.py:1836:           labels = set(self.get_labels())
A ./Lib/ftplib.py:178:        if s[:5] in {'pass ', 'PASS '}:
B ./Lib/mailbox.py:1141:               for key in sorted(set(keys)):
A ./Lib/ftplib.py:228:        if c in {'1', '2', '3'}:
B ./Lib/mailbox.py:1511:       self.set_flags(''.join(set(self.get_flags()) | set(flag)))
A ./Lib/ftplib.py:252:        if resp[:3] not in {'426', '225', '226'}:
B ./Lib/mailbox.py:1511:       self.set_flags(''.join(set(self.get_flags()) | set(flag)))
A ./Lib/ftplib.py:570:        if resp[:3] in {'250', '200'}:
B ./Lib/mailbox.py:1560:            flags = set(self.get_flags())
A ./Lib/ftplib.py:388:       if user == 'anonymous' and passwd in {'', '-'}:
B ./Lib/mailbox.py:1636:       self.set_flags(''.join(set(self.get_flags()) | set(flag)))
A ./Lib/ftplib.py:819:            if resp[:3] not in {'426', '225', '226'}:
B ./Lib/mailbox.py:1636:       self.set_flags(''.join(set(self.get_flags()) | set(flag)))
B ./Lib/hashlib.py:59:algorithms_guaranteed = set(__always_supported)
B ./Lib/mailbox.py:1669:            flags = set(self.get_flags())
B ./Lib/hashlib.py:60:algorithms_available = set(__always_supported)
B ./Lib/mailbox.py:1846:            labels = set(self.get_labels())
B ./Lib/inspect.py:994:   possible_kwargs = set(args + kwonlyargs)
B ./Lib/mailbox.py:1516:           self.set_flags(''.join(set(self.get_flags()) - set(flag)))
B ./Lib/mailbox.py:1614:       flags = set(flags)
B ./Lib/mailbox.py:1516:           self.set_flags(''.join(set(self.get_flags()) - set(flag)))
B ./Lib/mailbox.py:1110:           all_keys = set(self.keys())
B ./Lib/mailbox.py:1568:           flags = set(self.get_flags())
B ./Lib/mailbox.py:1646:           flags = set(self.get_flags())
B ./Lib/mailbox.py:1641:            self.set_flags(''.join(set(self.get_flags()) - set(flag)))
B ./Lib/mailbox.py:1733:            sequences = set(self.get_sequences())
B ./Lib/mailbox.py:1641:           self.set_flags(''.join(set(self.get_flags()) - set(flag)))
B ./Lib/mailbox.py:1823:           labels = set(self.get_labels())
B ./Lib/mailbox.py:1679:            flags = set(self.get_flags())
B ./Lib/mailbox.py:1547:           flags = set(self.get_flags())
B ./Lib/mailbox.py:1757:            sequences = set(self.get_sequences())
B ./Lib/mailbox.py:1744:            sequences = set(self.get_sequences())
A ./Lib/netrc.py:74:                   tt in {'', 'machine', 'default', 'macdef'}):
B ./Lib/mailbox.py:1836:            labels = set(self.get_labels())
A ./Lib/nntplib.py:124:_LONGRESP = {
B ./Lib/mailbox.py:1141:               for key in sorted(set(keys)):
A ./Lib/pydoc.py:166:   if name in {'__builtins__', '__doc__', '__file__', '__path__',
B ./Lib/mailbox.py:1511:       self.set_flags(''.join(set(self.get_flags()) | set(flag)))
A ./Lib/pydoc.py:2022:            if modname in {'test.badsyntax_pep3120', 'badsyntax_pep3120'}:
B ./Lib/mailbox.py:1511:       self.set_flags(''.join(set(self.get_flags()) | set(flag)))
B ./Lib/shutil.py:204:       return set(ignored_names)
B ./Lib/mailbox.py:1560:            flags = set(self.get_flags())
B ./Lib/site.py:84:   for m in set(sys.modules.values()):
B ./Lib/mailbox.py:1636:        self.set_flags(''.join(set(self.get_flags()) | set(flag)))
A ./Lib/socket.py:233:_blocking_errnos = { EAGAIN, EWOULDBLOCK }
B ./Lib/mailbox.py:1636:       self.set_flags(''.join(set(self.get_flags()) | set(flag)))
A ./Lib/socket.py:152:            if c not in {"r", "w", "b"}:
B ./Lib/mailbox.py:1669:            flags = set(self.get_flags())
A ./Lib/sre_compile.py:27:_LITERAL_CODES = set([LITERAL, NOT_LITERAL])
B ./Lib/mailbox.py:1846:            labels = set(self.get_labels())
A ./Lib/sre_compile.py:28:_REPEATING_CODES = set([REPEAT, MIN_REPEAT, MAX_REPEAT])
B ./Lib/mailbox.py:1516:           self.set_flags(''.join(set(self.get_flags()) - set(flag)))
A ./Lib/sre_compile.py:29:_SUCCESS_CODES = set([SUCCESS, FAILURE])
B ./Lib/mailbox.py:1516:           self.set_flags(''.join(set(self.get_flags()) - set(flag)))
A ./Lib/sre_compile.py:30:_ASSERT_CODES = set([ASSERT, ASSERT_NOT])
B ./Lib/mailbox.py:1568:           flags = set(self.get_flags())
A ./Lib/sre_parse.py:22:DIGITS = set("0123456789")
B ./Lib/mailbox.py:1641:           self.set_flags(''.join(set(self.get_flags()) - set(flag)))
A ./Lib/sre_parse.py:24:OCTDIGITS = set("01234567")
B ./Lib/mailbox.py:1641:           self.set_flags(''.join(set(self.get_flags()) - set(flag)))
A ./Lib/sre_parse.py:25:HEXDIGITS = set("0123456789abcdefABCDEF")
B ./Lib/mailbox.py:1679:            flags = set(self.get_flags())
A ./Lib/sre_parse.py:27:WHITESPACE = set(" \t\n\r\v\f")
B ./Lib/mailbox.py:1757:           sequences = set(self.get_sequences())
A ./Lib/sre_parse.py:381:_PATTERNENDERS = set("|)")
A ./Lib/netrc.py:74:                   tt in {'', 'machine', 'default', 'macdef'}):
A ./Lib/sre_parse.py:382:_ASSERTCHARS = set("=!<")
A ./Lib/nntplib.py:124:_LONGRESP = {
A ./Lib/sre_parse.py:383:_LOOKBEHINDASSERTCHARS = set("=!")
A ./Lib/pydoc.py:166:   if name in {'__builtins__', '__doc__', '__file__', '__path__',
A ./Lib/sre_parse.py:384:_REPEATCODES = set([MIN_REPEAT, MAX_REPEAT])
A ./Lib/pydoc.py:2022:            if modname in {'test.badsyntax_pep3120', 'badsyntax_pep3120'}:
B ./Lib/stringprep.py:19:b1_set = set([173, 847, 6150, 6155, 6156, 6157, 8203, 8204, 8205, 8288, 65279] + list(range(65024,65040)))
B ./Lib/shutil.py:204:       return set(ignored_names)
B ./Lib/stringprep.py:220:c22_specials = set([1757, 1807, 6158, 8204, 8205, 8232, 8233, 65279] + list(range(8288,8292)) + list(range(8298,8304)) + list(range(65529,65533)) + list(range(119155,119163)))
B ./Lib/site.py:84:   for m in set(sys.modules.values()):
B ./Lib/stringprep.py:247:c6_set = set(range(65529,65534))
A ./Lib/socket.py:233:_blocking_errnos = { EAGAIN, EWOULDBLOCK }
B ./Lib/stringprep.py:252:c7_set = set(range(12272,12284))
A ./Lib/socket.py:152:           if c not in {"r", "w", "b"}:
B ./Lib/stringprep.py:257:c8_set = set([832, 833, 8206, 8207] + list(range(8234,8239)) + list(range(8298,8304)))
A ./Lib/sre_compile.py:27:_LITERAL_CODES = set([LITERAL, NOT_LITERAL])
B ./Lib/stringprep.py:262:c9_set = set([917505] + list(range(917536,917632)))
A ./Lib/sre_compile.py:28:_REPEATING_CODES = set([REPEAT, MIN_REPEAT, MAX_REPEAT])
B ./Lib/subprocess.py:1280:                   fds_to_keep = set(pass_fds)
A ./Lib/sre_compile.py:29:_SUCCESS_CODES = set([SUCCESS, FAILURE])
B ./Lib/sysconfig.py:724:               archs = tuple(sorted(set(archs)))
A ./Lib/sre_compile.py:30:_ASSERT_CODES = set([ASSERT, ASSERT_NOT])
A ./Lib/tarfile.py:129:PAX_NAME_FIELDS = {"path", "linkpath", "uname", "gname"}
A ./Lib/sre_parse.py:22:DIGITS = set("0123456789")
B ./Lib/trace.py:133:       self._mods = set() if not modules else set(modules)
A ./Lib/sre_parse.py:24:OCTDIGITS = set("01234567")
B ./Lib/collections/abc.py:421:       return set(it)
A ./Lib/sre_parse.py:25:HEXDIGITS = set("0123456789abcdefABCDEF")
B ./Lib/collections/abc.py:437:        return set(it)
A ./Lib/sre_parse.py:27:WHITESPACE = set(" \t\n\r\v\f")
C ./Lib/concurrent/futures/_base.py:192:        finished = set(
A ./Lib/sre_parse.py:381:_PATTERNENDERS = set("|)")
C ./Lib/concurrent/futures/_base.py:254:        done = set(f for f in fs
A ./Lib/sre_parse.py:382:_ASSERTCHARS = set("=!<")
B ./Lib/concurrent/futures/_base.py:195:        pending = set(fs) - finished
A ./Lib/sre_parse.py:383:_LOOKBEHINDASSERTCHARS = set("=!")
B ./Lib/concurrent/futures/_base.py:256:        not_done = set(fs) - done
A ./Lib/sre_parse.py:384:_REPEATCODES = set([MIN_REPEAT, MAX_REPEAT])
B ./Lib/concurrent/futures/_base.py:275:    return DoneAndNotDoneFutures(done, set(fs) - done)
B ./Lib/stringprep.py:19:b1_set = set([173, 847, 6150, 6155, 6156, 6157, 8203, 8204, 8205, 8288, 65279] + list(range(65024,65040)))
A ./Lib/distutils/msvc9compiler.py:255:    interesting = set(("include", "lib", "libpath", "path"))
B ./Lib/stringprep.py:220:c22_specials = set([1757, 1807, 6158, 8204, 8205, 8232, 8233, 65279] + list(range(8288,8292)) + list(range(8298,8304)) + list(range(65529,65533)) + list(range(119155,119163)))
B ./Lib/distutils/util.py:153:                archs = tuple(sorted(set(archs)))
B ./Lib/stringprep.py:247:c6_set = set(range(65529,65534))
A ./Lib/idlelib/CodeContext.py:18:BLOCKOPENERS = set(["class", "def", "elif", "else", "except", "finally", "for",
B ./Lib/stringprep.py:252:c7_set = set(range(12272,12284))
A ./Lib/idlelib/EditorWindow.py:1614:    if (not keylist) or (macosxSupport.runningAsOSXApp() and eventname in {
B ./Lib/stringprep.py:257:c8_set = set([832, 833, 8206, 8207] + list(range(8234,8239)) + list(range(8298,8304)))
C ./Lib/idlelib/MultiCall.py:127:        substates = list(set(state & x for x in states))
B ./Lib/stringprep.py:262:c9_set = set([917505] + list(range(917536,917632)))
A ./Lib/lib2to3/fixer_util.py:167:consuming_calls = set(["sorted", "list", "set", "any", "all", "tuple", "sum",
B ./Lib/subprocess.py:1280:                    fds_to_keep = set(pass_fds)
A ./Lib/lib2to3/fixer_util.py:339:_def_syms = set([syms.classdef, syms.funcdef])
B ./Lib/sysconfig.py:724:                archs = tuple(sorted(set(archs)))
A ./Lib/lib2to3/fixer_util.py:382:_block_syms = set([syms.funcdef, syms.classdef, syms.trailer])
A ./Lib/tarfile.py:129:PAX_NAME_FIELDS = {"path", "linkpath", "uname", "gname"}
B ./Lib/lib2to3/main.py:220:    avail_fixes = set(refactor.get_fixers_from_package(fixer_pkg))
B ./Lib/trace.py:133:        self._mods = set() if not modules else set(modules)
C ./Lib/lib2to3/main.py:221:    unwanted_fixes = set(fixer_pkg + ".fix_" + fix for fix in options.nofix)
B ./Lib/collections/abc.py:421:        return set(it)
A ./Lib/lib2to3/patcomp.py:35:    skip = set((token.NEWLINE, token.INDENT, token.DEDENT))
B ./Lib/collections/abc.py:437:        return set(it)
A ./Lib/lib2to3/refactor.py:60:        return set([pat.type])
C ./Lib/concurrent/futures/_base.py:192:        finished = set(
A ./Lib/lib2to3/fixes/fix_dict.py:39:iter_exempt = fixer_util.consuming_calls | set(["iter"])
C ./Lib/concurrent/futures/_base.py:254:        done = set(f for f in fs
B ./Lib/lib2to3/fixes/fix_numliterals.py:25:        elif val.startswith('0') and val.isdigit() and len(set(val)) > 1:
B ./Lib/concurrent/futures/_base.py:195:        pending = set(fs) - finished
B ./Lib/multiprocessing/managers.py:397:            self.id_to_obj[ident] = (obj, set(exposed), method_to_typeid)
B ./Lib/concurrent/futures/_base.py:256:        not_done = set(fs) - done
B ./Lib/packaging/run.py:237:    for dist in set(opts['args']):
B ./Lib/concurrent/futures/_base.py:275:    return DoneAndNotDoneFutures(done, set(fs) - done)
A ./Lib/packaging/compiler/msvc9compiler.py:243:    interesting = set(("include", "lib", "libpath", "path"))
A ./Lib/distutils/msvc9compiler.py:255:    interesting = set(("include", "lib", "libpath", "path"))
B ./Lib/packaging/pypi/simple.py:138:        self._mirrors = set(mirrors)
B ./Lib/distutils/util.py:153:                archs = tuple(sorted(set(archs)))
B ./Lib/packaging/pypi/xmlrpc.py:184:        return [self._projects[name.lower()] for name in set(projects)]
A ./Lib/idlelib/CodeContext.py:18:BLOCKOPENERS = set(["class", "def", "elif", "else", "except", "finally", "for",
B ./Lib/packaging/pypi/xmlrpc.py:95:                hidden_versions = set(all_versions) - set(existing_versions)
A ./Lib/idlelib/EditorWindow.py:1614:    if (not keylist) or (macosxSupport.runningAsOSXApp() and eventname in {
B ./Lib/packaging/pypi/xmlrpc.py:95:                hidden_versions = set(all_versions) - set(existing_versions)
C ./Lib/idlelib/MultiCall.py:127:        substates = list(set(state & x for x in states))
A ./Lib/wsgiref/handlers.py:25:_is_request = {
A ./Lib/lib2to3/fixer_util.py:167:consuming_calls = set(["sorted", "list", "set", "any", "all", "tuple", "sum",
B ./Lib/xml/etree/ElementTree.py:990:    HTML_EMPTY = set(HTML_EMPTY)
A ./Lib/lib2to3/fixer_util.py:339:_def_syms = set([syms.classdef, syms.funcdef])
B ./Lib/xmlrpc/server.py:279:        methods = set(self.funcs.keys())
A ./Lib/lib2to3/fixer_util.py:382:_block_syms = set([syms.funcdef, syms.classdef, syms.trailer])
B ./Lib/xmlrpc/server.py:284:                methods |= set(self.instance._listMethods())
B ./Lib/lib2to3/main.py:220:    avail_fixes = set(refactor.get_fixers_from_package(fixer_pkg))
B ./Lib/xmlrpc/server.py:289:                methods |= set(list_public_methods(self.instance))
C ./Lib/lib2to3/main.py:221:    unwanted_fixes = set(fixer_pkg + ".fix_" + fix for fix in options.nofix)
A ./Lib/lib2to3/patcomp.py:35:    skip = set((token.NEWLINE, token.INDENT, token.DEDENT))
Type A: 48 (41.0%)
A ./Lib/lib2to3/refactor.py:60:        return set([pat.type])
Type B: 59 (50.4%)
A ./Lib/lib2to3/fixes/fix_dict.py:39:iter_exempt = fixer_util.consuming_calls | set(["iter"])
Type C: 10 (8.5%)
B ./Lib/lib2to3/fixes/fix_numliterals.py:25:        elif val.startswith('0') and val.isdigit() and len(set(val)) > 1:
B ./Lib/multiprocessing/managers.py:397:            self.id_to_obj[ident] = (obj, set(exposed), method_to_typeid)
B ./Lib/packaging/run.py:237:    for dist in set(opts['args']):
A ./Lib/packaging/compiler/msvc9compiler.py:243:    interesting = set(("include", "lib", "libpath", "path"))
B ./Lib/packaging/pypi/simple.py:138:        self._mirrors = set(mirrors)
B ./Lib/packaging/pypi/xmlrpc.py:184:        return [self._projects[name.lower()] for name in set(projects)]
B ./Lib/packaging/pypi/xmlrpc.py:95:                hidden_versions = set(all_versions) - set(existing_versions)
B ./Lib/packaging/pypi/xmlrpc.py:95:                hidden_versions = set(all_versions) - set(existing_versions)
A ./Lib/wsgiref/handlers.py:25:_is_request = {
B ./Lib/xml/etree/ElementTree.py:990:    HTML_EMPTY = set(HTML_EMPTY)
B ./Lib/xmlrpc/server.py:279:        methods = set(self.funcs.keys())
B ./Lib/xmlrpc/server.py:284:                methods |= set(self.instance._listMethods())
B ./Lib/xmlrpc/server.py:289:                methods |= set(list_public_methods(self.instance))
 
Type A: 48 (41.0%)
Type B: 59 (50.4%)
Type C: 10 (8.5%)
</pre>
638

edits