This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ping5][PATCH v2] Implement benchmark script in python


On Wed, Mar 19, 2014 at 08:57:47AM +0000, Will Newton wrote:
> Hi Siddesh,
> 
> This patch looks ok to me apart from very minor nits. It runs, the
> results are comparable to the old code and the Python code looks very
> clean to me (although I think Mike has a keener eye for this than
> me!).

Thanks for the review Will!  I think I'll commit this in 48 hours
after fixing the nits you pointed out, since it's blocking a bunch of
other changes I've had in the pipeline for a while (see
siddhesh/benchmarks if you want a sneak peek).  If you think I should
wait for Mike or another maintainer to review it then let me know and
I'll hold.

Siddhesh

> 
> On 10 March 2014 07:20, Siddhesh Poyarekar <siddhesh@redhat.com> wrote:
> > Ping!
> >
> > Sending the patch again to track on patchwork.
> >
> > Siddhesh
> >
> >         * scripts/bench.pl: Remove file.
> >         * scripts/bench.py: New benchmark script.
> >         * benchtests/Makefile ($(objpfx)bench-%.c): Use it.
> >         * benchtests/README: Mention python dependency.
> >         * scripts/pylintrc: New file.
> >         * scripts/pylint: New file.
> >
> > commit 9656e13b7c71ee100e744546626d8c73cf91e8d9
> > Author: Siddhesh Poyarekar <siddhesh@redhat.com>
> > Date:   Fri Dec 6 13:51:09 2013 +0530
> >
> >     Implement benchmarking script in python
> >
> > diff --git a/benchtests/Makefile b/benchtests/Makefile
> > index 8bfb039..792f61f 100644
> > --- a/benchtests/Makefile
> > +++ b/benchtests/Makefile
> > @@ -127,5 +127,5 @@ $(objpfx)bench-%.c: %-inputs $(bench-deps)
> >         { if [ -n "$($*-INCLUDE)" ]; then \
> >           cat $($*-INCLUDE); \
> >         fi; \
> > -       $(..)scripts/bench.pl $(patsubst %-inputs,%,$<); } > $@-tmp
> > +       $(..)scripts/bench.py $(patsubst %-inputs,%,$<); } > $@-tmp
> >         mv -f $@-tmp $@
> > diff --git a/benchtests/README b/benchtests/README
> > index a5fd8da..2a940fa 100644
> > --- a/benchtests/README
> > +++ b/benchtests/README
> > @@ -8,7 +8,9 @@ basic performance properties of the function.
> >  Running the benchmark:
> >  =====================
> >
> > -The benchmark can be executed by invoking make as follows:
> > +The benchmark needs python 2.7 or later in addition to the
> > +dependencies required to build the GNU C Library.  One may run the
> > +benchmark by invoking make as follows:
> >
> >    $ make bench
> >
> > diff --git a/scripts/bench.pl b/scripts/bench.pl
> > deleted file mode 100755
> > index 569cd51..0000000
> > diff --git a/scripts/bench.py b/scripts/bench.py
> > new file mode 100755
> > index 0000000..81ffc15
> > --- /dev/null
> > +++ b/scripts/bench.py
> > @@ -0,0 +1,299 @@
> > +#!/usr/bin/python
> > +# Copyright (C) 2014 Free Software Foundation, Inc.
> > +# This file is part of the GNU C Library.
> > +#
> > +# The GNU C Library is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU Lesser General Public
> > +# License as published by the Free Software Foundation; either
> > +# version 2.1 of the License, or (at your option) any later version.
> > +#
> > +# The GNU C Library is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +# Lesser General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU Lesser General Public
> > +# License along with the GNU C Library; if not, see
> > +# <http://www.gnu.org/licenses/>.
> > +
> > +"""Benchmark program generator script
> > +
> > +This script takes a function name as input and generates a program using
> > +an input file located in the benchtests directory.  The name of the
> > +input file should be of the form foo-inputs where 'foo' is the name of
> > +the function.
> > +"""
> > +
> > +from __future__ import print_function
> > +import sys
> > +import os
> > +import itertools
> > +
> > +# Macro definitions for functions that take no arguments.  For functions
> > +# that take arguments, the STRUCT_TEMPLATE, ARGS_TEMPLATE and
> > +# VARIANTS_TEMPLATE are used instead.
> > +DEFINES_TEMPLATE = '''
> > +#define CALL_BENCH_FUNC(v, i) %(func)s ()
> > +#define NUM_VARIANTS (1)
> > +#define NUM_SAMPLES(v) (1)
> > +#define VARIANT(v) FUNCNAME "()"
> > +'''
> > +
> > +# Structures to store arguments for the function call.  A function may
> > +# have its inputs partitioned to represent distinct performance
> > +# characteristics or distinct flavors of the function.  Each such
> > +# variant is represented by the _VARIANT structure.  The ARGS structure
> > +# represents a single set of arguments.
> > +STRUCT_TEMPLATE = '''
> > +#define CALL_BENCH_FUNC(v, i) %(func)s (%(func_args)s)
> > +
> > +struct args
> > +{
> > +%(args)s
> > +};
> > +
> > +struct _variants
> > +{
> > +  const char *name;
> > +  int count;
> > +  struct args *in;
> > +};
> > +'''
> > +
> > +# The actual input arguments.
> > +ARGS_TEMPLATE = '''
> > +struct args in%(argnum)d[%(num_args)d] = {
> > +%(args)s
> > +};
> > +'''
> > +
> > +# The actual variants, along with macros defined to access the variants.
> > +VARIANTS_TEMPLATE = '''
> > +struct _variants variants[%(num_variants)d] = {
> > +%(variants)s
> > +};
> > +
> > +#define NUM_VARIANTS %(num_variants)d
> > +#define NUM_SAMPLES(i) (variants[i].count)
> > +#define VARIANT(i) (variants[i].name)
> > +'''
> > +
> > +# Epilogue for the generated source file.
> > +EPILOGUE = '''
> > +#define BENCH_FUNC(i, j) ({%(getret)s CALL_BENCH_FUNC (i, j);})
> > +#define FUNCNAME "%(func)s"
> > +#include "bench-skeleton.c"'''
> > +
> > +
> > +def gen_source(func, directives, all_vals):
> > +    """Generate source for the function
> > +
> > +    Generate the C source for the function from the values and
> > +    directives.
> > +
> > +    Args:
> > +      func: The function name
> > +      directives: A dictionary of directives applicable to this function
> > +      all_vals: A dictionary input values
> > +    """
> > +    # The includes go in first.
> > +    for header in directives['includes']:
> > +        print('#include <%s>' % header)
> > +
> > +    for header in directives['include-sources']:
> > +        print('#include "%s"' % header)
> > +
> > +    # Print macros.  This branches out to a separate routine if
> > +    # the function takes arguments.
> > +    if not directives['args']:
> > +        print(DEFINES_TEMPLATE % {'func': func})
> > +        outargs = []
> > +    else:
> > +        outargs = _print_arg_data(func, directives, all_vals)
> > +
> > +    # Print the output variable definitions if necessary.
> > +    for out in outargs:
> > +        print(out)
> > +
> > +    # If we have a return value from the function, make sure it is
> > +    # assigned to prevent the compiler from optimizing out the
> > +    # call.
> > +    if directives['ret']:
> > +        print('static %s volatile ret;' % directives['ret'])
> > +        getret = 'ret = '
> > +    else:
> > +        getret = ''
> > +
> > +    print(EPILOGUE % {'getret': getret, 'func': func})
> > +
> > +
> > +def _print_arg_data(func, directives, all_vals):
> > +    """Print argument data
> > +
> > +    This is a helper function for gen_source that prints structure and
> > +    values for arguments and their variants and returns output arguments
> > +    if any are found.
> > +
> > +    Args:
> > +      func: Function name
> > +      directives: A dictionary of directives applicable to this function
> > +      all_vals: A dictionary input values
> > +
> > +    Returns:
> > +      Returns a list of definitions for function arguments that act as
> > +      output parameters.
> > +    """
> > +    # First, all of the definitions.  We process writing of
> > +    # CALL_BENCH_FUNC, struct args and also the output arguments
> > +    # together in a single traversal of the arguments list.
> > +    func_args = []
> > +    arg_struct = []
> > +    outargs = []
> > +
> > +    for arg, i in zip(directives['args'], itertools.count()):
> > +        if arg[0] == '<' and arg[-1] == '>':
> > +            pos = arg.rfind('*')
> > +            if pos == -1:
> > +                die('Output argument must be a pointer type')
> > +
> > +            outargs.append('static %s out%d;' % (arg[1:pos], i))
> > +            func_args.append(' &out%d' % i)
> > +        else:
> > +            arg_struct.append('  %s volatile arg%d;' % (arg, i))
> > +            func_args.append('variants[v].in[i].arg%d' % i)
> > +
> > +    print(STRUCT_TEMPLATE % {'args' : '\n'.join(arg_struct), 'func': func,
> > +                             'func_args': ', '.join(func_args)})
> > +
> > +    # Now print the values.
> > +    variants = []
> > +    for (k, vals), i in zip(all_vals.items(), itertools.count()):
> > +        out = ['  {%s},' % v for v in vals]
> > +
> > +        # Members for the variants structure list that we will
> > +        # print later.
> > +        variants.append('  {"%s(%s)", %d, in%d},' % (func, k, len(vals), i))
> > +        print(ARGS_TEMPLATE % {'argnum': i, 'num_args': len(vals),
> > +                               'args': '\n'.join(out)})
> > +
> > +    # Print the variants and the last set of macros.
> > +    print(VARIANTS_TEMPLATE % {'num_variants': len(all_vals),
> > +                               'variants': '\n'.join(variants)})
> > +    return outargs
> > +
> > +
> > +def _process_directive(d_name, d_val):
> > +    """Process a directive.
> > +
> > +    Evaluate the directive name and value passed and return the
> > +    processed value. This is a helper function for parse_file.
> > +
> > +    Args:
> > +      d_name: Name of the directive
> > +      d_value: The string value to process
> 
> This name does not match the argument name above.
> 
> > +
> > +    Returns:
> > +      The processed value, which may be the string as it is or an object
> > +      that describes the directive.
> > +    """
> > +    # Process the directive values if necessary.  name and ret don't
> > +    # need any processing.
> > +    if d_name.startswith('include'):
> > +        d_val = d_val.split(',')
> > +    elif d_name == 'args':
> > +        d_val = d_val.split(':')
> > +
> > +    # Return the values.
> > +    return d_val
> > +
> > +
> > +def parse_file(func):
> > +    """Parse an input file
> > +
> > +    Given a function name, open and parse an input file for the function
> > +    and get the necessary parameters for the generated code and the list
> > +    of inputs.
> > +
> > +    Args:
> > +      func: The function name
> > +
> > +    Returns:
> > +      A tuple of two elements, one a dictionary of directives and the
> > +      other a dictionary of all input values.
> > +    """
> > +    all_vals = {}
> > +    # Valid directives.
> > +    directives = {
> > +            'name': '',
> > +            'args': [],
> > +            'includes': [],
> > +            'include-sources': [],
> > +            'ret': ''
> > +    }
> > +
> > +    try:
> > +        with open('%s-inputs' % func) as f:
> > +            for line in f:
> > +                # Look for directives and parse it if found.
> > +                if line.startswith('##'):
> > +                    try:
> > +                        d_name, d_val = line[2:].split(':', 1)
> > +                        d_name = d_name.strip()
> > +                        d_val = d_val.strip()
> > +                        directives[d_name] = _process_directive(d_name, d_val)
> > +                    except (IndexError, KeyError):
> > +                        die('Invalid directive: %s' % line[2:])
> > +
> > +                # Skip blank lines and comments.
> > +                line = line.split('#', 1)[0].rstrip()
> > +                if not line:
> > +                    continue
> > +
> > +                # Otherwise, we're an input.  Add to the appropriate
> > +                # input set.
> > +                cur_name = directives['name']
> > +                all_vals.setdefault(cur_name, [])
> > +                all_vals[cur_name].append(line)
> > +    except IOError as ex:
> > +        die("Failed to open input file (%s): %s" % (ex.filename, ex.strerror))
> > +
> > +    return directives, all_vals
> > +
> > +
> > +def die(msg):
> > +    """Exit with an error
> > +
> > +    Prints an error message to the standard error stream and exits with
> > +    a non-zero status.
> > +
> > +    Args:
> > +      msg: The error message to print to standard error.
> 
> Trailing period is inconsistent with other docstrings.
> 
> > +    """
> > +    print('%s\n' % msg, file=sys.stderr)
> > +    sys.exit(os.EX_DATAERR)
> > +
> > +
> > +def main(args):
> > +    """Main function
> > +
> > +    Use the first command line argument as function name and parse its
> > +    input file to generate C source that calls the function repeatedly
> > +    for the input.
> > +
> > +    Args:
> > +      args: The command line arguments with the program name dropped.
> 
> As above.
> 
> > +
> > +    Returns:
> > +      os.EX_USAGE on error and os.EX_OK on success.
> > +    """
> > +    if len(args) != 1:
> > +        print('Usage: %s <function>' % sys.argv[0])
> > +        return os.EX_USAGE
> > +
> > +    directives, all_vals = parse_file(args[0])
> > +    gen_source(args[0], directives, all_vals)
> > +    return os.EX_OK
> > +
> > +
> > +if __name__ == '__main__':
> > +    sys.exit(main(sys.argv[1:]))
> > diff --git a/scripts/pylint b/scripts/pylint
> > new file mode 100755
> > index 0000000..49a775e
> > --- /dev/null
> > +++ b/scripts/pylint
> > @@ -0,0 +1,5 @@
> > +#!/bin/sh
> > +# Simple wrapper around the pylint program that uses the pylintrc file to
> > +# validate the source code in files passed on command line.
> > +
> > +exec pylint --rcfile "${0%/*}/pylintrc" "$@"
> > diff --git a/scripts/pylintrc b/scripts/pylintrc
> > new file mode 100644
> > index 0000000..a05ddfd
> > --- /dev/null
> > +++ b/scripts/pylintrc
> > @@ -0,0 +1,274 @@
> > +[MASTER]
> > +
> > +# Specify a configuration file.
> > +#rcfile=
> > +
> > +# Python code to execute, usually for sys.path manipulation such as
> > +# pygtk.require().
> > +#init-hook=
> > +
> > +# Profiled execution.
> > +profile=no
> > +
> > +# Add files or directories to the blacklist. They should be base names, not
> > +# paths.
> > +ignore=CVS
> > +
> > +# Pickle collected data for later comparisons.
> > +persistent=yes
> > +
> > +# List of plugins (as comma separated values of python modules names) to load,
> > +# usually to register additional checkers.
> > +load-plugins=
> > +
> > +
> > +[MESSAGES CONTROL]
> > +
> > +# Enable the message, report, category or checker with the given id(s). You can
> > +# either give multiple identifier separated by comma (,) or put this option
> > +# multiple time. See also the "--disable" option for examples.
> > +#enable=
> > +
> > +# Disable the message, report, category or checker with the given id(s). You
> > +# can either give multiple identifiers separated by comma (,) or put this
> > +# option multiple times (only on the command line, not in the configuration
> > +# file where it should appear only once).You can also use "--disable=all" to
> > +# disable everything first and then reenable specific checks. For example, if
> > +# you want to run only the similarities checker, you can use "--disable=all
> > +# --enable=similarities". If you want to run only the classes checker, but have
> > +# no Warning level messages displayed, use"--disable=all --enable=classes
> > +# --disable=W"
> > +#disable=
> > +
> > +
> > +[REPORTS]
> > +
> > +# Set the output format. Available formats are text, parseable, colorized, msvs
> > +# (visual studio) and html. You can also give a reporter class, eg
> > +# mypackage.mymodule.MyReporterClass.
> > +output-format=text
> > +
> > +# Put messages in a separate file for each module / package specified on the
> > +# command line instead of printing them on stdout. Reports (if any) will be
> > +# written in a file name "pylint_global.[txt|html]".
> > +files-output=no
> > +
> > +# Tells whether to display a full report or only the messages
> > +reports=yes
> > +
> > +# Python expression which should return a note less than 10 (10 is the highest
> > +# note). You have access to the variables errors warning, statement which
> > +# respectively contain the number of errors / warnings messages and the total
> > +# number of statements analyzed. This is used by the global evaluation report
> > +# (RP0004).
> > +evaluation=10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10)
> > +
> > +# Add a comment according to your evaluation note. This is used by the global
> > +# evaluation report (RP0004).
> > +comment=no
> > +
> > +# Template used to display messages. This is a python new-style format string
> > +# used to format the massage information. See doc for all details
> > +#msg-template=
> > +
> > +
> > +[MISCELLANEOUS]
> > +
> > +# List of note tags to take in consideration, separated by a comma.
> > +notes=FIXME,XXX,TODO
> > +
> > +
> > +[SIMILARITIES]
> > +
> > +# Minimum lines number of a similarity.
> > +min-similarity-lines=4
> > +
> > +# Ignore comments when computing similarities.
> > +ignore-comments=yes
> > +
> > +# Ignore docstrings when computing similarities.
> > +ignore-docstrings=yes
> > +
> > +# Ignore imports when computing similarities.
> > +ignore-imports=no
> > +
> > +
> > +[BASIC]
> > +
> > +# Required attributes for module, separated by a comma
> > +required-attributes=
> > +
> > +# List of builtins function names that should not be used, separated by a comma
> > +bad-functions=map,filter,apply,input
> > +
> > +# Regular expression which should only match correct module names
> > +module-rgx=(([a-z_][a-z0-9_]*)|([A-Z][a-zA-Z0-9]+))$
> > +
> > +# Regular expression which should only match correct module level names
> > +const-rgx=(([A-Z_][A-Z0-9_]*)|(__.*__))$
> > +
> > +# Regular expression which should only match correct class names
> > +class-rgx=[A-Z_][a-zA-Z0-9]+$
> > +
> > +# Regular expression which should only match correct function names
> > +function-rgx=[a-z_][a-z0-9_]{2,30}$
> > +
> > +# Regular expression which should only match correct method names
> > +method-rgx=[a-z_][a-z0-9_]{2,30}$
> > +
> > +# Regular expression which should only match correct instance attribute names
> > +attr-rgx=[a-z_][a-z0-9_]{2,30}$
> > +
> > +# Regular expression which should only match correct argument names
> > +argument-rgx=[a-z_][a-z0-9_]{2,30}$
> > +
> > +# Regular expression which should only match correct variable names
> > +variable-rgx=[a-z_][a-z0-9_]{2,30}$
> > +
> > +# Regular expression which should only match correct attribute names in class
> > +# bodies
> > +class-attribute-rgx=([A-Za-z_][A-Za-z0-9_]{2,30}|(__.*__))$
> > +
> > +# Regular expression which should only match correct list comprehension /
> > +# generator expression variable names
> > +inlinevar-rgx=[A-Za-z_][A-Za-z0-9_]*$
> > +
> > +# Good variable names which should always be accepted, separated by a comma
> > +# f is a useful name for a file descriptor
> > +good-names=f,i,j,k,ex,Run,_
> > +
> > +# Bad variable names which should always be refused, separated by a comma
> > +bad-names=foo,bar,baz,toto,tutu,tata
> > +
> > +# Regular expression which should only match function or class names that do
> > +# not require a docstring.
> > +no-docstring-rgx=__.*__
> > +
> > +# Minimum line length for functions/classes that require docstrings, shorter
> > +# ones are exempt.
> > +docstring-min-length=-1
> > +
> > +
> > +[VARIABLES]
> > +
> > +# Tells whether we should check for unused import in __init__ files.
> > +init-import=no
> > +
> > +# A regular expression matching the beginning of the name of dummy variables
> > +# (i.e. not used).
> > +dummy-variables-rgx=_$|dummy
> > +
> > +# List of additional names supposed to be defined in builtins. Remember that
> > +# you should avoid to define new builtins when possible.
> > +additional-builtins=
> > +
> > +
> > +[FORMAT]
> > +
> > +# Maximum number of characters on a single line.
> > +max-line-length=79
> > +
> > +# Regexp for a line that is allowed to be longer than the limit.
> > +ignore-long-lines=^\s*(# )?<?https?://\S+>?$
> > +
> > +# Maximum number of lines in a module
> > +max-module-lines=1000
> > +
> > +# String used as indentation unit. This is usually " " (4 spaces) or "\t" (1
> > +# tab).
> > +indent-string='    '
> > +
> > +
> > +[TYPECHECK]
> > +
> > +# Tells whether missing members accessed in mixin class should be ignored. A
> > +# mixin class is detected if its name ends with "mixin" (case insensitive).
> > +ignore-mixin-members=yes
> > +
> > +# List of classes names for which member attributes should not be checked
> > +# (useful for classes with attributes dynamically set).
> > +ignored-classes=SQLObject
> > +
> > +# When zope mode is activated, add a predefined set of Zope acquired attributes
> > +# to generated-members.
> > +zope=no
> > +
> > +# List of members which are set dynamically and missed by pylint inference
> > +# system, and so shouldn't trigger E0201 when accessed. Python regular
> > +# expressions are accepted.
> > +generated-members=REQUEST,acl_users,aq_parent
> > +
> > +
> > +[CLASSES]
> > +
> > +# List of interface methods to ignore, separated by a comma. This is used for
> > +# instance to not check methods defines in Zope's Interface base class.
> > +ignore-iface-methods=isImplementedBy,deferred,extends,names,namesAndDescriptions,queryDescriptionFor,getBases,getDescriptionFor,getDoc,getName,getTaggedValue,getTaggedValueTags,isEqualOrExtendedBy,setTaggedValue,isImplementedByInstancesOf,adaptWith,is_implemented_by
> > +
> > +# List of method names used to declare (i.e. assign) instance attributes.
> > +defining-attr-methods=__init__,__new__,setUp
> > +
> > +# List of valid names for the first argument in a class method.
> > +valid-classmethod-first-arg=cls
> > +
> > +# List of valid names for the first argument in a metaclass class method.
> > +valid-metaclass-classmethod-first-arg=mcs
> > +
> > +
> > +[IMPORTS]
> > +
> > +# Deprecated modules which should not be used, separated by a comma
> > +deprecated-modules=regsub,TERMIOS,Bastion,rexec
> > +
> > +# Create a graph of every (i.e. internal and external) dependencies in the
> > +# given file (report RP0402 must not be disabled)
> > +import-graph=
> > +
> > +# Create a graph of external dependencies in the given file (report RP0402 must
> > +# not be disabled)
> > +ext-import-graph=
> > +
> > +# Create a graph of internal dependencies in the given file (report RP0402 must
> > +# not be disabled)
> > +int-import-graph=
> > +
> > +
> > +[DESIGN]
> > +
> > +# Maximum number of arguments for function / method
> > +max-args=5
> > +
> > +# Argument names that match this expression will be ignored. Default to name
> > +# with leading underscore
> > +ignored-argument-names=_.*
> > +
> > +# Maximum number of locals for function / method body
> > +max-locals=15
> > +
> > +# Maximum number of return / yield for function / method body
> > +max-returns=6
> > +
> > +# Maximum number of branch for function / method body
> > +max-branches=12
> > +
> > +# Maximum number of statements in function / method body
> > +max-statements=50
> > +
> > +# Maximum number of parents for a class (see R0901).
> > +max-parents=7
> > +
> > +# Maximum number of attributes for a class (see R0902).
> > +max-attributes=7
> > +
> > +# Minimum number of public methods for a class (see R0903).
> > +min-public-methods=2
> > +
> > +# Maximum number of public methods for a class (see R0904).
> > +max-public-methods=20
> > +
> > +
> > +[EXCEPTIONS]
> > +
> > +# Exceptions that will emit a warning when being caught. Defaults to
> > +# "Exception"
> > +overgeneral-exceptions=Exception
> 
> 
> 
> -- 
> Will Newton
> Toolchain Working Group, Linaro

Attachment: pgppUD0D59_VK.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]