This is the mail archive of the gsl-discuss@sources.redhat.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: C++ wrapper


Gerard Jungman wrote:

> I've been thinking alot about native C++ implementations
> for some of the GSL functionality,
> so I am interested in what people want to do.  Anyway,
> I see that this is a fairly straightforward wrapper project,
> so maybe they thought it would not be sufficiently whiz-bang
> to bring up in this forum.  But still,
> any sort of general discussion about C++ is welcome here.

It is hard to write numerical application programs in ANSI C
clearly and concisely so that they are easy for other programmers
to read, understand and maintain.  An application program
written in ANSI C will, on average, be THREE TIMES LARGER
than the same application program written in C++ --
any way you choose to measure program size.

> I am especially interested in new object-oriented designs.
> As an example, consider the following.
> You can get these situations
> where you want to calculate a series of values
> or you want both the function and its derivative
> (say for the functions associated to second-order equations,
>  Bessel being the canonical example).
> Then there are often ways to calculate the necessary extra values 
> more optimally than just doing consecutive calculations
> (this is clear for the case of a sequence of values
> related by recursion, for example).  In the procedural world,
> you deal with this by providing multiple interfaces, such as
> 
>     o calculate f(x)
>     o calculate both f(x) and f'(x)
>     o . . .
>     o calculate f_n(x), f_{n+1}(x), ... f_m(x)
>
> The multiplicity of interfaces here
> reminds one of the kind of multiplicity
> which occurs in the standard procedural interfaces
> for things like linear algebra (lapack or blas,
>  with their exponential explosion of interface
>  for all the different types and algorithms).

It isn't exponential.

> It is not nearly so bad here
> but we are still talking about
> an increase in the size of the interface specification
> by a factor of 3 or so, over the "naive" notion
> of "Bessel function".  That's annoying for various reasons.
> 
>   o user perspective: more functions to understand
>     and wade through when looking for what you want
> 
>   o advanced user perspective: may want to reuse
>     the library algorithms in his own setting,
>     but there is no way to look into a library procedure,
>     other than a disagreeable copy-paste-edit-debug cycle
> 
>   o implementor perspective: more isolated
>     but closely related functionalities to support,
>     therefore more opportunity for problems
>     like functional duplication

You should specify as little as possible
about how a numerical class library is to be implemented.
Even the most subtle assumption
about the actual data representation or algorithm implementation
may preclude a better implementation without changing the API
and breaking all of the existing applications which depend upon it.

> The abstract problem is that these more involved calculations
> and composite calculations have state,
> and the basic design problem is encapsulating that state.
> In the procedural design you encapsulate that state
> solely in the procedure local variables.
> I would like to see an object-oriented design
> where that state gets stored in the obvious place:
> some object designed for it.
> 
> Note that there is an implicit assumption here
> about layering the design, so that average end users
> are not exposed to complexity that they are not interested in.
> However, the layer will be there to support users with special needs.
> In particular, the user with the most demanding special needs
> is the implementor himself
> who would ideally be able to compose pieces of the implementation
> from well-encapsulated and orthogonal base parts.
> 
> This may seem like alot of discussion
> for something as simple as "calculate J0(x)".
> But when you start thinking about efficient and clean implementations,
> you realize it is not so simple to get what you want
> out of the obvious mapping from functions of mathematics
> to functions in a procedural language.
> 
> I have written codes which were naively dominated
> by function evaluation and had to be structured
> in non-naive ways in order to make use of properties
> like recurrences to get the optimal performance
> (we're talking factors of 10 here, not fiddling at the 20% level).
> That sort of restructuring is often not the way
> that one thinks about the problem from a mathematical point of view
> and so you get an extra layer of confusion.
> I wonder if it would be possible to express what you mean
> in these sorts of situations rather than how the computer must do it.
> 
> Furthermore, I don't see anybody talking about this at all.
> Most of the work I see in high-performance C++
> is for vector/matrix/linear-algebra functionality.
> Very important stuff.  But the question of how to create
> efficient clean and usable high-performance components
> in other, perhaps more classical (and therefore boring?),
> areas of numerical computing is not often addressed.
> 
> Here is a basic question: Is this going to be harder or easier
> than the work being done on vector/matrix libraries?
> I don't know. Maybe some new things will have to be invented.
> Certainly we get new abstractions to play with,
> along with their associated implementation problems.

The scalar, vector, matrix and tensor arithmetic class libraries
are the basic foundation for virtually all of the more specialized
numerical class libraries.  You need to settle on a SVMT API standard
before you can make much progress with higher level class libraries.
Beyond SVMT arithmetic is linear algebra, signal processing,
image processing, numerical integration, ODE and PDE solvers, etc.
The Table of Contents from Numerical Recipes is probably as good
a place as any to outline the contents of a numerical class library.

The FFTW plan object is a DFT object.
Take a look at
The C++ Scalar, Vector, Matrix and Tensor class library

	http://www.netwood.net/~edwin/svmt/

Download, decompress and unarchive svmt.tgz
and read "The C++ Digital Signal Processing classes"
in the .../svmt/doc/signal.pdf portable document format file.
Take a look at Robert Davies' Newmat library

	http://webnz.com/robert/

He defines matrix decomposition objects.
Take a look at
the Vector, Signal and Image Processing Library (VSIPL)

	http://www.vsipl.org/

The VSIPL defines objects of all sorts --
not just vector, matrix and tensor objects.


Numerical programmers usually aren't
software engineers or even professional programmers.
They are usually amateur programmers who were hired
and are paid to do some other kind of professional work
but are expected to write the application programs
that they need to complete their work.
They try to learn as little about programming
and/or software engineering as is required
to complete the application programs that they need to write.
They seldom recognize the advantage
in writing programs that are reusable
and easier for other application programmers
to read, understand and maintain.
They don't have time to learn and evaluate
all of the existing numerical libraries
before deciding which one is best suited
to their purposes so they are inclined
to simply implement their own version from scratch.
They may realize that their version
is not the best possible implementation
but they feel that, at least, they understand it
and have complete control over the source code
so that they can modify it if necessary.

The advantage of a standard API is that
application programmers can begin program development
before they have finished evaluating
all of the different implementations
and deciding which implementation is most suitable.
They don't need to learn a new API
for each implementation and comparisons
between implementations are relatively straight forward.
ANSI C application programmers don't really need
to know anything about object oriented programming
or even C++ to use C++ class libraries
because they appear to be simple extensions
to the ANSI C programming language.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]