This is the mail archive of the gsl-discuss@sourceware.org mailing list for the GSL project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Sparse matrix extension

From: Patrick Alken <alken at colorado dot edu>
To: "gsl-discuss at sourceware dot org" <gsl-discuss at sourceware dot org>
Date: Sat, 13 Feb 2016 12:41:48 -0700
Subject: Re: Sparse matrix extension
Authentication-results: sourceware.org; auth=none
Authentication-results: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=colorado.edu;
References: <CAMWWPT3uJj4Vrn7ut6+F18gY===zd6+1r1UJhz0hcCj--zwtdg at mail dot gmail dot com> <CAMWWPT3Y=x-vYphaV+2gHPb9WZqEfYdDxs4T0KqBq987VjvDQA at mail dot gmail dot com> <569E6C33 dot 1090505 at colorado dot edu> <CAMWWPT2kMe=R0qUz4P6n_xzG+oW-8pOFwCtBuaritZ5MF6jzAg at mail dot gmail dot com> <569EA1A9 dot 2080101 at colorado dot edu> <CAMWWPT2d=PnK1cexcZ+OLQAYJudRk9QLGac-Wcn33OaES5UjJA at mail dot gmail dot com> <56B689B1 dot 5090005 at colorado dot edu> <CAMWWPT1mog0HrviL1tMo=f-rrSc2PhamWDGg7ZYLrVnnqkY3ng at mail dot gmail dot com> <56B77E13 dot 1000306 at colorado dot edu> <CAMWWPT2KTLjufNdwW=y-xaJ1dKkxDNQwgat4DB4ZwKDrK_fZSA at mail dot gmail dot com> <56B7A59D dot 5040707 at colorado dot edu> <CAMWWPT0CX+ti2j6QM8YmHqv+n8yg7sK69dtSdzKOQMFx3pXeBA at mail dot gmail dot com> <56B7B85C dot 10508 at colorado dot edu> <CAMWWPT1ELN9_BFLG=KUQ=Z6rC6ytYGTm0_K-=4-urJ6HGxfFJQ at mail dot gmail dot com> <CAMWWPT0qUccM6cNkX6fWezmj09SdcgYqYzGE3eoqetpuyZW6Vg at mail dot gmail dot com> <56BB5D84 dot 20804 at colorado dot edu> <CAMWWPT0a77RXNkg=a=Y8fJoxbYKpz-7tgvsXGPQNd4X_e=pNag at mail dot gmail dot com>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:23

Ok I've added the MatrixMarket stuff to fprintf/fscanf. I also merged
all the new sparse matrix stuff into the master branch on git.

I've added almost your whole patch, except for all the stuff in
spmanip.c and also the gt_elements() routines in spprop.c. I'm not sure
these routines belong in a general purpose matrix library....although I
can possibly see a need for the row/col multiplication stuff, since we
can't easily make vector views of rows/columns of sparse matrices..

I'll need to think some more about this.

Also I haven't yet modified dgemm, because I'm hesitant to use the
dynamic allocation method in your patch. Also I found a sparse blas
document on netlib which tries to outline a standard and I realize our
sparse blas routines don't adhere to this standard...so I may try to
redesign everything to follow that document.

Anyway thanks a lot for all your work on this.

Patrick

On 02/12/2016 03:42 AM, Alexis Tantet wrote:
> I corrected _fscanf. There was an error when reading the comment header.
>
> On Wed, Feb 10, 2016 at 4:55 PM, Patrick Alken <alken@colorado.edu> wrote:
>> I'm in favor of simplicity and easy-parsing, so matrix market sounds good to
>> me. I'll take a look at your latest code in the next few days.
>>
>> Patrick
>>
>>
>> On 02/10/2016 06:16 AM, Alexis Tantet wrote:
>>> Hi Patrick,
>>>
>>> Regarding the file format for sparse matrices, the one I have coded
>>> actually happen to be the coordinate format implemented by Matrix
>>> Market (the platform to share test data such as sparse matrices), with
>>> the addition of a matrix type header:
>>> http://math.nist.gov/MatrixMarket/formats.html
>>>
>>> It is also written that "Harwell-Boeing" is the most commonly used
>>> sparse matrix format, but that:
>>> "Unfortunately the HB specification is somewhat complex difficult to
>>> parse from languages other than Fortran, biased in favor of compressed
>>> column representation and not easily extensible. Some of these factors
>>> may have impeded the more widespread use of the HB format"
>>>
>>> It seems to me that complying to the Matrix Market coordinate format
>>> would be the right choice, in terms of ease of implementation,
>>> compliance to other packages and user-friendliness. I could update the
>>> print/scan functions accordingly (mostly handling the header). What do
>>> you think?
>>>
>>> Best,
>>> Alexis
>>>
>>>
>>>
>>>
>>> On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com>
>>> wrote:
>>>> Ok, my mistake, now I see where I got confused.
>>>> I had in mind to add all the elements first to the triplets and only
>>>> while converting to compressed sum up the duplicates.
>>>> While, indeed, if there's a way you can sum up the duplicates directly
>>>> while adding them to the triplet matrix (thanks to _ptr), this is more
>>>> handy and efficient.
>>>>
>>>> Thanks for the clarification,
>>>> Alexis
>>>>
>>>> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu>
>>>> wrote:
>>>>> By design, gsl_spmatrix_set won't allow you to do this.
>>>>>
>>>>> If you add element (i, j, x) and then later to try add element (i, j,
>>>>> y), gsl_spmatrix_set will detect that there exists an element in the (i,
>>>>> j) spot and it will simply change x to y - the value of x will be
>>>>> overwritten by y. This is the same behavior as gsl_matrix_set.
>>>>>
>>>>> So no duplicates are allowed by design. If you have such an application
>>>>> where you want to keep track of duplicates, you could do the following:
>>>>>
>>>>> double *ptr = gsl_spmatrix_ptr(m, i, j);
>>>>> if (ptr)
>>>>>    *ptr += x; /* sum duplicate values */
>>>>> else
>>>>>     gsl_spmatrix_set(m, i, j, x);   /* initalize to x */
>>>>>
>>>>> On 02/07/2016 01:31 PM, Alexis Tantet wrote:
>>>>>> I'm not sure I got your last point. I have the following situation in
>>>>>> mind:
>>>>>>
>>>>>> Start to construct a transition matrix in triplet format, adding one
>>>>>> element after another.
>>>>>> In this particular example, each element is one count of a transition
>>>>>> from (state, box, etc.) i to j,
>>>>>> so I add elements  (i, j, 1) to the triplet object, with possibly
>>>>>> duplicates.
>>>>>> What happen to these duplicates in the binary tree?
>>>>>>
>>>>>> Eventually, when I compress to CRS or CCS, I would like the duplicates
>>>>>> to be summed up, so that element (i, j) counts transitions from i to j
>>>>>> (and no duplicates exist after compression).
>>>>>>
>>>>>> Is this more clear?
>>>>>>
>>>>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu>
>>>>>> wrote:
>>>>>>> Hi Alexis,
>>>>>>>
>>>>>>>>> I'm not sure what you mean. I've added a new function
>>>>>>>>> gsl_spmatrix_ptr
>>>>>>>>> to the git, which as far as I can tell does exactly what your
>>>>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j)
>>>>>>>>> element,
>>>>>>>>> and if found returns a pointer. If not found a null pointer is
>>>>>>>>> returned.
>>>>>>>>> This makes it easy for the user to modify A(i,j) after it has been
>>>>>>>>> added
>>>>>>>>> to the matrix. Are you thinking of something else? Can you point me
>>>>>>>>> to
>>>>>>>>> the Eigen routine?
>>>>>>>>>
>>>>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress,
>>>>>>>> with the difference that gsl_spmatrix_ptr is used instead of
>>>>>>>> gsl_spmatrix_set,
>>>>>>>> so has to build the compressed matrix from triplets, summing the
>>>>>>>> duplicates, instead of replacing them.
>>>>>>>> This is what is done here :
>>>>>>>> The
>>>>>>>> http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Alexis
>>>>>>> I'm not sure why a user would ever need to do this. The whole point of
>>>>>>> the binary tree structure in the triplet storage is to efficiently
>>>>>>> find
>>>>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on
>>>>>>> an element which is already been previously set, it can find that
>>>>>>> element with a binary search (rather than linearly searching the
>>>>>>> arrays)
>>>>>>> and change the value of that element.
>>>>>>>
>>>>>>> Therefore, the way the triplet storage is designed, there is will
>>>>>>> never
>>>>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n])
>>>>>>> will be unique for each n <= nz.
>>>>>>>
>>>>>>> Am I missing something?
>>>>>>>
>>>>>>> Patrick
>>>>>>
>>>>
>>>> --
>>>> Alexis Tantet
>>>
>>>
>
>

Follow-Ups:
- Re: Sparse matrix extension
  - From: Alexis Tantet

References:
- Re: Sparse matrix extension
  - From: Alexis Tantet
- Re: Sparse matrix extension
  - From: Patrick Alken
- Re: Sparse matrix extension
  - From: Alexis Tantet
- Re: Sparse matrix extension
  - From: Patrick Alken
- Re: Sparse matrix extension
  - From: Alexis Tantet
- Re: Sparse matrix extension
  - From: Patrick Alken
- Re: Sparse matrix extension
  - From: Alexis Tantet
- Re: Sparse matrix extension
  - From: Patrick Alken
- Re: Sparse matrix extension
  - From: Alexis Tantet
- Re: Sparse matrix extension
  - From: Patrick Alken
- Re: Sparse matrix extension
  - From: Alexis Tantet
- Re: Sparse matrix extension
  - From: Patrick Alken
- Re: Sparse matrix extension
  - From: Alexis Tantet
- Re: Sparse matrix extension
  - From: Alexis Tantet
- Re: Sparse matrix extension
  - From: Patrick Alken
- Re: Sparse matrix extension
  - From: Alexis Tantet

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]