This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: String Functions for x86-64 (memcpy)


On the left, the results for the proposed strncpy, on the right, for the current one.  When the proposed strncpy is slower, a "!" is appended to a line.  I filtered duplicate lines with sort.

First, on an Athlon 64:

                                strncpy simple_strncpy  stupi                                   strncpy simple_strncpy  stupi
Length    2, n    4, alignment  7  2:   29      19      43    | Length    2, n    4, alignment  7  2:   16      19      43      !
Length    4, n    2, alignment  2  7:   14      12      44      Length    4, n    2, alignment  2  7:   14      12      44      !
Length    4, n    8, alignment  6  4:   46      44      58    | Length    4, n    8, alignment  6  4:   29      44      58      !
Length    8, n    4, alignment  4  6:   24      18      52    | Length    8, n    4, alignment  4  6:   16      18      52      !
Length    8, n   16, alignment  0  0:   41      66      74    | Length    8, n   16, alignment  0  0:   50      66      74
Length    8, n   16, alignment  5  6:   88      71      89    | Length    8, n   16, alignment  5  6:   50      71      89      !
Length    8, n   16, alignment  7  2:   85      64      82    | Length    8, n   16, alignment  7  2:   47      64      82      !
Length   16, n    8, alignment  6  5:   59      46      60    | Length   16, n    8, alignment  6  5:   25      46      60      !
Length   16, n   16, alignment  0  4:   64      74      124   | Length   16, n   16, alignment  0  4:   66      74      124
Length   16, n   16, alignment  1  1:   40      74      124   | Length   16, n   16, alignment  1  1:   44      74      124
Length   16, n   16, alignment  1  2:   47      80      133   | Length   16, n   16, alignment  1  2:   52      80      133
Length   16, n   16, alignment  2  1:   41      74      127   | Length   16, n   16, alignment  2  1:   45      74      127
Length   16, n   16, alignment  2  2:   64      68      133   | Length   16, n   16, alignment  2  2:   45      76      133     !
Length   16, n   16, alignment  2  4:   66      76      140   | Length   16, n   16, alignment  2  4:   52      76      140     !
Length   16, n   16, alignment  2  5:   66      76      145   | Length   16, n   16, alignment  2  5:   68      76      145
Length   16, n   16, alignment  3  3:   64      73      119   | Length   16, n   16, alignment  3  3:   65      73      119
Length   16, n   16, alignment  3  6:   69      76      124   | Length   16, n   16, alignment  3  6:   64      76      124     !
Length   16, n   16, alignment  4  0:   74      68      116   | Length   16, n   16, alignment  4  0:   58      68      116     !
Length   16, n   16, alignment  4  2:   77      71      122   | Length   16, n   16, alignment  4  2:   42      71      122     !
Length   16, n   16, alignment  4  4:   74      72      117   | Length   16, n   16, alignment  4  4:   65      72      117     !
Length   16, n   16, alignment  4  6:   51      80      127   | Length   16, n   16, alignment  4  6:   73      80      127
Length   16, n   16, alignment  5  2:   50      68      113   | Length   16, n   16, alignment  5  2:   61      68      113
Length   16, n   16, alignment  5  5:   50      74      114   | Length   16, n   16, alignment  5  5:   66      74      114
Length   16, n   16, alignment  6  3:   63      68      115   | Length   16, n   16, alignment  6  3:   58      68      115     !
Length   16, n   16, alignment  6  4:   63      71      110     Length   16, n   16, alignment  6  4:   63      71      110
Length   16, n   16, alignment  6  6:   63      75      110   | Length   16, n   16, alignment  6  6:   64      75      110
Length   16, n   16, alignment  6  7:   63      68      119   | Length   16, n   16, alignment  6  7:   69      82      127
Length   16, n   16, alignment  7  6:   64      68      107   | Length   16, n   16, alignment  7  6:   61      74      107     !
Length   16, n   16, alignment  7  7:   64      73      108   | Length   16, n   16, alignment  7  7:   65      73      108
Length   16, n   32, alignment  0  0:   47      126     145   | Length   16, n   32, alignment  0  0:   125     123     145
Length   16, n   32, alignment  4  0:   86      119     160   | Length   16, n   32, alignment  4  0:   119     116     160
Length   16, n   32, alignment  6  4:   88      121     156   | Length   16, n   32, alignment  6  4:   120     119     156
Length   32, n   16, alignment  0  0:   64      75      93    | Length   32, n   16, alignment  0  0:   65      72      93
Length   32, n   16, alignment  0  4:   64      76      93    | Length   32, n   16, alignment  0  4:   69      74      93
Length   32, n   16, alignment  7  2:   65      76      104   | Length   32, n   16, alignment  7  2:   61      68      97      !
Length   32, n   64, alignment  0  0:   56      211     235   | Length   32, n   64, alignment  0  0:   217     209     235
Length   32, n   64, alignment  3  2:   96      211     252   | Length   32, n   64, alignment  3  2:   207     209     252
Length   32, n   64, alignment  5  6:   103     229     265   | Length   32, n   64, alignment  5  6:   227     229     265
Length   64, n   32, alignment  0  0:   72      128     151   | Length   64, n   32, alignment  0  0:   110     128     151
Length   64, n   32, alignment  2  3:   75      135     218   | Length   64, n   32, alignment  2  3:   121     135     218
Length   64, n   32, alignment  6  4:   70      123     171   | Length   64, n   32, alignment  6  4:   100     123     171
Length   64, n  128, alignment  0  0:   76      371     431   | Length   64, n  128, alignment  0  0:   399     369     431
Length   64, n  128, alignment  2  4:   151     402     494   | Length   64, n  128, alignment  2  4:   426     402     494
Length   64, n  128, alignment  4  0:   104     356     447   | Length   64, n  128, alignment  4  0:   369     356     447
Length  128, n   64, alignment  0  0:   88      225     282   | Length  128, n   64, alignment  0  0:   195     225     282
Length  128, n   64, alignment  4  2:   84      225     298   | Length  128, n   64, alignment  4  2:   169     225     298
Length  128, n   64, alignment  5  6:   77      260     326   | Length  128, n   64, alignment  5  6:   221     260     326
Length  128, n  256, alignment  0  0:   126     691     791   | Length  128, n  256, alignment  0  0:   765     689     791
Length  128, n  256, alignment  1  6:   203     676     826   | Length  128, n  256, alignment  1  6:   778     676     826
Length  128, n  256, alignment  3  2:   152     691     810   | Length  128, n  256, alignment  3  2:   723     689     810
Length  256, n  128, alignment  0  0:   121     417     514   | Length  256, n  128, alignment  0  0:   370     417     514
Length  256, n  128, alignment  4  0:   106     404     530   | Length  256, n  128, alignment  4  0:   309     404     530
Length  256, n  128, alignment  6  1:   119     404     522   | Length  256, n  128, alignment  6  1:   309     404     522
Length  256, n  512, alignment  0  0:   203     1331    1511  | Length  256, n  512, alignment  0  0:   1495    1329    1511
Length  256, n  512, alignment  2  4:   315     1493    1672  | Length  256, n  512, alignment  2  4:   1610    1494    1672
Length  512, n  256, alignment  0  0:   202     801     978   | Length  512, n  256, alignment  0  0:   715     801     978
Length  512, n  256, alignment  3  2:   188     801     998   | Length  512, n  256, alignment  3  2:   630     801     998
Length  512, n 1024, alignment  0  0:   355     2611    2951  | Length  512, n 1024, alignment  0  0:   2945    2609    2951
Length  512, n 1024, alignment  1  6:   522     2596    2986  | Length  512, n 1024, alignment  1  6:   3018    2596    2986
Length 1024, n  512, alignment  0  0:   338     1569    1906  | Length 1024, n  512, alignment  0  0:   1398    1569    1906
Length 1024, n  512, alignment  2  4:   395     1886    2197  | Length 1024, n  512, alignment  2  4:   1638    1905    2197
Length 2048, n 1024, alignment  0  0:   610     3105    3762  | Length 2048, n 1024, alignment  0  0:   2749    3105    3762
Length 2048, n 1024, alignment  1  6:   741     3092    3802  | Length 2048, n 1024, alignment  1  6:   2915    3092    3802

Now, on a P4:

                                strncpy simple_strncpy  stupi                                   strncpy simple_strncpy  stupi
Length    2, n    4, alignment  7  2:   8       0       40    | Length    2, n    4, alignment  7  2:   8       8       40
Length    4, n    2, alignment  2  7:   0       0       32    | Length    4, n    2, alignment  2  7:   8       8       40
Length    4, n    8, alignment  6  4:   24      32      56    | Length    4, n    8, alignment  6  4:   16      40      56      !
Length    8, n    4, alignment  4  6:   0       0       40    | Length    8, n    4, alignment  4  6:   8       8       48
Length    8, n   16, alignment  0  0:   64      64      72    | Length    8, n   16, alignment  0  0:   48      72      80      !
Length    8, n   16, alignment  5  6:   104     64      104   | Length    8, n   16, alignment  5  6:   48      72      112     !
Length    8, n   16, alignment  7  2:   104     56      80    | Length    8, n   16, alignment  7  2:   48      72      104     !
Length   16, n    8, alignment  6  5:   32      40      40    | Length   16, n    8, alignment  6  5:   8       48      48      !
Length   16, n   16, alignment  0  4:   32      112     88    | Length   16, n   16, alignment  0  4:   48      120     256
Length   16, n   16, alignment  1  1:   40      112     88    | Length   16, n   16, alignment  1  1:   48      120     96
Length   16, n   16, alignment  1  2:   40      112     104   | Length   16, n   16, alignment  1  2:   48      120     104
Length   16, n   16, alignment  2  1:   40      112     104   | Length   16, n   16, alignment  2  1:   40      120     112
Length   16, n   16, alignment  2  2:   48      112     88    | Length   16, n   16, alignment  2  2:   48      120     96
Length   16, n   16, alignment  2  4:   40      112     88    | Length   16, n   16, alignment  2  4:   48      120     96
Length   16, n   16, alignment  2  5:   96      200     216   | Length   16, n   16, alignment  2  5:   48      120     112     !
Length   16, n   16, alignment  3  3:   40      112     88    | Length   16, n   16, alignment  3  3:   48      120     88
Length   16, n   16, alignment  3  6:   40      112     88    | Length   16, n   16, alignment  3  6:   40      120     88
Length   16, n   16, alignment  4  0:   40      112     88    | Length   16, n   16, alignment  4  0:   40      120     96
Length   16, n   16, alignment  4  2:   48      112     104   | Length   16, n   16, alignment  4  2:   48      120     112
Length   16, n   16, alignment  4  4:   40      112     88    | Length   16, n   16, alignment  4  4:   48      120     96
Length   16, n   16, alignment  4  6:   40      112     112   | Length   16, n   16, alignment  4  6:   48      120     112
Length   16, n   16, alignment  5  2:   88      184     168   | Length   16, n   16, alignment  5  2:   48      120     88      !
Length   16, n   16, alignment  5  5:   96      192     168   | Length   16, n   16, alignment  5  5:   40      120     88      !
Length   16, n   16, alignment  6  3:   40      112     96    | Length   16, n   16, alignment  6  3:   40      120     104
Length   16, n   16, alignment  6  4:   48      112     88    | Length   16, n   16, alignment  6  4:   48      120     88
Length   16, n   16, alignment  6  6:   96      200     184   | Length   16, n   16, alignment  6  6:   48      120     88      !
Length   16, n   16, alignment  6  7:   40      112     112   | Length   16, n   16, alignment  6  7:   40      120     120
Length   16, n   16, alignment  7  6:   40      112     80    | Length   16, n   16, alignment  7  6:   48      120     88
Length   16, n   16, alignment  7  7:   40      112     88    | Length   16, n   16, alignment  7  7:   48      120     88
Length   16, n   32, alignment  0  0:   64      184     136   | Length   16, n   32, alignment  0  0:   104     184     136
Length   16, n   32, alignment  4  0:   112     176     136   | Length   16, n   32, alignment  4  0:   104     184     160     !
Length   16, n   32, alignment  6  4:   112     184     160   | Length   16, n   32, alignment  6  4:   96      184     152     !
Length   32, n   16, alignment  0  0:   40      112     56    | Length   32, n   16, alignment  0  0:   48      120     64
Length   32, n   16, alignment  0  4:   32      112     96    | Length   32, n   16, alignment  0  4:   48      120     64
Length   32, n   16, alignment  7  2:   40      112     64    | Length   32, n   16, alignment  7  2:   48      120     72
Length   32, n   64, alignment  0  0:   80      392     312   | Length   32, n   64, alignment  0  0:   296     368     320
Length   32, n   64, alignment  3  2:   280     576     592   | Length   32, n   64, alignment  3  2:   288     368     352
Length   32, n   64, alignment  5  6:   128     392     336   | Length   32, n   64, alignment  5  6:   312     368     352
Length   64, n   32, alignment  0  0:   56      208     144   | Length   64, n   32, alignment  0  0:   128     208     144
Length   64, n   32, alignment  2  3:   120     360     296   | Length   64, n   32, alignment  2  3:   160     208     192
Length   64, n   32, alignment  6  4:   128     200     152   | Length   64, n   32, alignment  6  4:   136     216     160
Length   64, n  128, alignment  0  0:   112     624     552   | Length   64, n  128, alignment  0  0:   520     696     552
Length   64, n  128, alignment  2  4:   144     624     568   | Length   64, n  128, alignment  2  4:   520     688     592
Length   64, n  128, alignment  4  0:   216     624     568   | Length   64, n  128, alignment  4  0:   520     688     584
Length  128, n   64, alignment  0  0:   72      392     256   | Length  128, n   64, alignment  0  0:   224     408     264
Length  128, n   64, alignment  4  2:   160     392     272   | Length  128, n   64, alignment  4  2:   224     400     272
Length  128, n   64, alignment  5  6:   80      392     280   | Length  128, n   64, alignment  5  6:   280     400     288
Length  128, n  256, alignment  0  0:   200     1200    1080  | Length  128, n  256, alignment  0  0:   968     1336    1088
Length  128, n  256, alignment  1  6:   336     1200    1096  | Length  128, n  256, alignment  1  6:   968     1328    1120
Length  128, n  256, alignment  3  2:   408     1200    1104  | Length  128, n  256, alignment  3  2:   968     1328    1112
Length  256, n  128, alignment  0  0:   136     776     520   | Length  256, n  128, alignment  0  0:   424     784     528
Length  256, n  128, alignment  4  0:   280     776     536   | Length  256, n  128, alignment  4  0:   416     792     544
Length  256, n  128, alignment  6  1:   280     776     536   | Length  256, n  128, alignment  6  1:   424     784     536
Length  256, n  512, alignment  0  0:   400     2352    2040  | Length  256, n  512, alignment  0  0:   1864    2616    2040
Length  256, n  512, alignment  2  4:   872     2352    2056  | Length  256, n  512, alignment  2  4:   1864    2360    2056
Length  512, n  256, alignment  0  0:   352     1552    968   | Length  512, n  256, alignment  0  0:   808     1560    968
Length  512, n  256, alignment  3  2:   536     1544    984   | Length  512, n  256, alignment  3  2:   808     1560    992
Length  512, n 1024, alignment  0  0:   968     4656    3960  | Length  512, n 1024, alignment  0  0:   3136    4664    3936
Length  512, n 1024, alignment  1  6:   1856    4656    3976  | Length  512, n 1024, alignment  1  6:   3144    4664    3800
Length 1024, n  512, alignment  0  0:   560     3088    1864  | Length 1024, n  512, alignment  0  0:   1568    3096    1864
Length 1024, n  512, alignment  2  4:   976     3080    1888  | Length 1024, n  512, alignment  2  4:   1576    3096    1872
Length 2048, n 1024, alignment  0  0:   1064    6160    3656  | Length 2048, n 1024, alignment  0  0:   3112    6168    3616
Length 2048, n 1024, alignment  1  6:   2000    6160    3680  | Length 2048, n 1024, alignment  1  6:   3112    6160    3304


Thanks,

-- 
_______________________________________________________
Evandro Menezes               AMD            Austin, TX



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]