This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v4] [BZ #16141] strptime %z: fix rounding, extend range to +/-9959


On 08/11/2015 08:27 PM, James Perkins wrote:
> Fixes [BZ #16141] strptime %z offset restriction.
> 
> Topic: strptime supports a %z input field descriptor, which parses a
> time zone offset from UTC time into the broken-out time field tm_gmtoff.

Overall the patch looks great. The high-level idea is correct.

The architecture of the changes is almost right, I think though that we
need a more thorough test case that covers every value in +HHMM to -HHMM
(skipping the invalid ones) to make sure we never ever break this again
when someone tries to optimize the integer math.

After we fix the testing we can do a thorough polish of any final details.

> Problems:
> 
> 1) In the current implementation, the minutes portion calculation is
> correct only for minutes evenly divisible by 3. This is because the
> minutes value is converted to decimal time, but inadequate precision
> leads to rounding which calculates results that are too low for
> some values.

Agreed, the 50/30 value is not sufficiently accurate.

> For example, due to rounding, a +1159 offset string results in an
> incorrect tm_gmtoff of 43128 (== 11 * 3600 + 58.8 * 60) seconds,
> instead of 43140 (== 11 * 3600 + 59 * 60) seconds. In contrast,
> a +1157 offset (minutes divisible by 3) does not cause the bug,
> and results in a correct tm_gmtoff of 43020.

Agreed.

> 2) strptime's %z specifier will not parse time offsets less than
> -1200 or greater than +1200, or if only hour digits are present, less
> than -12 or greater than +12. It will return NULL for offsets outside
> that range. These limits do not meet historical and modern use cases:
> 
>   * Present day exceeds the +1200 limit:
>     - Pacific/Auckland (New Zealand) summer time is +1300.
>     - Pacific/Kiritimati (Christmas Island) is +1400.
>     - Pacific/Apia (Samoa) summer time is +1400.

Agreed.

>   * Historical offsets exceeded +1500/-1500.

Agreed.

>   * POSIX supports -2459 to +2559.

Agreed, but note that POSIX has no %z support in strptime, and this
is a glibc extension to support %z in strptime (like it does in strftime).

>   * Offsets up to +/-9959 may occasionally be useful.

Agreed.

>   * Paul Eggert's notes provide additional detail:
>     - https://sourceware.org/ml/libc-alpha/2014-12/msg00068.html
>     - https://sourceware.org/ml/libc-alpha/2014-12/msg00072.html

OK.

> 
> 3) tst-strptime2, part of the 'make check' test suite, does not test
> for the above problems.

As mentioned above, this needs to be more thorough.

> Corrective actions:
> 
> 1) In time/strptime_l.c, calculate the offset from the hour and
> minute portions directly, without the rounding errors introduced by
> decimal time.

OK.

> 2) Remove the +/-1200 range limit, permitting strptime to parse offsets
> from -9959 through +9959.

OK.

> 3) Add zone offset values to time/tst-strptime2.c.
> 
>   * Test minutes evenly divisible by three (+1157) and not evenly
>     divisible by three (+1158 and +1159).
>   * Test offsets near the old and new range limits (-1201, -1330, -2459,
>     -2500, -99, -9959, +1201, +1330, +1400, +1401, +2559, +2600, +99,
>     and +9959)

Not enough IMO. We might break this again in the future. I'd like to see
a thorough iteration of the values.

> The revised strptime passes all old and new tst-strptime2 tests.

Good thanks for clarifying that.

> 2015-08-11  James Perkins  <james@loowit.net>
> 
> 	[BZ #16141]
> 	* time/strptime_l.c (__strptime_internal): Fix %z minutes
> 	calculation, removing incorrect decimal time rounding, so that
> 	all minute values result in a valid seconds value.
> 	* time/strptime_l.c (__strptime_internal): Extend %z time zone
> 	offset range limits to UTC-99:59 through UTC+99:59 to parse
> 	current and historical use cases.
> 	* time/strptime2.c (tests): Modify and add tests for the
> 	strptime %z input field descriptor, specifically conversion of
> 	minutes to seconds and validating an offset range of -9959 to
> 	+9959.
> 
> Signed-off-by: James Perkins <james@loowit.net>
> ---
>  time/strptime_l.c    | 12 +++---------
>  time/tst-strptime2.c | 21 +++++++++++++++++++--
>  2 files changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/time/strptime_l.c b/time/strptime_l.c
> index 5640cce..4203ad8 100644
> --- a/time/strptime_l.c
> +++ b/time/strptime_l.c
> @@ -770,16 +770,10 @@ __strptime_internal (rp, fmt, tmp, statep LOCALE_PARAM)
>  	    else if (n != 4)
>  	      /* Only two or four digits recognized.  */
>  	      return NULL;
> -	    else
> -	      {
> -		/* We have to convert the minutes into decimal.  */
> -		if (val % 100 >= 60)
> -		  return NULL;
> -		val = (val / 100) * 100 + ((val % 100) * 50) / 30;

OK, I agree that 50/30 does result in a rounding error.

> -	      }
> -	    if (val > 1200)
> +	    else if (val % 100 >= 60)
> +	      /* Minutes valid range is 0 through 59.  */
>  	      return NULL;

OK.

> -	    tm->tm_gmtoff = (val * 3600) / 100;
> +	    tm->tm_gmtoff = (val / 100) * 3600 + (val % 100) * 60;

OK.

I created a small test case to isolate this code and validated it for
all values that were valid and your change matches the expected values
computed using double precision.

>  	    if (neg)
>  	      tm->tm_gmtoff = -tm->tm_gmtoff;
>  	  }
> diff --git a/time/tst-strptime2.c b/time/tst-strptime2.c
> index a08e6d7..4db8321 100644
> --- a/time/tst-strptime2.c
> +++ b/time/tst-strptime2.c
> @@ -17,8 +17,25 @@ static const struct
>      { "1113472456 -1030", -37800 },
>      { "1113472456 +0030", 1800 },
>      { "1113472456 -0030", -1800 },
> -    { "1113472456 -1330", LONG_MAX },
> -    { "1113472456 +1330", LONG_MAX },
> +    { "1113472456 +1157", 43020 },
> +    { "1113472456 +1158", 43080 },
> +    { "1113472456 +1159", 43140 },
> +    { "1113472456 +1200", 43200 },
> +    { "1113472456 -1200", -43200 },
> +    { "1113472456 +1201", 43260 },
> +    { "1113472456 -1201", -43260 },
> +    { "1113472456 +1330", 48600 },
> +    { "1113472456 -1330", -48600 },
> +    { "1113472456 +1400", 50400 },
> +    { "1113472456 +1401", 50460 },
> +    { "1113472456 -2459", -89940 },
> +    { "1113472456 -2500", -90000 },
> +    { "1113472456 +2559", 93540 },
> +    { "1113472456 +2600", 93600 },
> +    { "1113472456 -99", -356400 },
> +    { "1113472456 +99", 356400 },
> +    { "1113472456 -9959", -359940 },
> +    { "1113472456 +9959", 359940 },
>      { "1113472456 -1060", LONG_MAX },
>      { "1113472456 +1060", LONG_MAX },
>      { "1113472456  1030", LONG_MAX },

This test needs to be extended to cover all somehow. I don't want to
come back to this in 5 years when it breaks again.

Is that asking too much?

Cheers,
Carlos.
 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]