This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug libc/5225] New: The .*wscanf family of functions segfault
- From: "emil at wojak dot eu" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sources dot redhat dot com
- Date: 27 Oct 2007 17:27:48 -0000
- Subject: [Bug libc/5225] New: The .*wscanf family of functions segfault
- Reply-to: sourceware-bugzilla at sourceware dot org
This bug reveals itself in every glibc I could test, ie. 2.4 - 2.7 inclusive,
and it doesn't appear to be fixed in the repo HEAD.
This code below breaks miserably:
#include <stdio.h>
#include <wchar.h>
int main(void) {
wchar_t in[]=L"123,abc,321";
wchar_t format[50]=L"%d,%[^,],%d";
int out_d1, out_d2;
char out_s[50];
printf("in='%ls' format='%ls'\n", in, format);
swscanf(in, L"%d,%[^,],%d", &out_d1, out_s, &out_d2);
printf("in='%ls' format='%ls'\n", in, format);
printf("out_d1=%d out_s='%s' out_d2=%d\n", out_d1, out_s, out_d2);
return 0;
}
If you change the line with swscanf() call, to pass the format param from
writable memory, like this:
swscanf(in, format, &out_d1, out_s, &out_d2);
then it works and prints the following:
in='123,abc,321' format='%d,%[^,],%d'
in='123,abc,321' format='%d,%[^,321'
out_d1=123 out_s='abc' out_d2=321
As you can see, format gets overwritten by swscanf, that's why the version with
format passed as a literal segfaults.
I tracked this bug down to function _IO_vfwscanf in stdio-common/vfscanf.c.
There is a macro there called ADDW(Ch), that is used to append a character to a
temporary buffer. The buffer, pointed to by CHAR_T *wp, is used as a workplace
where digits from the input are copied to, and it is passed to a strtoll
function derivative afterwards for the actual conversion to a number.
The ADDW macro also handles dynamic allocation and resizing of the buffer when
needed, using alloca for on-the-stack allocation.
The macro itself seems to work as designed, however it assumes that the wp
pointer stays readonly outside of it. Unfortunately the code handling character
ranges (the %[] specifier) blithely points the wp pointer to the end of
character range, ie. the closing bracket.
So here's what happens with the above PoC code:
Preonditions:
wp=NULL; // pointer to the workplace buffer
wpsize=0; // how much of the buffer is in use
wpmax=0; // the actual size of the buffer
Parsing of '123' by the '%d' code:
ADDW() is invocated with each input digit.
Te first invocation allocates 256 bytes on the stack, points wp to the newly
allocated memory and stores 256 in wpmax.
wpsize is incremented with each call.
Parsing of 'abc' by the '%[]' code:
wp is set to point to the closing bracket in format which happens to be non-
writable memory.
Parsing of '321' by the '%d' code:
ADDW() is called again, it checks if there is enough space in the workplace by
comparing wpsize with wpmax, and of course sees no reason to resize the buffer,
leaving the wp pointer alone. After that it tries to write to wp[wpsize++] -
BANG!
My proposed solution - don't mess with wp outside of ADDW() macro, use another
temporary to point to the end of character range in the '%[]' code.
Workaround for developers - split your format after each character range and
call .*wscanf for each slice.
--
Summary: The .*wscanf family of functions segfault
Product: glibc
Version: unspecified
Status: NEW
Severity: critical
Priority: P2
Component: libc
AssignedTo: drepper at redhat dot com
ReportedBy: emil at wojak dot eu
CC: glibc-bugs at sources dot redhat dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=5225
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.