This is the mail archive of the gdb@sourceware.cygnus.com mailing list for the GDB project. See the GDB home page for more information.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
OK, who wants to speak to save bcache? -s ------- Start of forwarded message ------- From: Srikanth Adayapalam <srikanth@cup.hp.com> Subject: GDB questions To: shebs@cygnus.com Date: Mon, 15 Mar 1999 11:22:31 PST Hi Stan, My name is Srikanth and I work for the Wildebeest project at HP. I have been working on profiling the heap usage of GDB in response to several complaints from our customers about GDB's voracious appetite for memory. In the first phase I am focussing on fixing allocation bugs, leaks, eliminating redundancies, and tuning high overhead data structures. The second phase will focus at architectural improvements to speed up startup times and improve memory usage. One of things that I ran into in the high overhead items list is the byte cache (bcache.) This is the hash table used by the symbol reader when it is building psymtabs. Here are my observations on this hash table scheme : o it is used only during the psymtab building stage and the only things we store in it are vanilla symbol names, their demangled equivalents, and psymbols themselves. (It would be a digression here to mention that when we attempt to stick symbol names into the bcache, we are already duplicating strings, for the symbols name strings come from VT or its equivalent and is around for the duration of the object file.) o one thing unique about this hash table is that its objective seems to be not to achieve O(1) access to the objects stored in it. To appreciate this see that the lookup_cache() function is not called by any part of GDB other than the bcache module itself (and cannot be called as it a file static routine.) Rather the objective seems to be minimize storage requirement by maintaining unique copies of objects. o thus the only client of this module i.e., the symbol reader, requests this module to store certain kinds of objects (char *, psymbols) and is provided in return with a pointer to the location where the bcache module actually stored according to its internal algorithms. The client never looks up the hash table since it has no need to for it has a pointer to the whole object. Ironically this module does not minimize memory requirements of GDB but rather increases it tremendously. These are some of the numbers that illustrate this point. The memory requirements are as reported by gdb (when run with -statistics command line option) to bring up the application and break on main. Without bcache With Bcache Bloat HP C compiler 151+ MB 291+ MB 48 A Customer Application 82+ MB 107+ MB 23 GDB 20+ MB 24+ MB 16 HP C++ Compiler 43+ MB 51+ MB 15 The case marked "without bcache", is actually the bcache module itself but one that does not bother to eliminate duplicates. A further observation of interest is that this module scales very poorly : the overhead (which is really every byte that is not used to store GDB's data i.e., cells used for house keeping info like pointers, hash chain heads etc.,) is of the order of O(m * n * 64k) where m is the number of load modules compiled with -g, and n is the number of strings that have unique length. This spells doom for applications wih a large number of shared libraries (m increases) and C++ (n increases since we also stick demangled names into the cache.) This explains the lower overhead in the case of GDB and C++ compiler as they have only the main a.out compiled with -g. I think the first two benchmarks are more typical of HP's customers' code. That is the story. Before we go ahead and unplug the bcache, we thought it would be prudent to check with you guys to make sure we are overlooking anything here. Is there any compelling reason we should avoid duplicates in psymbols ? While we are on the topic, would it be possible to provide any details you have on PR 2207 ? I saw some reference to this in GDB sources and would like to find out more. Thanks for your time. Srikanth ------- End of forwarded message -------