This is the mail archive of the crossgcc@sources.redhat.com mailing list for the crossgcc project.
See the CrossGCC FAQ for lots more infromation.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Hi there, In light of the question of merging the src and gcc repos recently raised again in a thread on the GCC list about adding zlib to GCC, I've written an article about the Cygnus tree. It appears below. I hope it will start a better discussion about merging the repos and breaking the code of silence around the Cygnus tree. This article also appears on my FTP site in /pub/embedded/cygnus-tree-intro on ivan.Harhan.ORG. Whenever I release a toolchain for one of the embedded systems I'm working with based on the Cygnus tree, I have to explain to my users what it is, given its obscurity. Now I can just refer them to my article. Enjoy! -- Michael Sokolov Harhan Engineering Laboratory Public Service Agent International Free Computing Task Force International Engineering and Science Task Force 615 N GOOD LATIMER EXPY STE #4 DALLAS TX 75204-5852 USA Phone: +1-214-824-7693 (Harhan Eng Lab office) E-mail: msokolov@ivan.Harhan.ORG (ARPA TCP/SMTP) (UUCP coming soon) Here is the article: An Introduction to the Cygnus Tree By Michael Sokolov International Free Computing Task Force @(#)cygnus-tree-intro 1.1 00/09/04 1. What is the Cygnus tree? "The GNU configure and build system" by Ian Lance Taylor gives the following answer: The Cygnus tree is used for various packages including gdb, the GNU binutils, and egcs. It is also, of course, used for Cygnus releases. It is the build system which was developed at Cygnus, using the Cygnus configure script. It permits building many different packages with a single configure and make. The configure scripts in the tree are being converted to autoconf, but the general build structure remains intact. During the 1990s Cygnus Solutions (now part of Red Hat, Inc.) has created a remarkable system that is usually called the Cygnus tree. They have taken many GNU programs (which were all designed as completely self-contained stand-alone packages), most importantly the ones used for software development (gcc, gas, binutils, and gdb) and unified them into one source tree with a single top- level configure script and a single top-level Makefile. This allows all these packages to be configured, built, and installed together in one fell swoop. Initially, the Cygnus tree existed inside Cygnus only and was distributed only to their customers. At that point, most of the GNU programs were simply grafted into the tree with no changes of much interest to people outside of Cygnus. These programs existed and were completely public in their original GNU form, and the Cygnus tree was just a fancy packaging option of not much interest to people outside of Cygnus. Therefore, there was no pressure for a public Cygnus tree. However, as time went on, it turned out that Cygnus was the best and most active contributor to some of the GNU programs involved. As a result, they have taken over the maintenance of those programs and changed and enhanced them significantly. These GNU programs were gcc, gas, binutils, and gdb. Cygnus took over the maintenance of gas, binutils, and gdb back in early-mid 1990s, doing development behind closed doors and making occasional public releases. They were finally opened to the public in May 1999. gcc is bigger and has many more people interested in its development, so Cygnus didn't take over its development and do it behind closed doors. First they synchronised their work on it (in their internal Cygnus tree) with the FSF maintainers. Then in late 1997 they created an open development project for it which they named EGCS. It was run by Cygnus and competed with FSF's gcc project. Finally, in spring 1999 FSF closed their gcc project and EGCS was renamed into GCC. All these programs have been integrated into the Cygnus tree so completely that they no longer exist separately from it. Moreover, some of these programs are not even single modules any more. The Cygnus tree consists of many subdirectories called modules and some top-level glue. Initially, there was one module for each GNU program grafted into the tree. Then, however, Cygnus added some new modules, split some existing ones, and made some existing modules dependent on some new ones. As a result, some of the modules that initially (long ago) were grafted into the tree from stand-alone GNU packages can no longer be pulled back out of the tree and used separately. As Cygnus took over the maintenance of several GNU programs, they started making new releases of them. However, these programs had already been converted to work in the Cygnus tree only. How could they release them separately then? The answer is that these releases are NOT like typical GNU packages that have the program in the top-level directory of the distribution tarball. Instead, these releases are actually pruned checkouts of the Cygnus tree, made to look indistinguishable to the untrained eye from typical GNU packages. An important property of the Cygnus tree is that it doesn't have to be complete. The top- level configure script and Makefile check whether each directory they are about to descend into is actually present, and if it isn't, it's silently skipped. As a result, you can prune the Cygnus tree down to contain only the modules you need, and then just those modules will be configured and built. Cygnus-made GNU releases are all Cygnus tree checkouts pruned down to contain only the modules needed by the GNU release in question. GCC is still a single module in the Cygnus tree, but it now depends on the top level and on libiberty, the miscellaneous support code module that virtually every other module depends on. Current GCC releases consist of the top level, libiberty, the actual gcc module, and several modules with target libraries for different high-level languages that GCC now supports. Current Binutils releases consist of a lot of modules. There is still a binutils module that loosely corresponds to the old GNU binutils package, but GNU ld is now a separate module. (gas has always been a separate module.) The binutils, gas, and ld modules depend extensively on bfd and opcodes, two major host library modules invented by Cygnus. Current GDB releases also consist of a lot of modules. gdb itself is still a single module in the Cygnus tree, but it now depends extensively on bfd and opcodes, same as Binutils, as well as some other Cygnus- added modules that Binutils don't use. And of course on libiberty, which is included in all current GCC, Binutils, and GDB releases. In addition to the above GNU programs, the Cygnus tree contains many interesting non-GNU modules developed by Cygnus, most of which have been opened to the public. These include Newlib, Cygwin, a number of Tcl tools and GUI libraries, an extensive testsuite framework, and CGEN. 2. So where is this thing in terms of public CVS and mailing lists? This is where we currently have a problem. When Cygnus tree-based EGCS/GCC, Binutils, and GDB were first opened to the public, they were in the form of pruned Cygnus tree checkouts. We ended up with three GNU projects each having its own (stale, corresponding to a snapshot from the Cygnus tree at some point) copy of the top-level files, libiberty, and most of the headers. With Binutils and GDB it was even worse, as they had their own copies of bfd and opcodes, both of which are actively maintained and rapidly changing modules. This is OK for releases, but it's a problem for development. After all, the whole point of release branches and release engineering is to produce stability in a single software component, regardless of the staleness and deviation from the mainline this almost always causes. Before Cygnus opened the development to the public, they internally had a very sensible model: one master source tree with one master copy of each module where all development is done, so all developers are always on the same page, and specific bits of the tree are sent off on release branches as releases are made. Each release will inevitably have some oddities introduced into it by the release engineering process, and some things may become less generic than they could be (for example, the top-level configure and Makefile will still remind curious code readers of the other modules in the tree, but because of the potentially incompatible changes on the release branch, there may not be perfect interoperability with them). However, developers always have one single master tree to work with, to fix major bugs in, and to make major improvements to. It is perfectly synergistic and self- consistent, at the price of less stability because it isn't release-engineered. This means that releases have the property that some bits in them may be stale or duplicated elsewhere, both of which are highly undesirable for developers. The way releases should really work is by letting end users who are not expected to do their own development and bugfixing have a stable release that doesn't need any bugfixing. But as soon as a user does find a bug he/she wants to fix him/herself, he/she should really put the release aside, get the master copies of all components involved, fix any bugs there, and submit the fixes to the respective maintainers. This is the Free Software way, and this is what makes Free Software thrive. However, this arrangement was hampered by EGCS/GCC, Binutils, and GDB starting life as public GNU projects in the same branched form in which they exist in releases. What they should have done was to create a public Cygnus tree, fully explain to everyone what the Cygnus tree is, and have all development proceed in it like it did inside Cygnus before. But instead each of EGCS, Binutils, and GDB began life in its own CVS repository containing the same thing the FSF release tarballs have in them. The renaming of EGCS back to GCC didn't change anything. Unfortunately, the GCC (former EGCS) maintainers seem to not have grasped yet that this arrangement is troublesome and just plain wrong. (And they have been living with it since the start of the EGCS project!) Fortunately, the Binutils and GDB maintainers were much quicker to realise this. Almost immediately after the opening of these projects it became clear that a single master copy of each component is needed, instead of two teams working divergently on two branches made off of a once unified Cygnus tree. Also at the same time parts of the Cygnus tree other than GCC, Binutils, and GDB (i.e., the Cygnus-developed non- GNU modules) were being opened to the public. Those aren't GNU projects and virtually everyone doing significant work on them is from Cygnus. Those folks are used to the Cygnus tree and know that doing it any other way is just plain wrong. Therefore, it became clear that the Cygnus tree needed to be brought back. In February 2000, less than a year after the opening of Binutils and GDB, their separate CVS repositories were liquidated. Instead, a new CVS repository was created which was to be the public Cygnus tree. It is /cvs/src on sources.redhat.com (formerly sourceware.cygnus.com), and from the start it was designed as a real full Cygnus tree repository, rather than a repository for just one project, which is what they did with all their public CVS repositories before that. (You can check the CVS log on its modules file to convince yourself.) The modules that make up Binutils and GDB were moved into it, eliminating the separate Binutils and GDB repositories that existed before. Immediately after that Newlib and Cygwin were imported from Cygnus' internal tree (these are Cygnus-developed non-GNU tree modules), confirming without any doubt that finally the public Cygnus tree was born. After being born in February 2000, the public Cygnus tree in /cvs/src on the sourceware machine matured quickly. It is now almost complete. Unfortunately, there is still one omission. This omission and the need for users/developers to compensate for it manually is the reason why I'm boring you here with history lessons instead of just telling you where to get the public Cygnus tree and what to do with it. This omission is GCC. Ever since the start of EGCS in late 1997 it has lived in its own CVS repository. Currently it is /cvs/gcc on the sourceware machine (sources.redhat.com). Officially it's on gcc.gnu.org, but the dirty little secret is that gcc.gnu.org is just a DNS record, it points to the very same sourceware machine, same as sources.redhat.com. Everything else has now been integrated into the /cvs/src repository, bringing back the Cygnus tree in all its glory. However, there is also the /cvs/gcc repository duplicating a lot of it. In effect, instead of one unified public Cygnus tree we, the non-Cygnus folks who don't have access to their original internal tree, have two trees to deal with: /cvs/gcc and /cvs/src. The former contains the gcc module and all language library modules, the latter contains everything else. The two duplicate all the top-level files and the libiberty module. Most people working on this code have by now realised that it is really designed to be in one Cygnus tree, and that's how they work on it. Our unspoken convention is now to locally construct this tree from the two repositories, work on it, and check the changes into the right repo(s). The parts that are in only one of the repositories are simply taken from it and combined into one tree. The trickier parts are the ones that are duplicated in the two repos. These are the top-level files, the headers, and libiberty. /cvs/gcc's libiberty is considered the master one, so that one is usually taken. However, most commits to it are also simultaneously made to /cvs/src's copy, and the latter is also periodically replaced with the former, so it usually works just as well. The include directory in the Cygnus tree contains the public headers for all modules. It currently exists in both repos. /cvs/src's copy contains the headers for all modules and /cvs/gcc's copy contains only the headers for libiberty. The latter follow the same rules as libiberty itself. Finally, there are the top-level files. These must know about all the modules in tree. Most people changing these files now keep checking each change into both repos. The /cvs/src repo has one top-level directory in it also named src, and that directory has the Cygnus tree in it (sans GCC). You can check it out in its entirety with: cvs -d :pserver:anoncvs@sources.redhat.com co src This will take a lot of time and disk space. The /cvs/src repo also has a modules file. Remember, the Cygnus tree has lots of modules in it, and most people work only on those modules that interest them. The modules file in the /cvs/src repo allows checking out partial Cygnus trees, and the comments in its CVS log indicate that it is modeled after the modules file of Cygnus' internal repo, meaning that this is how the modules file of a real Cygnus tree should look like. It has CVS checkout modules defined for the most common Cygnus tree module combinations that are normally checked out together. Since CVS checkout module names exist in the same namespace as the top-level directories in the repo, of which there is only one (src), there are no conflicts. (In particular, there is no conflict between CVS checkout modules and the Cygnus tree modules, the latter being one level below in the src directory.) The /cvs/gcc repo has one top-level directory in it named egcs, and it has this repo's version of the Cygnus tree in it. You can check it out in its entirety with: cvs -d :pserver:anoncvs@sources.redhat.com co egcs The /cvs/gcc repo's modules file doesn't do much. It has an egcs-core CVS checkout module defined that checks out the tree without the language front ends and target libraries, but other than that, this repo is normally checked out in its entirety. As for mailing lists, currently most of the public projects in the Cygnus tree have their own project-specific mailing lists, but there are no mailing lists for the Cygnus tree overall, leaving the top-level files and many less popular modules homeless. 3. Our Current Solution So what do we do about it? As I've just explained, we really want and need the Cygnus tree, but there currently isn't a single public CVS repository for it. The current solution is for people to construct full Cygnus trees on their local machines from the two CVS repos and to keep track of these two repos in development. Here is the procedure I use for constructing the full Cygnus tree from the two repos: 1. Check out both repos (either in their entirety or only the parts you want). You'll have two partial Cygnus trees in different directories. 2. Create a directory for the combined Cygnus tree. 3. Populate it with everything from the src repo except the include and libiberty subdirectories. 4. Add gcc, libiberty, and the language target libraries from the gcc repo. 5. Create the include subdirectory and populate it by merging the include subdirectories of the two repos. For files that are present in both, use the gcc repo's version. Explanation. This procedure is designed with the following two points in mind: 1. libiberty and its headers are taken from the gcc repo, which is considered the master copy. 2. The procedure I just outlined uses the top-level files from the src repo. The ones from the gcc repo could have been used instead. Most of the time both will work equally well. I personally prefer to take them from the src repo because it was specifically designed as the real full Cygnus tree repo. 4. The Real Solution The above procedure, with a few variations, is generally followed by most developers working in the Cygnus tree. However, this doesn't make it any less painful. It is just a nuisance for everyone to keep piecing the tree together every time, then parsing back where to check in patches, and remembering to keep the two repos in sync. There is absolutely no benefit to gain from the current arrangement. It doesn't give GCC any more independence. GCC is now critically dependent on the Cygnus tree (and has been so ever since the start of EGCS), and by keeping their own copy of it the GCC maintainers are simply closing their eyes to this. But the reality is that everyone still builds and tests it together with the rest of the Cygnus tree (using the procedure from the previous section), and the top-level files in the gcc repo are still kept in sync with the ones in the src repo, manually and painfully. There is nothing to lose by doing away with this and merging the repos, only a lot of convenience and sanity to gain. This is only one of the problems with the current arrangement. The other problem is that there is no "home" for the Cygnus tree. There is no place where people can learn what the Cygnus tree is and read all about it. There is no mailing list to discuss it overall (as opposed to some particular module in it). There is no clearly designed group responsible for the maintenance of the top-level files. In fact, it's even more than just not having a mailing list for the Cygnus tree overall. There appears to be some sort of a code of silence around it. Many people know what it is and do their development with it in mind, but there is virtually no public mention of it, as if everyone is pretending that it doesn't exist. This is actually why I had to write this article: to tell the people what the Cygnus tree is. I have a number of projects that involve the Cygnus tree, and when presenting them to the public, I found myself facing a strange problem. I need to refer people to the Cygnus tree, but there is nothing to refer them to! Their is no home (WWW page, FTP site, mailing list, or anything at all) about the Cygnus tree that I can point people to. In fact, there is nothing to even tell people what it is, aside from a very terse remark in Ian Lance Taylor's "The GNU configure and build system", which while describing how its configure scripts and Makefiles work, really fails to answer the question of *what it is*. As a developer highly interested in the Cygnus tree, I'm trying to do whatever I can to help solve these problems. This article explaining what the Cygnus tree is and what is its current situation is my first step. I will end it with my proposal for what I believe is the right solution. For the problem of two repos, the solution is to merge them. Given how the src repo was intended from the start as the repo for the full Cygnus tree and how it successfully does this now for everything except GCC, there is no need to do anything with it. It is already exactly the way it should be. All that needs to be done is to move the gcc and language library modules from the gcc repo to the src one, replace libiberty and its headers in the src repo with the ones in the gcc one, and put the gcc repo to rest. Given how new modules have recently been added to the src repo with no fuss and no problems, I don't think that the current inhabitants of the src repo would object to welcoming a new member, or even that their consent would be required. After all, they don't have to check out any modules other than the ones they need, and the top-level configure script and Makefile already list everything anyway. In fact, many src repo inhabitants will certainly like having the master copy of libiberty in their repo, rather than a mirror that sometimes gets stale. Thus the only ones who will have to be persuaded are the GCC maintainers. The question of merging the repos has come up more than once on the GCC mailing lists. Some of the GCC maintainers have said that they liked the idea. However, there seems to be some politics playing against it, apparently coming from FSF. Thus it appears that the next battle is going to be between us, developers, and the politicians. In order to fight and win this battle, we must actively push our cause. This brings us to the second problem of not having a real home for the Cygnus tree. The solution for this problem is obvious: create one. This article is a start, explaining publicly apparently for the first time what the Cygnus tree is and trying to break the code of silence around it. Now we just need to make all this better known to more people so that we can start a mailing list for the Cygnus tree and decide what else do we need for a "home" for it. This awareness-raising cause will probably have to be pursued on the GCC, Binutils, and GDB mailing lists. This is because these are the parts of the Cygnus tree where some people still live in the sandbox of separate GNU projects. Everyone else, i.e., people working on the Cygnus-developed non-GNU modules like Newlib, already come from the Cygnus tree background and would certainly be all for bringing it back. So, let's all try to do our best to enlighten the public about the Cygnus tree, create a real home for it, and persuade the GCC maintainers to move to the src repository! Appendix. References "The GNU configure and build system" by Ian Lance Taylor is file configure.texi in the etc subdirectory of the Cygnus tree. For those of you who prefer WWW, the author has a WWW version on his page: http://www.airs.com/ian/configure/ Once you know what the Cygnus tree is thanks to this article, Ian's superb tutorial will tell you everything you need to know about its configure scripts and Makefiles to master development in this tree. libgloss/doc/porting.texi in the Cygnus tree gives a very good overview of how the different pieces of the tree come together to support embedded systems and how to port them to a new one. ------ Want more information? See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/ Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |