Copyright © 2001-2004 nALFS Development Team
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions in any form must retain the above copyright notice, this list of conditions and the following disclaimer.
The names "Linux From Scratch", "Automated Linux From Scratch", or the names of its contributors may not be used to endorse or promote products derived from this material without specific prior written permission.
Any material derived from "Linux From Scratch" must contain a reference to the "Linux From Scratch" project.
Any material derived from "Automated Linux From Scratch" must contain a reference to the "Automated Linux From Scratch" project.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Abstract
This book explains in detail all the things a nALFS hacker needs to know to contribute to the project.
Table of Contents
From the Editor
Being a Systems Engineer by trade and having used Linux From Scratch (LFS) for about a year, I was looking for a way to automate tasks on my Linux servers at work. One day, while surfing the LFS website, I found the ALFS project and then the nALFS implementation. After trying it out, I fell in love with the tool. I have been able to completely automate server builds, software package installation and administrative tasks across my data center. Since I like the product so much, I wanted to give back to the project and decided to take on the task of documentation. I hope you like the product as much as I do and it provides the same utility to your environment as it has mine.
--
James Robertson
jwrober@linuxfromscratch.org
This book is divided into the following parts.
This part contains information which is essential to the rest of the book available at Introduction
This part contains information just for nALFS's code hackers available at nALFS Hackers.
This part contains information just for nALFS's documentation editors.
If you are planning on writing code, updating profiles or the documentation for the nALFS implementation of the ALFS DTD, then this book is for you. If this is not what you want to do, then this book is not for you. If all you want to do is to understand how to use nALFS, then please direct your attention to the README file in the root of the source tarball or the Users Guide in the doc directory.
Table of Contents
Welcome to the nALFS Hackers Guide. This small book is designed to aid any person wishing to contribute to nALFS. If you are not sure where to begin, take a look at the Contact Information page. This is the best place to see who is actively involved with the project and provides places to ask questions. If you know what you are after, then take a look at the section you are most interested in.
For an in depth look at "who did what", you can grep through the source code files containing the program's changes. Below is just a list of people, in alphabetical order, that had some involvement in the nALFS code and/or documentation.
Joachim Beckers <jbeckers@linuxfromscratch.org> -- Profile editor.
Jamie Bennett <jamie@linuxfromscratch.org> -- Profile editor.
Marcus R. Brown <mrbrown@0xd6.org> -- Project developer.
Christophe Devine <devine@cr0.net> -- Project developer.
Vassili Dzuba <vassili@linuxfromscratch.org> -- Project developer.
Kevin P. Fleming <kpfleming@linuxfromscratch.org> -- Project developer.
Charless Fowlkes <fowlkes@cs.berkeley.edu> -- Project developer.
Neven Has <neven@linuxfromscratch.org> -- nALFS Creator, Project developer.
Peter van Kampen <pterk@datatailors.com> -- Project developer.
Thomas Pegg <thomasp@linuxfromscratch.org> -- Profile editor.
James Robertson <jwrober@linuxfromscratch.org> -- Documentation editor.
Maik Schreiber <bZ@iq-computing.de> -- Project developer.
Fabien Steinmetz <fabien.st@netcourrier.com> -- Project developer.
Jeremy Utley <jeremy@linuxfromscratch.org> -- Profile editor.
Countless other people on the ALFS mailing lists who are making this project happen by giving their suggestions, testing the tool and submitting bug reports.
To make things easy to follow, there are a number of conventions used throughout the book. Following are some examples:
./configure --prefix=/usr
This form of text is designed to be typed in exactly as seen unless otherwise noted in the surrounding text.
install-info: unknown option `--dir-file=/mnt/lfs/usr/info/dir'
This form of text (fixed width text) is showing screen output, probably as the result of commands issued and is also used to show filenames such as ~/.nALFSrc
Emphasis
Bold Emphasis
These forms of text are used for several purposes in the book but mainly to emphasize important points or to give examples as to what to type.
http://www.linuxfromscratch.org/alfs
This form of text is used for hyperlinks, both within the book and to external pages such as HowTo's, download locations, websites, etc.
cat > $LFS/etc/group << "EOF"> root:x:0: bin:x:1: ...... EOF
This type of section is used mainly when creating configuration files. The first command (in bold) tells the system to create the file $LFS/etc/group from whatever is typed on the following lines until the sequence EOF is encountered. Therefore, this whole section is generally typed as seen.
The nALFS uses two mailing list hosted from the Linux From Scratch servers.
Please direct the majority of your emails to the ALFS mailing list at alfs-discuss@linuxfromscratch.org. This is an excellent place to post questions and bug reports. For complete mailing list information, refer to http://www.linuxfromscratch.org/mailman/listinfo/alfs-discuss.
The second list is really for the development team's use and is available at alfs-log@linuxfromscratch.org. This is an excellent place to see the daily activity of the project. For complete mailing list information, refer to http://www.linuxfromscratch.org/mailman/listinfo/alfs-log.
All the mailing lists hosted at linuxfromscratch.org are also accessible via the NNTP server. All messages posted to a mailing list will be copied to its correspondent newsgroup, and vice versa.
The news server can be reached at news.linuxfromscratch.org.
Some other links that might interest you:
Linux From Scratch:
Automated Linux From Scratch:
The current nALFS source code owner is Kevin Fleming. If you need to reach Kevin, send an email to kpfleming@linuxfromscratch.org .
The current nALFS documentation maintainer is James Robertson. If you need to reach James, send an email to jwrober@linuxfromscratch.org .
The current nALFS profile lead maintainer is Thomas Pegg. If you need to reach Thomas, send an email to thomasp@linuxfromscratch.org .
2004-06-05 -- June 5th, 2004
June 5th, 2004 [jwrober]: Filled in any "PAGE TO BE WRIITEN" areas.
June 4th, 2004 [jwrober]: Added Part 2.
May 31st, 2004 [jwrober]: Removed Unused XML Tags from the source.
May 29th, 2004 [jwrober]: With some help from Manuel Canales Esparcia, modified the no-chunks xsl file to support inline css styling.
May 29th, 2004 [jwrober]: Updated to DocBook v4.3 and applied formatting similar to the ALFS Syntax Doc.
November 10th, 2003 [kpfleming]: Updated distribution instructions.
November 9th, 2003 [kpfleming]: Updated distribution instructions.
November 5th, 2003 [jwrober]: Updated most pages to help with presentation in the txt file version of the doc.
November 4th, 2003 [kpfleming]: Distro Tarballs page was missing bootstrap command.
November 4th, 2003 [kpfleming]: Updated the CVS and Distro Tarballs pages with new updated information.
October 30th, 2003 [jwrober]: Updated the Coding Style page and added a first stab at CVS.
October 29th, 2003 [jwrober]: Added a new chapter for Bugzilla, CVS, Coding Style and Distribution Tarballs. Added an initial page for each and then a first stab at the Coding Style page.
October 8th, 2003 [jwrober]: Added an Introduction section.
October 7th, 2003 [jwrober]: Basic book skeleton created
Table of Contents
This chater contains any common information you may need irregardless of what you may be doing for the project.
Bleading edge development of nALFS occurs on HEAD in the LFS CVS repository. You can do an anonymous HEAD checkout by issuing the following command :
cvs -z9 -d :pserver:anonymous@cvs.linuxfromscratch.org:/home/cvsroot checkout ALFS/nALFS
The current stable version of nALFS is developed on the 1.2 branch (branch-1_2) in the LFS CVS repository. You can do an anonymous checkout of this branch by issuing the following command :
cvs -z9 -d :pserver:anonymous@cvs.linuxfromscratch.org:/home/cvsroot checkout -r branch-1_2 ALFS/nALFS
The CVS repository copy of nALFS does not include an executable configure script nor a Makefile. This is by design, as there is a script to create them, which ensures that their contents will always be up to date with the source tree contents and not require manual editing.
nALFS has been configured to use the GNU autotools suite for its build process. The build system was created using autoconf-2.57, automake-1.7.7 and libtool-1.5. If you use older versions than those, you may experience warnings and/or outright failures (automake-1.6 is known to be unable to handle the nALFS Makefile.am) file.
After you checkout the nALFS source tree, you will need to execute:
sh ./bootstrap
in the nALFS root directory itself. This should result output similar to the following:
$ sh ./bootstrap You should update your 'aclocal.m4' by running aclocal. Putting files in AC_CONFIG_AUX_DIR, `gnubuild'. configure.ac: installing 'gnubuild/install-sh' configure.ac: installing 'gnubuild/mkinstalldirs' configure.ac: installing 'gnubuild/missing' Makefile.am: installing 'gnubuild/compile' Makefile.am: installing 'gnubuild/depcomp' patching file ltmain.sh Hunk #1 succeeded at 160 with fuzz 2 (offset -78 lines). Hunk #2 succeeded at 269 with fuzz 1. Hunk #3 succeeded at 4483 (offset -1172 lines).
This output comes from the autoreconf tool, which is part of the autoconf package, that automatically runs libtoolize, autoheader, aclocal and other parts of the autotools suite. The warning about "update your 'aclocal.m4'" can be ignored, as aclocal is already executed later by autoreconf. The patch offset and/or fuzz messages about ltmain.sh are caused by the bootstrap script applying a patch created against libtool-1.5; if your libtool version is different, the patch should still work correctly, but you may see messages like this from the patch command. In the worst case, the patch fails to apply, the nALFS build will still be functional, but libtool will generate more output messages than necessary.
Once bootstrap has been run, you can execute ./configure like any other GNU autoconf-based package to configure nALFS for your system.
The nALFS build system, because it uses automake, has full dependency tracking on all files used to build the binaries. This will reduce your rebuilding time as you edit header files and source files, as the make system will know exactly what must be rebuilt.
In addition, if you are going to make changes to bootstrap, you should add the "--enable-maintainer-mode" parameter to your configure command. With this done, each time you edit and re-run bootstrap, the Makefile will automatically re-run configure to make your changes take effect. Be warned though, that if you manually edit Makefile.am or configure.ac, you must modify the bootstrap script to incorporate your changes, because these files are not stored in the CVS repository.
If you add any C source files to the tree, you will need to rerun the bootstrap script to get them included in your build. If you add header files, it is not necessary to rerun the bootstrap script unless you plan on using "make dist" from that same tree, in which case those added headers would not get included into the tarball.
If you add or remove "syntax versions" in any of the handlers, you must re-run bootstrap to get the version lists in Makefile.am and configure.ac to incorporate your changes, or you will experience unusual build behavior.
If you add or remove any program options in src/options.h, you must re-run bootstrap to regenerate src/option-list.h. Without that file being regenerated, you may experience compile errors or find that your newly added option does not work properly.
All other types of files fall into two categories:
those that should be present only in the CVS repository
Nothing needs to be done for these files, other than the relevant "cvs add" commands.
those that should be added to the distribution tarball
The bootstrap.Makefile script will need to be edited, specifically the line that sets EXTRA_DIST near the beginning of the script. Add the path(s) to the new files to this line, or add an additional line starting with "EXTRA_DIST +=" (standard GNU makefile syntax). If the files are not listed on this line, they will not be included in the tarball created by make dist.
The nALFS package tracks all "bugs" in the LFS Bugzilla database. A "bug" is defined as an issue, todo or enhancement request tied to one of the ALFS products. The Bugzilla database can be accessed via the web at http://bugs.linuxfromscratch.org. The package is split across two Bugzilla products: "ALFS Profiles" and "Automated Linux From Scratch". The distinction between the two should be pretty clear.
The "Automated Linux From Scratch" product is then split up into five components: "Back End (XML)", "Docs", "DTD", "Extras", and "Front End (GUI)". The "Back End (XML)" component is for all of the back end handlers that parse the XML and perform the work they ask for. The "Docs" component is to track all issues related to the three works (this book, the ALFS DTD Syntax Doc and the Users Guide). The "DTD" component is designed to take care of all issues around the main ALFS DTD and any other DTD's that the project uses. A good example would be the logging DTD. The "Extras" component is for any add-ons or extras to the nALFS package. The "Front End (GUI)" is designed to track any bugs for the ncurses GUI.
If you find a bug in one of the products, first look in the Bugzilla database to see if you issue is there. If not, then please read the Contact Information page on how to contact the development team at the appropriate list. The team will decide if your issue is really a bug or just user error. If a bug is found, then you can or one of the development team will add the bug in the database for work on later.
This chapter contains all the information you need to write code for the nALFS.
To write code for nALFS you should be familiar with the C programming language and the XML specification.
Once you have those mastered, you will need to have the following on your linux computer:
GNU autoconf-2.57 or newer.
GNU automake-1.7.7 or newer.
GNU libtool-1.5 or newer.
GNU autoconf-2.57 or newer.
GCC-3.3 or newer.
CVS-1.11 or newer.
OpenSSL-0.9.7c or newer.
A good editor like Vim or Emacs.
The development team would like to keep a clean and standardized coding style in the product's code. This will make it easier for everyone to read the code.
Linux's creator Linus Torvalds has provided an excellent coding style for us to use. He even gives it to us in the Linux Kernel source distribution tarball. We have provided the version from Linux v2.4.22 in its entirety for your reference (and to comply with the GDL).
It should be noted, however, that not all of the coding style written by Linus is applicable to nALFS. The sections (a.k.a chapters) that can be ignored are:
Chapter 7: Configuration-files
Chapter 8: Data structures
--------------------------------------------------
This is a short document describing the preferred coding style for the linux kernel. Coding style is very personal, and I won't force my views on anybody, but this is what goes for anything that I have to be able to maintain, and I'd prefer it for most other things too. Please at least consider the points made here.
First off, I'd suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, it's a great symbolic gesture.
Anyway, here goes:
Tabs are 8 characters, and thus indentations are also 8 characters. There are heretic movements that try to make indentations 4 (or even 2!) characters deep, and that is akin to trying to define the value of PI to be 3.
Rationale: The whole idea behind indentation is to clearly define where a block of control starts and ends. Especially when you've been looking at your screen for 20 straight hours, you'll find it a lot easier to see how the indentation works if you have large indentations.
Now, some people will claim that having 8-character indentations makes the code move too far to the right, and makes it hard to read on a 80-character terminal screen. The answer to that is that if you need more than 3 levels of indentation, you're screwed anyway, and should fix your program.
In short, 8-char indents make things easier to read, and have the added benefit of warning you when you're nesting your functions too deep. Heed that warning.
The other issue that always comes up in C styling is the placement of braces. Unlike the indent size, there are few technical reasons to choose one placement strategy over the other, but the preferred way, as shown to us by the prophets Kernighan and Ritchie, is to put the opening brace last on the line, and put the closing brace first, thusly:
if (x is true) { we do y }
However, there is one special case, namely functions: they have the opening brace at the beginning of the next line, thus:
int function(int x) { body of function }
Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are right and (b) K&R are right. Besides, functions are special anyway (you can't nest them in C).
Note that the closing brace is empty on a line of its own, except in the cases where it is followed by a continuation of the same statement, ie a while in a do-statement or an else in an if-statement, like this:
do { body of do-loop } while (condition);
and
if (x == y) { .. } else if (x > y) { ... } else { .... }
Rationale: K&R.
Also, note that this brace-placement also minimizes the number of empty (or almost empty) lines, without any loss of readability. Thus, as the supply of new-lines on your screen is not a renewable resource (think 25-line terminal screens here), you have more empty lines to put comments on.
C is a Spartan language, and so should your naming be. Unlike Modula-2 and Pascal programmers, C programmers do not use cute names like ThisVariableIsATemporaryCounter. A C programmer would call that variable tmp, which is much easier to write, and not the least more difficult to understand.
HOWEVER, while mixed-case names are frowned upon, descriptive names for global variables are a must. To call a global function foo is a shooting offense.
GLOBAL variables (to be used only if you really need them) need to have descriptive names, as do global functions. If you have a function that counts the number of active users, you should call that count_active_users() or similar, you should not call it cntusr().
Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged -- the compiler knows the types anyway and can check those, and it only confuses the programmer. No wonder MicroSoft makes buggy programs.
LOCAL variable names should be short, and to the point. If you have some random integer loop counter, it should probably be called i. Calling it loop_counter is non-productive, if there is no chance of it being mis-understood. Similarly, tmp can be just about any type of variable that is used to hold a temporary value.
If you are afraid to mix up your local variable names, you have another problem, which is called the function-growth-hormone-imbalance syndrome. See next chapter.
Functions should be short and sweet, and do just one thing. They should fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, as we all know), and do one thing and do that well.
The maximum length of a function is inversely proportional to the complexity and indentation level of that function. So, if you have a conceptually simple function that is just one long (but simple) case- statement, where you have to do lots of small things for a lot of different cases, it's OK to have a longer function.
However, if you have a complex function, and you suspect that a less- than-gifted first-year high-school student might not even understand what the function is all about, you should adhere to the maximum limits all the more closely. Use helper functions with descriptive names (you can ask the compiler to in-line them if you think it's performance-critical, and it will probably do a better job of it that you would have done).
Another measure of the function is the number of local variables. They shouldn't exceed 5-10, or you're doing something wrong. Re-think the function, and split it into smaller pieces. A human brain can generally easily keep track of about 7 different things, anything more and it gets confused. You know you're brilliant, but maybe you'd like to understand what you did 2 weeks from now.
Comments are good, but there is also a danger of over-commenting. NEVER try to explain HOW your code works in a comment: it's much better to write the code so that the working is obvious, and it's a waste of time to explain badly written code.
Generally, you want your comments to tell WHAT your code does, not HOW. Also, try to avoid putting comments inside a function body: if the function is so complex that you need to separately comment parts of it, you should probably go back to chapter 4 for a while. You can make small comments to note or warn about something particularly clever (or ugly), but try to avoid excess. Instead, put the comments at the head of the function, telling people what it does, and possibly WHY it does it.
That's OK, we all do. You've probably been told by your long-time Unix user helper that GNU emacs automatically formats the C sources for you, and you've noticed that yes, it does do that, but the defaults it uses are less than desirable (in fact, they are worse than random typing -- a infinite number of monkeys typing into GNU emacs would never make a good program).
So, you can either get rid of GNU emacs, or change it to use saner values. To do the latter, you can stick the following in your .emacs file:
(defun linux-c-mode () "C mode with adjusted defaults for use with the Linux kernel." (interactive) (c-mode) (c-set-style "K&R") (setq c-basic-offset 8))
This will define the M-x linux-c-mode command. When hacking on a module, if you put the string -*- linux-c -*- somewhere on the first two lines, this mode will be automatically invoked. Also, you may want to add
(setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode) auto-mode-alist))
to your .emacs file if you want to have linux-c- mode switched on automagically when you edit source files under /usr/src/linux.
But even if you fail in getting emacs to do sane formatting, not everything is lost: use indent.
Now, again, GNU indent has the same brain dead settings that GNU emacs has, which is why you need to give it a few command line options. However, that's not too bad, because even the makers of GNU indent recognize the authority of K&R (the GNU people aren't evil, they are just severely misguided in this matter), so you just give indent the options -kr - i8 (stands for "K&R, 8 character indents").
indent has a lot of options, and especially when it comes to comment re-formatting you may want to take a look at the manual page. But remember: indent is not a fix for bad programming.
For configuration options (arch/xxx/config.in, and all the Config.in files), somewhat different indentation is used.
An indention level of 3 is used in the code, while the text in the config- options should have an indention-level of 2 to indicate dependencies. The latter only applies to bool/tristate options. For other options, just use common sense. An example:
if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then tristate 'Apply nitroglycerine inside the keyboard (DANGEROUS)' CONFIG_BOOM if [ "$CONFIG_BOOM" != "n" ]; then bool ' Output nice messages when you explode' CONFIG_CHEER fi fi
Generally, CONFIG_EXPERIMENTAL should surround all options not considered stable. All options that are known to trash data (experimental write- support for file-systems, for instance) should be denoted (DANGEROUS), other Experimental options should be denoted (EXPERIMENTAL).
Data structures that have visibility outside the single-threaded environment they are created and destroyed in should always have reference counts. In the kernel, garbage collection doesn't exist (and outside the kernel garbage collection is slow and inefficient), which means that you absolutely have to reference count all your uses.
Reference counting means that you can avoid locking, and allows multiple users to have access to the data structure in parallel -- and not having to worry about the structure suddenly going away from under them just because they slept or did something else for a while.
Note that locking is not a replacement for reference counting. Locking is used to keep data structures coherent, while reference counting is a memory management technique. Usually both are needed, and they are not to be confused with each other.
Many data structures can indeed have two levels of reference counting, when there are users of different "classes". The subclass count counts the number of subclass users, and decrements the global count just once when the subclass count goes to zero.
Examples of this kind of "multi-reference-counting" can be found in memory management (struct mm_struct: mm_users and mm_count), and in filesystem code (struct super_block: s_count and s_active).
Remember: if another thread can find your data structure, and you don't have a reference count on it, you almost certainly have a bug.
When you are ready to produce a tarball for distribution, there are a few steps to follow:
Check out a clean copy of the nALFS repository. Using an existing copy, especially one that has been configured and used as a build tree, can potentially cause errors in the make dist process.
Edit bootstrap.configure, and modify the line starting with AC_INIT to reflect the version number that you want the distribution to be given (replace "CVS" with your desired version number). Do not commit this change to the CVS repository, as the CVS version should always report its version number as "CVS".
Run sh ./bootstrap -d CVS -g CVS -p 5.0. This command will create the configure script and Makefile (as is normally done with a CVS checkout), but will also download, extract and rename the documentation files that will be included in the tarball. These files include the DTD (the current CVS version), the syntax document describing the DTD, the user's guide and the hacker's guide. The latter three documents are currently downloaded from James Robertson's home directory on the LFS server, but see the URL_BASE variable in the bootstrap script to change the download source. In addition, the LFS-5.0 profile will be downloaded so it can be included in the tarball.
Run ./configure specifying any parameters your system needs to complete the configuration process.
Run make dist. This will produce both a .tar.gz and a .tar.bz2 file in the current directory containing everything that should be needed to build nALFS on an end-user's system. Note that this tarball will be created using the autoconf, automake and libtool versions on your development system, so please make sure they are current releases before creating a tarball for public consumption.
Run make distcheck. If your system requires any special parameters to be given to configure for it to complete (like --with-libxml2, for example), then you can use make DISTCHECK_CONFIGURE_FLAGS="..." distcheck to supply those parameters. The distcheck process will actually unpack the tarball into a temporary directory, and run a complete configure/make/install/uninstall process on it to ensure that no build errors occur. This step should be considered to be mandatory before releasing your tarball to the general public.