-*-text-*- If you are contributing code to the Orca project, please read this first. ====================== HACKER'S GUIDE TO ORCA ====================== $LastChangedDate: 2002-11-07 09:30:37 -0800 (Thu, 07 Nov 2002) $ TABLE OF CONTENTS * Participating in the community * Getting the source * What to read * Directory layout * Coding style * Document everything * Using page breaks * Other conventions * Writing log messages * Patch submission guidelines * Commit access Participating in the community ============================== The community exists mainly through mailing lists and a Subversion source code repository: Go to http://www.orcaware.com/mailman/listinfo and * Join the "Orca-dev", "Orca-checkins", and "Orca-announce" mailing lists. The dev list, orca-dev@orcaware.com, is where almost all discussion takes place. All questions should go there, though you might want to check the list archives first. The "orca-checkins" list receives automated commit emails. There are many ways to join the project, either by writing code, or by testing. To submit code, simply send your patches to orca-dev@orcaware.com. No, wait, first read the rest of this file, _then_ start sending patches to orca-dev@orcaware.com. :-) Getting the source ================== Orca uses the Subversion source control system to manage the source code. Subversion is a CVS replacement that offers many features over and above CVS. To get an overview of Subversion, check out http://www.orcaware.com/svn/Subversion-Blair_Zajac.ppt The Orca Subversion repository is located at http://svn.orcaware.com/repos/trunk/orca/ with tagged releases located at http://svn.orcaware.com/repos/tags/orca/ The Subversion home page is at http://subversion.tigris.org If you are using Windows, then you can use the Windows binaries available at http://subversion.tigris.org/servlets/ProjectDocumentList?folderID=91 Make sure to use the latest svn-*-setup.exe file. If you are running Unix, then you'll need to compile Subversion for yourself. To get the source code for Subversion, go to http://subversion.tigris.org/servlets/ProjectDocumentList?folderID=260 and download the latest tar.gz. To build Subversion, read these pages http://svn.collab.net/repos/svn/trunk/INSTALL http://subversion.tigris.org/project_source.html What to read ============ Before you can contribute code, you'll need to familiarize yourself with the existing code base and interfaces. Check out a copy of Orca (anonymously, if you don't yet have an account with commit-access) -- so you can look at the code base. Directory layout ================ A rough guide to the source tree: config/ Files for the configure auto-configuration system. contrib/ Contributed tools. docs/ User documentation. lib/ Perl modules and image files for Orca. lib/Orca/ Perl modules that Orca uses. orcallator/ All programs and scripts for the Solaris orcallator.se program. packages/ Important Perl modules used by Orca that don't come with Perl. patches/ Patches for the SE toolkit. src/ The main Orca script and key utility scripts. Coding style ============ We're using Perl, and following the standard Perl style. In general, be generous with parentheses even when you're sure about the operator precedence, and be willing to add spaces and newlines to avoid "code crunch". Don't worry too much about vertical density; it's more important to make code readable than to fit that extra line on the screen. Document everything =================== Every function, whether public or internal, must start out with a documentation comment that describes what the function does. The documentation should mention every parameter received by the function, every possible return value, and (if not obvious) the conditions under which the function could return an error. Put the parameter names in upper case in the doc string, even when they are not upper case in the actual declaration, so that they stand out to human readers. Using page breaks ================= We're using page breaks (the Ctrl-L character, ASCII 12) for section boundaries in both code and plaintext prose files. This file is a good example of how it's done: each section starts with a page break, and the immediately after the page break comes the title of the section. This helps out people who use the Emacs page commands, such as `pages-directory' and `narrow-to-page'. Such people are not as scarce as you might think, and if you'd like to become one of them, then type C-x C-p C-h in Emacs sometime. Other conventions ================= In addition to the above standards, Orca uses these conventions: * Use only spaces for indenting code, never tabs. Tab display width is not standardized enough, and anyway it's easier to manually adjust indentation that uses spaces. * Stay within 80 columns, the width of a minimal standard display window. * We have a tradition of not marking files with the names of individual authors (i.e., we don't put lines like "Author: foo" or "@author foo" in a special position at the top of a source file). This is to discourage territoriality -- even when a file has only one author, we want to make sure others feel free to make changes. People might be unnecessarily hesitant if someone appears to have staked ownership on the file. * There are many other unspoken conventions maintained throughout the code, that are only noticed when someone unintentionally fails to follow them. Just try to have a sensitive eye for the way things are done, and when in doubt, ask. Writing log messages ==================== Certain guidelines should be adhered to when writing log messages: Make a log message for every change. The value of the log becomes much less if developers cannot rely on its completeness. Even if you've only changed comments, write a log that says "Doc fix." or something. Use full sentences, not sentence fragments. Fragments are more often ambiguous, and it takes only a few more seconds to write out what you mean. Fragments like "Doc fix", "New file", or "New function" are acceptable because they are standard idioms, and all further details should appear in the source code. The log message should name every affected function, variable, macro, makefile target, grammar rule, etc, including the names of symbols that are being removed in this commit. This helps people searching through the logs later. Don't hide names in wildcards, because the globbed portion may be what someone searches for later. For example, this is bad: * twirl.c (twirling_baton_*): Removed these obsolete structures. (handle_parser_warning): Pass data directly to callees, instead of storing in twirling_baton_*. * twirl.h: Fix indentation. Later on, when someone is trying to figure out what happened to `twirling_baton_fast', they may not find it if they just search for "_fast". A better entry would be: * twirl.c (twirling_baton_fast, twirling_baton_slow): Removed these obsolete structures. (handle_parser_warning): Pass data directly to callees, instead of storing in twirling_baton_*. * twirl.h: Fix indentation. The wildcard is okay in the description for `handle_parser_warning', but only because the two structures were mentioned by full name elsewhere in the log entry. Note how each file gets its own entry, and the changes within a file are grouped by symbol, with the symbols are listed in parentheses followed by a colon, followed by text describing the change. Please adhere to this format -- not only does consistency aid readability, it also allows software to colorize log entries automatically. If your change is related to a specific issue in the issue tracker, then include a string like "issue #N" in the log message. For example, if a patch resolves issue 1729, then the log message might be: Fix issue #1729: * get_editor.c (frobnicate_file): Check that file exists first. For large changes or change groups, group the log entry into paragraphs separated by blank lines. Each paragraph should be a set of changes that accomplishes a single goal, and each group should start with a sentence or two summarizing the change. Truly independent changes should be made in separate commits, of course. One should never need the log entries to understand the current code. If you find yourself writing a significant explanation in the log, you should consider carefully whether your text doesn't actually belong in a comment, alongside the code it explains. Here's an example of doing it right: (consume_count): If `count' is unreasonable, return 0 and don't advance input pointer. And then, in `consume_count' in `cplus-dem.c': while (isdigit ((unsigned char)**type)) { count *= 10; count += **type - '0'; /* A sanity check. Otherwise a symbol like `_Utf390_1__1_9223372036854775807__9223372036854775' can cause this function to return a negative value. In this case we just consume until the end of the string. */ if (count > strlen (*type)) { *type = save; return 0; } This is why a new function, for example, needs only a log entry saying "New Function" --- all the details should be in the source. There are some common-sense exceptions to the need to name everything that was changed: * If you have made a change which requires trivial changes throughout the rest of the program (e.g., renaming a variable), you needn't name all the functions affected, you can just say "All callers changed". * If you have rewritten a file completely, the reader understands that everything in it has changed, so your log entry may simply give the file name, and say "Rewritten". * If your change was only to one file, or was the same change to multiple files, then there's no need to list their paths in the log message (because "svn log" can show the changed paths for that revision anyway). Only when you need to describe how the change affected different areas in different ways is it necessary to organize the log message by paths and symbols, as in the examples above. In general, there is a tension between making entries easy to find by searching for identifiers, and wasting time or producing unreadable entries by being exhaustive. Use your best judgment --- and be considerate of your fellow developers. (Also, run "svn log" to see how others have been writing their log entries.) Patch submission guidelines =========================== Mail patches to `orca-dev@orcaware.com', with a subject line that contains the word "PATCH" in all uppercase, for example Subject: [PATCH] fix for Orca images A patch submission should contain one logical change; please don't mix N unrelated changes in one submission -- send N separate emails instead. The email message should start off with a log message, as described in "Writing log messages" above. The patch itself should be in unified diff format, preferably inserted directly into the body of your message (rather than MIME-attached, uuencoded, or otherwise opaqified). If your mailer wraps long lines, then you will need to attach your patch. Please ensure the MIME type of the attachment is text/plain (some mailers allow you to set the MIME type; for some others, you might have to use a .txt extension on your patch file). Do not compress or otherwise encode the attached patch. If the patch implements a new feature, make sure to describe the feature completely in your mail; if the patch fixes a bug, describe the bug in detail and give a reproduction recipe. An exception to these guidelines is when the patch addresses a specific issue in the issues database -- in that case, just make sure to refer to the issue number in your log message, as described in "Writing log messages". It is normal for patches to undergo several rounds of feedback and change before being applied. Don't be discouraged if your patch is not accepted immediately -- it doesn't mean you goofed, it just means that there are a *lot* of eyes looking at every code submission, and it's a rare patch that doesn't have at least a little room for improvement. After reading people's responses to your patch, make the appropriate changes and resubmit, wait for the next round of feedback, and lather, rinse, repeat, until some committer applies it. If you don't get a response for a while, and don't see the patch applied, it may just mean that people are really busy. Go ahead and repost, and don't hesitate to point out that you're still waiting for a response. One way to think of it is that patch management is highly parallizable, and we need you to shoulder your share of the management as well as the coding. Every patch needs someone to shepherd it through the process, and the person best qualified to do that is the original submitter. Commit access ============= After someone has successfully contributed a few non-trivial patches, some committer, usually whoever has reviewed and applied the most patches from that contributor, proposes them for commit access. This proposal is sent only to the other full committers -- the ensuing discussion is private, so that everyone can feel comfortable speaking their minds. Assuming there are no objections, the contributor is granted commit access. The decision is made by consensus; there are no formal rules governing the procedure, though generally if someone strongly objects the access is not offered, or is offered on a provisional basis. The criteria for commit access are that the person's patches adhere to the guidelines in this file, adhere to all the usual unquantifiable rules of coding (code should readable, robust, maintainable, etc), and that the person respects the "Hippocratic Principle": first, do no harm. In other words, what is significant is not the size or quantity of patches submitted, but the degree of care shown in avoiding bugs and minimizing unnecessary impact on the rest of the code. Many committers are people who have not made major code contributions, but rather lots of small, clean fixes, each of which was an unambiguous improvement to the code. See the COMMITTERS file for a complete list of committers.