///////////////////////////////////////////////////////////////////////////// /* Copyright 2001 Ronald S. Burkey This file is part of GutenMark. GutenMark is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. GutenMark is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GutenMark; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Filename: PrefatoryAnalysisPassNeural.c Purpose: Identifies the document's "prefatory area" using a neural-net method. An alternate drop-in replacement using a "heuristic" method may be selected instead at runtime. Mods: 01/01/02 RSB Moved and renamed from MarkBody.c. */ #include #include #include #include #include "AutoMark.h" //--------------------------------------------------------------------------- // Attempt to find the "prefatory" area, which is what I call the area between // the PG header and the actual text. The net result of the pass is simply to // set Dataset->NumPrefatoryLines. The basic technique is to find, within the // first MAX_PREFATORY_LINES the first candidate heading followed by at least // MIN_TEXT_LINES of non-headings. Obviously, some additional flourishes may // be required. For example, if we find a non-blank line that's a duplicate // of a prior line, it is probably the first section. // // The reason I'm doing this at all is that if the Gutenberg text contains a // "table of contents", we really messes things up in terms of extraneous // headings. So if I can find something labeled "contents" or "table of // contents", I'll include it even if it's outside of these parameters. void PrefatoryAnalysisPassNeural (AnalysisDataset * Dataset) { }