<!--  vim: set sw=2 sts=2 et ft=docbk:

  Part of the A-A-P recipe executive: Fetching

  Copyright (C) 2002-2003 Stichting NLnet Labs
  Permission to copy and use this file is specified in the file COPYING.
  If this file is missing you can find it here: http://www.a-a-p.org/COPYING

-->

<bridgehead>Fetching And Updating</bridgehead>

<para>
A convention about using the "update" and "fetch" targets makes it easy for
users to know how to use a recipe.  The main recipe for a project should be
able to be used in three ways:
<orderedlist>
<listitem><para>Without specifying a target.</para>
<para>
        This should build the program in the usual way.  Files with a
        "fetch" attribute are obtained when they are missing.
</para></listitem>
<listitem><para>With the "fetch" target.</para>
<para>
        This should obtain the latest version of all the files for the
        program, without building the program.
</para></listitem>
<listitem><para>With the "update" target.</para>
<para>
        This should fetch all the files for the program and then build it.
        It's like the previous two ways combined.
</para></listitem>
</orderedlist>
</para>

<para>
Here is an example of a recipe that works this way:
</para>
<programlisting>
        Status = status.txt
        Source = main.c version.c
        Header = common.h
        Target = myprog

        $Target : $Source $Status
            :cat $Status
            :do build $source

        # specify where to fetch the files from
        :attr {fetch = cvs://:pserver:anonymous@myproject.cvs.sourceforge.net:/cvsroot/myproject} $Source $Header
        :attr {fetch = ftp://ftp.myproject.org/pub/%file%} $Status
</programlisting>

<para>
Note that the header file "common.h" is given a "fetch" attribute, but it is
not specified in the dependency.  The automatic dependency checking will
notice the file is used and fetch it when it's missing.
</para>

<para>
When using files that include a version number in the file name, fetching
isn't needed, since these files will never change.  To reduce the overhead
caused by checking for changes, give these files a "constant" attribute (with
a non-empty non-zero value).  Example:
</para>

<programlisting>
        PATCH = patches/fix-1.034.diff {fetch = $FTPDIR} {constant}
</programlisting>

<para>
To fetch all files that have a "fetch" attribute start &Aap; with this command:
<literallayout>        <userinput>aap fetch</userinput>
</literallayout>

When the "fetch" target is not specified in the recipe or its children, it
is automatically generated.  Its build commands will fetch all nodes with
the "fetch" attribute, except ones with a "constant" attribute set
(non-empty non-zero).  To do the same manually:
</para>
<programlisting>
        fetch:
                :fetch $Source $Header $Status
</programlisting>
<para>
Or use the <link linkend='cmd-fetchall'>
<literal>:fetchall</literal></link> command.
</para>

<para>
NOTE: When any child recipe defines a "fetch" target no automatic fetching
is done for any of the recipes.  This may not be what you expect.
</para>
<para>
When there is no "update" target it is automatically generated.  It will
invoke the "fetch" target and the default target(s) of the recipe.  To do
something similar manually:
</para>
<programlisting>
        update: fetch $Target
</programlisting>


<bridgehead>The Fetch Attribute</bridgehead>

<para>
The "fetch" attribute is used to specify a list of locations where the file
can be fetched from.  The word at the start defines the method used to
fetch the file:

<informaltable frame="none">
  <tgroup cols="2">
    <colspec colwidth="130"/>
    <tbody>
      <row>
        <entry>ftp</entry>
        <entry>from ftp server</entry>
      </row>
      <row>
        <entry>http</entry>
        <entry>from http (www) server</entry>
      </row>
      <row>
        <entry>scp</entry>
        <entry>secure copy</entry>
      </row>
      <row>
        <entry>rcp</entry>
        <entry>remote copy (aka insecure copy)</entry>
      </row>
      <row>
        <entry>rsync</entry>
        <entry>remote sync</entry>
      </row>
      <row>
        <entry>file</entry>
        <entry>local file system</entry>
      </row>
      <row>
        <entry>cvs</entry>
        <entry>from CVS repository
                        For a module that was already checked out the part
                        after "cvs://" may be empty, CVS will then use the
                        same server (CVSROOT) as when the checkout was
                        done.</entry>
      </row>
      <row>
        <entry>other</entry>
        <entry>user defined</entry>
      </row>
    </tbody>
   </tgroup>
</informaltable>


These kinds of locations can be used:

<literallayout>        ftp://ftp.server.name//full/path/file
        ftp://ftp.server.name/relative/path/file
        http://www.server.name/path/file
        scp://host.name/path:path/file
        rcp://host.name/path:path/file
        rsync://host.name/path:path/file
        cvs://:METHOD:[[USER][:PASSWORD]@]HOSTNAME[:[PORT]]/path/to/repository
        file:~user/dir/file
        file:///etc/fstab
</literallayout>

For a local file there are two possibilities: using "file://" or "file:".
They both have the same meaning.  "file:" is preferred, because the double
slash is usually used before a machine name: "method://machine/path".  A file
is always local, thus leaving out "//machine" is the logical thing to do.
</para>
<para>
Note that for an absolute path, relative to
the root of the file system, you use either one or three slashes, but not two.
Thus "file:/etc/fstab" and "file:///etc/fstab" are the file "/etc/fstab".  A
relative path has two or no slashes, but keep in mind that moving the recipe
will make it invalid.  You can also use "file:~/file" or "file://~/file" for a
file in your own home directory, and "file:~jan/file" or "file://~jan/file"
for a file in the home directory of user "jan".
</para>

<para>
In the "fetch" attribute the string "%file%" can be used where the path of
the local target is to be inserted.  This is useful when several files have a
common directory.  Similarly "%basename%" can be used when the last item in the
path is to be used.  This removes the path from the local file name, thus can
be used when the remote directory is called differently and only the file name
is the same.  Examples:
</para>
<programlisting>
        :attr {fetch = ftp://ftp.foo.org/pub/foo/%file%} src/include/bar.h
</programlisting>
<para>
Gets the file "src/include/bar.h" from
"ftp://ftp.foo.org/pub/foo/src/include/bar.h".
</para>
<programlisting>
        :attr {fetch = ftp://ftp.foo.org/pub/foo/src-2.0/include/%basename%}
                          src/include/bar.h
</programlisting>
<para>
Gets the file "src/include/bar.h" from
"ftp://ftp.foo.org/pub/foo/src-2.0/include/bar.h".
</para>


<bridgehead>Defining Your Own Method</bridgehead>

<para>
To add a new fetch method, define a Python function with the name
"fetch_method", where "method" is the word at the start.  The function will be
called with four arguments:
<informaltable frame="none">
  <tgroup cols="2">
    <colspec colwidth="100"/>
    <tbody>
      <row>
        <entry>dict</entry>
        <entry>a dictionary with references to all variable scopes (for expert
        users only)</entry>
      </row>
      <row>
        <entry>machine</entry>
        <entry>the machine name from the url: what comes after the "scheme://"
          upto the first slash</entry>
      </row>
      <row>
        <entry>path</entry>
        <entry>the path from the url: what comes after the slash after
          "machine"</entry>
      </row>
      <row>
        <entry>fname</entry>
        <entry>the name of the file where to write the result</entry>
      </row>
    </tbody>
   </tgroup>
</informaltable>

The function should return a non-zero number for success, zero for failure.
Or raise an IOError exception with a meaningful error.
Here is an example:
</para>
<programlisting>
    :python
        def fetch_foo(dict, machine, path, fname):
            from foolib import foo_the_file, FooError
            try:
                foo_the_file(machine, path, fname)
            except FooError, e:
                raise IOError, 'fetch_foo() failed: %s' % str(e)
            return 1
</programlisting>
<para>
  Note that a version control function overrules a fetch function.  Thus if
  "foo_command()" is defined "fetch_foo" will not be called.
</para>


<bridgehead id="cache-update">Caching</bridgehead>

<para>
Remote files are downloaded when used.  This can take quite a bit of time.
Therefore downloaded files are cached and only downloaded again when outdated.
</para>
<para>
The cache can be spread over several directories.  The list is specified
with the $CACHE variable.
</para>
<para>
NOTE: Using a global, writable directory makes it possible to share the cache
with other users, but only do this when you trust everybody who can login to
the system!  Someone who wants to do harm or make a practical joke could put a
bogus file in the cache.
</para>
<para>
A cached file becomes outdated as specified with the "cache_update" attribute
or the $CACHEUPDATE variable.  The value is a number and a name.  Possible
values for the name:
<informaltable frame="none">
  <tgroup cols="2">
    <colspec colwidth="100"/>
    <tbody>
      <row>
      <entry>day</entry>
      <entry>number specifies days</entry>
      </row>
      <row>
        <entry>hour</entry>
        <entry>number specifies hours</entry>
      </row>
      <row>
        <entry>min</entry>
        <entry>number specifies minutes</entry>
      </row>
      <row>
        <entry>sec</entry>
        <entry>number specifies seconds</entry>
      </row>
    </tbody>
   </tgroup>
</informaltable>
The default is "12 hour".
</para>
<para>
When a file becomes outdated, its timestamp is obtained.  When it differs
from when the file was last downloaded, the file is downloaded again.  When
the file changes but doesn't get a new timestamp this will not be noticed.
</para>
<para>
When fetching files the cached files are not used (but may be updated).
</para>
