|
Design Decisions
1. Using an existing format for recipes
2. Base for recipe format
3. Script language
4. Base for recipe handling
5. Base for the IDE
6. What to do for version control
7. Base for browsing
8. Format to use for dependency files
9. Format to use for configuration scripts
10. How the cross referencer uses recipes
11. How to use the script language in recipes
12. how to use system commands in recipes
13. What signatures to store for a recipe
14. How to specify variants in a recipe
15. Interface between Recipe Executive and Dependency Checker
16. Where to store signatures for a target
17. Automatic fetching of files
18. How to specify the origin of a file
19. Specify actions with Python or not
20. Strip comments from C files for signatures
21. Give command options as "-o" or as "{option}"
22. Distributing dependency files
23. Distributing generated files
24. Scope of variables
25. Build options independent of variants
26. Scope of rules
27. How to define building at a high level
28. Expanding wildcards
29. Design for configuration checks
30. How to cleanup cached files
1. Using an existing format for recipes
The recipe format is very important, it's the base of A-A-P. Designing this
from scratch will be a lot of work and makes it less predictable how long it
will take to become stable. Using an existing format would save a lot of
work. But the question is if it will be possible to meet all the demands.
Recipes will be used for many purposes and have these demands:
- Execute commands based on dependencies, like in a Makefile
- Automatically find dependencies between program files (possibly with
an extra tool) and use them
- Handle downloading of files that are needed
- Also be able to use signatures for recognizing changed files
(timestamps may change when downloading or unpacking an archive)
- Easy to understand and editable for a human, does not require a
specific tool to edit or generate it
- Be usable as a project file in the IDE (machine readable)
- Support splitting up a big task into several recipe files (downloading
only those that are needed)
- Can execute portable script snippets and arbitrary programs
Alternatives:
- Makefile
-
+ |
recognizable for people who know makefiles
|
+ |
mature, less work
|
- |
there are actually several, incompatible Makefile formats (e.g.,
GNU and BSD); adding A-A-P features will add more
incompatibilities
|
- |
no automatic dependency handling; requires using a tool and
modifying the Makefile (e.g., "make depend")
|
- |
format is a bit complex (esp. the GNU variant) and verbose; this
is a reason that makefiles are often generated from a template
|
- |
not easily usable as a project file (would require strict rules,
which means it's not just a Makefile)
|
- |
unclear how downloading and remote dependencies can be added (the
FreeBSD ports system does this with shell scripts, can't use
dependencies inside downloaded files though)
|
- |
script snippets can only be used in the build commands, not in
if-then rules
|
- Makefile template
-
Since Makefiles are verbose, they are often generated from a template. The
most well known are Makefile.in (autoconf) and IMakefile (imake).
- |
IMakefile uses the C preprocessor, this adds the complexity of cpp
|
- |
Makefile.in is basically a Makefile where some items need to be
replaced by the configure script; it doesn't meet more demands for
A-A-P
|
- |
All template formats add new syntax elements, while still having
most of the disadvantages of a Makefile
|
- Makefile with extensions
-
This involves using the Makefile format, but changing the format to fit the
needs of A-A-P, while remaining backwards compatible (not being compatible
is regarded a new format).
+- |
people who know makefiles will recognize the format, but still
need to learn the extensions
|
- |
very difficult to make the format easier to use than Makefiles,
the additions probably make it more cryptic than it already was
|
- |
recipes that use the extentions are not backwards compatible,
recipes that are compatible can't do the necessary things:
being backwards compatible doesn't appear to have an advantage
|
- |
still has most of the disadvantages as for Makefiles
|
- New format
-
+ |
will be the best format for new users
|
+ |
can use all the good ideas from Makefiles and templates
|
+ |
can still start "make" when this has an advantage
|
+ |
can use a script language (Python or Perl) in many places
|
- |
quite a bit of work, it needs to be designed and implemented
|
- |
developers need to learn the format before they can use it
|
- Script language
-
There are a few tools that use a script file (e.g., Perl or Python) as a
replacement for a makefile. The cleverness is done by predefined script
functions.
+ |
no need to design the syntax
|
+ |
should be very flexible and reasonable readable
|
- |
still need to define how building works in the script language,
these parts need to be learned even by someone that knows the
script language
|
- |
the format is not free, it quickly becomes cryptic
|
None of the existing formats sufficiently fulfills the demands.
The extra work for a new format can be reduced by re-using ideas from existing
formats (see next design decision).
Since the recipes will be the backbone of A-A-P it is worth investing time on
designing the format.
Choice: Use a new syntax
index
2. Base for recipe format
This assumes that the first design decision is to use a new format.
The demands for the recipe format are also listed there.
Designing the format from scratch will be a lot of work and makes it less
predictable how long it will take to be stable. Therefore using an existing
build-script format to start with will be good. Ideas from several other
formats can be merged in.
This partly depends on the choice for the script language.
Alternatives:
- Makefile
-
Provides the basic idea for what a recipe does: a depency and associated
actions. This can be used as a base.
The syntax is a bit cryptic though, especially when using the more advanced
features (BSD and GNU extensions). Using the script language in those
places makes it both less cryptic and more flexible.
Makefiles are widely used, if A-A-P recipes resemble them people will find
recipes easier to learn.
- BSD ports file
-
This is actually a BSD Makefile, but with extra rules (targets and variable
names) to use them for downloading and installing software. A-A-P scripts
should be able to do all that a ports file can do.
Shell scripts are used for the more complicated things. Still requires
three separate files for info on the package.
Ideas from this could be used for A-A-P, but not necessarily with the same
syntax.
- Makepp
-
Compatible with Makefile, with several useful extensions:
- automatically handle dependencies on include files
- handle dependencies on the result of other makefiles,
"load_makefile"
- reliable use of wildcards
- specifying type of signature used
- filenames with special characters
- using a Perl function in the place of a variable
Could very well be used as a base, only downloading and using a script
language still need to be added.
- MakeMake
-
A Makefile template format, used by MakeMake. Uses Perl.
Does not appear to have more interesting ideas than Makepp.
- Imakefile
-
A Makefile template format, used by imake. Uses the C preprocessor.
Does not appear to have more interesting ideas than Makepp.
- SCons
-
Basically a Python script.
This has the disadvantage that it looks more like a program than a list of
dependencies and actions.
It's more verbose than a Makefile.
To use it not only requires knowing Python, but also the SCons specific
additions, which are not obvious.
This doesn't look like a good base for the recipe format.
Some of the scripting parts might still be used.
- Cons
-
Like SCons, only in Perl.
Can make the same remarks as for SCons, except that SCons uses Python,
thus the Cons scripting parts are not useful.
- Tmk
-
Combination of Tcl and Make.
Mostly looks like a Tcl program with specific functions.
Similar comments as for SCons.
Can perhaps use ideas of using a script language in a makefile.
- Sire
-
Replacement for makefiles.
Uses a free-format syntax; replaces the need for line continuation with the
need for terminating a sentence.
Includes support for Python snippets.
Has quite a few nice ideas.
Can be used as a base or for ideas.
- Jam
-
Well thought out, but cryptic language. Still similar to a makefile in
many ways.
Handles dependencies automatically.
Uses a free-format syntax, like Sire.
Can be used as a base or for ideas.
- Ant
-
Uses XML, which is cryptic when using a normal text editor.
Perhaps some ideas can be used (e.g., for CVS access), but not for a base.
There does not appear to be a format that has clear advantages over using
Makefile as a base. The main advantage of using Makefiles is that many people
know it already. The Makepp variant comes closest to the goals
for A-A-P.
Choice:
Use makepp Makefiles for the recipe base, mix in ideas from Sire, Jam and
SCons.
index
3. Script language
In recipes, plugins and other places some kind of script language is
needed. It will be used for those parts of A-A-P which need to be
portable. It is also needed to avoid many tools need to be installed
before A-A-P can do its work and avoid incompatible (or buggy) versions of
tools (e.g., "sed").
Once the language has been chosen and is being used, changes in the language
will cause trouble. They either cause incompatibilities, or may result in
awkward solutions to avoid incompatibilities. Therefore it is important to
use a script language that has proven itself and is stable. Using a main
stream language also has the advantage that many people will already know it
and don't need to learn yet another language. And implementing a new language
would be an awful lot of work. Therefore only well-known languages are
considered here.
Alternatives:
- Perl
-
+ |
A very powerful language, widely used.
|
- |
Perl scripts tend to be cryptic. It's not easy to write nice Perl.
|
- |
O-O stuff was added later, the syntax is not very nice.
|
- Python
-
+ |
A very powerful language, widely used.
|
+ |
Can be used in many ways, from single-line scripts to big programs.
|
- |
The dependency on indents may cause mistakes.
|
- Tcl
-
+ |
Widely used language, especially in combination with Tk.
|
- |
An older language, less powerful than Perl or Python.
|
- |
Only supports the string data type.
|
- Ruby
-
+ |
An improved version of Perl.
|
- |
Not as widely used as others. Still has to prove its usability.
|
- |
Concentrates more on theoretical O-O paradigms than practical use.
|
Here is an article that compares the alternatives in a nice way:
article.
An article that compares Perl and Python can be found
here
Although Ruby might be better as a language, there is hardly any code in Ruby
that could be used for A-A-P, while there is plenty in Python.
Additionally, the recipe is based on Makefile syntax, which uses indents and
"#" comments more or less like Python.
Choice: Python
index
4. Base for recipe handling
Recipe handling is the core of A-A-P.
It reads recipes, downloads files, executes build commands and installs
packages.
Being able to use an existing program as a start will save a lot of work.
Obviously, this choice heavily depends on the choice for the recipe format.
This implicitly also choses the programming language to be used for (most of)
the recipe handling.
This is important, because it can't be changed later without great cost.
Demands:
- Must be able to run on many Operating Systems.
- Preferably be small and easy to install (everybody needs it).
- Support downloading files over the internet
- Can be used from the IDE.
- Integrate nicely with the chosen script language.
The base does not need to support all the demands, but it must be possible to
add them.
Ideas from all programs can be used, thus this isn't mentioned below.
Alternatives:
- SCons
-
Written in Python, thus easily handles Python script snippets.
Well documented design.
Build engine separate from script parsing.
- Makepp
-
Much of this can be used, since it supports many features that recipes use.
However, A-A-P uses Python while Makepp uses Perl.
It is implemented in C.
- GNU make or BSD make
-
Written in C. Does not provide more than Makepp.
- Myke
-
Written in C. Does not provide more than Makepp.
- Sire
-
In C++. Supports Python scripts.
Could be used as a base.
- Jam
-
In C.
Can parse a makefile-like file and build the targets.
Looks nice and small.
Could be used as a base.
Doesn't embed Python though.
- Cons
-
In Perl.
Like SCons, but using Perl instead of Python.
Does not appear to have any advantages over SCons.
- Tmake
-
In Perl.
Only produces a makefile from tmake project file, leaves the work of
building to "make". Thus not useful as a base.
Does include a library of compiler settings that could be useful.
- MakeMake
-
Only a small Perl script. Not useful.
- Ant
-
In Java. This parses XML and does not embed a script language. Does not
appear to be useful as a base.
- tmk
-
Written in Tcl. Would only be useful when using Tcl, which we don't.
- SDS
-
Uses XML, Lisp, C++ and a few other languages.
Looks unnecessary complicated for A-A-P and is still in a beta version.
- acdk_make
-
In Lisp. Cryptic and no easy to read documentation. Not useful.
The choice apparently is between using C with Python embedded, or just Python.
Since no specific C features are required, and Python should be able to do
everything, using Python saves work and fulfills all demands.
Choice: SCons
index
5. Base for the IDE
What existing program to build the A-A-P IDE upon?
There are an awful lot of alternatives.
The list below is not complete, it only mentions the ones that come close to
being useful for A-A-P.
Other IDEs could still be used for ideas, see the
tools page.
Demands (either already present or can be added):
- Must be portable over many Operating Systems
- Preferably small and fast, "lean and mean"
- Supports many languages
- Supports plugins for editors, version management, debuggers, etc. It
should be a framework instead of a monolitic program
- Integration with recipes for projects, building and packaging
- Can handle downloading parts that are needed
- Clever browsing of files and items in a project and outside of it
A main issue here is that the GUI used must be portable. This rules out IDEs
based on Win32 and X-Window toolkits. This is also important when someone
writes a plugin that uses the GUI.
Alternatives:
- IDLE
-
Nice small IDE written in Python.
Fits quite well with using Python as a script language;
portable to most systems that Python runs on (limited by Tk being
available).
Not many features, but should be easy to extend.
Easy to add plugins written in Python.
- Eclipse and Netbeans
-
These are both "big", Java based IDEs. They require a running Java
runtime and occupy many Megabytes on disk. The main disadvantage is
that they are slow, startup takes half a minute. And huge amounts of
memory are used (256 Mbyte mininum). The advantage is that they offer
many features and allow plugging in many tools (through a Java interface
though).
- Kdevelop
-
Written in C++, requires the Qt library. This makes it a bit more
difficult to write plugins.
The advantage is that it has quite a few features, the disadvantage is that
it uses its own solution for everything (e.g., project files).
Probably not easy to port to MS-Windows or other OS.
Would require a lot of work to be used for A-A-P.
- anjuta
-
"The Gnome variant of Kdevelop".
Written in C with the GTK GUI.
Not really available yet.
Is being merged with gIDE. Has a relation with Glade (GUI builder).
Less portable than Kdevelop, because it uses GTK.
Therefore not useful as a base for A-A-P.
- BOA constructor
-
IDE for Python written in Python.
Actually in alpha stage, but still interesting because of its use of
Python. Not useful as a base though.
- RHIDE, motor and xwpe
-
Borland-like cursus based IDEs.
Not useful as a base, but the parts that work in a console are interesting.
- IDEntify
-
GTK based.
Looks less interesting than anjuta.
- Moonshine
-
Qt library based.
Looks less interesting than Kdevelop.
It's not necessary to make a final choice yet, so long as it's clear that
there is a base that can be used, since this significantly reduces the amount
of work.
Preliminary choice: IDLE
UPDATE January 2003
An important issue in chosing the base used for the GUI IDE is the GUI toolkit
used. Once this toolkit has been chosen it will be an awful lot of work to
switch to another one.
Since we don't want to implement the code again for each system, we need a
portable GUI toolkit. We can rule out toolkits that are OS-specific.
We prefer a toolkit that offers high-level widgets, such as a tree viewer (to
display the tree of recipes and the tree of files) and a terminal emulator (to
run commands in). Implementing these ourselves would take an awful lot of
time.
Alternatives:
- tkinter
-
This is the standard GUI, available in almost every Python installation.
For more advanced widgets (e.g., a tree view) the Tix library is
additionally required.
Advantage is its availability. Disadvantage is that it's oldfashioned and
doesn't offer more advanced features. Many people don't like it much.
- anygui
-
This is a meta-GUI. It switches to various available GUIs: tkinter,
wxwindows, etc. Even curses and plain text can be used.
Advantage is the possibility to use all kinds of GUIs. Disadvantage is that
it appears that anygui is not complete and not actively being developed.
Also, everything would still need to be tested on all the different GUIs to
find out if it actually works properly. Not all features are available on
all GUIs.
- wxPython
-
Often mentioned as the best GUI currently available for Python.
Advantage is that it's a good GUI. The main disadvantage mentioned is that
it needs a bit of effort to install.
Since it is popular many tools are available.
For example, the GUI designer wxGlade (written in Python).
- pyGTK
-
Python bindings for the GTK toolkit.
GTK is a well known toolkit for Unix. It has been ported to MS-Windows.
I have not heard good or bad experiences with pyGTK.
I don't know if it is easier to use than wxPython or not.
- pyQt
-
Python bindings for the Qt toolkit.
Qt is a well known portable toolkit. It is commercial by nature, but some
versions are available for free. The same is true for pyQt.
The documentation is limited, it is a list of differences from the C++
version.
The commercial nature of Qt and pyQt make it less attractive.
Although Qt is known to be very good, I have not heard good or bad
experiences with pyQt.
The choice for tkinter would be made when we want the GUI to run without
installing an extra library. But since A-A-P is all about automatically
downloading and installing, this can't be an important issue. And for more
advanced widgets an extra library (e.g. Tix) is needed anyway.
Although pyGTK seems to be a good system, it is unclear how reliable it is.
Chosing it means taking a risk. Same for pyQt.
Using a good GUI that makes it easy to program the GUI IDE and results in a
good looking application is the most important. This means wxPython is the
best choice. The next question is: Is an application in wxPython available
that can be used as a base for the GUI IDE? Possibilities:
-
wxGlade
Actually a GUI designer, not an IDE, written in Python.
-
PythonCard
Builds on the hypercard concept, impemented with wxPython.
Still in an early phase.
-
wxWorkshop
IDE for wxWindows. Does not appear to be available yet.
-
Boa Constructor
wxPython GUI builder and IDE written in Python. Recommended by many.
It is actively being developed.
Updated choice: Use the wxPython toolkit. Use Boa Constructor as a
base for the GUI IDE.
index
6. What to do for version control
A-A-P will work together with many existing version control systems (VCS).
This is required for existing projects that can't switch to another VCS
without great effort.
Additionally, a person who is working on a project needs to manage his local
versions. He may want to try out changes, include patches from others or just
keep a recipe with a few different settings.
The question is how much of this personal VCS is going to be included in
A-A-P.
What is not involved in this decision is using the VCS for obtaining various
versions of project files, making diffs, merging and checking in new versions.
That will be done anyway.
Alternatives:
- Create a new VCS
-
This would be a perfect system for use with A-A-P, but it is going to be
too much work, this falls outside of the scope of A-A-P.
- Make nothing new
-
This would limit the work to providing interfaces to existing systems.
These interfaces have to be done anyway, because of the requirement for
existing projects. The user can then pick a VCS he likes, or not use a VCS
for his local changes.
The main disadvantage is that there currently is no "perfect" VCS
for use with A-A-P. New users will run into the problem of
having to chose a non-optimal VCS for their work.
- Make a simplistic VCS
-
Most VCS systems are set up to be used by a group of people.
What we need for A-A-P will only be used by one person.
That creates the option for a simple VCS that is powerful for this limited
task.
This VCS would remember every version of a file. No need to check-in or
check-out, it is sufficient to label selected versions of a file and define
which versions make a snapshot of a project version.
It's more like an unlimited undo system.
It's unclear why this would be better than a system like RCS or SCCS.
It might still be quite a bit of work if done properly.
- Modify an existing VCS
-
This means doing work for the VCS project of choice. This unlikely is
closely related to A-A-P and can't be considered part of the A-A-P scope.
A-A-P can probably work quite well without making something new.
This is not an easy decision. This might have to be reconsidered when A-A-P
has progressed further and needs of A-A-P users are clearer.
Then some better VCS may be added. This is possible, because there is not
much interference with the rest of A-A-P and existing VCS have to be supported
anyway.
Choice: Make nothing new
index
7. Base for browsing
Browsing a project requires creating databases of identifiers. This can be
split into three parts:
- parsing a language to extract identifiers and their context
- storing the information in a database file
- looking up identifiers in database files
Note that we assume that several databases are used, since one project may
include libraries and parts of other projects, and you don't want to create a
huge database which partly contains the same information as a database for
another project.
See the tools overview for the demands.
Alternatives:
- GNU ID-utils
-
This doesn't support keeping the context of an identifier. It also tends
to create huge database (ID) files. Perhaps a few ideas can be used, but
not useful as a base.
- Exuberant Ctags
-
This contains parsers for multiple languages, and extracts context
information. That part can be used as a base.
The created tags files are verbose, since they are backwards compatible
with old ctags programs.
Does not store where an identifier is used, only where its defined.
This format can't be used.
- Cscope
-
Parser only supports C, can also be used for C++ and Java.
Database functionality could be used.
The code is not well documented though.
- Freescope
-
Similar to Cscope, but written in C++. Better documented.
Unclear if its database code would be better than from cscope.
- Hdrtag
-
Like Exuberant ctags, but parses a few different languages.
The parsers that Exuberant ctags does not have might be useful.
- Glimpse
-
Does not understand languages, but its way to store info in the database
could be interesting.
There is no program that can be used as a base for all three parts.
The base will have to be a combination of two programs.
Choice: Base parsing languages on Ctags, base database handling on
Cscope or Freescope
index
8. Format to use for dependency files
Dependencies between files can be figured out automatically.
For example, a C program can be inspected for included files.
To avoid having to check dependencies very often (which can take a lot of time
and requires all files to be present) the result of the dependency check is
written in a file. The question is what format this file will have.
There can be either one file with many dependencies, or one file per target.
For simplicity of updating the file, and being able to use external tools, it
is probably easiest to have one file per target. This does result in a large
number of files, this must be dealth with somehow.
Alternatives:
- A-A-P recipe
-
This means generating the dependencies just like a user would type them in.
Since the format of the dependencies is similar to what a makefile uses,
tools that generate makefile compatible dependencies can still be used.
Example:
source.o : header.h proto/source.h
Advantages: smooth integration in recipes. No need to write another parser
to read the dependencies. Can use tools that generate makefile style
dependencies.
Disadvantages: None?
- Makefile
-
This is an existing file format, which many tools can already handle.
Mostly this means putting the dependencies at the end of the file,
overwriting them when updating the dependencies.
Example:
source.o : header.h proto/source.h
Advantages: Existing format.
Disadvantages: Not possible to use attributes when needed.
- Custom format
-
This would mean designing a new (very simple) file format, in
which the dependencies are written. The format would be designed to be
easy to read and write with Python.
Advantages: The reading and writing is simple to implement.
Disadvantages: Requires specific tools to update the file. Possibly the
output of existing tools can be converted with a small filter program.
But that is still extra work.
Choice: A-A-P recipe
index
9. Format to use for configuration scripts
Each system is different. To be able to write a portable program it is often
required to check out what the system is and adjust the program to it.
On Unix the autoconf system is often used.
A-A-P needs something similar, but better portable.
The question is what file format to use to specify the configuration checks,
and the file format for the resulting configure script. These two are closely
related, because parts of the configure script may have to be written
literally in the input file.
Avoiding the need to produce a separate configure script would be even better.
This means there is no tool required to translate a specification of the
configure checks to an executable script. The specification of configure
checks is executed directly.
This is possible by using Aap recipe commands and Python.
An important aspect is that the building itself may depend on configure
checks. Being able to mix configure checks with build commands makes it very
flexible. However, the user must take care to maintain the overview. Putting
a sequence of configure checks in a separate recipe and including this recipe
takes care of this.
Recipe commands can be added to produce a C header file that passes the
results of the configure checks to a C program. This can be done in
various ways, thus this will not be a restriction for the choice of the file
format of the configure checks.
Choice: Put configure checks in the recipe
index
10. How the cross referencer uses recipes
The cross referencer needs to know which files are part of a project, so that
it can read those files to locate symbols.
It also needs to know which of these files have been changed, to be able to
decide to read them again.
The question is how the cross referencer does this.
Alternatives:
- The cross referencer does it by itself
-
This has the advantage that the cross referencer can be used as a
standalone program. And when looking up a symbol it can detect a database
is to be updated for a changed file.
However, the cross referencer has no means to find out which files belong
to a project and need to be scanned for symbols. The user would have to
specify the list of files manually.
- The cross referencer shares code with the recipe executive
-
This means the cross referencer knows itself which files require updating,
but uses code from the recipe executive to do the actual work. This
creates extra dependencies between the cross referencer and the recipe
executive. The advantages over storing the info for detecting outdated
files in the database appear to be minimal.
The cross referencer can use the code of the recipe executive to find out
which files belong to a project. Running the cross referencer to do this
would be nearly equal to having the recipe executive start the cross
referencer on the project files, but without the extra interface.
- The cross referencer lets the recipe executive do this
-
The recipe is already designed with the goal to handle dependencies and
update files that need updating. It is easy to execute the cross
referencer to update the database for outdated files.
However, the knowledge about which files are outdated would be stored in a
format specifically for the Recipe Executive. When using the cross
referencer as a standalone program this information would not be available.
Choice: Update the cross references for files in a project with a
recipe: "aap reference".
Store info to detect outdated files in the cross referencer database.
index
11. How to use the script language in recipes
In recipes the script language is used for the more complicated things, while
the most often used rules and commands should be simple, like in a Makefile.
The question is how to mix the script language with Makefile lines, while not
confusing the meaning of each.
This is based on the choice for using makepp Makefiles as a base
(decision 2)
and Python for the script language (decision 3).
The lines from a Makefile that would be needed are:
- Variable assignments: "SOURCE = main.c version.c".
- Dependency rules: "main.o : main.c".
- Shell commands: "$(CC) -c main.c".
- Other non-script commands.
Comment lines are not relevant, since both Makefile and Python use "#".
Alternatives:
- No specific marker, recognize each line
-
This is very difficult. For example, an assignment in Python:
SOURCE = "main.c"
This looks nearly identical to a Makefile assignment:
SOURCE = main.c
To allow white space in file names, the quotes will also be used in a
Makefile assignment (using a backslash is an alternative, but this gets
very complicated when a backslash is a path separator). The lines are
identical then. An alternative is to use ":=" instead of "=".
Recognizing shell commands could be done by using the rule that when it's
not a legal Python command and it comes after a dependency line, it must be
a shell command.
SOURCE := main.c version.c
if expression:
main.o : main.c
for i in \
list:
r = func(i)
$(CC) -c main.c
|
Main problem here is that it's not easy to find mistakes, since a wrong
script command will be recognized as a shell command.
Also, the script and Makefile assignments are not easy to recognize.
This increases the chance for making errors.
- Mark script lines
-
This uses a marker character in front of each line that is a Python
command. To keep this simple, it is also used in continuation lines.
Using the '@' character as an example:
SOURCE = main.c version.c
@if expression:
main.o : main.c
@ for i in \
@ list:
@ r = func(i)
$(CC) -c main.c
|
This requires a lot of markers when using a larger script (e.g., when
defining a function). To work around this, a block of lines could be
marked as a script, for example between "[" and "]":
SOURCE = main.c version.c
@if expression:
main.o : main.c
[
for i in \
list:
r = func(i)
]
$(CC) -c main.c
|
The big advantage of this alternative is that all Makefile lines can remain
unmodified, and these are the most often used in recipes that a user
writes.
A disadvantage is that a Python commands has two appearances, depending on
where it is used. Misplacing the marker can have strange results.
- Mark non-script lines
-
This uses an easily recognizable string for lines that are Makefile lines.
Script lines can be mixed in freely.
Like above, the "@" character could be used:
@ SOURCE = main.c version.c
if expression:
@ main.o : main.c
for i in \
list:
r = func(i)
@ $(CC) -c main.c
|
However, since these three lines do something completely different, this is
confusing. An alternative is to use textual markers, starting with a
special character to avoid confusion with a Python command.
For example, ":let" could be used for an assignment, ":rule" for a
dependency rule and ":shell" for a shell command.
:let SOURCE = main.c version.c
if expression:
:rule main.o : main.c
for i in \
list:
r = func(i)
:shell $(CC) -c main.c
|
The lines are easy to recognize and future expansion is not compromized.
However, it looks less like a Makefile, people will need to get used to it.
Since Makefile lines are most often used in recipes, the extra text is a
disadvantage.
An advantage is that it's very clear what the Makefile lines do (esp. for
someone who doesn't know Makefiles).
Choice: Mark script lines
index
12. how to use system commands in recipes
In a Makefile the commands at the toplevel are assignments and dependencies,
while commands in build rules are system (aka shell) commands.
The question is whether we should do the same in A-A-P recipes, or make the
commands in both places work the same way.
Alternatives:
- Like a Makefile
-
At the toplevel shell commands can still be useful.
The ":system" command can be used for this.
In build commands it can still be required to assing a value to a variable.
The ":let" command can be used for this.
An example of what the recipe looks like then:
VAR1 = value1
:system cmd and arguments
foo : foo.o common.h
:let VAR2 = value2
$CC -o $target $sr
|
- Commands work the same at the toplevel as in build commands.
-
In build commands shell commands need to be executed with the ":system"
command.
An example of what the recipe looks like then:
VAR1 = value1
:system cmd and arguments
foo : foo.o common.h
VAR2 = value2
:system $CC -o $target $sr
|
The Makefile way is what most people are used to, but it is inconsistant.
Since A-A-P adds the commands that start with a colon, the commands often look
different from a Makefile anyway.
Using system commands should be discouraged, because they are not portable.
Use of the built-in ":" commands and Python commands is to be stimulated.
Also, using ":system" offers the possibility to execute system commands in
several ways later (e.g., silently or using a persistent shell).
Choice: Commands work the same at the toplevel as in build commands.
index
13. What signatures to store for a recipe
When executing a recipe, signatures are used to check whether a file was
changed since the last time building was done.
This means the signature of a file from the previous build needs to be
remembered.
Since a file can be used in building several targets (e.g., a C header file)
and not all targets are always build in one execution, remembering the
signature of the file is not sufficient.
The signature needs to be remembered for each target.
Alternatives:
- Store one signature per target
-
This means that the signature of all files the target depends on are
combined and this combined signature is remembered.
Disadvantages:
- The user cannot be told which of the files caused the building to be
required.
- When the recipe is edited and dependencies added or removed,
building will always take place, even when the file didn't actually
change.
- Store a signature per target-source pair
-
This means that for each target a signature is remembered for each file it
depends on.
Disadvantages:
- The number of signatures to be remembered is the product of the
number of targets times the number of files each target depends on.
This can be quite a lot.
There does not seem to be a reason to think that the size of the file to
remember the signatures becomes too large to handle or cause a significant
slowdown. Being able to tell which dependency caused the build rules to be
executed can be quite useful.
Choice: Store a signature per target-source pair
index
14. How to specify variants in a recipe
A variant specifies a way to build a target with different arguments or
options.
For example a "release" variant with optimized code, a "debug" variant with
debugging info and a "profiling" variant with timing hooks.
There are quite a few issues involved here:
- One list of variants or several lists that are combined?
A good example is combining the alternatives "debug" / "release" /
"profiling" with "Athena" / "Motif" / "GTK" / "none" (the kind of GUI
used). These could be specified as two lists of variants with three and
four alternatives, or one list with twelve alternatives.
Using one long list could mean that the same setting would be repeated, for
example setting CFLAGS to include either "-O2" for release or "-g" for
debug would have to be repeated for each GUI variant. This repetition is to
be avoided, there should be only one place where something is specified.
- Presenting the variants in the IDE.
The alternatives much be available in the IDE, where the user can make a
choice which variant to build. Variants which are disabled by a
configuration check should not be available though. Additionally, the
variants must be editable, for when no IDE is used.
This example shows how this can be achieved:
:variant GUI
none
motif check_library("Xm")
SOURCE += motif.c
CFLAGS += -DMOTIF
gtk check_library("gtk")
SOURCE += gtk.c
CFLAGS += -DGTK
|
The ":variant" command specifies the variable used to chose a variant.
Below it follows a list of the possible values for the variable.
Optionally a Python expression follows that specifies whether this value is
a possible choice. This allows the IDE to offer only the choices that will
actually work.
The commands below each possible value are commands specifically for this
variant (usually assignments, but it can be any command valid at the
toplevel). Since the variable is available throughout the recipe, it can
be used to make other things depend on the variant.
There can be several ":variant" commands, to get the effect mentioned in
the previous item..
- Storing build results that depend on variants.
When switching from "release" to "debug", all object files are normally
affected and need to be rebuild. To avoid doing this each time the switch
is made, object files for "debug" and "release" are to be stored
separately. The obvious solution is to use a different directory for the
results of each variant (or combination of variants if there is more than
one choice). $BDIR can be used for this. A Question is how to get the
results into that directory. See the alternatives below.
- Storing build results that don't depend on variants.
Some build commands produce results that don't depend on a variant. For
example when generating a tags file. Some build results do depend on one
variant selection (e.g., "release" / "debug") but not on another ("Motif" /
"GTK"). Rebuilding should be avoided when it is not needed. Somehow the
dependency on the variants need to be specified, and the results stored in
such a way that they can be shared between variants. This would require a
directory that doesn't include all variant choices.
- Default rules should obey variants.
The default rules specify how to compile a C source file into an object
file, for example. When specifying variants the default rules should still
work. Setting a different value for a variable like $CFLAGS should make
this work. The use of $BDIR must also be supported (the object file
may be in a different directory than the source file).
- Implicit variants.
The same sources can be compiled for different systems. This is often done
by putting the sources on a networked file system and doing the compilation
while it is mounted on the current system. The system-dependend build
should then be executed, and the results stored separately for each system.
This is very similar to an explicit variant, but in this case it's the
person installing the system who wants to specify the variants, instead of
the developer.
Alternatives:
- Let the recipe writer explicitly handle it
-
The ":variant" command is used to specify what the variants are. Otherwise
no special support is provided. The recipe writer has to take care of
putting results of different variants in different places when needed.
- |
This shifts the burden of doing all the work to the recipe writer.
|
- |
It is easy to make a mistake and get inconsistent builds when
switching variants.
|
+ |
It is very flexible.
|
- Use repositories
-
This is how Make++ does it. The sources and "normal" results are stored in
one directory. To build a variant a new directory is created, and A-A-P is
started there, specifying the directory where the sources are and the
different arguments. The results of building are written in the local
directory. Sharing files that don't depend on the different arguments is
done by the usual "buildcheck".
- |
This doesn't offer the possibility to share files between two
groups of variants (e.g., debug and release that don't depend on
the GUI used).
|
- |
It is the responsibility of the user to specify the same arguments
each time while standing in the right directory. It's easy to
make mistakes, thus this quickly leads to using a shell script or
alias to do the building.
|
- |
This interferes with using the directory layout to organise a
project.
|
- Split up the recipe in a common part and a part for each variant
-
This is like using a repository, but including the different arguments for
the variant build in a recipe. Thus this requires writing a recipe
specific for a variant, which refers to the common recipe. Selecting a
variant is then done by executing in a specific directory.
- |
This doesn't offer the possibility to share files between two
groups of variants (e.g., debug and release that don't depend on
the GUI used).
|
- |
This quickly leads to duplication of settings that are common to
some of the variants but not all.
|
- |
This interferes with using the directory layout to organise a
project.
|
- Use a build directory per variant
-
This is the reverse of using a repository: Building is done where the
source files are, but a build directory is specified where all the
resulting files are placed. The main difference is that selecting the
build directory is specified in the recipe, thus making a mistake of being
in the wrong directory for a variant is avoided.
The implementation can be done in the same way: Change directory to the
build directory and use files from the original directory when possible.
However, this depends on the build commands and whether files are shared
between variants.
- |
Requires figuring out how to get the resulting files in the right
directory.
|
+ |
Very flexible, while still being able to do a lot of things
automatically.
|
Using a separate build directory for variants appears to be the best solution.
Since this is already used by many projects it should not be a surprise for
the user.
Choice: Use a build directory per variant
index
15. Interface between Recipe Executive and Dependency Checker
The Recipe Executive invokes the dependency checker to automatically detect
the dependencies of a target file on other files. Mostly these are included
header files, but it could be any file that the target file depends on.
One of the goals is that dependency checkers will exist for many languages.
To be able to re-use tools that exist for makefiles, it must be possible to
invoke external tools. For C code, for example, "gcc -MM" can be used.
For some languages a Python function can be written (which in turn may invoke
an external tool).
[old decision: use a Python function, specified with a variable. This was
changed 2002 Aug 13, because the action mechanism is more generic.]
To keep this flexible and allow the user to define what dependency checker to
use, a "depend" action is to be defined for each type of file. This action
gets the name of the file to check as source and the name of the file to
produce the results in as target. The resulting file must be a recipe (see decision 8).
The Recipe Executive will have a number of depend actions defined by default.
The user can define new ones with the ":action" command. This can be done in
any of the recipes that are read on startup.
index
16. Where to store signatures for a target
The Recipe Executive checks if a target is outdated by comparing the
signatures of the sources and the build command with the signatures used the
last time the target was build. This means the signatures are to be
remembered somewhere. The question is where.
What needs to be stored is a list of signatures for each source used to build
the target. The method used to compute the signature also needs to be
remembered (it may change). Additionally the signature for the build command
itself needs to be remembered. There must be only one place where the
signatures are stored to avoid that ambiguety exists when deciding whether the
target is outdated.
Alternatives:
- Use one central file
-
The question then becomes: Per user, per project or per system? This is
impossible to answer, since a target may be generated by several users
working in a group, be a part of several projects and used over a network
from different systems. This is not a solution.
- Relative to the target
-
This should work very well for targets that are stored in the directories
for the project. But when targets are installed in various directories on
the system or even on other systems we don't want signature files to be
created there as a side effect. Also, a virtual target does not have a
directory and access to remote files can be very slow.
- Relative to the main recipe
-
A project may consist of several modules, where building can be done for
each module separately or for all modules at once, using a recipe that
invokes the recipes for each module. This means the main recipe is
different, even though the building is done in the same way. This does not
give a consistent result.
- Relative to the recipe where the target is first mentioned
-
This has a similar problem as the previous solution: For a project split in
modules the target might be mentioned in a project recipe and a module
recipe, which can be invoked separately.
- Relative to the toplevel recipe for the project
-
When a project consists of several modules (possibly in a directory tree),
find the recipe at the toplevel. This should avoid the problems of the
previous two alternatives. There still is a problem when two projects
share the same module, it's ambiguous what the toplevel recipe is then.
- Relative to the last recipe where the target is mentioned
-
In other words: At the deepest level of nesting recipes that uses the item
as a target. This has a problem when the item is only defined as a source
and a generic rule is used to build it. The item may be used as a source
in several recipes, making it unpredictable which one was the last recipe
to use it.
- Relative to the recipe with build commands for the target
-
This should work well for targets that have a dependency with build
commands. However, when a generic rule applies, there is no direct
relation between the recipe that defines the rule and the recipe that
invokes it. Especially when an ":include" command was used.
The alternatives all have some drawbacks. A combination of them is more
complicated, but is the only way to reduce the drawbacks.
Using the directory relative to the target has the lowest number of
problems. It works reliably for targets that are produced within the
project, no matter what combination of module and project recipes is used.
Thus let's use this when it is possible.
An exception is made for these targets:
- Virtual targets.
- Targets that are installed somewhere on the system. These can be
recognized by an absolute path being used. An additional check is whether
the target directory is not below the main recipe.
- Remote targets.
For these targets another location must be chosen. Considering the above
alternatives, the directory of the most relevant recipe for the target is to
be used. This algorithm can be used to decide which recipe that is:
- If there are explicit build commands for the target, use the recipe
with these build commands.
- If there is a dependency for this target, use the recipe that defines
this dependency. If there are several dependencies, use the first
one.
- Use the recipe in which the target is first used as a source.
This is not a perfect solution. To allow the user to get correct results the
"signdirectory" attribute can be used to force using a specified directory.
index
17. Automatic fetching of files
Files that are downloaded or obtained from a version control system become
outdated when the original file is changed. The question is when A-A-P should
check for an updated version to be available and whether to obtain it
automatically. This applies mostly to the Recipe Executive. The IDE may also
run into this issue, although it probably uses the Recipe Executive to take
care of fetching files.
Note: before October 9 2002 "refresh" was used instead of "fetch".
There are basically two methods to handle changes in a file:
- Use the same file name for all versions. A version number is
remembered elsewhere (or not at all).
- Include a version number in the file name. Once a file has been
released with this name, the contents will never change.
Files falling under the first method may need to be fetched. That is what
this decision is about. Files of the second type only need to be obtained
once. Although this means we don't have to worry about fetching them, we
need to know that these files don't need to be fetched. A "fetch"
attribute can be used for this.
Alternatives:
- Automatic fetching
-
Each time commands for a dependency are executed, the sources may be
obtained when they are outdated.
There needs to be a way to specify the time after which a once obtained
file becomes outdated (e.g., once a day; extreme values are always and
never).
The main disadvantage of this method is that it is unpredictable what
happens. E.g., when trying to reproduce a problem or debugging an
application it is not clear if something happened because of something that
was done explicitly or because of a fetched file.
This could be avoided by specifying an option to avoid fetching for a
moment, but this requires the user being fully aware of how this works.
Another disadvantage is that a change in one file may depend on a change in
another file. Obtaining only one of the two files will cause problems.
It's difficult to specify the dependency between the two versions, so that
both are fetched at the same time.
- Manual fetching
-
This requires the user to tell A-A-P to fetch the files now.
A command such as "aap fetch" could be used.
A disadvantage is that the user has to separately invoke the fetch
action before building. This can be avoided by specifying an action that
includes both fetching and building.
Since predictable and reproducable builds are most important, only manual
fetching will be supported.
A recipe may still include commands to do its own automatic updates.
index
18. How to specify the origin of a file
Files may be obtained from a remote machine by a URL or from a Version
Control System (VCS), either locally or remote.
There are several kinds of files for which the origin needs to be specified:
- The recipe itself. This makes it possible to update the recipe before
the rest of it is executed.
- Child recipes, used in the ":child" command.
- Sources of a dependency. These do not require specifying a local name,
it will be automatically chosen (possibly the name under which the file is
cached).
- Other files.
A simple and consistent method for specifying the origin is desired.
In most cases the local file name must also be specified, because it may
differ from the remote file name. An exception is when using a file as the
source in a dependency that is specified with a URL, the local file name can
be automatically generated (could be the name of the file in the cache).
However, this requires a few tricks to make the generated name available to
the build commands.
Alternatives:
- Use a specific command
-
The arguments for the command would be the method to obtain the file,
arguments needed for that method and a list of files that can be obtained
with this method.
For this alternative every file needs a local name, also ones specified by
a simple URL.
The name of the command isn't important, let's use ":getmethod" for now.
Examples for the four kinds of files:
- :getmethod url {url = ftp://ftp.foo.org/pub/myproject.aap} self
- :getmethod url {url = http://foo.org/myproject/myproject.aap} foo.aap
- foo.o : common.h
:getmethod url {urlbase = ftp://foo.com/pub/files/myproject/} common.h
- :getmethod cvs {server = :pserver:user@cvs.foo.org:/cvsroot/myproject} foo.c
To avoid having to use one ":getmethod" command for each file obtained by URL,
"urlbase" specifies the part of the URL that is to be prepended to the file
name.
The main advantage is that the method and its attributes only need
to be specified once for a list of files.
Another advantage is that the information about the origin is
separated from the dependencies.
It would be possible to allow a file to appear in more than one command.
This would mean there are several ways to obtain the file.
A disadvantage is the clumsy way the origin for the recipe itself is
specified. In the example above the "self" name was used to work around
this. This could be solved by using a separate command to specify the
origin of the recipe itself.
A complication is that a ":child" command may be used before the ":getmethod"
command that specifies the origin of the recipe used. This leads to
confusion, since toplevel commands of a recipe is normally executed from
start to end. This needs to be explained in the documentation.
- Use the "origin" attribute
-
A normal URL can be used for files obtained with methods like ftp and http.
For a file in a VCS a special kind of URL is to be used.
Examples for the four kinds of files:
- :recipe {origin = ftp://ftp.foo.org/pub/myproject.aap}
- :child foo.aap {origin = http://foo.org/myproject/myproject.aap}
- foo.o : ftp://foo.com/pub/files/myproject/common.h
- SOURCE = foo.c {origin = cvs://:pserver:user@cvs.foo.org:/cvsroot/myproject}
For a source file in a dependency a local file name is not required, but it
is possible.
To specify alternative locations, a list of items can be used. When one
item fails, the next one is tried.
The text for the "origin" attribute can become quite long, but this can be
solved by using a variable.
When obtaining several files from one directory, we don't want to specify
the origin of each file. This can be solved by using the special
characters "%file%" instead of the actual file name.
An advantage is that no new mechanism or command is needed.
A disadvantage is that the information about the origin is mixed in
with the dependencies. This could be avoided by using a separate command
to add attributes (e.g., ":attr {origin = ...} file"), and putting these
commands in a separate section in the recipe.
Another disadvantage is that all the information is packed into one
attribute. It will be complicated to change.
- Use several attributes
-
Instead of using one "origin" attribute that contains everything, use
different attributes for each method. For example, "cvsserver" can be used
to obtain files from a CVS repository. "origin" can still be used for
ordinary URLs.
Just like with the "origin" attribute, the ":attr" command can be used to
specify the attributes in a separate section of the recipe.
The main advantage over using just "origin" is that it is more
flexible. Future extensions are easier to do. Extra parameters for a
specific method can be added.
A disadvantage is that the order in which alternatives are tried can
not be specified.
For making the choice here simplicity is the crucial issue. Adding special
commands for specifying the origin is more complicated and does not really
make the recipe simpler. The mechanism for using attributes is present
anyway. This does require assuring that the attributes themselves do not
become too complicated.
Choice: Use the "origin" attribute for specifying a list of alternate
origins and allow adding extra attributes to specify extra arguments for
specific methods. Add the ":attr" command to be able to set the attributes
separately from specifying the dependencies. Allow the user to specify the
commands for fetching explicitly and avoid an automatic download.
Note: Later the "origin" attribute has been renamed to "fetch".
index
19. Specify actions with Python or not
Actions are triggered with the ":do" command, based on filetypes. They make
it possible for the user to configure how an action is executed. For example,
how to build a program from a fortran file or how to edit an HTML file.
Many actions will be defined by default. The files for this will be part of
the A-A-P distribution. A user can override the defaults and define
additional actions. The question is which format is to be used for defining
these default and user specified actions.
Alternatives:
- Python
-
When defining the action with Python script, there will still be a need to
execute recipe commands. For example, executing system commands with
os.system() does not take care of logging the output, ":sys" does do this
and also takes care of expanding variables in a way required for the shell
used. The "aap_cmd" function can be used for this.
Example:
:action build fortran
if globals().has_key("target"):
aap_cmd(":sys f77 -o $target $source")
else:
aap_cmd(":sys f77 $source")
|
The disadvantage is that a strange Python function is used, making this
difficult to understand both for experienced Python users and for recipe
writers.
- recipe commands
-
Example:
:action build fortran
@if globals().has_key("target"):
:sys f77 -o $target $source
@else:
:sys f77 $source
|
The main advantage here is that it looks the same as build commands in a
rule or dependency. A disadvantage is that quite a bit of Python lines are
used, which makes the lines often start with @. However, this can be
solved by putting ":python" before all the commands.
- specific file format
-
This means actions cannot be defined inside a recipe, a separate file is to
be used. Although this is fine for the default actions, it would cause
trouble for user specified actions that are used in a single recipe. A
recipe command would need to be used to load the action file, for example
":actionfile filename", even when the action can be specified in a
couple of lines. Another disadvantage is that yet another file
format is to be learned by the user. There do not appear to be advantages.
One other issue is potentially involved here: For the IDE another type of
action is to be defined, which allows communication with an application after
it has been started. This probably requires using Python to define the
interface, since it can be quite complex to define the glue between A-A-P and
the application. Now suppose a user wants to tell A-A-P he wants to use
editor xyz for all kinds of edit actions. It would be nice if this can be
done by dropping one plugin file in his A-A-P configuration directory.
Assuming the actions are defined with a recipe, this would require a recipe
command to define the interface. It should be possible to do this, for
example with an ":interface" command that accepts an argument with a Python
class name. The class can be defined elsewhere in the recipe below a
":python" command. Thus the desire to use a single plugin file does not
influence the choice how to define actions.
Choice: Use recipe commands to define actions.
index
20. Strip comments from C files for signatures
When a comment in a C file is changed, it doesn't need to be compiled again.
To accomplish this the signature for the C file can be computed after
stripping comments and spans of white space.
Advantages and disadvantages:
+ |
Compilation can be skipped when only comments are changed. If the
resulting object file would change (e.g., because of a timestamp)
unnecessary linking is also skipped.
|
- |
Computing the signature becomes slower. Experiments show that doing the
stripping with Python causes this to be what aap spends most of its time
on. A C program is much faster, but is not portable.
|
- |
When publishing files, the changes in comments are relevant.
This would require the user to overrule the default signature check, or
some complicated automatic mechnism that makes a guess whether comments
are relevant or not.
|
Since compiling C files mostly isn't that slow, and quite often files have not
changed at all, the slowdown for the comment stripping is the most important
argument.
An alternative would be to store the MD5 signature as well, and only strip
comments when it has changed. This would mean storing two signatures per file
and the disadvantage for publishing still exists.
Choice: Don't strip comments by default, allow the user to do it when
he wants to.
index
21. Give command options as "-o" or as "{option}"
Many commands like ":copy" and ":delete" have options to modify how the
command works. There are two ways to specify these options.
- The traditional way:
-o |
--option |
-o value |
--option=value |
- Like attributes:
{o} |
{option} |
{o = value} |
{option = value} |
The traditional way looks more like shell commands. For example
":copy -f foo bar" forcefully overwrites a file.
With attributes it looks more like dependencies and rules:
":copy {force} foo bar".
In future commands options and attributes may both be used. Mixing two styles
looks messy then. Example: ":command -f file {force}".
Consistency between commands is more imporant in a recipe than similarity with
shell commands. Trying to use the same options as shell commands actually
creates expectations with the user that the same options are supported and
that they work in the same way. This can be confusing.
Choice: Specify options like attributes: {o}, {option} and {option = val}
index
22. Distributing dependency files
A C source file is scanned for included files to find the files it depends on.
The result of the dependency scan is stored in a file, to avoid having to do
it each time.
The dependency scan can be done in two ways. One way is to ignore #ifdef
statements, thus finding all files that are possibly included, also ones that
are not used by the current configuration. The other way is to handle #ifdef
statements like the compiler, the resulting dependency file is only valid for
the specific configuration. When anything changes (e.g., compilation options,
selected variant) a new dependency file has to be generated.
When distributing the C source files it might be desirable to include the
generated dependency files. This is especially true when generating the
dependency files requires a tool that is not generally available. This
requires that the file name of the generated file is know by the user and must
not depend on the configuration (e.g., the name of the platform). This can
only be true when ignoring #ifdef statements. Also, the buildcheck needs to
be ignored, since arguments like $CPPFLAGS often change and would require
generating the dependency files again anyway. However, when changing an
argument like "-Ipath" the actually used include files might change, and the
dependency file must be regenerated.
Since it's unpredictable when generated dependency files become outdated,
distributing them is not a very good idea. The alternative is to make sure
they can always be generated. Using the C compiler to do the dependency check
is the best way, because it knows the details about how include paths are
handled and knows about predefined preprocessor symbols. When the C compiler
cannot do this, a Python function can be used instead.
Choice: Do not distribute dependency files, make sure they can be
generated.
index
23. Distributing generated files
There are many tools that generate a long file from a short description.
A good example is "autoconf", which produces a "configure" shell script from
"configure.in".
Even though the generated file is not directly edited by the programmer, it
should be distributed. Either because the user might not have the tool to
generate the file again or because it takes too much effort.
Since A-A-P stores signatures to keep track of changed files, simply
distributing a generated file will not work. There is no signature for the
file, thus when running A-A-P it will attempt generating it. A-A-P does not
know the up-to-date file has replaced the old version. The question is how
A-A-P can be told that this generated file is up-to-date.
-
Using "newer" checks instead of signatures. This would work, just like it
does for "make". However, care should be taken that the timestamp for the
generated file is actually newer than the file(s) it depends on. This is not
always easy when packing and unpacking distributed files. It often requires
the script that is used for packing a distribution to "touch" the generated
file.
-
Add an attribute "distributed" to the generated file. This has the meaning
that A-A-P knows it does not need to be generated. However, when the user
makes a change that does require generating the file, he would have to
explicitly force it. Since the user might not know about the "distributed"
attribute, this quickly leads to unexpected problems.
-
Add the signature file to the distribution. This file contains the updated
signatures for the build commands that produce the generated file. The build
commands will only be executed again when appropriate, e.g., when one of the
input files changes.
However, the signature file also contains updated signatures for other
targets. When updating from a previous version, the signature file may
indicate that a target is up-to-date, while the updated file is not
distributed and thus actually needs to be build.
-
Include the relevant signatures with the generated files. This can work by
specifying a file name to store the signatures in with the "signfile"
attribute. It is up to the recipe writer to make sure the relevant files are
included with the distribution with the right version. There is a small
chance for a problem when a user updates only some of the files (e.g.
using CVS over a bad connection). But this doesn't appear to be a larger
problem than other situations where fetching only some of the files can lead
to an inconsistent situation.
Choice: Use the "signfile" attribute to specify a distributed file that
contains signatures. The developer may also use "newer" checks if he prefers.
index
24. Scope of variables
The normal way variables are used is that they are passed down from a recipe
to its children and to invoked build commands. They are normally not passed
back to the parent or invoker, thereby creating different scopes. This is
good for variables like CFLAGS, which can be changed in a child recipe for
dependencies there, while the parent recipe is unaware of this and continues
to use the old value.
Sometimes a variable must be passed back to the invoker or parent. The
":export" command is used for this. However, this doesn't always achieve the
desired effect, as the following example will show.
Assume the default.aap recipe contains this:
HASGCC = no
:action depend c
:update gcccheck
@if HASGCC == "yes":
:sys $CC $CPPFLAGS $CFLAGS -E -MM $source > $target
@else:
@ aap_depend_c(globals(), source, target)
gcccheck {virtual} :
@if some_condition:
HASGCC = yes
:export HASGCC
|
When Aap is executed with a main.aap recipe, which invokes a child.aap recipe
that invokes the action, there are these scopes:
- toplevel: used by default.aap and main.aap
- child: used by child.aap
- action: used by the ":action depend c" build commands
- gcccheck: used by the "gcccheck" target build commands
Since the variables in a scope are passed on to a deeper level, the
gcccheck scope will have all the variables used in toplevel.
But when in gcccheck the HASGCC variable is given a new value, this
would be lost as soon as the "gcccheck" build commands have finished and this
scope is left. The ":export HASGCC" command is used to pass the value on to
higher scopes. The question is which ones.
It is clear that the value of HASGCC should at least be passed to the
action, since it uses the ":update gcccheck" command to obtain the
value of HASGCC. The generic rule is that an exported variable is passed to
the scope from where the build commands were invoked.
Since the "gcccheck" target is build only once, the resulting value of HASGCC
must be remembered for the next time. The scope of the recipe where the
action and "gcccheck" target are defined is the logical place to do this. The
rule to achieve this is that an exported variable is also passed to the scope
of the recipe in which the build commands were defined.
Now the problem: When the "gcccheck" target is build, the value of HASGCC is
exported to the action and toplevel scope, but not the
child scope. In the child scope the value of HASGCC is "no",
since that was the value it obtained from the toplevel scope when the
child recipe was read. Each time the child invokes the action, it will pass
the value "no" on to the action. How do we make the action use the
HASGCC value from the toplevel scope that was set by the "gcccheck"
target?
Before considering alternatives, it is required to mention another situation
which must also work properly. Assume child.aap contains this:
CFLAGS += -Iproto
foo.o : foo.c
:sys ccc $CFLAGS -o $target $source
|
The CFLAGS variable was given a default value in default.aap and is given a
new value in child.aap. Child.aap also defines a dependency, which uses
CFLAGS in the build commands. When executing these build commands the value
of CFLAGS in the child scope needs to be used. But the dependencies
are executed from main.aap, which uses the toplevel scope. How do we
make the build commands of the dependency use the CFLAGS value from the recipe
it was defined in?
The CFLAGS issue can be solved by making the variable values from the recipe
(child) overrule the variable values from where it was invoked from
(toplevel). For the action this causes problems, because it was
defined at the toplevel but should use the CFLAGS value from where it
was invoked, which is child. To avoid this it would be possible to
make it work differently for dependencies and actions. But that is confusing,
the build commands should work in the same way. When also considering
rules it becomes unclear how it should work for these. It's better to make
it work in the same way for all kinds of build commands.
Alternatives:
- recipe scope overrules invoker
-
The variable value from the recipe where the build commands were specified
overrules the value from where the commands were invoked from. When the value
from invoking commands is needed, this must be specified somehow. The problem
with this is that it is not always known what variables the build commands
will use, especially when using ":do" to start actions. This is not a good
solution.
- invoker overrules recipe scope
-
The variable value from the invoking commands overrules the value of the
recipe where the build commands were defined. When the value of the recipe is
to be used this must be specified in some way. The problem with this is that
for dependencies there can be a long list of variables that are used. Having
to repeat this list for every dependency is not good. It is also very
unexpected for people who are used to a Makefile.
- assignment changes overruling
-
Only when a variable is set in a recipe, this value from the recipe scope
overrules the value from where the commands were invoked from. This mostly
works well: HASGCC was set in default.aap, thus the action defined there uses
the value from the toplevel scope instead of the value passed on from
the child scope. CFLAGS wasn't set here, thus the action uses the
CFLAGS value from the child scope.
A flaw is that when CFLAGS is given a default value in default.aap, this
suddenly changes what happens. This requires specifying that CFLAGS is a
global variable. Using ":global CFLAGS" can be used for this. This is
similar to Python, which also specifies a variable is global instead of local.
As an extra convenience an assignment with "?=" should have the same effect,
since it may use the global value. For completeness the ":local" command can
be used to explicitly specify a variable uses the recipe scope value.
- specify overruling
-
Like the previous alternative, but instead of implicitly changing the use of a
variable when it has been assigned a value, specify this with a command.
A disadvantage is that this requires the recipe writer to specify it. The
list of variables used in dependencies can become quite long. It is
unexpected for people who are used to a Makefile.
- export to all scopes
-
Make the ":export" command pass the variable value to all higher scopes. Then
the value will be used again when the action or dependency is invoked.
However, this makes it nearly impossible to use a value for CFLAGS
in a child recipe different from the parent. This is not a good solution.
Simplicity is the most important argument here. It is preferred to make it
work as most people would expect, thus that assigning a value to a variable
has the effect of making that value overrule the value passed from an invoking
command.
Choice: assignment changes overruling, add ":global" and ":local"
commands for when different behavior is required.
UPDATE March 2003
After implementing the new way ":export" works, several implementation
problems popped up. Also, it is quite difficult to explain how it works.
A disussion on the a-a-p-develop maillist resulted in the conclusion that an
explicit scope mechanism is much better. The main advantages are:
-
An assignment can directly specify in what scope the variable is set.
In the example for HASGCC above, the "_recipe" scope is used to make clear
this variable is local to the recipe.
-
When entering a new scope (a child recipe or executed build commands) it is
not required to copy all the existing variables into a new dictionary. The
scope mechanism can be implemented with a custom dictionary that looks up a
variable in the specified scope.
-
The same scope mechanism can be used for other items, such as rules.
-
Explicitly using scope names makes it possible for the user to define his own
scopes.
-
When using a variable without a scope specified, it is still possible to
lookup the value in other scopes. When assigning to a variable without a
specified scope it can be made a local variable. This avoids mistakes.
New Choice: Use explicit scope names
index
25. Build options independent of variants
When variants are specified, some of the source files may not depend on which
variant is selected. For example, a "main.c" file does not depend on the GUI
variant and a "gui.c" file does. CFLAGS is changed for "gui.c" to include
"-DGUI=$GUI". This must not be done for "main.c", because the buildcheck
signature would be different for each variant and "main.c" would be compiled
again for different GUI variants, while this is not necessary. Using a
specific dependency for "main.c" would not be sufficient, because the
automatic dependency checking would still use the wrong value for CFLAGS.
The more general problem is: How to use different options for groups of source
files? We would like a solution that is simple to use and requires only a
small number of lines in a recipe, while it is powerful enough to handle many
different situations. Possible solutions are:
-
Define an environment in which variables can be given a value, and specify the
environment to use for a source file. This is similar to what SCons uses.
When no environment is explicitly specified, the global variables are used.
In the example "main.c" would use the environment "nogui", in which CFLAGS has
a value that does not include "-DGUI=$GUI".
A possible way to implement this is using a dot between the environment name
and the variable name:
# Get the value of $BDIR and $CFLAGS before the GUI variant changes them.
nogui.BDIR = $BDIR
nogui.CFLAGS = $CFLAGS
:attr {env = nogui} main.c
:variant GUI
...
|
-
Add the different variable values as attributes to the file. These attributes
can then overrule the global value of the variable with the same name.
Example:
# Get the value of $BDIR and $CFLAGS before the GUI variant changes them.
:attr {BDIR = $BDIR} {CFLAGS = $CFLAGS} main.c
:variant GUI
...
|
The advantage over using an environment is that it is more direct, there is no
extra name for the environment to be used. It also gives more freedom, every
file can have different variable values. A disadvantage is that there might
be a name clash between attribute names and variable names. This can be
solved by prepending something to the name, e.g., "var_":
:attr {var_BDIR = $BDIR} {var_CFLAGS = $CFLAGS} main.c
|
-
Use a different variable name for storing the value and use attributes to
specify the different variable name to use. Example:
# Get the value of $BDIR and $CFLAGS before the GUI variant changes them.
NOGUI_BDIR = $BDIR
NOGUI_CFLAGS = $CFLAGS
:attr {BDIR_var = NOGUI_BDIR} {CFLAGS_var = NOGUI_CFLAGS} main.c
:variant GUI
...
|
This has similar advantages and disadvantages as the previous alternative.
A disadvantage is the extra indirection, which requires more variable names
and a new mechanism.
-
Use a different filetype for a file and define actions to build the files with
different variables. For the example the files which require an extra
compilation option would be given another filetype, like this:
:attr {filetype = guifile} gui.c
:action depend guifile
CFLAGS += -DGUI=$GUI
:do depend $source {filetype = c}
:action compile object guifile
CFLAGS += -DGUI=$GUI
:do compile $source {filetype = c}
|
An advantage is that this does not introduce a new mechanism. But it is quite
verbose and requires knowing which actions are going to be used.
Choice: Use the second solution: using "var_" attributes to directly
specify a value for a variable. It is the most generic solution and does not
appear to have relevant drawbacks.
The first three solutions require specifying whether a variable value
can be obtained from the attributes of a source file (either through the
environment or directly). Since a block of build commands may be used for
several sources, the question arises how the variable from the environment or
attribute is used in the build commands. Three alternatives:
-
Explicitly obtain a variable from a specific source, for example with a
Python function. Instead of $CFLAGS use `get_var("main.c", "CFLAGS")`.
-
Use a command to obtain all variables from source file attributes. This
avoids having to know which variables can be overruled.
-
Automatically always obtain all variables from source file attributes. Avoids
the extra command and makes it work even for rules and actions where
overruling was not explicitly enabled. When needed an attribute on the rule or
action can be used to disable the mechanism and a Python function to obtain
the global value of a variable.
Choice: Use the third solution: In most situations the overruling is
needed.
index
26. Scope of rules
When a rule is specified in a child, it is currently also used for nodes in
the parent recipe and other children. This makes it impossible to define
rules locally to a recipe and may lead to conflicting rules.
A complication is that all recipes are read before the first target is build.
Thus when following dependencies the rules in all recipes have been processed
and are available. Information about in which recipe a rule was defined is
currently not remembered.
What matters here is the ":child" command, not ":include" or ":execute".
Using an ":include" command works as if the rules defined in the included
recipe are defined in the including recipe. ":execute" doesn't carry over
rules at all.
A complication is that it depends on where the target was caused to be build.
There are several possibilities:
-
From the command line. Usually the toplevel recipe specifies what to do for
these targets, but it's also possible to leave this to a child recipe,
especially if a specific file is mentioned, e.g.: "aap lib/foo.a". Not using
rules from a child recipe could prevent this from working.
-
From a source used in a dependency. Then it is clear where the target was
triggered from: the recipe that contains the dependency. Rules defined in
this and parent recipes can be used. That rules in child recipes are ignored
is logical.
-
From a source used in another rule. This rule could be defined in an
unexpected place, this location should not be used for deciding what rule to
use. The target that invoked the other rule can be used, repeating this until
a dependency or a command line target is found.
-
An automatically generated dependency. Like with a source of a rule this can
be skipped and followed back to a command line target or dependency.
It appears the targets from the command line are difficult to work with. It
is not clear which rules should apply to them.
Alternatives:
-
Use a rule only in the recipe where it was defined. For the command line
targets only the rules in the toplevel recipe are used.
-
Use a rule in the recipe where it was defined and in its children. Do not use
a rule in a parent recipe. Thus using ":child" carries over the currently
defined rules from the parent to the children, but does not carry back rules
defined in the child recipe to the parent. For the command line targets only
the rules in the toplevel recipe are used.
-
Use a rule in all recipes (as it is now).
-
Use a rule in all recipes. When there are multiple rules that match, use the
one that was defined in the current recipe or a parent recipe, not one defined
in a child recipe. Thus a local rule overrules a global rule.
The last alternative is the most versatile. It is possible to define a rule
in a child recipe and use it for targets in a parent recipe or the command
line. At the same time it is possible to have a local rule in a child
recipe without disturbing rules defined in a parent recipe. A disadvantage is
that a rule is not really local by the normal meaning of scope rules in a
programming language.
The first alternative has the problem that a rule defined in the toplevel
recipe will not be used for targets in children. This could be useful in some
cases though.
The second alternative is mostly useful and works like most people expect, but
has the problem of not being able to handle a command line target with a rule
in a child recipe.
By using an option on the ":rule" command a rule can be made globally
available, thus this covers the third alternative as well. By using another
option, to keep the rule lcoal, the first alternative can be covered as well.
Choice: Make the ":rule" command define a rule that is used for targets
in the current recipe and children. Add a "{global}" option to make the rule
used in all recipes. Add a "{local}" option to make the rule used only in the
current recipe, not in children.
index
27. How to define building at a high level
Building can be specified at two levels:
- With each build step specified separately. A list of C files would first
be compiled into object files, one dependency per C file. Another dependency
specifies how the object files are linked into the program.
- By specifing the edited sources and the final target. How the object
files are compiled and linked together is left to the rules build into
Aap.
Aap will support both levels. For the second or high level a choice must be
made how this is specified in a recipe. There are several alternatives:
-
Use specific variable names.
So far the variables $SOURCE and $TARGET have been used. This allows only
one target per recipe, which is often not sufficient. To allow for more
targets the mechanism from automake can be used: bin_TARGETS = prog1 prog2
prog1_SOURCES = source1.c source2.c
prog2_SOURCES = source3.c source4.c
Here "bin_" indicates that the target is an executable binary program.
Possibly "lib_TARGETS" could be used for static libraries and "dll_TARGETS for
dynamic (shared) libraries.
Main disadvantage is that many variable names need to be used.
It is easy to make a typing mistake.
Also, the program names will have the restriction that they can only use
characters that are valid in a variable.
Another thing that could confuse a user is that the variables will be
available in child recipes, but should not be used there.
-
Use a command.
Specify the kind of target to be build, the name of the target and the name of
the source files. This method is used by Boost.Build and SCons.
Example: :program prog1 : source1.c source2.c
:program prog2 : source3.c source4.c
The ":program" command is used for an executable binary program.
There could be ":lib" (to build a static library) and ":dll"
(to build a dynamic or shared library).
The syntax is very similar to the ":rule" command, except that the names are
not patterns.
To be able to define settings for the build command, attributes can be used.
Since the syntax is similar to dependencies and rules, it can work the same
way and no new mechanism is introduced.
Obviously the second choice is simpler.
Choice: Use a command to specify high level building
index
28. Expanding wildcards
So far wildcard expansion was specifically done with the Python glob()
function. This was appearing in recipe examples too often. Many people
expect wildcards to be expanded automatically and not requiring a Python
function.
An alternative is to expand wildcards everywhere. Then the problem is
reversed: anywhere a '*', '?' or '[' is used expansion has to be avoided by
escaping the character. This is a problem in commands like ":print" where
wildcard expansion is not expected. Example:
:print Is [this] a *very* important message?
Similarly, doing expansion in an assignment has the same problem. At the
moment the assignment is done it is not clear if the argument is a list of
file names or not:
Headers = *.h
Message = Is [this] a *very* important message?
:attr {filetype = c} $Headers
:print $Message
The best solution is to expand the wildcards at the position where it is clear
that the argument is a file name or list of file names.
This system for wildcard expansion should work fine in most recipes.
In a few situations precautions have to be made.
-
When a variable like $Headers is used both in a place where it is expanded
(e.g., with ":attr") and a place where it isn't (e.g., with ":print") the user
might be confused. The only solution to this appears to be to clearly
document this.
-
If a variable, like $Headers above, is used several times, the expansion is
done each time. This may be inefficient and even lead to different results if
files have been created or deleted. In that case the Python glob() function
can still be used.
-
When files exist with wildcard characters in them (this is possible on Unix)
there must be some way to avoid the expansion. This can be done by putting
the wildcard character in angle brackets: [*], [?] and [[].
-
When a list of already expanded file names is used, and some of these names
may contain wildcard characters the wildescape() function must be used to
take the names literally and avoid expanding the wildcards.
At the moment of writing the disadvantages are considered less important than
the advantage of being able to use wildcards directly. However, it does make
file name handling more complicated, thus it's not such a clear choice.
Choice: Expand wildcards directly
index
29. Design for configuration checks
Adding configuration checks requires knowledge about various pecularities of
systems. Implementing and maintaining this is a lot of work. A choice has to
be made about how to design the checks in such a way that the amount of work
involved is not too high.
Many of the basic checks can be obtained from autoconf. This is implemented
in M4 macros and uses shell script, thus needs to be converted to Python to
make it work on non-Unix systems. The things that are in autoconf for
backwards compatibility can be left out.
Many checks require compiling and linking a C or C++ program. This should be
done in the same way as the actual building is done, so that all relevant
flags are used. Esp. the locations where header files and libraries can be
found.
It would be an advantage if other projects can use the same checks. We can
then work on the checks as a group, thus spread the work. But since the
building is done differently in each project (e.g., SCons) there is extra work
to implement a callback interface, so that the generic configure module
doesn't include the code for building the test programs.
A rough estimate is that the specific checks will grow rapidly and need a lot
of tuning for various systems. The callback interface should be quite stable,
since only a few methods for building are needed (proprocess, compile, link,
run). Thus the gain from sharing the checks themselves will be much bigger
than the extra work for implementing the callback interface.
Choice: Put the configure checks in a generic module, so that it can be
shared with other projects.
index
30. How to cleanup cached files
When files are downloaded they are cached. After using Aap for a while the
cache is filled with files, but they are not cleaned up without manual
intervention. Many people do not even know about the cache and never clean it
up.
Alternatives:
- Add a command to cleanup the cache
-
The main drawback is that many people do not know this is to be done, and
people who do know about it might want to run it often, upto after each
download (if they know the chached files won't be used again). That would
be annoying.
- Automatically cleanup the cache
-
This could be done when Aap is run for downloading and detects that some
files in the cache are older than a certain age (e.g. a month). Since
deleting a file doesn't take much time the user will hardly notice the
extra cleanup action.
The main drawback is that it is quite unpredictable how long a cached file
is valid. Some files may never be used a second time, some remain valid
for years. Since the cached files might not appear in the recipe currently
being executed, information about the usefulness of the file (e.g., the
"constant" attribute) is currently missing. If that information is
available it should be added in the cache index file.
Another disadvantage is that the cleanup is a side effect of running Aap.
When Aap is not run the cleanup doesn't happen. The user would have to run
Aap with a dummy recipe to cleanup the cache.
If this is implemented the user could set a limit for the size of the cache.
- Never use the cache
-
This is a simple solution, quite a bit of code can be removed. The
disadvantage would be that the same file may need to be downloaded several
times. How often does this occur?
- Most often an explicit ":fetch" command or using the "fetch" target
invokes a download anyway, bypassing the cache.
- When downloading through a method where the timestamp can't be
obtained the cache will not be used, because the remote file may already
have changed. Unless the "contstant" attribute is used.
- A situation where the cache would actually be used is when a file
doesn't exist for one user and has already been downloaded by another
user. But this requires a directory writable for a group of users,
which has security problems (someone could put a trojan horse in the
cache), thus can only be used on a trusted system. And never happens on
a system with one Aap user.
- Another situation is when a user cleans up everything and starts
building a program all over again.
Summary: Removing the use of the cache would cause extra downloads for some
kind of files. This may be a problem for someone who uses many of these
files, he would have to do some caching manually.
- Only store a file in the cache when useful
-
This is a more complicated solution. The principle would be that when the
downloading is done the file is stored in the cache if the cache would be
used when the current operation is repeated. When a forced fetch is done,
the cached file won't be used next time, thus don't write it in the cache.
When fetching outdated files (using the ":mkdownload" command) cached files
will also not be used, thus don't cache them. What remains are files that
are downloaded when missing - if the download method allows checking the
timestamp - and for files with a "constant" attribute.
Choice: Use a mix: Only store a file in the cache when useful, add a
way to manually cleanup the cache when desired and automatically delete files
when the cache uses more than a certain amount of space.
index
|
|