[BACK]Return to make.h CVS log [TXT][DIR] Up to [local] / src / usr.bin / make

File: [local] / src / usr.bin / make / make.h (download)

Revision 1.24, Fri Jun 23 16:27:29 2000 UTC (23 years, 11 months ago) by espie
Branch: MAIN
Changes since 1.23: +10 -2 lines

This is the speed-up patch, which doubles make speed (almost).

Use the open hashing functions for global contexts instead of List in
var.c.

All the preliminary work to trim down local contexts means that we don't
suffer from the heavy initialization work that a hash table entails.

There is some make kludgery to:
- build the hashing functions as a library,
- recreate hashconsts.h, even if make depend was not invoked.

One point of the hashing scheme written was to separate the computation
of the hash function, and the hash lookup itself. This is very convenient
for make, because of those pesky special variables. hashconsts.h is there
to pre-hash the correct values, which replaces a few expensive string
comparisons with quick hash value comparisons, followed by one expensive
string comparison. The modulus MAGICSLOTS chosen in the Makefile is
ad-hoc: it is small enough to write a small switch without collision,
and will need changing if the hash function changes...

The function quick_lookup is the most important:
it either returns an index, for a local variable, or it does compute a
hashing value, and returns -1.

Another somewhat controversial decision is the use of string intervals.
This avoids either copying a string, or twiddling with a byte for cases
such as ${VAR}.

Finally, the variable name is stored within the variable itself. Since
a given variable name never changes, this makes sense. All that was needed
was a hash library with support for this.  Note that the hashing table
holds only a variable pointer AND the corresponding hashing value, WITHOUT
a modulo hashtablesize. Two reasons:
- hash resizes can be done faster, without having to recompute hashing values.
- locality of access. The hash table fits into memory without problem. Once
a candidate slot is found, we check the complete hashing value. Probability
of a collision is very small (32 bits...). So bringing up the whole
variable in memory at once is good: the name will almost always match, in
which case we want the variable value as well, so it makes sense to put
them together.

The ohash functions implement open hashing, as described in Knuth, but with
a variable table size.  Choosing powers of 2 sizes does not yield more
collisions, but it makes the hashing scheme much simpler. The thresholds at
which to expand/shrink the tables seem to work well in practice. The
default sizes were chosen such that the tables hardly ever shrink or expand
anyways (though I've tried with smaller/larger sizes to verify that the
shrinking/expanding worked correctly): larger Makefiles hold roughly
500/600 variables, which fits without trouble into a 1024-sized variable.

Disregard #ifdef STATS_HASH, this is some internal scaffolding I'm using
to measure make performance.

The only known issue with open-hashing is that deletions cannot create
empty slots, but do leave slots marked as `occupied once' so that lookup
works.  We use a well-known optimization which records those pseudo-empty
slots while looking up values. If the value is not found, the pseudo-empty
slot is returned to be filled. If the value is found, it is swapped with
the pseudo-empty slot. This is an improvement in both cases, since this
shortens the length of lookup chains, eventually pushing the pseudo-empty
slots to the end.

Reviewed by millert@ and miod@

/*	$OpenBSD: make.h,v 1.24 2000/06/23 16:27:29 espie Exp $	*/
/*	$NetBSD: make.h,v 1.15 1997/03/10 21:20:00 christos Exp $	*/

/*
 * Copyright (c) 1988, 1989, 1990, 1993
 *	The Regents of the University of California.  All rights reserved.
 * Copyright (c) 1989 by Berkeley Softworks
 * All rights reserved.
 *
 * This code is derived from software contributed to Berkeley by
 * Adam de Boor.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 3. All advertising materials mentioning features or use of this software
 *    must display the following acknowledgement:
 *	This product includes software developed by the University of
 *	California, Berkeley and its contributors.
 * 4. Neither the name of the University nor the names of its contributors
 *    may be used to endorse or promote products derived from this software
 *    without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 *	from: @(#)make.h	8.3 (Berkeley) 6/13/95
 */

/*-
 * make.h --
 *	The global definitions for pmake
 */

#ifndef _MAKE_H_
#define _MAKE_H_

#include <sys/types.h>
#include <sys/param.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>

#if !defined(MAKE_BOOTSTRAP) && defined(BSD4_4)
# include <sys/cdefs.h>
#else
# ifndef __P
#  if defined(__STDC__) || defined(__cplusplus)
#   define	__P(protos)	protos		/* full-blown ANSI C */
#  else
#   define	__P(protos)	()		/* traditional C preprocessor */
#  endif
# endif
# ifndef __STDC__
#  ifndef const
#   define const
#  endif
#  ifndef volatile
#   define volatile
#  endif
# endif
#endif

#ifdef __STDC__
#include <stdlib.h>
#include <unistd.h>
#endif
#include "sprite.h"
#include "lst.h"
#include "config.h"
#include "buf.h"

#define OUT_OF_DATE INT_MIN

/* Variables that are kept in local GNodes.  */
#define TARGET_INDEX	0
#define OODATE_INDEX	1
#define ALLSRC_INDEX	2
#define IMPSRC_INDEX	3
#define PREFIX_INDEX	4
#define ARCHIVE_INDEX   5
#define MEMBER_INDEX    6

#define LOCAL_SIZE	7

/* SymTable is private to var.c, but is declared here to allow for
   local declaration of context tables
 */
typedef struct {
	struct Var_ *locals[LOCAL_SIZE];
} SymTable;

typedef struct hash GSymT;
/*-
 * The structure for an individual graph node. Each node has several
 * pieces of data associated with it.
 *	1) the name of the target it describes
 *	2) the location of the target file in the file system.
 *	3) the type of operator used to define its sources (qv. parse.c)
 *	4) whether it is involved in this invocation of make
 *	5) whether the target has been remade
 *	6) whether any of its children has been remade
 *	7) the number of its children that are, as yet, unmade
 *	8) its modification time
 *	9) the modification time of its youngest child (qv. make.c)
 *	10) a list of nodes for which this is a source
 *	11) a list of nodes on which this depends
 *	12) a list of nodes that depend on this, as gleaned from the
 *	    transformation rules.
 *	13) a list of nodes of the same name created by the :: operator
 *	14) a list of nodes that must be made (if they're made) before
 *	    this node can be, but that do no enter into the datedness of
 *	    this node.
 *	15) a list of nodes that must be made (if they're made) after
 *	    this node is, but that do not depend on this node, in the
 *	    normal sense.
 *	16) a Lst of ``local'' variables that are specific to this target
 *	   and this target only (qv. var.c [$@ $< $?, etc.])
 *	17) a Lst of strings that are commands to be given to a shell
 *	   to create this target.
 */
typedef struct GNode {
    char            *name;     	/* The target's name */
    char    	    *path;     	/* The full pathname of the file */
    int             type;      	/* Its type (see the OP flags, below) */
    int		    order;	/* Its wait weight */

    Boolean         make;      	/* TRUE if this target needs to be remade */
    enum {
	UNMADE, BEINGMADE, MADE, UPTODATE, ERROR, ABORTED,
	CYCLE, ENDCYCLE
    }	    	    made;    	/* Set to reflect the state of processing
				 * on this node:
				 *  UNMADE - Not examined yet
				 *  BEINGMADE - Target is already being made.
				 *  	Indicates a cycle in the graph. (compat
				 *  	mode only)
				 *  MADE - Was out-of-date and has been made
				 *  UPTODATE - Was already up-to-date
				 *  ERROR - An error occured while it was being
				 *  	made (used only in compat mode)
				 *  ABORTED - The target was aborted due to
				 *  	an error making an inferior (compat).
				 *  CYCLE - Marked as potentially being part of
				 *  	a graph cycle. If we come back to a
				 *  	node marked this way, it is printed
				 *  	and 'made' is changed to ENDCYCLE.
				 *  ENDCYCLE - the cycle has been completely
				 *  	printed. Go back and unmark all its
				 *  	members.
				 */
    Boolean 	    childMade; 	/* TRUE if one of this target's children was
				 * made */
    int             unmade;    	/* The number of unmade children */

    time_t          mtime;     	/* Its modification time */
    time_t     	    cmtime;    	/* The modification time of its youngest
				 * child */

    LIST     	    iParents;  	/* Links to parents for which this is an
				 * implied source, if any */
    LIST    	    cohorts;  	/* Other nodes for the :: operator */
    LIST            parents;   	/* Nodes that depend on this one */
    LIST            children;  	/* Nodes on which this one depends */
    LIST    	    successors;	/* Nodes that must be made after this one */
    LIST    	    preds;  	/* Nodes that must be made before this one */

    SymTable        context;   	/* The local variables */
    unsigned long   lineno;	/* First line number of commands.  */
    const char *    fname;	/* File name of commands.  */
    LIST            commands;  	/* Creation commands */

    struct _Suff    *suffix;	/* Suffix for the node (determined by
				 * Suff_FindDeps and opaque to everyone
				 * but the Suff module) */
} GNode;

/*
 * Manifest constants
 */

/*
 * The OP_ constants are used when parsing a dependency line as a way of
 * communicating to other parts of the program the way in which a target
 * should be made. These constants are bitwise-OR'ed together and
 * placed in the 'type' field of each node. Any node that has
 * a 'type' field which satisfies the OP_NOP function was never never on
 * the lefthand side of an operator, though it may have been on the
 * righthand side...
 */
#define OP_DEPENDS	0x00000001  /* Execution of commands depends on
				     * kids (:) */
#define OP_FORCE	0x00000002  /* Always execute commands (!) */
#define OP_DOUBLEDEP	0x00000004  /* Execution of commands depends on kids
				     * per line (::) */
#define OP_OPMASK	(OP_DEPENDS|OP_FORCE|OP_DOUBLEDEP)

#define OP_OPTIONAL	0x00000008  /* Don't care if the target doesn't
				     * exist and can't be created */
#define OP_USE		0x00000010  /* Use associated commands for parents */
#define OP_EXEC	  	0x00000020  /* Target is never out of date, but always
				     * execute commands anyway. Its time
				     * doesn't matter, so it has none...sort
				     * of */
#define OP_IGNORE	0x00000040  /* Ignore errors when creating the node */
#define OP_PRECIOUS	0x00000080  /* Don't remove the target when
				     * interrupted */
#define OP_SILENT	0x00000100  /* Don't echo commands when executed */
#define OP_MAKE		0x00000200  /* Target is a recurrsive make so its
				     * commands should always be executed when
				     * it is out of date, regardless of the
				     * state of the -n or -t flags */
#define OP_JOIN 	0x00000400  /* Target is out-of-date only if any of its
				     * children was out-of-date */
#define	OP_MADE		0x00000800  /* Assume the node is already made; even if
				     * it really is out of date */
#define OP_INVISIBLE	0x00004000  /* The node is invisible to its parents.
				     * I.e. it doesn't show up in the parents's
				     * local variables. */
#define OP_NOTMAIN	0x00008000  /* The node is exempt from normal 'main
				     * target' processing in parse.c */
#define OP_PHONY	0x00010000  /* Not a file target; run always */
#define OP_NOPATH	0x00020000  /* Don't search for file in the path */
/* Attributes applied by PMake */
#define OP_TRANSFORM	0x80000000  /* The node is a transformation rule */
#define OP_MEMBER 	0x40000000  /* Target is a member of an archive */
#define OP_LIB	  	0x20000000  /* Target is a library */
#define OP_ARCHV  	0x10000000  /* Target is an archive construct */
#define OP_HAS_COMMANDS	0x08000000  /* Target has all the commands it should.
				     * Used when parsing to catch multiple
				     * commands for a target */
#define OP_SAVE_CMDS	0x04000000  /* Saving commands on .END (Compat) */
#define OP_DEPS_FOUND	0x02000000  /* Already processed by Suff_FindDeps */

/*
 * OP_NOP will return TRUE if the node with the given type was not the
 * object of a dependency operator
 */
#define OP_NOP(t)	(((t) & OP_OPMASK) == 0x00000000)

#define OP_NOTARGET (OP_NOTMAIN|OP_USE|OP_EXEC|OP_TRANSFORM)

/*
 * The TARG_ constants are used when calling the Targ_FindNode function in 
 * targ.c. They simply tell the function what to do if the desired node(s) 
 * is (are) not found. 
 * If the TARG_CREATE constant is given, a new, empty node will be created 
 * for the target, placed in the table of all targets and its address returned. 
 * If TARG_NOCREATE is given, a NULL pointer will be returned.
 */
#define TARG_CREATE	0x01	  /* create node if not found */
#define TARG_NOCREATE	0x00	  /* don't create it */

/*
 * There are several places where expandable buffers are used (parse.c and
 * var.c). This constant is merely the starting point for those buffers. If
 * lines tend to be much shorter than this, it would be best to reduce BSIZE.
 * If longer, it should be increased. Reducing it will cause more copying to
 * be done for longer lines, but will save space for shorter ones. In any
 * case, it ought to be a power of two simply because most storage allocation
 * schemes allocate in powers of two.
 */
#define MAKE_BSIZE		256	/* starting size for expandable buffers */

/*
 * These constants are all used by the Str_Concat function to decide how the
 * final string should look. If STR_ADDSPACE is given, a space will be
 * placed between the two strings. If STR_ADDSLASH is given, a '/' will
 * be used instead of a space. If neither is given, no intervening characters
 * will be placed between the two strings in the final output. If the
 * STR_DOFREE bit is set, the two input strings will be freed before
 * Str_Concat returns.
 */
#define STR_ADDSPACE	0x01	/* add a space when Str_Concat'ing */
#define STR_DOFREE	0x02	/* free source strings after concatenation */
#define STR_ADDSLASH	0x04	/* add a slash when Str_Concat'ing */

/*
 * Error levels for parsing. PARSE_FATAL means the process cannot continue
 * once the makefile has been parsed. PARSE_WARNING means it can. Passed
 * as the first argument to Parse_Error.
 */
#define PARSE_WARNING	2
#define PARSE_FATAL	1

/*
 * Values returned by Cond_Eval.
 */
#define COND_PARSE	0   	/* Parse the next lines */
#define COND_SKIP 	1   	/* Skip the next lines */
#define COND_INVALID	2   	/* Not a conditional statement */

/*
 * Definitions for the "local" variables. Used only for clarity.
 */
#define TARGET	  	  "@" 	/* Target of dependency */
#define OODATE	  	  "?" 	/* All out-of-date sources */
#define ALLSRC	  	  ">" 	/* All sources */
#define IMPSRC	  	  "<" 	/* Source implied by transformation */
#define PREFIX	  	  "*" 	/* Common prefix */
#define ARCHIVE	  	  "!" 	/* Archive in "archive(member)" syntax */
#define MEMBER	  	  "%" 	/* Member in "archive(member)" syntax */
#define LONGTARGET	".TARGET"
#define LONGOODATE	".OODATE"
#define LONGALLSRC	".ALLSRC"
#define LONGIMPSRC	".IMPSRC"
#define LONGPREFIX	".PREFIX"
#define LONGARCHIVE	".ARCHIVE"
#define LONGMEMBER	".MEMBER"


#define FTARGET           "@F"  /* file part of TARGET */
#define DTARGET           "@D"  /* directory part of TARGET */
#define FIMPSRC           "<F"  /* file part of IMPSRC */
#define DIMPSRC           "<D"  /* directory part of IMPSRC */
#define FPREFIX           "*F"  /* file part of PREFIX */
#define DPREFIX           "*D"  /* directory part of PREFIX */

/*
 * Global Variables
 */
extern LIST  	create;	    	/* The list of target names specified on the
				 * command line. used to resolve #if
				 * make(...) statements */
extern LIST    	dirSearchPath; 	/* The list of directories to search when
				 * looking for targets */

extern Boolean	compatMake;	/* True if we are make compatible */
extern Boolean	ignoreErrors;  	/* True if should ignore all errors */
extern Boolean  beSilent;    	/* True if should print no commands */
extern Boolean  noExecute;    	/* True if should execute nothing */
extern Boolean  allPrecious;   	/* True if every target is precious */
extern Boolean  keepgoing;    	/* True if should continue on unaffected
				 * portions of the graph when have an error
				 * in one portion */
extern Boolean 	touchFlag;    	/* TRUE if targets should just be 'touched'
				 * if out of date. Set by the -t flag */
extern Boolean  usePipes;    	/* TRUE if should capture the output of
				 * subshells by means of pipes. Otherwise it
				 * is routed to temporary files from which it
				 * is retrieved when the shell exits */
extern Boolean 	queryFlag;    	/* TRUE if we aren't supposed to really make
				 * anything, just see if the targets are out-
				 * of-date */

extern Boolean	checkEnvFirst;	/* TRUE if environment should be searched for
				 * variables before the global context */

extern GNode    *DEFAULT;    	/* .DEFAULT rule */

extern GSymT	*VAR_GLOBAL;   	/* Variables defined in a global context, e.g
				 * in the Makefile itself */
extern GSymT	*VAR_CMD;    	/* Variables defined on the command line */
extern char    	var_Error[];   	/* Value returned by Var_Parse when an error
				 * is encountered. It actually points to
				 * an empty string, so naive callers needn't
				 * worry about it. */

extern time_t 	now;	    	/* The time at the start of this whole
				 * process */

extern Boolean	oldVars;    	/* Do old-style variable substitution */

extern LIST	sysIncPath;	/* The system include path. */

/*
 * debug control:
 *	There is one bit per module.  It is up to the module what debug
 *	information to print.
 */
extern int debug;
#define	DEBUG_ARCH	0x0001
#define	DEBUG_COND	0x0002
#define	DEBUG_DIR	0x0004
#define	DEBUG_GRAPH1	0x0008
#define	DEBUG_GRAPH2	0x0010
#define	DEBUG_JOB	0x0020
#define	DEBUG_MAKE	0x0040
#define	DEBUG_SUFF	0x0080
#define	DEBUG_TARG	0x0100
#define	DEBUG_VAR	0x0200
#define DEBUG_FOR	0x0400

#ifdef __STDC__
#define CONCAT(a,b)	a##b
#else
#define I(a)	  	a
#define CONCAT(a,b)	I(a)b
#endif /* __STDC__ */

#define	DEBUG(module)	(debug & CONCAT(DEBUG_,module))

#include "extern.h"

#endif /* _MAKE_H_ */