AD/IT (Abstract Decision/Interactive Trees)

Introduction
Approach
Example
Tool Support
Future Plans
Publications
Licence
History

Introduction

This project is being undertaken by Ken Turner. AD/IT (Abstract Decision/Interactive Trees) is a notation and a set of tools for defining, translating and formally analysing decision trees. (This is entirely unconnected with the ADIT trademark held by Adit Ltd. and with any products or services of this company.)

AD/IT is based on the notation and tools developed by the CGT project ('The Development and Evaluation of A Computerised Clinical Guidance Tree for Benign Prostatic Hyperplasia and Hypertension'). The CGT project was funded by the Chief Scientist's Office in Scotland from March 2000 to June 2003. Richard Bland was the chief designer of the CGT system, Claire Beechey was the main implementer, and Dawn Dowding (now at the University of Leeds) was the driving force behind the project. The clinical guidance trees were mainly designed by Pat Thomson and Claire Beechey(University of Stirling), Chris Mair (Forth Valley Primary Care Research Group), and Joanne Protheroe (University of Manchester).

Approach

AD/IT supports conventional decision trees that have decision nodes, chance nodes and terminal nodes. However, AD/IT inherits from CGT a number of useful extensions such as:

expressions for calculating node visibility, validating question answers, computing payoffs or probabilities, etc.
macros for sharing explanatory text, computing expressions inside text, etc.
question nodes for collecting user input
conditional visibility for hiding sub-trees if circumstances make them irrelevant
node composition for automatically dealing with independent combinations of choices.

The AD/IT design philosophy is to focus first on the structure and flow in a decision tree. Once this is correct, the content of the tree can be added. The notation encourages separate definition of structure and content. The AD/IT design approach is shown in the following figure.

AD/IT makes use of a simple applicative syntax (like function calls). AD/IT decision trees are defined by the following directives:

Directive	Meaning
// text	an explanatory comment about the tree that is removed in the translated output
chance(id,label,attributes,node1,..)	a probabilistic (system) choice
comment(text)	an explanatory comment about the tree that is transcribed to the translated output
decision(id,label,attributes,node1,..)	a deterministic (user) choice
question(id,label,attributes,node1,..)	a request for user input
terminal(id,label,attributes)	a leaf node
tree(id,label,attributes,node)	the whole tree with a single root node
*value(name,value)*	a textual, numeric or code definition

These directives take attributes as follows:

Attribute	Meaning	chance	decision	question	terminal	tree
composed	expression for using composed node headings					*
conjunction	text to join composed nodes					*
dictionary	name of a glossary file					*
display	text for user display (assumed by default)	*	*	*	*
error	text for reporting a validation error			*
format	format for user input			*
label	long label for a node	*	*	*	*
macros	global macros					*
neutral	neutral value for positive/negative outcome					*
payoff	expression for a payoff (utility, default 100)				*
perform	user instruction text	*	*	*	*
print	expression for summarising a node	*	*	*	*
probability	expression for probability of a choice	*	*	*	*
query	text for question to user			*
reason	text for explaining a choice	*	*	*	*
scale	expression for scaling payoff	*	*	*	*
valid	expression to validate question input			*
variable	question variable			*
variables	tree variables					*
version	tree notation version					*
visible	expression to check node visibility	*	*	*	*

Example

As a simple example, the following tree explores the consequences of deciding whether to spend leisure time on going for a walk or not.

    comment(walk.adit: K. J. Turner and R. Bland, 7th October 2007)

    value(Leisure_Display,
      <h2>Introduction</h2>

      This example discusses the choice of going for a walk or not.)

    value(Leisure_Variables,
      tuesday = 0;
      probBored = 0.55; probNap = 0.4; probPorlock = 0;
      payBored = 40; payNap = 70; payNapBored = 10; payPorlock = 15; payWalk = 80)

    value(Root_Display,
      Should you go for a walk or not? You'll be asked a question if not.)

    value(Outcome_Display, What happens now depends on the day of the week.)

    value(Outcome_Query, Is it Tuesday?)

    value(Outcome_Valid,
      if tuesday == 0
	then probPorlock = 0; probNap = 0.4; probBored = 0.55; true
	else
	  if tuesday == 1
	    then probPorlock = 0.4; probNap = 0.2; probBored = 0.35; true
	    else false
	  fi
      fi)

    value(Outcome_Reason, Staying at home means a lack of desirable exercise.)

    value(Stay_Display, If you stay at home it may turn out well or badly.)

    value(Porlock_Visible, tuesday)

    value(Porlock_Display,
      A person from Porlock calls with probability McEval(probPorlock).)

    value(Nap_Display,
      You have a nice nap by the fire with probability McEval(probNap).)

    value(Bored_Display,
     You will feel bored and dull with probability McEval(probBored).)

    value(Walk_Display,
      If you go for a walk the payoff will be McEval(payWalk).)

    tree(Leisure, Leisure, neutral="70" variables,
      decision(Root, At leisure, ,
	question(Outcome, Stay at home, query variable="tuesday"
	 format="Edit(1)" reason valid,
	  chance(Stay, Result of staying at home, ,
	    terminal(Porlock, A person from Porlock calls, visible
	     payoff="payPorlock" probability="probPorlock"),
	    terminal(Nap, Have a nap, payoff="payNap" probability="probNap"),
	    terminal(Bored, Become bored, payoff="payBored"
	     probability="probBored"),
	    terminal(NapAndBored, Have a nap & Become bored,
	     payoff="payNapBored" probability="#"))),
	terminal(Walk, Have a walk, payoff="payWalk")))

The Root node has short label 'At leisure' and no attributes. A choice of two child nodes is then made by the user:

Outcome: This is a question node that asks whether it is Tuesday or not, storing the result in variable tuesday. This expects an Edit (i.e. free-form) response of one character: 0 (no) or 1 (yes). The question has separately defined attributes for query (the question to be asked), reason (a justification for asking the question), and valid (a validation rule).
Walk: This is a terminal node corresponding to going for a walk. It has payoff payWalk (defined by a tree variable).

If the user chooses to stay at home, the question about Tuesday is asked. There are now four possible outcomes determined by chance (i.e. the user does not make a choice of these branches). Each outcome has an associated payoff and probability. If it is a Tuesday, a person from Porlock may call. (A 'person from Porlock' is said to have disturbed Coleridge while he was trying to recall the 'Kubla Khan' poem he had composed in a dream.) Alternatively the user may have a nap or may become bored. Both of these may be combined in a composed node with probability '#', meaning the probability remaining after consideration of the other terminal nodes.

The attributes of nodes are mostly defined separately through value definitions. Apart from probability and payoff variables for the tree, there is a tuesday variable that records whether it is Tuesday. All these variables are given initial values since questions may be skipped or nodes may be rendered invisible.

The display, query and reason attributes are defined as text. Apart from allowing HTML markup, text fields may also use macros. Here, the built-in McEval macro is used to return the text resulting from evaluating an expression. The valid and visible attributes are defined as expressions.

The definition of Outcome_Valid illustrates that complex expressions are allowed, here with conditions and sequences of assignments. As with CGT, expressions are patterned after the C programming language and its derivatives. In fact, it would be better to call these statements rather than expressions: they are statement sequences that yield a value. A simple expression may be used on its own. An assignment expression yields the value that is assigned. If there is a sequence of statements or a conditional statement, its value is that of the last calculated expression.

The definition of Porlock_Visible is much simpler: the Porlock node is visible only if tuesday has value 1. Variables always hold numeric values. A zero value means false in a boolean context, while non-zero means true.

For Outcome_Valid, the validation expression yields true if the answer for tuesday is 0 or 1, otherwise false. As a side-effect, the probabilities of various terminal nodes are set depending on the value of tuesday.

Tool Support

AD/IT is accompanied by a toolset that performs translations between different decision trees representations. The translations supported include:

AD/IT to CGT, for use with the CGT tree viewer; this uses the original tree notation developed by the CGT project
AD/IT to XML (via CGT, using an adaptation of code by Richard Bland, University of Stirling)
AD/IT to Lotos, for formal analysis; Lotos (Language Of Temporal Ordering Specification, ISO 8807) is a standardised language for formal specification.

See the author's Lotos software page for download information.

The CGT Viewer allows trees generated from AD/IT to be interactively explored. The viewer was not developed by the author, and is available only in binary form for Microsoft Windows. See the author's graphical software page for more information.

Future Plans

At present, trees are defined manually. It is planned to investigate links to other decision tree tools with a view to converting AD/IT notation to/from the notations they handle (at least, as far as common tree features are concerned).

Publications

The technical basis of AD/IT is contained in the following published papers:

Richard Bland, Claire E. Beechey and Dawn Dowdwing. Extending the Model of The Decision Tree, Technical Report CSM-162, Computing Science and Mathematics, University of Stirling, August 2002.
Richard Bland, Claire E. Beechey and Dawn Dowdwing. Writing Decision Trees for The CGT Viewer Program, Technical Report CSM-163, Computing Science and Mathematics, University of Stirling, October 2003.
Kenneth J. Turner. Abstraction and Analysis of Clinical Guidance Trees, Biomedical Informatics, Copyright Elsevier, October 2008.

Licence

This program is free software. You can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation - either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful but without any warranty, without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.

You may re-distribute this software provided you preserve this README file. Bug reports should be sent to Ken Turner, who would also appreciate receiving any corrections and comments.

History

Version 1.0: Ken Turner, 11th June 2007

internal version

Version 1.1: Ken Turner, 18th October 2007

first public version

Version 1.2: Ken Turner, 23rd October 2008

Lotos template changed to match new version of walk example
Back not translated in a question unless -g requires it

Up one level to Ken Turner - Research Projects

Last Update: 24th July 2018
URL: https://www.cs.stir.ac.uk/~kjt/research/adit/adit.html