SPOON: A library for implementing analyses and transformations of Java source code

This paper presents SPOON, a library for the analysis and transformation of Java source code. SPOON enables Java developers to write a large range of domain‐specific analyses and transformations in an easy and concise manner. SPOON analyses and transformations are written in plain Java. With SPOON, developers do not need to dive into parsing, to hack a compiler infrastructure, or to master a new formalism. Copyright © 2015 John Wiley & Sons, Ltd.


INTRODUCTION
Compilers and interpreters analyze source code. But source code analysis is used in many more places [1]: It is used to compute metrics [2], to detect bad smells [3], and to detect code clones [4]. Companies and open-source projects set up their own metrics and coding conventions [5]. This motivates a library for source code analysis that is usable by the masses of developers and not dedicated to compiler hackers.
Beyond source code analysis, there is source code transformation. Source code transformation is a program transformation at the source code level, as opposed to program transformation performed on binary code [6]. There are many usages of program transformation: profiling [7], security [8], optimization [9], and refactoring [10]. As source code analysis, some source code transformations are written by normal Java developers. For instance, this happens when the transformation uses domain-specific knowledge [11]. This paper presents SPOON, a library for the analysis and transformation of Java source code. SPOON enables Java developers to write a large range of domain-specific analyses and transformations in an easy and concise manner. SPOON analyses and transformations are written in plain Java. With SPOON, developers do not need diving to parse, to hack a compiler infrastructure, or to master a new formalism.
The main features of SPOON are as follows:

SOURCE CODE ANALYSIS WITH SPOON
The first goal of SPOON is to enable standard developers to write their own domain-specific analyses on source code. This requires: First, an intuitive metamodel understandable by the mass of Java developers (presented in Section 2.2) and second, mechanisms to analyze source code elements. The latter is embodied by queries (Section 2.3) and processors for traversing the program under analysis (Section 2.4). But let us first give an overview of the library before going into the details of the Java metamodel of SPOON. SPOON is a meta-analysis tool, it provides software engineers with the primitives to write their own analyses. As such, SPOON does not provide any specific analysis such as dataflow analysis.  environment (Java development tools) is to support different tasks of software development in an integrated manner (code completion, quick fix of compilation errors, debug, etc.). Unlike a compiler-based AST (e.g., from javac), the SPOON metamodel of Java is designed to be easily understandable by normal Java developers, so that they can write their own program analyses and transformations. The SPOON metamodel is complete in the sense that it contains all the required information to derive compilable and executable Java programs (hence contains annotations, generics, and method bodies).
The SPOON metamodel can be split in three parts. The structural part ( Figure 2) contains the declarations of the program elements, such as interface, class, variable, method, annotation, and enum declarations. The code part (Figure 3) contains the executable Java code, such as the one found in method bodies. The reference part models the references to program elements (for instance, a reference to a type).
As shown in Figure 2, all elements inherit from CTElement that declares a parent element denoting the containment relation in the source file. For instance, the parent of a method node is a class node. All names are prefixed by 'CT' that means 'compile-time'. Figure 3 shows the metamodel for Java executable code. Because of the complexity of the Java language, the code metamodel figure contains only an excerpt of all classes. There are two main kinds of code elements. First, the statements (CtStatement) are untyped top-level instructions that can be used directly in a block of code. Second, the expressions (CtExpression) are used inside the statements (for the sake of readability, this can not be seen on the figure). For instance, a CtLoop (which is a statement) points to a CtExpression that expresses its boolean condition. Some code elements such as invocations and assignments are both statements and expressions (multiple inheritance links). Concretely, this is translated as an interface CtInvocation inheriting from both interfaces CtStatement and CtExpression. The generic type of CtExpression is used to add static type-checking when transforming programs. This will be explained in details in Section 3.2.
The reference part of the metamodel expresses the fact that program references elements that are not necessarily reified into the metamodel (they may belong to third party libraries). For instance, an expression node returning a String is bound to a type reference to String and not to the compile-time model of String.java because the source code of String is (usually) not part of the application code under analysis. In other terms, references are used by metamodel elements to reference elements in a weak way. Weak references make it more flexible to construct and modify a program model without having to get strong references on all referred elements.
References are resolved when the model is built; the resolved references are those that point to classes for which the source code is available in the SPOON input path. Because the references are weak, the targets of references do not have to exist before one references them. The price to pay for this low coupling is that to navigate from one code element to another; one has to chain a navigation to the reference and then to the target. For instance, to navigate from a field to the type of the field, one writes field.getType().getDeclaration().

Querying source code elements
SPOON aims at giving developers a way to query code elements in one single line of code in the normal cases. Classical research about code querying uses specific ad hoc languages [14]. On the contrary, code query in SPOON is performed in plain Java, in the spirit of an embedded Domain-specific language. The information that can be queried is that of a well-formed typed AST. For this, we provide the query API, based on the notion of 'Filter'. A Filter defines a predicate of the form of a matches method that returns true if an element is part of the filter. A Filter is given as parameter to a depth-first search algorithm. During AST traversal, the elements satisfying the matching predicate are given to the developer for subsequent treatment. Table I gives an excerpt of built-in filters.
Listing 1 gives an example of client code. Three filters are used. The first returns all AST nodes of type 'Assignment'. The second one selects all deprecated classes. The last one is a user-defined filter that only matches public fields across all classes.
To guide the user in learning the metamodel, two pieces of information are available. First, the user is given a graphical user interface that provides a navigable view of the SPOON AST of the program under analysis. A screenshot of this graphical user interface is given in Figure 4. Second, the metamodel is carefully designed so that the user can discover the metamodel through SPOON 1159 Table I. Excerpt of built-in filters for querying source code elements.

Filter name Responsibility
TypeFilter Returns all metamodel elements of a certain type (e.g., all assignment statements) FieldAccessFilter Returns all accesses to a given field AnnotationFilter Returns all elements annotated with a given annotation type ReturnOrThrowFilter Returns all elements that end the execution flow of a method Figure 4. A graphical user interface to understand and learn the SPOON metamodel. method calls. The object-orientation of the SPOON metamodel is appropriate for this. For instance, statement ((CtIf)method.getBody().getStatement(0)).getThenStatement() navigates from a method object to the statement of the first if.

Processing code elements
A program analysis is a combination of query and analysis code. In SPOON, this conceptual pair is reified in a 'processor'. A SPOON program processor is a class that focuses on the analysis of one kind of program elements. For instance, Listing 2 presents a processor that analyzes a program to find empty catch blocks.
The elements to be analyzed (here catch blocks) are given by generic typing: The programmer declares the AST node type under analysis as class generics. The processed element type is automatically inferred through runtime introspection of the processor class. There is also an optional overridable method for querying elements at a finer grain. The process method takes the requested element as input and does the analysis (here detecting empty catch blocks). Because a real world analysis combines multiple queries, multiple processors can be used at the same time. The launcher applies them in the order they have been declared.
Processors are implemented with a visitor design pattern [15] applied to the SPOON Java model. Each node of the metamodel implements an accept method so that it can be visited by a visitor object, which can perform any kind of action, including modification.
SPOON provides developers with an intuitive Java metamodel and concise abstractions to query and process AST elements.

SOURCE CODE TRANSFORMATION WITH SPOON
SPOON has been designed to facilitate source code transformation. For this, four mechanisms are provided: first-class intercession mechanisms (Section 3.1), the use of generics (Section 3.2), the notion of statically checked templates (Section 3.3), and the use of annotations (Section 3.4).

First-class intercession mechanisms
For transforming programs, SPOON provides first-class intercession mechanisms at different levels. At the structural level, SPOON enables one to add and remove types in packages, as well as fields and methods in types. At the behavioral level, SPOON enables one to modify any part of the code. For instance, one can add pre-conditions in methods, or add logging in catch blocks in an automated manner. Table II provides an overview of the main intercession methods of SPOON. All intercession methods take as parameter one or several AST nodes. However, for adding code only, for the sake of pragmatism, SPOON enables one to manipulate code strings encapsulated in a special object: a 'code snippet'. In order for code snippets to seamlessly integrate with the existing AST-level intercession method, in the metamodel, a code snippet is a subclass of an AST node.
Let us now discuss the concrete example of Listing 3. It shows the code of a processor that adds logging in order to detect null method parameters. The processor processes CtParameter (the metamodel type representing method parameters). It creates a snippet representing an 'if/then' statement that checks the value of the parameter. It then adds this piece of code at the beginning of the method body corresponding to this parameter. Note that when several parameters are declared, all are processed, and the checks are inserted in the same body in the order they are declared in the method signature (one can also write a sophisticated transformation to change this order). This is the only code to write. The command-line-based launcher takes as input the processor name

Use of generic typing for static checking of transformations
When manipulating an AST with intercession methods, well-formedness rules must be enforced so as to produce compilable code. For instance, a throw statement contains an expression that necessarily returns an exception object (in Java, an instance of Throwable). There are three ways of detecting violations of those well-formedness rules: statically when writing the code manipulation code, dynamically when applying the transformation, or by verifying that the generated code actually compiles. We believe that static checking is the best solution, because it gives instant feedback to the developer.

Static checks of intercession methods.
In SPOON, we use Java generics to statically enforce the AST well-formedness rules. For instance, Listing 4 shows how we statically enforce that the expression of a throw statement is a valid exception object: The parameter of method setThrownExpression is typed by CtExpression<? extends Throwable>. More generally, as shown in Figure 3, an expression (CtExpression) has a type parameter T. This enables to statically enforce well-formedness rules at many places: The condition expression of an if statement or a loop must return a boolean; in an assignment, the generic type of the expression to be assigned must be compatible with the type of type variable and so on. The same technique is applied by the class Class of Java for reflection. and to enforce that the expression initializing that variable indeed returns an author instance. This is illustrated in Listing 5, where the last line triggers a compilation error because one tries to set an expression that returns a book in a variable that expects an author.

Statically type-checked code templating for Java
We have seen so far two ways of writing code transformations. First, one can use the intercession API to manipulate AST objects. Second, one can generate code snippets using text fragments. Both have the same limitation: There is no way to be sure, before the actual transformation, that the transformed code will be compilable. SPOON provides developers with a third way of writing code transformations: code templates. Those templates are statically type checked, in order to ensure statically that the generated code will be correct. Our key idea behind SPOON templates is that they are regular Java code. Hence, the type checking is that of the Java compiler itself.
A SPOON template is a Java class that is type checked by the Java compiler, then taken as input by the SPOON templating engine to perform a transformation. This is summarized in Figure 5. A SPOON template can be seen as a higher-order program, which takes program elements as arguments, and returns a transformed program. Like any function, a template can be used in different contexts and give different results, depending on its parameters.

Template definition.
Listing 6 defines a SPOON template. This template specifies a statement (in method, statement) that is a precondition to check that a list is smaller than a certain size. This piece of code will be injected at the beginning of all methods dealing with size-bounded lists. This template has one single template parameter called _col_, typed by TemplateParameter.
In this case, the template parameter is meant to be an expression (CtExpression) that returns a collection (constructor, line 3). All metamodel classes, including CtExpression, implement interface TemplateParameter. A template parameter has a special method (named S, for substitution) that is used as a marker to indicate the places where a template parameter substitution should occur. For a CtExpression, method S() returns the return type of the expression. A method S()is never executed; its only goal is to get the template statically checked. Instead of being executed, the template source code is taken as input by the templating engine ( Figure 5), which is described previously. Consequently, the template source is well-typed, compiles, but the binary code of the template is thrown away.
There are three kinds of templates: block templates, statement templates, and expression templates. Their names denote the code grain they respectively address.

Template instantiation.
In order to be correctly substituted, the template parameters need to be bound to actual values. This is performed during template instantiation.
Listing 7 shows how to use the check-bound of template of Listing 6. One first instantiates a template, then one sets the template parameters, and finally, one calls the template engine. In Listing 7, last line, the bound check is injected at the beginning of a method body.
Because the template is given, the first method parameter that is in the scope of the insertion location, the generated code is guaranteed to compile. The Java compiler ensures that the template compiles with a given scope; the developer is responsible for checking that the scope where he or she uses template-generated code is consistent with the template scope.

Template substitution.
The substitution engine uses the metamodel and querying API presented in Section 2.3 to look up all invocations of method S(). It then substitutes them by the template parameter instances. Here, if _col_ represents the method parameter x, the substitution modifies the model to return the expression 'x.size()>10'. If _col_ stands for an expression that returns a collection (say getCollection()), the substitution result is 'getCollection ().size()>10'. by the substitution engine to wrap a method body into a try/catch block. The substitution engine contains various methods that implement different substitution scenarios. For instance, method insertAllMethods inserts all the methods of a template in an existing class. It can be used for instance, to inject getters and setters.

Literal template parameters.
We have already seen one kind of template parameter (TemplateParameter<T>). Sometimes, templates are parameterized literal values. This can be performed with a template parameter set to a CtLiteral, for instance, For convenience, SPOON provides developers with another kind of template parameters called literal template parameters. When the parameter is known to be a literal (primitive types, String, Class, or a one-dimensional array of these types), a template parameter enables one to simplify the template code. To indicate to the substitution engine that a given field is a template parameter, it has to be annotated with a @Parameter annotation. Figure 6 illustrates this feature with two equivalent templates. By using a literal template parameter, it is not necessary to call the S() method for substitution: The templating engine looks up all usages of the field annotated with @Parameter. The listing previously shows those differences.
3.3.6. Summary. To sum up, SPOON templates are regular Java code. Because they type check successfully; they ensure that the transformed code actually compiles. It is complementary to the intercession API and the code snippets. Depending on the transformation and the amount of time given to write it, they all have pros and cons, as summed up in Table III. The intercession API is well suited for fine-grain, surgical transformations but cumbersome for inserting lots of new code. Code snippets are very handy for code injection; however, in complex transformations with many conditionals, they may result in incorrect code. The templates provide strong static checks with respect to the transformed code. However, the price to pay is that they have a longer learning curve required to master them. Again, they are complementary; one can use them in conjunction in the same transformation.

Annotation-driven program processors
We now discuss how SPOON deals with the processing of annotations. Java annotations enable developers to embed metadata in their programs. Although by themselves, annotations have no explicit semantics; they can be used by frameworks as markers for altering the behavior of the programs that they annotate. This interpretation of annotations can result, for example, on the configuration of services provided by a middleware platform or on the alteration of the program source code. Figure 6. Two equivalent excerpts of templates. The left-hand side one uses TemplateParameter; the righthand side one is more concise thanks to @Parameter.Two equivalent excerpts of templates. The left-hand side one uses TemplateParameter; the right-hand side one is more concise thanks to @Parameter. Annotation processing is the process by which a pre-processor modifies an annotated program as directed by its annotations during a pre-compilation phase. The Java compiler offers the possibility of compile-time processing of annotations via the API provided under the javax.annotation.processing package. Classes implementing the javax.annotation.processing.Process interface are used by the Java compiler to process annotations present in a client program. The client code is modeled by the classes of the javax.lang.model package (although Java 8 has introduced finer-grained annotations, but not on any arbitrary code elements). It is partially modeled: Only types, methods, fields, and parameter declarations can carry annotations. Furthermore, the model does not allow the developer to modify the client code; it only allows adding new classes.
The SPOON annotation processor overcomes those two limitations: It can handle annotations on any arbitrary code elements (including within method bodies), and it supports the modification of the existing code.
3.4.1. Annotation processing with SPOON. SPOON provides developers with a way to specify the analyses and transformations associated with annotations. Annotations are metadata on code that start with @ in Java. For example, let us consider the example of a design-by-contract annotation. The annotation @NotNull, when placed on arguments of a method, will ensure that the argument is not null when the method is executed. Listing 9 shows both the definition of the @NotNull annotation type and an example of its use.
The @NotNull annotation type definition carries two meta-annotations (annotations on annotation definitions) stating which source code elements can be annotated (line 1) and that the annotation is intended for compile-time processing (line 2). The @NotNull annotation is used on the argument of the marry method of the class Person. Without annotation processing, if the method marry is invoked with a Null reference, a NullPointerException would be thrown by the Java virtual machine when invoking the method isMarried in line 7.
The implementation of such an annotation would not be straightforward using Java's processing API because it would not allow us to just insert the Null check in the body of the annotated method.

The annotation processor interface.
In SPOON, the full code model (cf. Section 2.2) can be used for compile-time annotation processing [16]. To this end, SPOON provides a special kind of processor called AnnotationProcessor whose interface is 1166 R. PAWLAK ET AL.
Annotation processors extend normal processors by stating the annotation type those elements must carry (type parameter A), in addition to stating the kind of source code element they process (type parameter E). The process method (line 4) receives as arguments both the CtElement and the annotation it carries. The remaining three methods (getProcessedAnnotationTypes, getConsumedAnnotationTypes, and inferConsumedAnnotationTypes) configure the visiting of the AST during annotation processing. The SPOON annotation processing runtime is able to infer the type of annotation a processor handles from its type parameter A. This restricts each processor to handle a single annotation § . To avoid this restriction, a developer can override the inferConsumedAnnotationType() method to return false. When doing this, SPOON queries the getProcessedAnnotationTypes() method to find out which annotations are handled by the processor. Finally, the getConsumedAnnotationTypes() returns the set of processed annotations that are to be consumed by the annotation processor. Consumed annotations are not available in later processing rounds. Similar to standard processors, SPOON provides a default abstract implementation for annotation processors: AbstractAnnotationProcessor. It provides facilities for maintaining the list of consumed and processed annotation types, allowing the developer to concentrate on the implementation of the annotation processing logic.
Going back to our @NotNull example, we implement a SPOON annotation processor that processes and consumes @NotNull annotated method parameters and modifies the source code of the method by inserting an assert statement that checks that the argument is not null.
The NotNullProcessor (Listing 10) leverages the default implementation provided by the AbstractAnnotationProcessor and binds the type variables representing the annotation to be processed and the annotated code elements to NotNull and CtParameter. respectively. Then, in the class constructor, it configures the annotation types it is interested in by adding them to the lists provided by its super class (lines 5 and 6). The actual processing of the annotation is implemented in the process(NotNull,CtParameter) method (lines 10-13). Annotated code is transformed by navigating the AST up from the annotated parameter to the owner method and then down to the method's body code block (lines 10 and 12). The construction of the assert statement is delegated to a helper method constructAssertion(String), taking as argument the name of the parameter to check. This helper method constructs an instance of CtAssert (by either programmatically constructing the desired boolean expression or employing the templating facilities explained in Section 3.3). Having obtained the desired assert statement, it is injected at the beginning of the body of the method.
More complex annotation processing scenarios can be tackled with SPOON. For example, when using the @NotNull annotation, the developer is still responsible for manually inspecting which method parameters to place the annotation on. A common processing pattern is then to use regular SPOON processors to auto-annotate the application's source code. Such a processor, in our running § Java does not allow annotations to extend other annotations. With this processing pattern, the programmer can use an annotation processor in two ways: either by explicitly and manually annotating the base program or by using a processor that analyzes and annotates the program for triggering annotation processors in an automatic and implicit way. This design decouples the program analysis from the program transformation logics and leaves room for manual configuration.
A second complex use of annotation processing in SPOON is presented in Section 4 (Section 4.3.3).

EVALUATION
We now present an evaluation of SPOON. The evaluation is composed of three parts: correctness (4.1), performance (4.2), and case studies (4.3).

Correctness
The core of an AST-based source code analysis tool as SPOON is made of two components: a model of the program and a pretty-printer. For real-world and rich programming languages, it is hard to achieve a model that provably covers all parts of the language. Similarly, it is very difficult to prove that all pretty-printed versions of all valid models would be valid programs that (1) correspond to the original one and (2) can be compiled and executed.
To validate the correctness of those two components, we set up the following experiments: First, we collect a dataset of open-source programs; second, for each subject programs, we build the model from the original source code, and we then pretty print back as source code; third, we compile the pretty-printed source code; and fourth, we run the test suite of the pretty-printed program to check whether the program has still the same observable semantics.
This process hence has two stacked correctness oracles: the compiler and the test suite. If all programs can be represented as a model whose pretty-printed version can be compiled and executed, this gives strong confidence in both the correctness of the model and of the pretty-printer. The test suite ensures that the model indeed corresponds to the original program.
We searched in software forges software packages that meet the following inclusion criteria: The program is open-source; the program is written in Java; and the program contains a good test suite. Also, the better the program test suite is, the better the correctness oracle is. Hence, we take a special care of selecting programs that are well tested. Note that the test suite is not pretty-printed in order to keep the exact same specification and avoid biases.
This process results in 13 software applications: Bukkit; Commons-codec; Commons-collections; Commons-imaging; Commons-lang; DiskLruCache; Javawriter; Joda time; Jsoup; JUnit; Mimecraft; Scribe Java; and Spark. We give below the version number (as Git commit checksum) for future work on this dataset. Table IV gives the key descriptive statistics for this dataset: the commit identification (the digest of the commit in the Git version control system, necessary for future replication), the size in number of statements (a semantic line-of-code measure that we also call LOC in this paper), and the number of tests (as defined by the JUnit test framework: the number of test methods). The smallest programs have less than 1000 LOC, and the biggest program has 31k LOC. The total is 166,079 statements. The number of test cases ranges from 14 to 15,067.

R. PAWLAK ET AL.
In addition, over the years, SPOON has been used in many different projects, both within academia and within industry, which further validates this formal experiment.

Performance
We now discuss the performance of SPOON. How long does it take to create a model of a program and to write it back on disk? For each program of Table IV, we measure the time required to parse the original program, build the model of the program as instance of the SPOON metamodel, and pretty print it to disk. All experiments are performed on a MacBook (Apple Inc., Cupertino, CA, USA) with CPU Intel core i7 (2 GHz, Intel, Santa Clara, CA, USA) and 8 Gb random access memory (1600 MHz DDR3) and a solid state disk with an hierarchical file system. The version of SPOON is 2.3.1, and the version of the Java virtual machine is OpenJDK 7. The results are shown in Table V.
For five out of 13 subjects, the time to build a model of a program and write its pretty-printed version is lower than 1 s. For the remainder, it takes up to 6.8 s (for Commons-collections) to perform the same task. We see an expected correlation between the application size and the SPOON time. The correlation is not strict because not all applications use exactly the same features of the Java language and thus do not exercise the exact same parts of SPOON.
If we compare those durations with the time required to compile the whole application (using javac), it is clearly less (between one-third and one-tenth). Consequently, this experiment shows that the compilation is a good upper bound of the price of source code analysis. Note that the SPOON

Case studies
We now present four case studies to give concrete insights on how SPOON is relevant to analyze and transform source code.

Java 1.4 to Java 5 translator.
In the early years of Java 5, there was an important need for porting Java 1.4 code. In particular, the Java 5 compiler generates a great deal of type-safety warnings (in particular when using the collection framework). In order to avoid these warnings, one needs to include type parameters into the legacy code, which is a time-consuming, error-prone, and repetitive task. Automated refactoring approaches are implemented in some integrated development environments (e.g., Eclipse and Idea). As an illustration of SPOON's power, let us now present how to implement this refactoring with SPOON. The refactoring handles two cases: the automatic addition of generic types and the morphism of for loops into foreach loops.
Generics To introduce generic type parameters, one needs to statically determine the missing types. This analysis consists of checking variable usages for inferring their type parameter(s). To do so, we use the algorithm described by Fuhrer et al. [17], which uses a constraint-based approach.
Fuhrer's type-inference algorithm is composed of two main stages. In the first stage, a pass over the statements of the program is made. During this pass, type constraints on the expressions are derived and added to an equation system. In the second stage, once all constraints are derived, the equation system is solved, and a single type solution is associated with each expression. For example, in Table VI, a fragment of source code and the relevant-derived constraints are shown. After having found a set of types that satisfies all the derived constraints, the program is then transformed into a generic-compliant Java 5 program.
Implementation The refactoring is implemented by two sets of SPOON processors. The first set derives the type-constraints, while the second one eliminates cast operations made redundant by the inclusion of type parameters. The constraint system solution is found by means of a book-keeping algorithm as described in [17].
First, constraints are derived using five analysis processors; each of them takes care of a single kind of statement: assignments, class declarations, method invocations, method returns, and field and local variable definitions. Each processor implements an inference rule that produces a given set of constraints for the statement or expression it processes.
Second, the constraint equation is solved, and the solution is translated back into code using a transformation processor that manipulates the type declaration of field declarations, method parameters, and local variable declarations.  Finally, redundant casts are eliminated by a last processor that, during a second processing round of the program's model, visits all invocations and checks if they are casted. If the cast is a type that is assignable from the return type of the method, the cast is removed.
Loop transformations Java 5 introduced a new language construct called 'enhanced for loop' (also known as foreach). This foreach construct is syntactic sugar that avoids the explicit use of iterators. Let us now describe a refactoring that uses SPOON to translate regular for iterator-based loops into their foreach equivalent.
To be able to translate a traditional for loop into a foreach loop, it is necessary to be able to identify the source code pattern that denotes a translatable loop and replace the for loop with its equivalent. We have implemented a SPOON processor that performs this task for this particular kind of for loops.
The ForeachProcessor is implemented as follows. When a for loop is found (CtFor elements), it is tested whether it matches the desired structure (i.e., is translatable). For instance, we check that the loop body does not contain further calls to next(). We use the reflection API to check that the initialization, looping expression, and the first statement of the block are of the expected form. To replace the initial for loop statement, we use the template LoopTemplate shown in Listing 11, where _collection_ represents the iterable collection, _body_ represents the loop body without the first statement of the initial loop, _I_ represents the type of the collection's contents (see Section 3.3.5 for type substitution), and _loopingVariable_ represents the name of the identifier used to denote the currently iterated element. ForeachProcessor then instantiates this template by feeding the parameters with the appropriate values in order to get the piece of code used to replace the original loop.

Meta-object protocol for method calls.
A meta-object protocol [18] enables developers to access the internals of a programming language's semantics. There are different meta-object protocols and different ways for implementing them. For instance, a meta-object protocol can modify the object instantiation or the read and write accesses of object fields. It can be implemented within the runtime environment (such as in Smalltalk) or with code transformation. In this section, we present an implementation of a meta-object protocol for method calls (called meta-call protocol or MCP) implemented as a code transformation with SPOON.
A meta-object protocol for method calls enables one to intercept all method calls of a software application [19]. The interception consists of many pieces of information starting with the receiver object, the method to be called, and the parameter values. AspectJ [20] provides one kind of MCP using aspects. The interception can have several different goals: logging some information, replacing the method by a more efficient alternative, checking preconditions, changing the arguments, and so on.
Our MCP intercepts the following information: the receiver, the actual parameters, the statically declared parameter types, the called method, the class containing the called method, the location of the call, and the code being replaced. To illustrate this, Listing 12 shows a method call before and after transformation, where all the required information is passed to the method call protocol handler (class MCP). MCP takes as input one object that specifies the method call (an instance of class MethodCall). This object is set once by chaining setters and is immutable.
Within class MCP, several strategies are implemented. The default strategy simply calls the method using reflection. It is semantically equivalent to the original code. We have implemented 1172 R. PAWLAK ET AL.
Users of these frameworks annotate their code in order to access services or declare special semantics of their program. The manner in which annotations are placed on program elements must respect the rules specified by the framework being used, and violations of these rules result in compilation or runtime errors.
In this section, we describe a SPOON-based framework for the specification and checking of the usage rules of annotations. Our annotation validation framework is called AVal. The key idea is that when annotations are defined, they also specify the way in which they should be validated. The validation rules are specified by meta-annotations -that is, annotations on the annotations.
The meta-annotations declare constraints with respect to the program element that it annotates (its target) as well as with respect to other annotations in the system (interplay between annotations).
Constraints in AVal belong to one of these four kinds: local, scoped, attribute, or target. Local and scoped constraints deal with the presence of annotations on the AST with respect to each other. Attribute constraints define valid values for the attributes of annotations used in the system, while target constraints deal with the characteristics of the AST nodes that can carry each annotation. Examples of AVal annotation for each kind are as follows: Local constraints restrict the annotations allowed on a particular AST node. For example, @Requires takes as parameter another annotation and states that the constrained annotation is only valid if the target AST node already carries the parameter given to the @Requires annotation. Scoped constraints dictate the annotations allowed or not on sibling nodes on the AST. For example, to constrain the presence of an @Id annotation to only one of the fields of a class, the @Id annotation definition must be constrained with @UniqueInside(CtClass.class). Attribute constraints restrict the values allowed on the parameters of annotations. When placed on an attribute defined on an annotation type, the @URLValue annotation checks that the values for the attribute are valid uniform resource locator strings. Target constraints describe the characteristics of the AST nodes allowed to carry an annotation.
For example, the annotation @Type(Customer.class) would restrict an annotation to be placed on classes or fields that extend the customer type.
An in-depth description of the semantics of these constraints can be found in [23]. AVal has a four layer architecture: At the bottom lies the client code that must be validated. The client code carries domain annotations defined by the domain annotation library. The annotation types defined in the library are themselves annotated by constraining annotations defined by AVal (e.g., @Requires). On the top layer, the validation of each constraint is implemented by a SPOON annotation processor.
Annotation validation with AVal -web services with JSR 181. We will now use the three annotations defined in the JSR181 for web services (@WebService, @WebMethod, and @OneWay) to show how, using AVal, their use can be validated by special SPOON processors.
The JSR181 [24] is a specification for the description of web services using pure Java objects. The JSR defines a set of annotations and their mapping to the XML-based web service description language. In Section 2.5.1 of the specification, it is stated that implementations of the JSR must provide a validation mechanism that performs the semantic checks on the Java Bean web service definition. In the succeeding texts, we describe three of those annotations, and how AVal is used to provide the semantic checks required by the specification.
@WebService marks a class as representing a web service. It specifies the following constraints: The class must be public and must not be final nor abstract; it must also define a default public constructor; the wsdlLocation parameter must be a valid uniform resource locator pointing to the definition of the web services description language file for this web service. The meta-annotated code for the @WebService annotation is shown in Listing 14. The @AValTarget annotation is used to constrain @WebService to classes. @Modifier checks that the class carrying the annotation is public and neither final nor abstract.
@WebMethod marks a method as being a web operation for the web service. Web methods can only be declared on web service classes. This is expressed by the AVal constraint @Inside in line 2 of Listing 15. Also, the method must be public (line 3). SPOON

1173
Asynchronous web methods are declared with the @OneWay annotation. This means that it is an error to annotate a method as being @OneWay without it being a @WebMethod (line 1, Listing 16). Also, being a one way method, no return value is allowed. This is specified in line 2 by the @Type constraining the return type to void.
Domain-specific meta-annotations When AVal annotations are not enough, the SPOON API enables developers to write domain-specific AVal meta-annotations and their corresponding custom checkers. In the presentation of WebService, a domain-specific meta-annotation @HasDefaultConstructor is used. It checks for the presence of no arguments, public constructor in the annotated class. This is achieved by annotating the annotation type definition for the @HasDefaultConstructor with an AVal-defined annotation @Implementation, which makes the link between the AVal annotation and the SPOON processor that validates the rule it defines.
The source code for the SPOON processor that checks this constraint is shown in Listing 18. This processor validates annotations that carry the @HasDefaultConstructor meta-annotation, as specified by its implementation of the Validator <HasDefaultConstructor> annotation. For each occurrence of such an annotation, the check method is called as parameter an object that contains contextual information (such as the program element AST node in which the annotation was found). The validator reports violations on the has default constructor constraint (see line 16 in Listing 18).
We have used AVal as a means to describe and enforce the usage rules of annotations defined by three large frameworks [23]: Hibernate's Java persistence API implementation [21], Fractal's component model implementation for Java Fraclet [25], and the JSR181 for web services. We were able to use AVal to specify 23 annotations from these frameworks.
Implementing AVal on top of SPOON provides benefits at two levels: On the implementation side, SPOON's abstractions allow AVal to rely on meta-annotations to specify annotation constraints in a modular manner.

Discussion
To sum up, we have shown that SPOON can be used in a number of different analysis and transformation scenarios. As the examples show, the SPOON API enables the developer to write a transformation that is intuitive to write and easy to maintain.
SPOON is deeply founded on AST analysis, and consequently, all comments and layout are discarded when transforming code. In the case of large scale refactorings or mass maintenance scenarios, this is an important limitation.
In this paper, we have not discussed the usage of SPOON for standard generic static analyses such as building a control-flow graph, a program dependence graph, a single static assignment form, and so on. This is on purpose: As the title suggests, SPOON is meant to be used 'by the masses' for writing their own domain-specific analysis (such as the one related to web services). However, based on our experience, SPOON can also be used for such tasks, for instance, to perform a dead code elimination in a Java project.

RELATED WORK
For early papers on source code analysis and transformation, we refer the reader to the classical surveys of Partsch et al. [30] and Feather [31]. With respect to analysis only (and not transformation), there is the survey by Binkely [1]. An analysis on the different characteristics of program transformation systems has been performed by Nadera et al. [32]. We now present the most related pieces of research. Note that in the literature, there are different terms related with transformation: instrumentation to refer to adding monitoring points; meta-programming to refer to create pieces of programs for more abstract specifications; and reflection and intercession to refer to changing a program's behavior at runtime. They all somehow refer to the same concept and are discussed in the succeeding texts.
Ichisugy and Roudier [33] devised a preprocessor for Java. A preprocessor enables one to only write transformations specified with intrusive-specific preprocessor directives. On the contrary, SPOON allows one to transform any Java program in a non-intrusive manner.
Chiba [34] introduced the notion of compile-time meta-architecture. SPOON has the same kind of architecture for the Java language. Also, it uses Java generics for well-typing program analyses and transformations. Tatsubori et al. [35] described a macro system for Java called OpenJava. OpenJava contains introspection and intercession methods that are similar to SPOON. The Javadoc system [36] enables to write 'doclets', pieces of code that analyze the structure of a Java programs (classes, field, and methods). Beyond the structural part, SPOON enables to analyze every single program element. SPOON allows full intercession up to the statements and expressions of the language.
Compile-time meta-architecture and aspects à la AspectJ [20] are closely related concepts. SPOON can be seen as a compile-time aspect weaver, where the AST node type and isToBeProcessed method act as the pointcut and the process method as the advice. The main difference between AspectJ and SPOON is that SPOON enables any kind of pointcut and advice to be written: Any arbitrary code patterns can be encoded in isToBeProcessed as well as any transformation up to code elements. AspectJ is simpler to use but to the price of having a more limited scope of aspects.
Van Deursen and Visser [37] introduced the idea of combining AST visitors to perform a wide range of analysis tasks. The main difference with SPOON is that SPOON provides not only visiting mechanisms but also a powerful transformation API. However, Van Deursen and Visser's visitor combinators might be plug on top of SPOON transformations to create higher-order transformations.
There are many bytecode analysis and transformation libraries (e.g., [6,38]). SPOON does not work on bytecode but on source code. This enables developers to reason and transform language structures and constructs that disappear during the compilation process (e.g., try/catch blocks, anonymous classes, etc.). Also, the transformations are more intuitive to write because they refer to the syntactic programming concepts.
Klint et al. [39] presented Rascal. Rascal is a domain-specific language for source code analysis and manipulation. It enables one to implement refactoring algorithms that are independent of the Copyright  one particular programming language. While a domain-specific language allows conciseness and expressiveness, it has a steep learning curve. On the contrary, any good Java developer is able to write a SPOON transformation of Java programs within a relatively short time.
Visser [40] invented Stratego, a language for program transformations using rewriting rules. Schordan and Quinlan [41] described a source-to-source transformation framework dedicated to optimization. MetaJ is a metaprogramming environment for Java [42]. It provides a template mechanism that is similar to ours described in Section 3.3. Cordy [43] presented the TXL: a rapid prototyping system for programming language dialects, 1988 source transformation language. The TXL language is based on rules to specify the transformations. Balland et al. [44] summarized their research efforts on Tom, a term-rewriting platform that is able to deal with Java code. For all those approaches, the analysis and transformation developers are required to master a new abstract formalism. On the contrary, using SPOON, only the concept of AST is required to start writing productive transformations in plain Java.
Ludwig and Heuzeroth [45] designed the Compost system and some program transformations. Our goal and architecture are similar, but the SPOON mechanisms for transforming programs are more powerful. For instance, Compost has no way for specifying templates.
Strein et al. [46] proposed an extensible metamodel for program analysis. They only focus on program analysis, while SPOON enables users to specify analyses and transformations using the same framework. Kuipers and Visser [47] presented a meta-approach to analysis: Given a grammar definition, JJForester generates the classes implementing the meta-model and the traversal helper classes. Contrary to SPOON, this is one meta-level further.
Cetus [48] is a compiler extension that provides an intermediate representation and an API to manipulate C/C++ programs. In SPOON, what corresponds to Cetus' intermediate representation is the SPOON metamodel. While their approach is for low-level languages with a focus on automatic parallelization, ours is in Java and aims at being generic, as shown by our different case studies.

CONCLUSION
We have presented SPOON, a library for analyzing and transforming Java source code. SPOON provides a unique blend of features: a Java metamodel dedicated to analysis and transformation; a powerful intercession API; static checking of transformations using templates; and fine-grained annotation processing.
SPOON has been developed since 2006 and has been used in a number of research and industrial projects. We have shown with four case studies how it is a key component in different scenarios. Future work will consist in porting the concepts of SPOON for analyzing and transforming newly popular programming languages, in particular, JavaScript. Also, our current research heavily uses SPOON to insert runtime hooks, used for runtime verification and runtime failure recovery.