Neural style transfer: a paradigm shift for image-based artistic rendering?

In this meta paper we discuss image-based artistic rendering (IB-AR) based on neural style transfer (NST) and argue, while NST may represent a paradigm shift for IB-AR, that it also has to evolve as an interactive tool that considers the design aspects and mechanisms of artwork production. IB-AR received significant attention in the past decades for visual communication, covering a plethora of techniques to mimic the appeal of artistic media. Example-based rendering represents one the most promising paradigms in IB-AR to (semi-)automatically simulate artistic media with high fidelity, but so far has been limited because it relies on pre-defined image pairs for training or informs only low-level image features for texture transfers. Advancements in deep learning showed to alleviate these limitations by matching content and style statistics via activations of neural network layers, thus making a generalized style transfer practicable. We categorize style transfers within the taxonomy of IB-AR, then propose a semiotic structure to derive a technical research agenda for NSTs with respect to the grand challenges of NPAR. We finally discuss the potentials of NSTs, thereby identifying applications such as casual creativity and art production.


INTRODUCTION
Non-photorealistic rendering (NPR) constitutes a highly active research domain of computer graphics that deals with the expression, recognition, and communication of complex image contents by means of information abstraction and highlighting [DeCarlo and Santella 2002;Hertzmann 2010;Lansdown and Scho eld 1995]. In particular, image-based artistic rendering (IB-AR) enjoys a growing popularity in mobile expressive rendering [Dev 2013;Winnemöller 2013] to simulate the appeal of traditional artistic styles and media for visual communication [Kyprianidis et al. 2013;Rosin and Collomosse 2013] such as pencil, pen-and-ink, oil paint, and watercolor. Classical IB-AR techniques typically model the design aspects that are involved with these artistic styles, i. e., to direct the smoothing and contour highlighting of image ltering, Figure 2: Overview of NST techniques and applications. Previous works used implementations of NSTs to perform generalized color and texture transfers, stylize videos, and provide means for casual creativity in mobile expressive rendering. Images © Risser et al. [2017], from Gatys et al. [2016b] © IEEE, from Selim et al. [2016] © ACM, all used with permission.
the approximation of image contents via rendering primitives (e. g., brush strokes, stipples), or an image segmentation. A more generalized approach has been introduced by example-based rendering (EBR), which employs machine learning or statistical models to emulate characteristics of artistic styles from visual examples [Kyprianidis et al. 2013]. Previous techniques in EBR, however, typically require analogous style and content pairs for training [Hertzmann et al. 2001] or only inform low-level image features for texture transfers, thus limiting its application and creative control over the design aspects. Advancements in deep learning and convolutional neural networks (CNNs) demonstrated that these technical limitations can be alleviated as follows: (1) Deep CNNs are able to accurately classify high-level image contents across generalized data sets [Simonyan and Zisserman 2015].
(2) Layers of pre-trained deep CNNs can be activated to match content and style statistics, and thus perform a neural style transfer (NST) between arbitrary images [Gatys et al. 2016b] (Figure 1).
To this end, we argue that deep learning denotes a key technique in the chronology of IB-AR [Kyprianidis et al. 2013], as it makesfor the rst time-a generalized style transfer practicable. First applications demonstrate this process using the example of color and texture transfers as well as casual creativity systems and services ( Figure 2). To provide a sophisticated paradigm shift for IB-AR, however, we believe that NSTs need to mature from color and texture transfers to interactive tools that consider the design aspects and mechanisms involved in artwork production, i. e., to ease the visual expression of artists, non-artists (i. e., general public), and scientists Isenberg 2016;Salesin 2002].
In this paper we discuss the potentials and challenges of NST for IB-AR. In the following section, we rst provide a conceptual overview for (neural) style transfer and show how the design process differs from classical IB-AR paradigms (Section 2). Next, we provide a semiotic structure for IB-AR that combines design aspects and mechanisms of artwork production with well-established design principles of NPAR (Section 3). We then use this structure to categorize current (neural) style-transfer techniques (Section 4) and derive a technical research agenda for NST (Section 5) including potential mutual inclusions with other IB-AR paradigms such as image ltering (Figure 1). With this research agenda we shed light on how NSTs may contribute to deal with the grand challenges of NPAR put forth by Salesin [2002] and revisited by , and how they can be evolved as interactive tools that consider mechanisms of artwork production. Finally, we identify potential future applications such as casual creativity (Section 6).

ARTISTIC STYLE TRANSFER WITHIN THE TAXONOMY OF IB-AR TECHNIQUES
IB-AR is related to the processes of visual abstraction that are involved in the creation of general artworks [Hertzmann 2010;Ma 2002] and used to express uncertainty, communicate abstract ideas, and evoke the imagination ] by addressing the rational, emotional, and cognitive qualities of the human mind [Halper et al. 2003;Hertzmann 2010]. For an e ective visual abstraction, the separation of content from style is thus considered to be a key factor to allow us to distinguish between the mechanisms used for capturing the essence of an image, on the one side, and the design aspects that drive the aesthetic appeal to stimulate human senses Salesin 2002 Figure 3: Overview of style transfer concepts, which di er in the way artistic styles are modeled or transferred: heuristicsbased algorithms (left) and style transfers based on image statistics or analogies (middle) require explicit modeling or training phases prior to application, whereas NSTs (right) combine both aspects in a single phase.
this end, research in IB-AR has been devoted to deduce the design aspects of an artistic style that are involved in artwork production: De nition "Artistic Style": The constant form-and sometimes the constant elements, qualities, and expression-in the art of an individual or a group.
-Meyer Schapiro [Schapiro 1994] IB-AR implementations typically require programmers to model the design space as well as the de ning and distinguishing characteristics of an artistic style. Here, we see two general approaches which align with Kyprianidis et al. 's [2013] taxonomy as follows: (1) Heuristics-based Algorithms: Paradigms that are based on rendering functions, which are implemented by a domain expert who explicitly models individual artistic styles and its correspondent design aspects or mechanisms. This group basically comprises stroke-based rendering, regionbased techniques, image processing and ltering, and may also account for physically-based simulations.
(2) Style Transfer Algorithms: Example-based rendering which is directed to learn or reproduce artistic styles from visual examples (ground-truth data sets). This type often comprises statistical models and optimization schemes to balance aspects of content and style in the stylized output.
Prominent examples of heuristics-based algorithms are the strokebased rendering approach of Hertzmann [1998], the cartoon pipeline of Winnemöller et al. [2006], and the watercolor system of Bousseau et al. [2006]. For style transfer algorithms, by contrast, the literature primarily distinguishes between EBR techniques that transfer color or texture [Kyprianidis et al. 2013]. However-with the maturation of machine learning-we believe that this strict separation is no longer practicable because color and texture represent only two out of many variables to de ne the composition of artistic styles, and deep learning enables NSTs to abstract from applications (e. g., color/texture transfer). To this end, we conjecture that it is worthwile to provide a process-oriented taxonomy for EBR that re ects how artistic style transfers are modeled or technically implemented. Given artistic works as ground-truth data, we argue that three concepts may distinguish current and future EBR techniques ( Figure 3): I. Style Transfer using Image Statistics: Techniques that balance content and style of two separate inputs using statistical models. Prominent examples are histogram-based color transfers that equalize the mean and variance between content and style images [Neumann and Neumann 2005;Reinhard et al. 2001]. II. Style Transfer using Image Analogies: Techniques that use image pairs for training-a source image and an artistic depiction of this image-i. e., to learn an analogous transformation such that content images can be transformed into an artistic rendering of similar visual style [Hertzmann et al. 2001]. III. Style Transfer using Neural Networks: Techniques that employ neural networks to separate and recombine the content and style of arbitrary inputs. Typically, loss functions are minimized iteratively to balance the components of style and content in the output [Gatys et al. 2016b], or train feed-forward neural networks for linear image transformation [Johnson et al. 2016a,b].
We believe this classi cation helps to organize EBR techniques by their technical foundation and underpins the maturation from application-speci c (e. g., color transfers) towards generalized style transfers. In the following section, we de ne design aspects and mechanisms important for implementing these three concepts.

A SEMIOTIC STRUCTURE FOR ARTISTIC STYLE TRANSFER
Semiotics deals with the study of symbols and how they communicate image contents or information in a meaningful way [Bertin 2010]. In artwork production, elements of design are considered to be fundamental aspects of pictorial semiotics [Rudner 1951], whose mutual impact de ne the "composition" of an artwork, and thus its artistic style. Therefore, we believe that the transfer of proven design aspects and mechanisms of artwork production to modern media and imaging technologies, and the development of new artistic styles are key challenges for current and future research. In IB-AR theory [Hertzmann 2010], a semiotic structure that considers these design aspects and the mechanisms of interactive NPAR has not been formulated yet. We believe, however, that such a structure is essential to provide developers of NPAR techniques with the conceptual means to help them compose and extend artistic styles as well as evolve (neural) style transfers as interactive tools that ease the visual expression of artists, non-artists and scientists for illustrative visualization Isenberg 2016;Salesin 2002]. We thus formulate a semiotic structure that is based on graphic  Figure 4: Semiotic structure comprising graphical core variables and mechanisms that may be considered for style transfers.
semiology principles of Bertin [2010] and MacEachren et al. [2012] that provide a theoretical foundation to visualization ( Figure 4). The visual variables described by Bertin [2010] and MacEachren et al. [2012], however, cannot fully express the unique requirements of interactive media and systems (e. g., animation, video, interactive parameterizations). We thus extend the classi cation by the concepts of ltering and perception to consider interactivity, level of abstraction, and coherence/continuity issues of NPAR as well. This way, user involvement can be considered as a key mechanism for maintaining an iterative feedback loop between a system-as design instance implementing NPAR techniques-and the user's requirements-as consumer/artist. In particular, it is directed to interactively adjust the semiotic structure that de nes aspects of modeling, ltering, composition and perception ( Figure 4): 1. Modeling Aspects: They deal with encoding real-world phenomena as color maps, and complementary information as feature maps (e. g., results of an image segmentation, saliency analysis, optical ow estimation) and geometry maps (e. g., depth). 2. Filtering Aspects: They are used to select and apply different con gurations of composition variables according to image location, color, or feature. Filtering aspects should provide e ective control to globally and locally adjust the level of abstraction. Examples are the luminance-based placement of stipples [Martín et al. 2015], the locationdependent placement of contour lines [Cole et al. 2008], and feature-guided image ltering using orientation information [Kyprianidis and Döllner 2008]. 3. Graphical Elements: These elements comprise rendering primitives such as points, lines, areas, and generalized 2D elements. They may also de ne rendering paths or locations for texturing, e. g., stippling, contour-lining, and the decoration of image segments. 4. Graphical Variables: They refer to the illusion of physical mass and density (form), image regions with well-de ned boundaries (shape), the size of graphical elements, and color including brightness as phenomena of light and human visual perception. Prominent examples refer to rendering with reduced color palettes and at multiple scales [Kyprianidis et al. 2013]. 5. Design Mechanisms: They deal with the surface character and relationships among image features with respect to position and direction (space/texture), transparency to infer   color blending via overdraw or layering, the orientation of graphical elements, the shading and lighting conditions, and the crispness/resolution of image features. Previous works deal with mechanisms for stylized shadows [DeCoro et al. 2007], the orientation and layering of curved brush strokes [Hertzmann 1998], and low-pass image lters. 6. Perceptional Aspects: IB-AR typically aims to reproduce a hand-drawn look, where "distracting ickering and sliding artifacts" for animated scenes (e. g., virtual environments, video) should be minimized [Bénard et al. 2011]. Bénard et al. [2011] propose this challenge to be a concurrent ful llment of three goals: atness, motion coherence, and temporal continuity. In addition, we conjecture that pictorial cues are important perceptional aspects because artists often carefully consider linear perspective, occlusion, and texture gradients to infer depth in their artworks.
The mutual impact of these aspects de ne the individual artistic style and composition, and thus should be considered when designing and implementing style transfers. In particular, we argue that color and texture are only two semiotic aspects most techniques currently serve. By contrast, a "successful" modeling approach should consider the distinctive design aspects and mechanisms involved in a particular artistic style, i. e., with respect to the rendering functions, optimization functions for image statistics and analogies, or loss functions for neural networks (Section 2).  [Yang et al. 2017] to colorize grayscale images. With interactive methods it is also possible to maintain control over colors that are involved in palette-based color transfers [Chang et al. 2015;Pouli and Reinhard 2011]. Another classical application for image statistics can be found in image stippling [Martín et al. 2017]. Here, patterns are learned and applied through example using statistical texture measures [Kim et al. 2009;Maciejewski et al. 2008], modeling aspects such as the location of points (stipples), texture, shading, and resolution, which should depend on the spatial size of the output image. Martín et al. [2011] evolve these methods towards a "scale-dependent, examplebased stippling technique that supports both low-level stipple placement and high-level interaction with the stipple illustration. " These methods are prime examples for how style transfers can be implemented on a primitive level, considering graphical elements explicitly rather than texture patches.

Style Transfer using Image Analogies
Most style transfer techniques de ned by image analogies are based on texture transfers. Its basic idea is to copy image patches from a style image to a content image in a way that locally shares and minimizes pixel di erences in the content image, thereby using a smoothness constraint to provide similarity with adjacent textures [Efros and Freeman 2001]. Hertzmann [2001] de nes this as an optimization problem by learning the analogous transformation of a style/ground-truth image pair (A, A ) and applying it to a content image B to obtain a stylized output B such that A : A :: B : B .
Ashikhmin [2003] provides conditions for how to integrate userde ned feature maps to adjust parameter values of the texture transfer. The approach can also be used to learn stroke placements for contour-lining  in domains such as portrait sketches using templates [Zhao and Zhu 2011] and modeling image features at multiple scales for level of abstraction rendering [Berger et al. 2013]. Further extensions use edge and orientation information encoded in feature maps to control the placement of texture patches ] and individual brush strokes ], learn multiple styles and stroke patterns for portrait sketching and painting [Berger et al. 2013;Zhao and Zhu 2011], and estimate motion using ow elds to stabilize temporal coherence [Hashimoto et al. 2003]. Bénard et al. [2013] propose a sophisticated system for artists that performs style transfers for animations using orientation, velocity, and geometry information of 3D models to direct the transfer with shading and lighting conditions, and to ensure temporal and style continuity. In addition, they support overdraw and partial transparency using a layering approach explicitly de ned by the artist. Most of these works, however, typically consider only luminance-or color-guidance texture transfers, yet other information may be considered as well such as illumination as shown by Fišer et al. [2016] for stylized 3D models.

Style Transfer using Neural Networks
To discuss this sub eld, we draw on Gatys et al. [2016b] de nition of NSTs. Given a style image, a content image and a loss network, e. g., VVG-16 [Simonyan and Zisserman 2015], that is used to de ne several loss functions to measure the di erence between the output image and a target image, one can compute an output image by minimizing a weighted combination of the loss functions. Gatys et al. [2016b] initially de ne perceptual loss functions that control feature and style reconstructions to balance the components of content and style, and control spatial smoothness by regularizing the total variation, then solve the optimization problem using L-BFGS (Figure 5). Besides texture transfers, this approach can be employed to perform sophisticated color transfers as well, e. g., to colorize grayscale images [Iizuka et al. 2016]. Because this generalized style transfer employs back-propagation and combines learning and application in a single phase, we denote it as an iterative approach and distinguish it from the approach that separates learning from application to train a feed-forward neural network.
Iterative Approaches. Extensions of Gatys et al.'s [2016b] work primarily de ne additional loss functions to control the output's composition. MRFs loss functions, for instance, can be used as a local constraint to provide a more accurate texture patch matching and blending [Li and Wand 2016], histogram losses may produce outputs that statistically match style images more accurately [Risser et al. 2017], and a depth loss function to consider the spatial distribution of image features [Liu et al. 2017]. Further, a temporal loss function based on optical ow can be used to stabilize temporal coherence when applied on a per-frame basis to video [Anderson et al. 2016;Gupta et al. 2017;Ruder et al. 2016;Selim et al. 2016]. A few works controlled perceptual factors locally by considering feature maps using semantics-based image segmentation, such as to subdivide the optimization problem of NST to local image regions [Champandard 2016] or facial regions of portrait images [Selim et al. 2016  provide semantically more accurate transfers. Some enhancements also considered composition variables of the semiotic structure such as color, size, and location-based ltering by introducing control measures [Gatys et al. 2016a[Gatys et al. ,c, 2017 (Figure 6). We see these works as a starting point to evolve NSTs as interactive tools for IB-AR that facilitate creative expression, which we discuss below.
Feed-forward Approaches. The solving of NSTs optimization problems is computationally extensive. Some approaches thus provide approximations by computing the weights of a feed-forward neural network. Here, test images sets, e. g., ImageNet [Krizhevsky et al. 2012] or MS-COCO [Lin et al. 2014], are often used in a training phase performed once per artistic style, after which the obtained generative convolutional networks are used for linear image transformation [Johnson et al. 2016a,b;Ulyanov et al. 2016aUlyanov et al. , 2017a. Johnson et al. [2016a; and Ulyanov et al. [2016a] showed that these networks can be three orders of magnitude faster than the iterative approach. The output quality of these approaches can be further improved by employing network layers for (adaptive) instance normalization [Huang and Belongie 2017;Ulyanov et al. 2016b] that align the mean and variance of features of the content and style images. Conceptual limitations of these approaches, however, lie in the limited level of detail: style characteristics are generalized and not balanced for a unique style/content image pair (Figure 7). Alternative approaches either employ simpler loss functions with only local matching constraints, e. g., using a single layer of a pre-trained loss network [Chen and Schmidt 2016], or learn multiple styles or generative networks at once [Dumoulin et al. 2017;Zhang and Dana 2017] to improve versatility.

A TECHNICAL RESEARCH AGENDA FOR NEURAL STYLE TRANSFER
NST is a relatively new eld of research but has already shown promising results for generalized style transfers. We believe its future directions can be de ned in the context of some of the grand challenges of NPAR   i. e., its combination with other IB-AR paradigms for providing algorithmic aesthetics, improving the delity in reproducing and extending artistic styles towards new forms of art, and its paramaterization to evolve as interactive tools that "support full design cycle" [Salesin 2002] and ease visualization tasks. With these challenges and semiotics-oriented overview of Section 4 in mind, we thus propose the following technical research agenda.

Proposal 1: Semiotics-based Loss Functions
Current NST techniques primarily depend on color statistics for style transfer, but model color as a mutual inclusion and e ect of multiple composition variables. However, we believe that loss functions need to be de ned for individual composition variables and controlled ltering-wise by providing modeling information that, e. g., encode how the size, shape, orientation, transparency, shading, and shadows are aligned with the contents of a style image. For instance, stroke-based rendering models the image composition by placing, orienting, and layering individual brush strokes as graphical elements [Kyprianidis et al. 2013]. Typically, techniques estimate image ow [Wang et al. 2004;Yan et al. 2008;Zeng et al. 2009] or derive local surface properties [Sloan et al. 2001] to guide brush strokes with the orientation of image features or the shading and lighting conditions [Fišer et al. 2016]. Together with texture layering, e. g., of painterly art maps or dictionaries [Yan et al. 2008;Zeng et al. 2009], they provide better quality in preserving ne texture details and modeling style characteristics induced by form, shape, and orientation. For the latter, we believe a similar loss used for temporal consistency [Gupta et al. 2017;Ruder et al. 2016]but based on image orientation information-could help guide the texture transfer. There is also demand to explicitly model semiotic aspects that consider feature semantics. Here, Figure 8 exempli es some limitations that NSTs currently face for three artistic styles: • Divisionism represents images by regularly aligned rendering primitives, e. g., brush strokes that optically compose image features when viewed from distance. Because of its analogy to patch-based texturing, divisionism can be modeled quite accurately by current loss functions. • Cubism depicts subjects using simpli ed shapes and forms for composition, which are often portrayed using multiple perspectives. Here, NST techniques would need to infer geometric transformations and match geometric representations, e. g., as practiced by Mital et al. [2013], in correspondence with the color similarity. • Pop Art typically composes images by thick outlines, bold solid colors and Ben-Day dots. Here, current NST techniques face multiple limitations in reproducing shape, preserving the semantic composition, and style characteristics such as the regularity and color inversion of halftoning. The examples of cubism and pop art demonstrate that the coupling of individual semiotic aspects with the semantics of content and style images requires sophisticated rule-based algorithms. Eventually, this would lead to couple feature-level engineering with the architecture engineering approach of deep learning.

Proposal 2: Combining IB-AR Paradigms
Local e ects and phenomena of traditional artistic media such as oilpaint, pencil, or watercolor at high-delity and resolution are still hard to reproduce by NSTs. Here, we believe that NSTs may be used as one of multiple processing stages in IB-AR, and combined with the knowledge and algorithms of other paradigms. NSTs would thus not operate at the lowest level of detail, but as a rst stage that introduces higher-level abstractions-to be followed by a low-level, established technique to simulate drawing media and their interplay with substrates. For instance, specialized line drawing algorithms can be used to detect and stylize (salient) edges, e. g., via di erence-of-Gaussians [Winnemöller et al. 2012], edgepreserving ltering for noise reduction [Kyprianidis et al. 2013], and the constraints of stroke-based rendering to control the placement of graphical elements, e. g, based on luminance to direct (tonal) art maps for pencil rendering [Lee et al. 2006;Praun et al. 2001] or structure grids for feature-guided stippling [Son et al. 2011  avoid the artifacts from pure NSTs shown in Figure 9. In Figure 10 we show results of a case study, where image ltering is employed in a post-processing stage to NST to simulate local e ects such as edge darkening, pigment density variation, and wet-in-wet of watercolors quite accurately [Bousseau et al. 2006;Wang et al. 2014], whereas ow-based Gaussian ltering with Phong shading is used to lter low-level noise and create smooth continuous oilpaint-like texture e ects [Hertzmann 2002;Semmo et al. 2016b] In both cases we used the abstract style of Pablo Picasso's "La Muse" to generate an e ect of higher-level abstraction, before adding mentioned lters to simulate the respective low-level, local paint characteristics.

Proposal 3: New Forms of Styles
Gooch et al.  provided an overview of NPAR research through Heinlein's maturation model, and argue that NPAR has left the rst stage-emulating and imitating artistic styles-, evolved towards the second stage by optimizing the performance of the (used) technology, and is about to move towards the last stage, where the technology becomes seamless and almost transparent. In this respect, we believe that NST provides new opportunities for the rst two stages, but needs to "incorporate elements such as interaction, collaboration, human perception and cognition" ] to approach the third stage. In particular, here we see two potential use cases for NST. First, modifying learned artistic styles by providing mechanisms to specify transfer or loss functions that change particular design aspects or variables. Second, performing a style transfer by taking rule-based algorithms into account, i. e., to learn styles not only from style images but also a set of descriptions how an artistic style should look like, which makes new forms of styles-that have never been seen before-practicable.

Proposal 4: Providing Interactivity
Recently, Isenberg [2016] argued that EBR approaches have the potential to enable users to provide "both higher-level interaction and low-level control"-suggesting that this allows us to create both interaction environments for artists who need a wide range of low-level to high-level control and for non-artists whose interaction needs are likely easier satis ed with high-level interactions such as the application of lters. Many traditional EBR approaches, however, have relied on a close relationship between input style and input context, e. g., for hatching [Gerl and Isenberg 2013]. NSTs have the potential to address this very problem: styles are more easy to capture and thus the interactive application of stye becomes easier. So far, however, NST are typically treated like a "black box", supporting only the high-level application of a captured style. To enable the interaction spectrum that Isenberg [2016] calls for, it would be necessary to integrate more local control. Artists need to be able to a ect the result on a semantic level: controlling how larger regions are treated, change groups of marks, and even adjust a single mark. One approach could be to provide loss functions that operate on primitive-level and single design aspects as well, e. g., graphical elements such as brush strokes in a style image. For example, Figure 9 demonstrates how a purely global NST approach fails in several regions, and local control such as the change of an underlying directional eld, e. g., as practiced by [Salisbury et al. 1994], seems to be missing.
Moreover, it is important to consider the input from several style images, which is technically demonstrated by Johnson et al. [2016a; for blending multiple styles. This could be extended to either learn a particular technique/style or even an artist's design principles more reliably, or it could be used to combine two di erent

Neural Style Transfer
Neural Style Transfer with Post-process Watercolor Rendering Neural Style Transfer with Post-process Oilpaint Filtering Content Image Figure 10: Post-process image ltering to reduce low-level noise and inject paint characteristics. NST results are combined with watercolor rendering [Bousseau et al. 2006;Wang et al. 2014] and oilpaint ltering [Semmo et al. 2016b]. Content image by Frank Köhntopp is in the public domain.
styles in the same target image. For example for the latter, illustrations that combine di erent depiction styles to steer attention and create focus and context view would be an important application domain. Such an approach, however, would need local control or a semantic/semiotic processing of the content image by the NST algorithm, e. g., as is partially practiced by Gatys et al. [2016c; using feature maps, and interactive performance for immediate visual feedback, but which is currently a strong limitation of iterative NST techniques.

Proposal 5: Supporting Visualization Tasks
Semiotics are inherently linked with the theory of (information) visualization [Bertin 2010]. In particular, style transfers have been commonly used in illustrative visualization [Rautek et al. 2008], e. g., for the stylization of lines to depict ow [Everts et al. 2015], to make phenomena-hidden in complex data sets-visible to the human mind. However, e ective visualization must also "enable analysis of the supplied information, while easing the cognitive burden of a user" . NSTs based on deep CNNs emulate functionalities of the visual cortex by solving tasks through hierarchical processing [DiCarlo et al. 2012], but need to be performed in a context-dependent manner, e. g., with respect to a user's task and data domain, for e ective visualization. Here, we imagine the development of toolboxes or palettes of illustration styles that can be interactively applied by professional illustrators, in a way that considers an interaction spectrum from low-level to high-level controls [Isenberg 2016]. For example, a palette for computer-supported hatching and stippling could be provided that alleviates some of the tediousness of manual processes, but that includes support for higher-level illustration processes, e. g., [Martín et al. 2011], where NSTs could suggest regions to be ltered or regions to be contrastadjusted. The layers of deep CNNs that capture multiple levels of abstraction could be interactively used for this purpose to direct the interactive visualization/illustration process. Finally, we believe that, with the generalized application of NSTs, more complex artistic styles of several visualization domains could be served, such as medical imaging or cartography, but which requires NSTs to consider the semantics of style and content images (e. g., as shown for portrait images [Selim et al. 2016]), and data-domain speci c design mechanisms such as generalization [MacEachren 1995].

Proposal 6: Evaluation
The evaluation of aesthetics and practical bene ts for illustration or visualization tasks remains an important issue in IB-AR Hall and Lehmann 2013;Hertzmann 2010;Isenberg 2013]. For e ective comparison of NST techniques, we believe there is demand for a standardized benchmark image set such as the general NPAR set provided by Mould and Rosin [2016].
With respect to aesthetic evaluation, Salesin [2002] and  raised the issue of a "Turing Test" that determines if CG imagery can be indistinguishable from imagery produced by humans. While the utility of such a test is being debated [Hall and Lehmann 2013], some authors have included respective questions in their evaluations [Gatys et al. 2016b;Isenberg et al. 2006]. Gatys et al. for instance, evaluated their NST technique [2016b] in a preliminary choice experiment, asking participants to nd the hand-painted images in a set of 10 hand-painted/NST image pairs. The average of their 45,000 participants answered 6.1 image pairs correctly. 3 With the further consideration of semiotic aspects, in particular ltering that includes semantics to resolve incoherences in color transfers, it would be great to gather more information such as response time and eye xations to determine apparent locations or aspects of style incoherence-information that may be injected into the learning phase for improving a style transfer.
With respect to task e ciency, studies are required to determine if NSTs only copy low-level style aspects or if they also maintain higher-level semantics of image contents. These studies could also be used to determine to what degree NSTs introduce abstraction, whether the degree of abstraction can be intentionally controlled, and how it can be seamlessly interpolated for an interactive application as discussed above. In particular, the meaningful interaction with NSTs as tools for artists or scientists (e. g., with respect to illustrative visualization) requires investigation.

APPLICATIONS
The shift from feature engineering towards architecture engineering 4 of deep learning enables IB-AR to abstract from input data, and thus increase the general applicability in highly dynamic environments. Here, we see the following potentials for using NSTs.

Casual Creativity
NSTs have particularly enriched casual creativity applications [Winnemöller 2013] in ubiquitous environments such as mobile computing. This domain has largely been devoted to image ltering and processing to date, providing only constrained e ects [Dev 2013]. Prominent examples are the web service deepart.io and the iOS app Prisma-attracting 60 million users in three weeks-, which also started to establish their own social media communities for sharing and commenting on stylized outputs. We believe, however, that these apps have to evolve from "black box" solutions towards user-centric tools [Winnemöller 2013] to further promote visual expression. Here, a metaphor for on-screen parameter painting [Semmo et al. 2016a] may be used to tune hyperparameters of neural networks, while hiding the computational complexity.

Art Production
Salesin [2002] had envisioned the support of artists to be a major goal of NPAR, i. e., developing tools that make their life easier but that do not constrain their capabilities in visual expression [Isenberg 2016]. We discussed in Section 5 that this requires NSTs to evolve as interactive tools. One example is the system by Fišer et al. [2016] in which artists are able to draw over a printed stencil, while their individual style is transferred in real-time onto 3D models, dealing with proper light propagation and auto-completion. Another example is the system for watercolor rendering with artdirected control of Montesdeoca et al. [2016;, where the e ects shown in Figure 10 (among others) can be controlled via on-screen painting. Here, a long-term goal would be to integrate NSTs in the production pipeline of feature lms, e. g., as evaluated by Joshi et al. [2017] for Come Swim, reaching a quality level to assist the laborious production of fully painted animated lms such as Loving Vincent [Mackiewicz and Melendez 2016] (Figure 11), e. g., with respect to temporal coherence and the placement of graphical elements such as brush strokes.

Teaching Art Classes
We see potentials to use NSTs for teaching purposes, i. e., to help study and explore artistic styles of famous artists or epochs. In particular, we consider semiotics-oriented loss functions (Section 5) as a key goal for providing algorithmic support at a high-level (e. g., texture transfer) and low-level (e. g., primitive-level transfer). This way, interactive art explorations could be feasible for children using (semi-)automatic transfers, e. g., using the semantics of twobit doodles [Champandard 2016]. A similar scenario can also be created for adults who could explore, e. g., the modeling, painting, and mixing of style invariances (e. g., brush size, pattern, etc.).
Painting from 'Loving Vincent'

Content
Style transfer from deepart.io Style Figure 11: Comparison between emulating an artistic style via painting (oil on canvas) and a NST. Results from "Loving Vincent" © BreakThru Films, used with permission. Style image by Vincent van Gogh is in the public domain.

Exhibitions and Art Installations
Machine learning has gathered particular interest as an interactive component of exhibition and art installations, e. g., Tate Modern's IK Prize 2016 winner Recognition 5 uses pattern recognition to compare art to photojournalism. For instance, Adobe's Artistic Eye 6 uses NSTs to enable children transform their self-portraits into artistic renditions in the style of a museum's exhibits, while Becattini et al. [2016] combined NSTs with art explorations, allowing users to scan exhibits and transfer their style to user-de ned images.

CONCLUSION
Deep learning has opened new possibilities for IB-AR to make a generalized style transfer practicable. On the one hand, NSTs provide new potentials for using IB-AR in context-sensitive and creative application domains, such as casual creativity apps for mobile expressive rendering and production tools for feature lms. On the other hand, NSTs currently provide only "black box" solutions from a HCI point-of-view: research (so far) has mainly focused on tuning hyperparameters of deep neural networks. To this end, we propose a semiotic structure to provide developers of NST techniques with the conceptual means of artworks production to help them compose and extend artistic styles, as well as consider design aspects and mechanisms for evolving NSTs as interactive tools. In particular, we hope that this structure helps researchers to identify requirements for semiotics-based loss functions, combine NSTs with the knowledge of other IB-AR paradigms, promote completely new artistic styles, and assist applications in illustrative visualization.
Finally, we argue that semiotics can be considered for de ning artistic style and used to systematically evaluate NST techniques. Eventually, this evaluation should also account for the application space, level of interactivity, and audience including the user's context and environment, skills and competence, and the purpose of 5 Tate IK Prize 2016. http://www.tate.org.uk/about/projects/ik-prize-2016. Last followed: 04/09/2017. 6 Adobe Artistic Eye. http://blogs.adobe.com/conversations/2017/03/ de-youngsters-photos-get-the-look-of-masterpieces.html. Last followed: 04/09/2017. artistic rendering, e. g., the user's task-conditions that a ect the "success" of a NST. For example, while a real-time feed-forward NST of texture and color as semiotic aspects may provide hallucination results of su cient quality in mobile expressive rendering, artists typically wish to have full control over each individual semiotic aspect involved in the composition and transfer of artistic styles.