Deep Visual Instruments: Realtime Continuous, Meaningful Human Control over Deep Neural Networks for Creative Expression

Akten, Memo. 2021. Deep Visual Instruments: Realtime Continuous, Meaningful Human Control over Deep Neural Networks for Creative Expression. Doctoral thesis, Goldsmiths, University of London [Thesis]

COM_thesis_AktenM_2021.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (91MB) | Preview

Abstract or Description

In this thesis, we investigate Deep Learning models as an artistic medium for new modes of performative, creative expression. We call these Deep Visual Instruments: realtime interactive generative systems that exploit and leverage the capabilities of state-of-the-art Deep Neural Networks (DNN), while allowing Meaningful Human Control, in a Realtime Continuous manner. We characterise Meaningful Human Control in terms of intent, predictability, and accountability; and Realtime Continuous Control with regards to its capacity for performative interaction with immediate feedback, enhancing goal-less exploration. The capabilities of DNNs that we are looking to exploit and leverage in this manner, are their ability to learn hierarchical representations modelling highly complex, real-world data such as images. Thinking of DNNs as tools that extract useful information from massive amounts of Big Data, we investigate ways in which we can navigate and explore what useful information a DNN has learnt, and how we can meaningfully use such a model in the production of artistic and creative works, in a performative, expressive manner. We present five studies that approach this from different but complementary angles. These include: a collaborative, generative sketching application using MCTS and discriminative CNNs; a system to gesturally conduct the realtime generation of text in different styles using an ensemble of LSTM RNNs; a performative tool that allows for the manipulation of hyperparameters in realtime while a Convolutional VAE trains on a live camera feed; a live video feed processing software that allows for digital puppetry and augmented drawing; and a method that allows for long-form story telling within a generative model's latent space with meaningful control over the narrative. We frame our research with the realtime, performative expression provided by musical instruments as a metaphor, in which we think of these systems as not used by a user, but played by a performer.

Item Type:

Thesis (Doctoral)

Identification Number (DOI):


Machine Learning, Deep Learning, Artificial Intelligence, Creative AI, Computational Creativity, Expressive Human Machine Interaction, Visual Instruments, Computer Vision

Departments, Centres and Research Units:



31 May 2021

Item ID:


Date Deposited:

17 Jun 2021 14:57

Last Modified:

08 Sep 2022 13:08


View statistics for this item...

Edit Record Edit Record (login required)