|MIT Department: Electrical Engineering and Computer Science
Faculty Mentor: Prof. Jacob Andreas
Undergraduate Institution: Allegheny College
My name is Teona Bagashvili. I am originally from Tbilisi, Georgia (country). I am a junior at Allegheny College, majoring in Computer Science. During my undergraduate years, I have worked as a software engineering intern, technical leader, and research and teaching assistant. My current research interests include Software Engineering, Natural Language Processing (NLP), and Data Science. Some of my interests outside of academia are watching movies, spending time with my friends, and traveling.
Language Based Image Editing with Neuron Captions
Teona Bagashvili1, Evan Hernandez2, Jacob Andreas2
1Department of computer science, Allegheny College
2Department of Electrical Engineering and Computer Science,
Massachusetts Institute of Technology
How can we automatically edit an image given a language description of the edit? One way is to look at generated images, and manipulate the models that generated them. We focus on Generative Adversarial Networks(GANs). GANs are networks that can generate photo-realistic images from scratch. We show that it is possible to perform fine-grained, localized edits of GAN outputs by selectively activating neurons based on their descriptions. Hernandez et al has recently developed a method for generating descriptions of individual neurons in deep networks. It has been used for analyzing network behavior, however, we’re going to demonstrate that it can also be useful for changing network behavior. We select the neurons based on the textual similarity between language description of edit and neuron description. To activate the selected neurons we compare three different ways of modifying the neuron activation values. 1. Find the maximum activation value for each neuron across the large collection of generated images and edit the image by setting the relevant neurons to the respective activation values. 2. Find optimal activation values using gradient descent 3. Maximize the activation values by multiplying them by the constant value. The preliminary results show that the first and the second methods are more effective at making photo-realistic edits to the image. However compared to the second method the first approach ensures that only the user-selected region of the image gets affected.