Science has always been about improving human understanding of our universe…but scientists have not always prioritized accessibility of their hard-won results. The deeper research digs into specialized sub-fields and daunting data sets, the greater the divide a team must cross to help communicate their findings not just to the public, but to other scientists.
It is cliché: “A picture’s worth a thousand words.” But it’s the truth: strong visual communication helps readers make the choice to dig into dense manuscripts, and helps journal editors decide whose work gets published in the first place. Good dataviz can get complexity across in less time and with less effort, help public audiences grasp science better and appreciate the beauty that inspired the research to start with.
Deciding how to represent research in graphic form is both a little science and a little art: it takes developing an understanding of what information matters and what doesn’t, and how other people will absorb it. Thus it should come as no surprise that in our noisy era, the data artist rises as a hero of both fields: empowered by technology to bridge dissociated disciplines and help us all learn more and better.
This week’s episode is with Kirell Benzi, a data artist and data visualization lecturer who holds a PhD in Data Science from EPFL (Ecole Polytechnique Fédérale de Lausanne). Kirell’s work has been shown in outlets as diverse as the Swiss National Museum, Gizmodo, VICE, and Phys.org. In this recording, we discuss his projects mapping the Montreaux Jazz Festival and the Star Wars Extended Universe, the future of neural-network assisted data visualization, and how data art helps with the technical and ethical challenges facing science communication in the 21st Century.
If you enjoy this podcast, please help us reach a wider audience by leaving a review at Apple Podcasts, or sharing the show on social media. Thank you for listening!
Kirell Benzi’s Website
Kirell’s SFI seminar on Data Art (video)
“Useful Junk? The Effects of Visual Embellishment on Comprehension and Memorability of Charts” by Scott Bateman, Regan L. Mandryk, Carl Gutwin, Aaron Genest, & David McDine, University of Saskatchewan
Visit our website for more information or to support our science and communication efforts.
Join our Facebook discussion group to meet like minds and talk about each episode.
Podcast Theme Music by Mitch Mignano.
Follow us on social media:
Twitter • YouTube • Facebook • Instagram • LinkedIn
Michael: Kirell Benzi, it's a pleasure to have you on Complexity Podcast.
Kirell: Thank you very much for hosting me.
Michael: Yeah. So, this is a little bit more off-the-cuff than I'm used to with this show. You just came and gave a very interesting presentation on your work here at the Santa Fe Institute, but I haven't had weeks to steep myself in your thesis and publications and prior work like I normally do. So, I hope that our listeners enjoy something a little bit more informal and free-wheeling.
But let's start with your history, your background. You talk a little bit about this on the site, but I'd be curious to know what inspired you to get into data art in the first place, in your childhood and in your PhD program, and so on. What brought you to the point that you are today?
Kirell: Okay. So, I started doing digital art when I was really young, I would say, but I was always fascinated by computers. I quickly oriented myself towards a career in software engineering first, and then I decided to do a PhD in Data Science.
But I was missing this part of myself that I used to like, which was about passion and art. What I said in the talk is that people have been asking me, "So, what are you doing?" And I was not really sure the best way of explaining them what I was actually doing. They were asking me this just out of courtesy, they were not really interested in what I was doing.
So, I realized, "Okay, maybe I should start to visualize stuff and get interested in the field of data visualization." Then I realized, "Okay, but what I do could maybe be turned into something more aesthetic, artistic."
And in fact, it all started with the publication I did on Star Wars, my favorite universe. And at the time, there was a lot of coverage, actually, all over the world. People were saying in the comments, "Oh, look at this amazing, beautiful data visualization."
To me, when I look at it, it's like, "It's not beautiful. It's something that we do here every day in network visualization." And so, "Okay, maybe if there's an interest here, I should investigate that and see if I can reconcile both my early passion and what I do right now."
And this is how I really started to think of new ways to basically do science communication, using art. And this is where I am today, trying to blend both worlds, basically.
Michael: So, actually, I would love for you to get into a little more detail about the Star Wars project, because just the methodology of it. This is dipping into a broader question, and I think this would be a good example of this broader question, which is: as you mentioned in your talk today, often, data visualization and data art are about selecting the relevant data.
I used to work as a scientific illustrator, and the reason that I was still getting jobs in an era of photography and high-resolution mechanical imaging was because these photographs have too much information. So, it seems like a huge piece of what you're doing is actually coarse-graining and down-sampling and deciding what information counts and what doesn't, and how that works in telling a story.
So, let's dive into that through this particular project, and why you selected the data points you did, and so on.
Kirell: Okay. So, the Star Wars project, just to recap, the idea was to create a network of characters, of all the Star Wars characters. So, just a quick precedent here. You know that Star Wars, now, has been bought by Disney, right? And what I did was in the Star Wars Expanded Universe, which is not canon anymore.
So, just for the fans here, I'm just [laughs] … Okay, so the expanded universe includes the movies, of course, but the books, the video games, anything, everything related to Star Wars.
So, the idea was to go on this website, Wookieepedia. It's a Wikipedia, but only for Star Wars. If you go on there now, you would find that it's more than 150,000 pages being created by people, referencing the characters, of course, but the planets, the factions, everything about Star Wars.
And being a huge fan, to me, was the perfect opportunity to get some of the data, because it's free, basically, and create this network of characters. The idea is that first, you have to index all the characters in this universe. It was not that easy to do. So, I had to write a small Python program, a robot to basically crawl and get all the pages.
Then, on each biography, the idea is to look for other names. Because now you have this index of all the characters, so you can look whether or not one of the characters is cited in the biography. So, it means that probably they interacted in some way. But we have to be careful in the Star Wars universe that, you know, there's ghosts, so phantoms can appear 3000 years later.
So, it's not the same as with our own real history, just for the fact that Star Wars spans over 37,000 years of history. So, it's very big.
The idea here was that as a scientist, this was really all so interesting, because now, if you start to develop new tools and algorithms to explore this fictional universe, you could apply exactly the same tools for our own real universe.
But because it's very difficult to get sometimes ... like, imagine in the Middle Ages, if you’re in Europe, and you have to get all the different languages, they are basically gone. No one speaks them anymore.
So, just to get the data and digitize them, and be able to explain stuff, it's very difficult, and it's not exactly my core study. So, I said, "Okay, let's focus on developing new tools and methods to get on this easy, I would say fictional universe, because it's in English. It's accessible, you get it on the web.”
And then, try to find ... I know it's interesting, clusters, points, something that would actually make sense. When I started doing this project, it was really about giving an idea of how big the universe was. And it's actually very big. It's more than 20,000 characters. This is why it got a lot of coverage at that time, it was because it was the first time we could have an idea of how big the Star Wars Expanded Universe was. So, this is how it started.
And then, people have been making comments that, "Oh, this is beautiful." It was not artistic at that time, and then I started to improve on the visualization techniques to make it more emotional. And this is basically how I started to work and to do data art.
Michael: So, what were those improvements? What did you tweak in order to make the work more emotionally compelling?
Kirell: Okay, so first … okay, this is my personal preferences, right? But first, use a black background to make it more dramatic, I would say. But it's not only that.
So, here, if you look it up, the shape of the network is really ... It's difficult to describe, but all the principal characters at the center of the piece, and as you would go around, so the less and less popular characters are spread out in a way that it reminds you of, like in a galaxy, I would say, but not as a starry night. It's something more ... I don't know, that draws the audience into the piece.
Then, to create this effect, I basically had to tweak the network visualization techniques, because if you use them in a standard scientific way, you will always get more or less the same shapes. Part of my job now is to take some algorithms that we use in science, but basically tweak them, break them, and change them to create emotional pieces, which cannot really be created just out of the box with the standard techniques, because in these techniques, you want to highlight particular clusters or artifacts, or things that are interesting for scientists to see in the data.
But here, because it goes to create more emotions, you want to maybe even focus more on the shapes overall. What is the effect the piece has when you look at it from afar, from up close? And it's not necessarily the best way to represent it as scientific data, but maybe it's another way, an interesting way, to get the audience attracted to the whole idea of data art itself, and at the end, on the scientific part, like techniques that we use to create the piece.
Michael: So, one of the things that I'm curious about, but I don't remember you mentioning in the talk, was the edge lengths in this. Is that based on the number of co-occurrences, or was that an artistic decision?
Kirell: That's a very good question. In network vis, the problem is that because the network is an ND, n-dimensional space, the number of nodes, you have to make an embedding into 2D or 3D. So, you have to make compromises.
If the graph is not really planar, there's no way you can embed it and keep all the distances perfectly correct. So, you have to make compromises, and what the algorithm will try to do is minimize the edge length to have something centered. Even most pieces will actually be centered around that circle shape.
So, you have to break that, but in this particular case, no. The edge length doesn't mean ... it's a way to embed it into the plane. And the cool thing about this is, because it doesn't really represent the similarity, or just distance, is that you can put the nodes a bit where you want, so you can create more dramatic effect by using this particular property of the graph.
So, no. In this particular case, no. Some other cases, because they are planner, you can do this.
Michael: So, two things came up for me, watching you speak about this particular project, one of which was, recently we had Matthew Jackson of Stanford on. You know he does network economic research. In his book, The Human Network, there's a chapter where he's explaining the different kinds of network centrality, and how those different kinds of centrality quantify different kinds of social influence that a person has, and how those different kinds of influence relate to one another.
He gives a really profound example of the de Medici family, and he shows how de Medici was not powerful on certain measures, didn't have the most wealth, wasn't, in one sense, the most important character to everyone else in Florence at the time. But if you look at the network map, then you actually see that he and his family, through intermarriages, stood at the very center of that map.
It's akin to, especially the way that you said you wrote this bot that scraped Wookieepedia, it's scraped by taking hot links between names of the different characters, and this reminded me of the text modeling work done by another one of our external professors, Simon DeDeo, and how he's used text modeling in this way, through historical documents, like the parliamentary papers during the French Revolution to model who held the most influence at any given time.
So, I'm just curious how much that ... other than simply revealing a beautiful image, how much the narrative component of it and the revelations of creating a map like this figure into the way that you think about this? And if you've been working with researchers like Simon and Matt, what kind of stuff has come out of that? What kind of insights?
Kirell: It's interesting because you were talking about centrality, and it's something that I use all the time, in all of my networks, basically as a prior to layout the piece. So, that's very important information to be able to decipher, because it's very big, right? Who are the main protagonists, modern characters?
So, in the Star Wars piece, of course you would expect this to be the Emperor, and Darth Vader, because they are the most connected. They have actually the highest score. You have different measure of centrality, but in this particular case, they were the highest, for both of these characters.
So, this is a very important measure, and we use it all the time. I have another example of this, a funny one.
When I was doing my PhD, you know, every four years there's a new president at EPFL. I had to make a piece, actually, on EPFL staff. So, I used the same techniques to try to understand who was the most central character, and it turns out that this character became president of EPFL.
Kirell: After, and didn't know about this. But it's very interesting, because you say, that's surreal. So, he had the most connection with everyone in the school, and at the same time, he had one of the highest numbers of citations. You could say, "For once, we can use these network measures to understand and to vote for a new president." Of course, it was not the case, but I think it was an interesting fact that, basically, it was pure merit in this case, according to this network, a centrality measure.
To create this network, what we did was to extract all the papers from all the scientists over the ... EPFL is actually a pretty young university. Last year was its 50th anniversary. But we took all the papers, and found out ... so, connected them by co-citations on all the actors at the EPFL, and we found out more of the core professors were very connected and they were competing for the presidency at some point!
So, it was funny. And also, this now-president-of-EPFL, has also the highest number of patents. So I made another piece, which he has in his office, and the funny thing is, it's a nice way to highlight, "You're here, and you are the most connected, because you have one of the top three number of patents."
And so, this kind of measure there, they really mean something in the real world. It's not just something that we use as a scientific tool. But as you can see here, it represents a real reality that we can chase, basically, what's happening.
So, I'm not sure if that answers your question, but it's something that I use all the time, because it's something that, when we interact as humans, basically, if you have connections, if you have influence, if you have followers, anything, it means something. It is a way of characterizing this influence.
Michael: It does seem like the question of whether it is or is not an ethical tool, to select people who might be eligible for a promotion, I think, there is always that…
Kirell: It is catchy, okay. But it's interesting. It's a measure. Then we decide collectively whether or not we want to use it, whether or not it's ethical or not, but it exists, and it represents a part of reality.
Michael: Right next to this is the work that you did for the Montreux Jazz Festival, which I thought was just extraordinary, because it's taking a similar approach and then using it to draw this map of influence and collaboration of all of these amazing musicians that people will recognize. I would love to hear you talk a little bit more about your relationship with Montreux and with how you built this thing, and then the dome installation. This is a very, very cool project.
Kirell: Okay, so the collaboration is between the Montreux Jazz and the EPFL, which is in charge of digitizing all the archives. So basically the story of the Montreux Jazz is, it's a very famous music festival. It's been on for 43 years now, I think.
The creator of the festival had all the archives in his chalet in Montreux, which is next to Lausanne in Switzerland. The thing is, what happens if there's a fire? We lose something that is invaluable, right? So, the idea was, okay, maybe we should involve another university, because it's nonprofit, right? to be able to digitize and let people create new research projects out of this data, and also let people explore and be able to listen to the concert, watch them, and things like this.
So, this collaboration, we started a long time ago. Basically, seven years ago. I did part of my PhD, also, on this data set that was really interesting. I created some databases about the location of all the artists and the connection, in terms of musical similarity. So, here for this, we had to extract the notes, the rhythm, using the machine learning to match artists if they sound the same.
Michael: And this is sort of similar to Pandora's recommendation algorithm?
Kirell: Yes, or Spotify, exactly. And actually, back in the day, we made a startup about doing music recommendations, which actually was really cool. Didn't really succeed in the end, but we had some cool ideas. And if I might say, they are not still there yet.
So, I'm just going to give you, because I'm teasing you, I'm going to give you one idea that we had, and maybe if some guy from Spotify listens to this, okay, man, you can take it. But the idea was this: you could select any artist or any song that you liked, and it would give you the best transition between all of them.
So, it means you could start with metal and jazz, and say, "Give me a playlist that is smooth to the ear," and it would gradually go from one song to the other. I don't think this exists here yet.
Michael: I think the world has already been super-saturated with human DJs, so we might as well put at least a few of them out of work, yeah. [laughs]
Kirell: Yeah. It wasn't the idea, but … and the thing is, you could add three, four key points. Now, I want to start the party at 8:00, when guests come in, there's some smooth jazz music, but at 10:00, I want to dance, and I don't want to be around my iPhone trying to play and switch. So, let it do automatically all the work.
So, it was a cool idea, and it was a research project. This started from the Montreux Jazz because it's a live concert. It's not exactly the same version that you would have found in studio albums, so we needed a way to be able to explore this data set, and for that, we needed to be able to recommend it to people.
So, what we did is we created this app, and at the Montreux Jazz Festival, people could listen and create these kinds of playlists between all the music from the concert. Just to give you an idea, there's more than 40,000 songs already in the Montreux Jazz, and exclusive. You cannot listen to them anywhere else. It's really cool, you know, like a Sting version of something, then you would go to Queen.
So, I had access to all these songs, but of course, I could not keep them. That would've been great.
So, this was the idea. And so, for the dome installation, it's a collaboration, I would have to say, with Professor Sarah Kenderdine, which is actually in charge of the ArtLab Museum at EPFL. It's a museum dedicated to the collaboration between projects between art and science, and we did this with the company that I'm currently working with, ekino.
We created this installation with them, basically. We have this big network of all the artists that played in the jazz, and you connect them if they're being played together on stage in one particular concert. But because the artists come, each year, Friends and the BB King came 13 times in the Montreux Jazz, each time when he came, he brought a new set maybe, of one different drummer, and different bassist, different guitarist.
So, when you connect them all together, you would find that they create this very high concentrated community around very key, central artists, and would connect every artist in the Montreux Jazz, because they all know each other at some point. If you go look at the core of the network, like the most connected person, if you look at the music genre, you would find that it's mostly jazz or blues, because the music itself calls for collaboration. You want to do a jam, you want to invite people going on stage and playing. "Let's do this song, this classic."
This is why, over time, over 50 years of data, you connect and you have a very large connected network of artists. So, this installation is interactive. It means that you can play it. It's like a small planetarium, I would say. It's a full dome installation, and you basically look up. We have this spherical controller that we built, and then when you move it around, you have this fish-eye effect. So you can zoom on part of the network, and when you click, of course you would go to the artist, and then all these concerts. Then you can basically play each concert, or each song of each concert.
Michael: One of the things that I like about this is, like you said, a lot of people were swapping band members and bringing in new people. We all know BB King, Miles Davis, Herbie Hancock, Chick Corea, Santana, but I love that this allows you to be like, "Oh my God, that bassist is really good. I'm going to follow that bassist over a decade and see all of the gigs that they played with everyone else."
Michael: That is something that the typical ways that we tell history tend to ... you know, the human compression algorithm, or whatever, the way that we have to reduce things into narrative makes it so that history is told by the winners. But then here's an opportunity to tell the story in a hundred different ways that are not about the star players, that are about these people that are largely in the background.
And so, in a way, this kind of research seems really hopeful to me for not just identifying who has the greatest influence, but identifying who has been standing behind those people.
Michael: And then, of course, dome installations are just awesome.
Kirell: Yes. They're actually very hard to create. It's also expensive to move them around, of course, as you would expect. It's difficult because you have to be able to look at the piece in every angle. You can have a 360 view of the whole piece, and so, in terms of interaction, for instance, like when you play a video, how do you make sure that people can see it in the correct angle all over the dome?
So, what we did was to put them into circles and have different circles rotated so that wherever you stand in the dome, you can still see correctly, the whole concert.
So, it was a bit challenging in terms of the organization, instead of just having a screen that everyone knows about.
Michael: So, I want to jump here, because another part of your presentation was about faces, and using GANs to create artwork. This was really different than the network architecture diagrams. I'd love to hear you talk more about using artificial intelligence to make data art, and how you see this as an extension, because it seems to me like it involves a good deal more artistic discretion, and it's a lot less of a strict adherence to data visualization.
And so, I'd love to hear how you got into this, and how you've been working in that space, as well.
Kirell: Okay. You are absolutely right. It's different, because if you look at the result of a GAN, it doesn't tell you anything about the data underlying. It might be …
Michael: Maybe the best thing to do is assume no knowledge on the part of the listener, and talk a little bit about this particular technology.
Kirell: Okay. The idea here is you have two networks, so it's artificial intelligence, this machine learning algorithm that allows you to basically synthesize new faces, images, sounds, anything, basically.
First you have to train it. You give him a lot of images. In this particular example, there's actually a thousand different objects that you give it, give as an input. The idea is that you train it to be able to understand what an image is, as you would do with your own eyes, right? So, I don't know, decipher the color, the shape, the pattern, the textures.
And once the network knows about this, there's another one just behind it, that actually tries to mimic and to generate fake data. This is the first one, sorry. And the second one tries to identify whether or not the image that you generate is true or fake.
So, at the beginning, it's generating noise. So, of course the other network that's a discriminator would say, "No, it's fake. It's not a real dog," for instance.
But as the algorithm improves, because both are training next to each other, they converge to something. At the end, when you do the training right, the images that you generate are so close to the real images that the discriminator cannot say whether or not this is fake or real. And this is how you can synthesize deep fakes, for instance.
Now, with the technology that we have right now, the images that we create are so realistic that most people will not ... you have to really look very closely to see if something is wrong, but otherwise, if you just look at it like this, you would say, "Yeah, it's a picture."
So, this is the idea of GANs, but you can use GANs to synthesize new sound, for instance. People have been doing this with classical music, and being able to generate Bach-like ... it's like, and it's not the real deal. But it's, I would say, for people who are not really trained into classical music, it would sound like what Bach would do, right?
So, it's an interesting technology. And here, the idea is, "How can we leverage this technology to talk about artificial intelligence in general, and get people interested into learning more about it, and not being scared by the headlines that we hear all the time?”
Right? ”I'm going to be replaced by a robot," and then you have The Terminator as an image. It's always like this. [laughs] No?
And then also, you have people like our digital influencers that talk about it, but they are not really qualified to talk about it. And then, because they have 300,000 followers, what they say carries some weight.
So, the idea here is, "Okay, maybe it's better for you to get your opinion from yourself, so get interested into knowing it, because we're going to use AI a lot more in the future. It's going to be everywhere. So, better know it now and not be scared, because we're always scared of something we don't understand."
And the idea to get people interested into learning more about it is to, "Okay, maybe can we create artistic shapes, based on the same technology, and see that it's not that scary after all?" So, this is how I started to ...
It's based on data art, because you need data to train the whole neural network, and you also need data to generate new images, faces or abstract images. But as you said, it's not actual depictions of the underlying data that you have. It's more creative, I would say, representation.
Michael: I'm reminded, I forget who the team was that actually did this, but a couple years ago, somebody fed hundreds of thousands or millions of training images into an algorithm that was supposed to spit out what the internet looks like, and it was a cat. It had made its own cat image. [laughs]
And that's a little bit of a ridiculous example, but it got a lot of traction in the press. In that sense, there does seem to be a really obvious link. Like your biography says, that the purpose of data art is to render a hidden mass visible. And so, in that sense, again, it's about reducing the number of dimensions so that it's actually something that a human can comprehend, that there's something about trying to imagine the internet as like what Rice University philosopher Timothy Morton would call a hyperobject, something that is so fast and so extensive in time, and so many-dimensional that it's impossible for us to conceive.
And so, how you do that in an honest way is a really interesting question. You've done some really interesting work on GANs, like the GAN that you trained on plastic trash, and then gave an image of, was it Dubai?
Kirell: Yes, yes. But you cannot say it. I'm not sure if we can. I don't want to offend the people over there, but … [laughs]
Michael: Well, we're all full of microplastics at this point, but Dubai is not unique in that regard.
But at any rate, I'm curious how else you've applied this particular technique in order to make a statement that is still beholden to scientific standards? This is an area where, like I said, I think you have a lot more room to play artistically. But I'm curious what kind of scientific messages you're trying to communicate with this specifically? Other than just, "You should be more interested in AI."
Where else do you think this is viable?
Kirell: Do you mean for a specific piece, for instance, like in this Dubai and trash thing. This technique was not actually, again, it was like a style transferred. So, the idea is you ... well, it's still artificial neural networks, but the idea is you have this original image that you like. You also have an image of an artist or a texture, basically any other images, and the network will try to combine both so they keep their original, say, topologies, like structure of the photo that you input.
But at the same time, replacing the texture and the pattern with the other artistic images. And so, what I did for this Dubai thing was to take a picture of basically plastic bottle and trash, and it would stylize the buildings in Dubai, in having this very disgusting plastic effect. But which was the idea, right?
And the idea was, of course, to talk about global warming, what we do with plastic, pollution, ocean pollution, anything like this. And I would say in this regard, the message here is more artistic in a way, that it's a way to communicate a message about, "Okay, maybe I should not trash, put my plastic bottle in the sea. Maybe I should put it in the garbage bin."
But I'm not sure if that answers your question, or ...
Michael: Well, I'm reminded of a friend of mine, who came up to SFI from San Francisco last spring, and took the massive NVIDIA face set of all the fake faces that had been generated, and then was able to use, let's say, someone and their romantic partner, and then was able to search that database and pull the faces that were most like their intersection, so you actually get some clue as to what your child might look like as an adult.
Kirell: Yes, we did those kind of experiments, yes. It's very interesting, especially with the beautiful celebrities, which, "Oh, yeah, I should definitely marry, alright, because our children would look so beautiful."
Michael: Right, better than consulting the astrologer.
Kirell: Yes, if we meet, we have to go in L.A. in secret. No.
But yeah. So, we did those kind of experiments, and what we can do now is if you can even upload the picture of yourself and see how the network will actually re-synthesize it. So, it's not perfect, but it's also interesting, because you can age yourself, or get younger, and some old kind of experiment.
And I think there was this FaceApp now, that you can download. It was a bit of a controversy, because you're not really sure ... basically, you have to put your phone number, I think, or your email, and you're not really sure where the images of your face go. So, there was the big backlash over …
Michael: Yeah, that one and Zao, right? There's the question of whether the Russians and Chinese are using this to train their own networks to accomplish some other unrelated …
Kirell: Probably, right?
Michael: Yeah. But I guess what I'm getting at is a little bit more of a speculative question, which is, how do you imagine that this kind of thing, like virtual breeding, not just of people, but of ideas, of data sets, and so on, might be used to help people make better predictions in the future, or to achieve other … Like, it is a little bit more artsy, but it does seem like it could be wielded in a really interesting way toward pointing people toward a productive avenue of research. I'm just curious. Yeah, just off the cuff.
Kirell: Yeah, it's a difficult question. I think part of data art is about this, about getting people interested, and giving a message. But at the same time, it's very difficult to speculate about what the future may be, right? And at the same time, because you introduce bias, because if you put in a certain direction, people will ask you ... and actually, I've had this question last week. People have been asking me, “But are you manipulating people by selecting just part of the data, some of the subset, and showing …?”
And I was like, "Yeah, but it's in a good way, because I'm legit, right? I'm a scientist, so it's ... " And of course, it's not a very good answer. But the thing is, there's always a bias. If you do any kind of visualization, you actually take a stand. There's no true objectivity anywhere.
When you read a newspaper, it's far from being objective. So, there's always our own bias that comes into play. So, I think it could be used, but I think the main thing here is that we could appreciate anything, but you be wary of, "Okay, there have been some trade-offs here.” And this is why it's better to get interested in the field, so you can make sure that you make the right assumptions, and you're not being manipulated too much," because otherwise you will always have this question of, "Who is this guy? Is it really legit, or is it trying to manipulate me into doing something?”
Maybe it's stealing my data, right? I have this question all the time. "Oh, but you're stealing data!" No, I'm not. But it's difficult, especially for the general public and audience, that they don't really understand necessarily all these questions. If you have a headline, "Facebook lost 150 million passwords," then it's very hard to come in and say, "Yeah, but I do art with big data." You see the point, right?
So, I don't know, we need more of this, but it's not really objective.
Michael: Yeah. So, this seems to touch in on a question I asked the SFI Facebook group, if they had any questions for you, right before we got onto this. And one of them came from one of our volunteer moderators, Tim Clancy.
So, he says, "We often over-focus on techniques in the sciences, but sometimes the best advice comes from approaches that are technique-generic, how to define a problem, how to understand audiences, identify boundaries, et cetera." And he wanted to know, do you have advice on improving communication with data art, things to consider? And noodle, before diving in with a specific technique or presentation style.
I mean, obviously, the question of, "Am I creating an emotionally manipulative movie trailer that's going to make you cry no matter who you are?" And, "Who am I serving with this?" You just addressed that, but zooming out a little furth er, I think what he might be asking is in part, are there consistencies in the heuristics that you use to determine what data you leave in, what you exclude, how you decide to represent these things?
Kirell: It's very difficult. I found ... how would I say that? So, obviously, there are some dimensions that are interesting to explore because they have more variability, for instance. This is good. You know that if there's a big change, it's going to be interesting visually. If it's average, and everything is very similar to each other, then it's going to be very cluttered, and it's not going to be very interesting.
So, I would say that you try to look for dimensions that are interesting to visualize, and have more variability.
But other than that, yeah, as you said, it's about who is the target audience, and what are their backgrounds? So, it's a bit difficult if you make an art piece, because you cannot really choose the audience. But when you give presentations, this is why I give presentations, because I think it's easier to understand the whole data art thing if you have the guy explaining to you, not necessarily what it means, but how it was made, actually. So, to get interested in the science. We go back to this.
I'm not sure if I have anymore techniques in sort of regular database principles. So, the audience, the storytelling part, who is it for? Then, "Okay, what dimensions could be interesting to visualize?" And how they relate to the original storytelling and message that I wanted to make.
This is what we do all the time in data vis. If you do data journalism, for instance, on The New York Times, before trying to do anything, they have ID, and they have to validate whether or not it works in the data. But I would say it's a bit different than what we do in scientific visualization. Here we explore data, and then we find new patterns, and then we investigate. But in most articles that you see online, they already have a headline, so they know what they are looking for.
It doesn't mean that you cheat. It doesn't mean that you select part of the data. It just means that you always have an opinion. You cannot make something out of being completely objective.
So, I don't necessarily have a technique. I would say it's tentative, I don't know, techniques that we have, but not something specific. I don't know.
Michael: Leading with the opinion, versus finding results in the data, is that the only difference, or are there other significant differences between communicating to a scientific audience, versus communicating to the general public?
Kirell: If you talk to a scientific audience, it's going to be tougher, as we saw today in the talk. [laughs] Because they know me. And it's a legitimate question they want to ask. "Okay, but is it really useful? Do we have a market for these kinds of work?" And of course, the answer is yes.
But as you can see, for instance, a lot of exhibition, even trade shows, see as they add some artistic data visualizations. So, there's a market, of course.
But when you talk to the general public, think about it. If you go to a museum, how often do you read the description? Right? Not that much.
So, how can you make sure that your piece stands out? You need to have something, a good impact, a strong impact visually, to make sure that people, at least they go towards the room because they see it from afar.
Then, you have to make sure that they read the description, which is very difficult to do. Sometimes I go undercover, and I look at people, and I see that they don't read the description, and I'm pissed off, because I'm like, "Read it. It's better." Half of the work is in the description, because you understand that it's real.
This is why I switched to a different approach, and try to talk and to explain the pieces, because then you cannot skip me, basically. You have to listen to what I say.
It's not the same approach. You have one message. If you want to make it in front of a general audience, you have to be very clear about the message that you want to ... and people, basically they take what they want from the piece. Sometimes they only like the aesthetics. I've been running a study and asking people whether or not the message was important, and people said yes. But of course, some of them said, "No. Just if it's beautiful, I like it." It's cool, but it's not exactly the point.
But for a scientific audience here, they have this analytic mind, and it's good, right? So, they have to know whether or not it's efficient, whether or not people will like it, and this is why I started this data art study. I have more or less a scientific study, on whether or not people find value in data art, and what is this value?
For instance, people have been saying, "Okay," the question was, "Is it different from other visual arts? And does it add something from other visual arts?" And people have been saying yes at 75%. So, it's a very small number of participants, 200, I'm sorry. But it's growing.
I find it interesting. This is why I try to approach, also, data art as a scientific practice, and people have been actually publishing papers in the datavis community about this. So, people are interested in the effect, the perception that we have, and I'm trying to be in both worlds.
So, when I talk to scientists, I have to show data, facts. When I talk to the general audience, I just have to show emotions and have something compelling, so people will actually go and read the whole thing.
Michael: So, it seems like in both cases, and I think we've been kind of dancing around this the whole time: in both cases, it has to do with how much information a person can take in at any given time, and how the way that you represent that information is able to exploit people's attentional biases.
It's common. In journalism, they say, "You don't ask more than three questions in a row." I forget who it was, but it was an interesting project that I circulated in-house, here at SFI, where this team had redesigned the poster presentation at scientific conferences so that it was just very bold and colorful, and actually very low information content, and then it pointed people to all of this stuff on some website that you could go to if you really wanted to know more. It became more of an advertisement for the research than an attempt to cram everything onto a poster, at which point you're just lost in the noise at these conferences.
All of that information just becomes like television static.
Kirell: Exactly. And you know when you go to conferences, there's a best poster award. And usually, the winners are the ones who have less text, more visuals, and link to the actual papers, and a QR code, or something like this.
So, I think there's a nice blend between ... it would be good to have designers also help scientists to make more impactful communication, because otherwise it's a lot of text, as you said. Do you want to go close, read? It's font 8, and you have to read all the ... it's not very fun, until you talk to the guy. But it's very noisy.
So, it's not very efficient. No, it's true, right? So, here we tried the opposite approach. We give something very visual, and we let people, if they're really interested in it, look it up online.
Michael: So, this seems to be how you would articulate this argument, that you pointed to at the end of your talk, where you said, "It's becoming more and more important for people to take seriously the science that suggests this as an improvement on communication techniques."
And specifically, when it comes to the people that are reviewing hundreds and hundreds of research grant applications, for example, how do you get your project to the top of that stack? How do you get people to actually pay attention to you?
To me, as someone with a background in both the arts and sciences, it seems obvious. But I just want to hear more from you about how this seems like you're on the winning side of history here, insofar as, the design is now a part of the way that we have to even request for the kind of support that it requires to do this research, and that there's an arms race going on here in an attention-starved global system that, in a weird way, is bringing art and science back under the same roof as necessary facets of each other's work.
Because frankly, as an artist and designer, if you don't include the scientific, it also is not as strong. I know a lot of fine artists whose work I love, but at the same time, I don't think their work has the broad appeal, the stickiness, the depth that it could if it were touching into this other stuff as well.
So, you said you've got friends that are researching this right now? What are their findings?
Kirell: Yeah. So, two things. Just in Europe, for instance, if you want to ask for a grant to conduct research, they ask you specifically, "What are you going to do to make sure that you communicate about this research?" And you have to justify it, otherwise you don't get the funding.
So, this is something that, it's not made up. It's actually part of the papers ... things that you have to write to make sure that people will ... because it's tax payers money, right? So you have to make sure that they get something back.
So, being able to do data art is actually most of my work, and clients are actually scientific institutions, and they use it to promote their papers.
I had this professor asking me to create a piece, because he wanted to publish in Nature. And so, he'd say, "If I have a good, strong visual that makes sense because it's based on real data, I have a stronger chance of getting accepted, because it's always cool to have a nice visual to illustrate."
So, data art, first of all, there's a market, but there's also a real appeal, and people understand this. Another facet is also social media, which is weird, if you think about it. But we are humans, right? We are scientists, but we still have Twitter.
And something that I wanted to connect to is research, so it's something that is just starting, is how much social media changes the way you are cited, basically. So, people will scream, I think, when hearing this. It's like, "Yeah, but the real research is about the peer review, then real science." Yeah, of course, but I'm sure that if you have a guide making sure that you have good social communication, you would get cited more, just because you have more exposure. So, maybe the work is a bit less good than the others, but if they are better at communicating, you lose.
Michael: Yeah, actually, I think there was ... James Evans, another one of our external professors, was on a team that published kind of related findings, looking at citation networks. You're looking at how people who publish at a prestigious institution might be publishing the same findings, more or less, as someone who's not. But then, again, it is an unpleasant fact of the world that these things matter.
Kirell: Exactly. And also, not only in your name, because most things are double blind, but it's not really true. You know who the guy is. And so, depending on the institution, whether or not you're friends with the guy, it also has a big impact on the publication.
Every scientist will know that sometimes you publish exactly the same paper, and you can get rejected for nothing. There was a study on NeurIPS acceptance rate, and they did a study where basically, they put the paper in two different sets of reviewers to see whether or not the paper was accepted. I think the result was actually weird. Half of the time, it would get rejected, and half of the time, it would get accepted. But for exactly the same paper, for the same conference. So, showing the randomness of acceptance.
So you need other factors to also have your research shown to others. Now, it's cool. You have arxiv. You can put pre-prints online and let people decide whether or not it's worth it.
But basically, the whole thing ties to how we now do science, because before, you had prestigious journals. It was very expensive. It's still very expensive to publish there.
But now people are asking for more open access, open science, open data. So, I think this is shifting. In this world of, “anyone can publish,” you need also ways to stand out and strong visuals to make that people will actually read your paper. Because if you have 100,000 papers to read, you're human. You will find one that has a good title and nice images, and then you will start to read and see whether or not it's a good paper.
So, I think as you said, now, we put art and science under the same roof, just to have our research discovered, I would say.
Michael: So, Karen Rhodes was at your talk today, and she wanted me to ask you, when I pulled for follow up questions, she wanted me to ask you if you are aware of publications or data of some form that visualization actually helps people grasp and remember the information they're being presented with?
Kirell: Yes. There is a nice paper, and I forget the name. It's called “Chart Junk, The Effect of Visual Embellishment,” so you can look it up online. I'm not sure. The idea was exactly this. It was not about data art, it was about datavis itself.
Michael: "Useful Junk." Yeah, we'll link to this in the show notes.
Kirell: And the idea was, so they took some illustration from journals, and they asked people, "So, there was a study," and whether or not the fact that it was embellished.
So, there was like, if you look at the paper, you would see that one is a monster, because it's about monstrous costs, and the other one is just the same information as a simple, plain bar chart.
And they asked people whether or not they remember, five minutes after, the information. People would say, "Okay, it's the same." But then they asked two or three weeks later whether or not they would remember what this was about, and as you would expect, the charts with the visual, the illustration or the funny things, it would help people remember the data and the subject much better.
And this was also one paper that I used to motivate the data art study, basically. But there's a downside.
So, it's good for impact and for engagement, but of course, if you now publish a new study on cancer research, and you put some meme with the data, it's not going to look serious. So, this is where you have to draw the line between being very serious and being able to communicate to a general audience, and I think we can find a common ground.
So, basically helping scientists to learn more about datavis so they have better, effective charts, and keep the more ... I would say, funky, more beautiful visualizations for when you do mass communication. I'm not sure if that makes sense.
Michael: Yeah. So, I think about this, to turn it inward on the scientific process, also, and not just about communicating results. There are Chernoff faces, which is this whole thing of, the fact that the human attentional system is tuned to face data.
In particular, I gave David Krakauer here a copy of one of my favorite science fiction novels, Blindsight, by Peter Watts, in which one of the characters is a vampire, and his data visualization cockpit is tormented human faces. So, rather than graphs or various network diagrams, the best way for him ... and it's a dark joke on what is actually, probably, I would imagine, going to be a growing trend.
And I'm curious what you think, because some of your stuff, your GANs, involved simulating human faces that, in theory, you could then use as templates for a much richer and higher-dimensional set of data than you might with some other forms of data visualization.
And so, I'm curious, there's the sort of dark side of thispersondoesnotexist.com, is that you're going to start getting phone calls from people that don't exist, and so on. But the bright side is that we can really hijack the human bias for faces in a way that helps us learn, by representing data in the form of a face. And I'm just curious what you …
Kirell: It's very meta, yeah, but it's cool. So, first, yeah, the deepfakes. It's going to be a real issue, we agree. The people, of course, are finding this. At the same time, people are generating faces, most are scientists, and at the same time, developing new techniques to check whether or not the image is fake or not.
But yes, the biggest downside of this is that maybe video evidence would be gone, basically, because you can also, "No, it's fake. It's not me. It wasn't me." The song, right?
Michael: Right, yeah.
Kirell: So, this is a big issue. But of course, it's going to be cat and mouse. You play, and you have techniques that check whether or not this is fake or true.
But it's always going to be like cyber security. You have a new uncrackable software. A week later, it's cracked. It’s going to be like this.
And going back to your question, yes, we're very good at discerning if something is wrong with the face. And this is why, even with the most top movies in the Star Wars, for instance, there's some weird artifacts that we see, for instance, when they tried to make Carrie Fisher younger. It looks real, but there's something wrong. And I think it's the micro-expressions that are wrong. But as we develop the technology, we're going to be able to do something completely perfect.
But I'm not sure if it's really ethical to play with faces. The cool thing is, because we're going to be able to create new faces, yeah, it will mean less bias for diversity. It's going to be helpful to be able to play with that. But I'm not really sure if we should.
Michael: Yeah. You're right that it does get back to this issue of how much are we exploiting the viewer? But at the same time, there are so many issues now, involving radiation spills, or other forms of ecological catastrophe, or natural disasters, or whatever, that, because of the scale, or because of the abstraction of the issue, they don't register with people.
So, I just think about what it's going to take to motivate people to act when ... I mean, I'm Luke Skywalker-ing here, on the edge of the dark side. But if you could make a face that is the Gulf of Mexico, and then …
Kirell: No, I’m going to tell you, because what you're saying is exactly what the product that we had in mind for …
So, we work with the committee that creates the paper for the COP25, that recommendation that we should keep under 1.5 degrees, and things like this. So, it's a committee, and they go to Europe, and they ask them, basically, to write the report. It's impossible, it's like 400 pages, and no one reads it, of course. There's no images, and there's very scientific plots that no one understands.
And they say, "Okay." I've been contacted with one of those guys to create, "Let's see if we can make data art, to make the message be simpler."
I had this idea of actually taking the face that you are looking at right now on Instagram, and having half of it completely ... because we have GAN, so we can get people older, younger, and transform them into monsters, and have somebody have a monstrous face that will grow according to the temperature rising, because we don't actually do much to reduce the pollution.
So, I wanted to use that. I don't know. I didn't do it, but maybe you're right. It depends on what you want to do, because I want to get close to the scientific world, but in this case, maybe we just need to strike a big hit on the whole thing, and say, "We should use faces. People recognize faces. We should deform them and look ugly.” Like special effect makeup, when they create these horrible faces. And then, you have this tension, you don't like it, but at the same time, it's fascinating. And maybe this will help people act. But I don't know, maybe I should try it and see the reaction, right?
Michael: Yeah. Well, I'm thinking of ... was it Japan? I forget who did it, where they took the speed limit signs and then, rather than it just flashing you and warning you when you're going over the speed limit, it had a smiley face that turned into a frowning face. It was just like a little cartoon, but they found that the social reinforcement of having even a cartoon smile at you when you're doing the responsible thing, or frowning at you when you're putting other people at risk, dramatic improvement over just threatening people with punishment.
Kirell: The social pressure, because you recognize the face, right?
Michael: Right. So, the thought of like, your virtual assistant, that most of us already have.
Kirell: It's judging you.
Michael: Yeah, but your virtual assistant is going to judge you. People opt in to this, that it becomes a gamified thing, like your diet. You would FaceApp, you would see yourself with acne, or with another 50 pounds, or whatever.
I don't know. We're completely off-center from where this podcast normally goes, but I think that you work at such a juicy, interesting intersection of techniques and possibilities. So, thanks for indulging this.
Kirell: Well, thank you very much.
Michael: Yeah. What are you working on now? What's on the horizon for you?
Kirell: So, two things. I'm still working with GANs. The thing is, for now, the GANs that we have are very low-res. The maximum resolution that we get is basically 1024 pixels square. So, it's not much. So, I've been investigating using other AI techniques to do super resolution, basically, so we have very high resolution onto a smaller model.
And of course, creating more and more, I would say, artistic results from GANs, so it's something that you wouldn't necessarily recognize when you look at it. It could look beautiful, but you have no idea the different ingredients that composed this image, so you have to, once again, click and look it up. And it's, "Ah, it's funny. There's like 14% of jellyfish in this image.” And we see it, but, "Okay, maybe now that you tell me that there are some jellyfish in here, okay, I see it."
So this is one part, and the other part that I'm really focusing on is going back to fractals, or math, because, as I said in the talk, it was popular in the late '80s, beginning of the '90s, and nothing happened afterwards. And now, the software that we have to create fractals is awesome. It's really, the possibilities are endless.
What I try to do, because, okay, fractals are mathematical functions, but I work with data. So, I try to blend both fractals and data, so fractal data art. Wow, it's a fancy keyword. This is what I did with one piece: the image was generated by just fractals, but the animation itself was parameterized with data. So, if you look it up, the animation, one piece is about sea level increase, because the glaciers are melting, because it's global warming, as you know.
So, the idea here is that the animation, you would see that something looks like a coral shape, more or less, and it would grow according to the increase of the sea level, basically. So, I tried to combine both data, and what I like about fractals is that the image, it's infinite resolution, basically. It's just, you need to have a very big computer.
So, it's very aesthetic. People love fractals, but not the usual Mandelbrot things. There's something more ... you have to look it up, to see what people are doing. It's beautiful. So, people like it because it's beautiful, but at the same time, we give back meaning to this. You talk about math, you talk about data, and you have this animation. You have everything that you need. Something that moves, something that is beautiful, and that has meaning.
So, this is something that I really want to investigate some more, because I find it really interesting.
Michael: Yeah, that piece in particular actually reminded me of, if you know Scott Draves, Electric Sheep? So, back in 2005, he started doing this screensaver that … everyone running this screensaver was contributing to the computation of the next few seconds of the animation.
And so, then people would vote up and down, and it evolves over time. It's gone through this extraordinary nonlinear trajectory, where it would gain complexity, and then it would collapse and be very simple, over the last 15 years.
But you've got the sort of high resolution and narrative version, rather than just this sort of random walk through design space that is represented in his stuff.
Anyway, just in case people know Electric Sheep, then that's kind of what this looked like. But it was also sort of ominous, as the thing grows, and you realize it's rising sea levels, and it's like an oncoming train. [laughs]
Well, gosh, anyway, Kirell, it's been such a pleasure to talk to you. Where do you want to send people, and do you want to leave them with any parting thoughts?
Kirell: First, thank you for having me. It was really a great ... I mean, even the campus here, it's awesome. It's the best research center I've ever seen. So, people, if you don't know here, it's beautiful. The scenery, everything is great. The people here are great. Everyone is great.
And so, if you guys are interested in learning more about what I do, I strongly suggest that you visit my website. It's KirellBenzi.com.
And also, if you have ideas and collaborations, this is something that I really look for. Like scientists here, anyone that is interested in having this research, and investigate together, if we can make some new art pieces with this, because I'm a scientist, but I don't do much science now. What I do is, I'm the bridge between scientists and the general public here. So, I need also scientists to collaborate with me, to give me part of their research, so we can make new pieces. So, if you guys are interested, I'm all in.
Michael: Awesome, thank you so much.
Kirell: Thank you very much. Bye bye.