Wes Souza

161 Final Project

A Directed Graph Visualization of Memes


Introduction

I used to map internet phrases as a function of time to frequency, then I took an arrow to the meme[1][2]. Memes are colloquially understood as ideas that are propagated by the internet, usually due to some esoteric humor. However, the term originated from the evolutionary biologist Richard Dawkins in his book The Selfish Gene, in which it is defined as any unit of cultural influence - analogous to the functionality of human genes. MemeTracker is a utility that records popular phrases written on news websites ranging from major news bullitons to smaller blogs [1]. Researchers used the recorded data to analyze temporal patterns between the time of the original posting of a phrase and subsequent postings of similar phrases [3]. I aim to highlite alternative aspects of the data, such as the topology of the evolved phrases by visualizing the data as a directed graph.

Description

MemeTracker data is organized as a list of phrase clusters. Each cluster is a set containing a unique root phrase and subsequent variations of this root phrase. I will impose an approximate substring relationship on all phrases in a phrase cluster as described in Leskovec and Backstrom's paper[3]. A phrase p is an approximate substring of a phrase q if either: 10 sequential words form p form a exact substring of q, or if the directed edit distance from p to q is less than some constant k. An appropriate value of k will be determined through trial. I will use Levenshtein Distance to compute the directed edit distance [4]. This relationship yields a directed graph structure: each node is a phrase and each directed edge is pair (p,q) such that p is an approximate substring of q.

The visualization will portray the graph along a time axis. Each node will be displayed as an orb with relative size proportional to the frequency of the phrase. To focus on the topological aspect of the graph, the orbs won't display text until clicked. Edges will be displayed as arrows.

Timeline

This timeline will acheive the required checkpoints:

References