Research Experience

Research Projects

Generating Variations in a Virtual Storyteller
[LREC 16]
[IVA 15]
[ICIDS 13]
Dissertation, University of California Santa Cruz, Santa Cruz
January 2013 - December 2016
Advisor: Marilyn Walker
This dissertation introduces the Expressive-Story Translator (EST) content planner and Fabula Tales sentence planner in a storytelling natural language generation framework. Both planners operate in a domain independent manner, abstractly modeling a variety of stories regardless of story vocabulary. The EST captures story semantics from a narrative representation and constructs text plans to maintain semantic content through rhetorical relations. Content planning is performed using these relations to enhance narrative effects, such as modeling emotions and temporal reordering. The EST transforms the story into semantic-syntactic structures interpreted by the parameterizable sentence planner, Fabula Tales. The semantic-syntactic integration allows Fabula Tales to employ narrative sentence planning devices to change narrator point of view, insert direct speech acts, and supplement character voice using operations for lexical selection, aggregation, and pragmatic marker insertion. The frameworks are evaluated using traditional machine translation metrics, narrative metrics, and overgenerate and rank to holistically test the effectiveness of each generated retelling. This work shows how different framings affect reader perception of stories and its characters, and uses statistical analysis of reader feedback to build story models tailored for specific narration preferences.
Automatically Identifying Sarcasm in Online Debate Forums through Bootstrapping
[NW-NLP 14]
[LASM 13]
Master's Project, University of California Santa Cruz, Santa Cruz, CA
March 2012 - March 2014
Advisor: Marilyn Walker
More and more of the information on the web is dialogic, from Facebook newsfeeds, to forum conversations, to comment threads on news articles. In contrast to traditional, monologic Natural Language Processing resources such as news, highly social dialogue is frequent in social media, making it a challenging context for NLP. This paper tests a bootstrapping method, originally proposed in a monologic domain, to train classifiers to identify two different types of subjective language in dialogue: sarcasm and nastiness. We explore two methods of developing linguistic indicators to be used in a first level classifier aimed at maximizing precision at the expense of recall. The best performing classifier for the first phase achieves 54% precision and 38% recall for sarcastic utterances. We then use general syntactic patterns from previous work to create more general sarcasm indicators, improving precision to 62% and recall to 52%. To further test the generality of the method, we then apply it to bootstrapping a classifier for nastiness dialogic acts. Our first phase, using crowdsourced nasty indicators, achieves 58% precision and 49% recall, which increases to 75% precision and 62% recall when we bootstrap over the first level with generalized syntactic patterns.
Mining Online Discussions to Inform and Engage
[EACL 17]
University of California Santa Cruz, Santa Cruz, CA
September 2011 - March 2012, January 2017
Advisors: Marilyn Walker, Steve Whittaker and Pranav Anand
Americans spend about a third of their time online, with many participating in online conversations on social and political issues. We hypothesize that social media arguments on such issues may be more engaging and persuasive than traditional media summaries, and that particular types of people may be more or less convinced by particular styles of argument, e.g. emotional arguments may resonate with some personalities while factual arguments resonate with others. We report a set of experiments testing at large scale how audience variables interact with argument style to affect the persuasiveness of an argument, an under-researched topic within natural language processing. We show that belief change is affected by personality factors, with conscientious, open and agreeable people being more convinced by emotional arguments.
Dialogue Management Google, Summer Software Engineer Intern, New York, NY
June - September 2016
Advisors: David Elson
Created a domain independent dialog manager. Collaborated with researchers to design a domain independent grammar. Built a sustainable and reusable pipeline for scraping training data
Identifying Emotions in Online Conversations Microsoft Research, Summer Research Intern, Redmond, WA
June - September 2015
Advisors: Kristin Tolle, Evelyne Viegas, Chris Brockett
Collected a corpus of 12K conversations annotated with sentiment and emotion. Manipulated recurrent neural network architecture to generate personalized responses.
Building Community & Commitment with a Virtual Coach in Mobile Wellness Programs
[IVA 15]
PARC, A Xerox Company, Summer Research Intern, Palo Alto, CA
June - September 2013
Advisors: Michael Youngblood and Ashwin Ram
FittleBot is virtual coach provided as part of a mobile application named Fittle that aims to provide users with social support and motivation for achieving the user's health and wellness goals. Fittle's wellness challenges are based around teams, where each team has its own FittleBot to provide personalized recommendations, support team building and provide information or tips. Here we present a quantitative analysis from a 2-week field study where we test new FittleBot strategies to increase FittleBot's effectiveness in building team community. Participants using the enhanced FittleBot improved compliance over the two weeks by 8.8% and increased their sense of community by 4%.
Improving Internet Speed: A Comparison of Round-Trip Time Algorithms
[ICCCN 11]
Summer Undergraduate Research Fellowship in Information Technology, University of California Santa Cruz, Santa Cruz, CA
June 2010 - September 2010
Advisor: Katia Obraczka
In this paper, we explore a novel approach to end-to- end round-trip time (RTT) estimation using a machine-learning technique known as the Experts Framework. In our proposal, each of several "experts" guesses a fixed value. The weighted average of these guesses estimates the RTT, with the weights updated after every RTT measurement based on the difference between the estimated and actual RTT. Through extensive simulations we show that the proposed machine-learning algorithm adapts very quickly to changes in the RTT. Our results show a considerable reduction in the number of retransmitted packets and a increase in goodput, in particular on more heavily congested scenarios. We corroborate our results through "live" experiments using an implementation of the proposed algorithm in the Linux kernel. These experiments confirm the higher accuracy of the machine learning approach with more than 40% improvement, not only over the standard TCP, but also over the well known Eifel RTT estimator.

I was selected to participate in graduate research with the Computer Engineering faculty and graduate students at UC Santa Cruz. I implemented into their test bed the Eifel Round-Trip Time algorithm. Using a live network, I performed tests to compare and evaluate Eifel's efficiency against the algorithm the research group had previously created.
Detection of Malware through Statistical Analysis of Source Code
Hauber Science Research Fellowship, Loyola University Maryland, Baltimore, MD
June 2009 - May 2010
Advisors: Dawn Lawrie and Dave Binkley
In my first research experience, I evaluated a novel process devised by my advisors to detect malicious software in programs. I create a program to detect outliers by performing statistical tests, such as Kullbeck-Leibler Divergence. By the time I had finished, we could correctly identify an unclassified file from a collection of source code written by distinct authors 90% of the time.

Interactive Video Art Exhihit
Senior Project, Loyola University Maryland, Baltimore, MD
August - December 2010
Faculty: Roger Eastman, Daniel Schlapbach
My senior project at Loyola University Maryland was a collaboration with video art students and professors to create a week long public exhibit featuring the intersection between computer science, technology, and art. I managed the exhibits, coordinated efforts between disciplines, and built software that utilized the Arduino boards to create an interactive experience. Below are pictures of from the exhibition.