By Joseph C. Osborn, advised by Prof. Michael Mateas.
Documentation for CMPS 261, Fall 2013.
This page describes the Gamalyzer, a tool for the exploratory visualization of arbitrary games which clusters together similar game play traces (sequences of user decisions). Our two main contributions are a game-independent play trace similarity metric based on edit distance (the Gamalyzer metric) and a visualization technique (the Gamalyzer Visualization) which uses this metric to meaningfully show similarities and differences among large numbers of play traces.
Unlike previous play trace visualization tools, Gamalyzer is more concerned with trajectories than with states. Gamalyzer shows play traces directly and leaves spatiotemporal and other displays of game state up to game- or genre-specific auxiliary visualizations. The Gamalyzer metric can be applied to player and opponent modeling, goal recognition, adaptive tutorials, or even domains outside of games: anywhere one might want to compare sequences of (goal-directed) actions. The Gamalyzer visualization can quickly and inexpensively show designers the strategic landscape of their (arbitrary) game, without requiring specialist statistical or machine learning knowledge on the part of the designer.
We validate the Gamalyzer metric against one synthetic and two real-world data sets, finding that our general-purpose metric is nearly as good as hand-tuned feature selection for discerning designer-relevant differences between play traces.
Gamalyzer is based on the Gamalyzer Metric, an application of Chhieng and Wong's Constraint Continuous Edit Distance (CCED) to play traces. The key differences between CCED and regular edit distance are that edit paths are constrained to prevent matches of early parts of one sequence against later parts of the other, and that change operations replace match operations. These changes may be of variable cost; Gamalyzer's are defined based on an input-versus-input dissimilarity function grounded in a lexicographically weighted product order.
That's a mouthful. Basically, the input to Gamalyzer is a set of sequences of game events. Each game event (also called an input) has four properties: a time point, a player identifier, a determinant, and a value. The determinant is a token which identifies an input type—a regular move versus castling in chess—and the value is a compound term comprised of round lists (terminated with parentheses), square lists (terminated with square brackets), keywords, and numbers. The difference between two keywords is 1.0, the difference between two numbers depends on their distance (normalized with respect to their observed range), the difference between two square lists is the average distance of the corresponding members, and the difference between two round lists is a weighted sum with larger weights near the front of the list. The difference between two events with different determinants is also 1.0.
As is traditional, we implement CCED with a bottom-up dynamic programming algorithm. Because we adopt the Sakoe-Chiba Band (the constraint mentioned earlier), we have an algorithm which is basically linear with respect to the length of the first sequence of the pair being compared.
The Gamalyzer metric emits a time-series of dissimilarity matrices for every pair of play traces up to a given edit operation, and the Gamalyzer visualization makes use of these data.
Fig. 2 shows a typical Gamalyzer display for data from the Mario AI Benchmark. The most prominent features are the colored circles, one for each input. Each polyline connecting those inputs represents a play trace, and time proceeds from the bottom of the image towards the top. The Mario AI Benchmark receives 24 inputs per second and games last up to one minute; this has been downsampled to one aggregate action per second to simplify the visual display. Similar traces are nearer to each other than they are to dissimilar traces, and as traces become more and less similar over time they converge and diverge. The degree to which the incremental similarities influence the layout can be controlled by the user; Fig. 3 shows the case where only overall similarity is used. To faciliate quick visual comparison, different determinants are arbitrarily assigned different glyphs (see Fig. 1) and different value combinations are arbitrarily assigned different colors. When the user mouses over one or more inputs, their contents are shown in a hovering information box.
This is academic prototype software at the moment, so configuration unfortunately requires some coding. Also, the data sets cannot be packaged with Gamalyzer; please email me and I can share them on a limited instructional basis.
mode
definition in cli/ui/core.cljs
to one of :mario
, :refraction
, or :synthetic
, then run lein cljsbuild auto
.lein ring server
and you're off to the races! Your server is at localhost:3000
.resources/traces/refraction
for examples) or implement a new reader (see the modules in srv/gamalyzer/read/
). Update srv/gamalyzer/rsrc/data.clj
to handle your data set and pass along your path or call your reader, then modify core.cljs
to make a request against your data set.Terms in ALL-CAPITALS
are standins.
[:properties ([:game :GAME-NAME] [:root :MAIN-GAME-PROC] [:events (:i)])] [:log_start "UUID"] [:i :PLAYER-ID :GAME-PROC DETERMINANT-BASE DETERMINANT-FINAL VALUE] ... :log_end [:log_start "UUID"] [:i :PLAYER-ID :GAME-PROC DETERMINANT-BASE DETERMINANT-FINAL VALUE] ... :log_end ...
The time point of each input is inferred by the event ordering. "Procs" are not necessary for Gamalyzer, but they can be a mnemonic for the designer; Chess might only have two procs ("move" and "promote"), and a game like Mario only one ("game"). The "main proc" is just a hook for establishing a little hierarchy; this trace format is a specialized version of a more generic play trace format, and it drops out all game events except for (:i
) inputs.
This format also splits the determinant into a base and final part. The base part should be a round list, the final part should be a :keyword
, and the value should be either a round or a square list. Apologies for the awkwardness of the format.