Building Blocks for Exploratory Data Analysis Tools

Data exploration is largely manual and labor intensive. Although there are various tools and statistical techniques that can be applied to data sets, there is little help to identify what questions to ask of a data set, let alone what domain knowledge is useful in answering the questions. In this paper, we study user queries against production data sets in Splunk. Specifically, we characterize the interplay between data sets and the operations used to analyze them using latent semantic analysis, and discuss how this characterization serves as a building block for a data analysis recommendation system. This is a work-in-progress paper.

Authors: Sara Alspaugh, Archana Ganapathi, Marti Hearst, Randy Katz
Publication Date: August 2013
Conference: Workshop on Interactive Data Exploration and Analytics (at KDD)
Download PDF: alspaugh_idea2013