Visual Comparison of Datasets Using Mixture Decompositions

Loading...
Thumbnail Image

Related Collections

Degree type

Discipline

Subject

classification
data visualization
density estimation
exploratory data analysis
mixture decomposition
Applied Statistics
Other Statistics and Probability
Statistics and Probability

Funder

Grant number

License

Copyright date

Distributor

Related resources

Contributor

Abstract

This article describes how a mixture of two densities, f0 and f1, may be decomposed into a different mixture consisting of three densities. These new densities, f+, f−, and f=, summarize differences between f0 and f1: f+ is high in areas of excess of f1 compared to f0; f− represents deficiency of f1 compared to f0 in the same way; f= represents commonality between f1 and f0. The supports of f+ and f− are disjoint. This decomposition of the mixture of f0 and f1 is similar to the set-theoretic decomposition of the union of two sets A and B into the disjoint sets A\B, B\A, and A ∩ B. Sample points from f0 and f1 can be assigned to one of these three densities, allowing the differences between f0 and f1 to be visualized in a single plot, a visual hypothesis test of whether f0 is equal to f1. We describe two similar such decompositions and contrast their behavior under the null hypothesis f0 = f1, giving some insight into how such plots may be interpreted. We present two examples of uses of these methods: visualization of departures from independence, and of a two-class classification problem. Other potential applications are discussed.

Advisor

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Publication date

2004-01-01

Journal title

Journal of Computational and Graphical Statistics

Volume number

Issue number

Publisher

Publisher DOI

relationships.isJournalIssueOf

Comments

Recommended citation

Collection