见:http://www.infoq.com/cn/news/2017/07/goole-sight-facets-ai
https://github.com/PAIR-code/facets/blob/master/facets_dive/README.md
Introduction
The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive.
The visualizations are implemented as Polymer web components, backed by Typescript code and can be easily embedded into Jupyter notebooks or webpages.
Live demos of the visualizations can be found on the Facets project description page.
Facets Overview
Overview gives a high-level view of one or more data sets. It produces a visual feature-by-feature statistical analysis, and can also be used to compare statistics across two or more data sets. The tool can process both numeric and string features, including multiple instances of a number or string per feature.
Overview can help uncover issues with datasets, including the following:
- Unexpected feature values
- Missing feature values for a large number of examples
- Training/serving skew
- Training/test/validation set skew
Key aspects of the visualization are outlier detection and distribution comparison across multiple datasets. Interesting values (such as a high proportion of missing data, or very different distributions of a feature across multiple datasets) are highlighted in red. Features can be sorted by values of interest such as the number of missing values or the skew between the different datasets.
Details about Overview usage can be found in its README.
Facets Dive
本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/7227788.html,如需转载请自行联系原作者