Py之seaborn:数据可视化seaborn库(一)的柱状图、箱线图(置信区间图)、散点图/折线图、核密度图/等高线图、盒形图/小提琴图/LV多框图的简介、使用方法之最强攻略(建议收藏)(二)

简介: Py之seaborn:数据可视化seaborn库(一)的柱状图、箱线图(置信区间图)、散点图/折线图、核密度图/等高线图、盒形图/小提琴图/LV多框图的简介、使用方法之最强攻略(建议收藏)(二)

3、barplot函数:条形图可视化


seaborn.barplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=<function mean at 0x7fecadf1cee0>, ci=95, n_boot=1000, units=None, seed=None, orient=None, color=None, palette=None, saturation=0.75, errcolor='.26', errwidth=None, capsize=None, dodge=True, ax=None, **kwargs)

仅第2变量必须为数值型

条形图表示数值变量与每个矩形高度的中心趋势的估计值(默认平均值),并使用误差条提供关于该估计值附近的不确定性的一些指示。误差条越长,数据离散程度越大,数据越不稳定。


官方文档解释:http://seaborn.pydata.org/generated/seaborn.barplot.html?highlight=barplot#seaborn.barplot


Show point estimates and confidence intervals as rectangular bars.

A bar plot represents an estimate of central tendency for a numeric variable with the height of each rectangle and provides some indication of the uncertainty around that estimate using error bars. Bar plots include 0 in the quantitative axis range, and they are a good choice when 0 is a meaningful value for the quantitative variable, and you want to make comparisons against it.

For datasets where 0 is not a meaningful value, a point plot will allow you to focus on differences between levels of one or more categorical variables.

It is also important to keep in mind that a bar plot shows only the mean (or other estimator) value, but in many cases it may be more informative to show the distribution of values at each level of the categorical variables. In that case, other approaches such as a box or violin plot may be more appropriate.

用矩形条显示点估计和置信区间。

条形图表示对每个矩形高度的数值变量的集中趋势的估计,并使用误差条提供了一些关于估计的不确定性的指示。条形图在数量轴范围中包括0,当0是数量变量的一个有意义的值,并希望与之进行比较时,条形图是一个很好的选择。

对于0不是一个有意义的值的数据集,点图将允许你关注一个或多个分类变量的不同级别。

同样重要的是要记住,条形图只显示平均值(或其他估计值),但在许多情况下,显示分类变量每一级的值分布可能会提供更多信息。在这种情况下,其他方法,如盒子或小提琴情节可能更合适。

Input data can be passed in a variety of formats, including:

Vectors of data represented as lists, numpy arrays, or pandas Series objects passed directly to the x, y, and/or hue parameters.

A “long-form” DataFrame, in which case the x, y, and hue variables will determine how the data are plotted.

A “wide-form” DataFrame, such that each numeric column will be plotted.

An array or list of vectors.

In most cases, it is possible to use numpy or Python objects, but pandas objects are preferable because the associated names will be used to annotate the axes. Additionally, you can use Categorical types for the grouping variables to control the order of plot elements.

This function always treats one of the variables as categorical and draws data at ordinal positions (0, 1, … n) on the relevant axis, even when the data has a numeric or date type.

输入数据可以以多种格式传递,包括:

表示为列表、numpy数组或pandas系列对象的数据向量,直接传递给x、y和/或hue参数。

一个“长格式”数据帧,在这种情况下,x, y和hue变量将决定数据如何绘制。

一种“宽格式”数据帧,这样每个数字列都将被绘制出来。

向量的数组或列表。

在大多数情况下,可以使用numpy或Python对象,但pandas对象更合适,因为关联的名称将用于注释坐标轴。此外,您可以为分组变量使用类别类型来控制绘图元素的顺序。

该函数总是将其中一个变量视为类别变量,并在相关轴上的顺序位置(0,1,…n)绘制数据,即使数据具有numeric 或date 类型。

(1)、BarPlot

image.png

image.png




(2)、BarPlotByV

image.png






(3)、BarPlotBy2V

image.png







4、pointplot函数:点估计和置信区间可视化(误差条)


seaborn.pointplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=<function mean at 0x7fecadf1cee0>, ci=95, n_boot=1000, units=None, seed=None, markers='o', linestyles='-', dodge=False, join=True, scale=1, orient=None, color=None, palette=None, errwidth=None, capsize=None, ax=None, **kwargs)

仅第2变量必须为数值型

置信区间估计:图中的点为该组数据的平均值点,竖线则为误差条,默认两个均值点会相连接


官方文档解释:http://seaborn.pydata.org/generated/seaborn.pointplot.html?highlight=pointplot#seaborn.pointplot


Show point estimates and confidence intervals using scatter plot glyphs.

A point plot represents an estimate of central tendency for a numeric variable by the position of scatter plot points and provides some indication of the uncertainty around that estimate using error bars.

Point plots can be more useful than bar plots for focusing comparisons between different levels of one or more categorical variables. They are particularly adept at showing interactions: how the relationship between levels of one categorical variable changes across levels of a second categorical variable. The lines that join each point from the same hue level allow interactions to be judged by differences in slope, which is easier for the eyes than comparing the heights of several groups of points or bars.

It is important to keep in mind that a point plot shows only the mean (or other estimator) value, but in many cases it may be more informative to show the distribution of values at each level of the categorical variables. In that case, other approaches such as a box or violin plot may be more appropriate.

使用散点图符号显示点估计和置信区间。

点图通过散点的位置表示对数值变量的集中趋势的估计,并使用误差条提供一些关于估计的不确定性的指示。

点图比条形图更有助于集中比较一个或多个分类变量的不同层次。他们特别擅长展示交互作用:一个分类变量的各个层次之间的关系如何在另一个分类变量的各个层次之间发生变化。连接来自同一色调等级的每个点的线条允许通过斜率的差异来判断交互作用,这比比较几组点或条的高度更容易。

重要的是要记住点图只显示平均值(或其他估计值),但在许多情况下,显示分类变量的每一级值的分布可能会提供更多的信息。在这种情况下,其他方法,如盒子或小提琴情节可能更合适。

Input data can be passed in a variety of formats, including:

Vectors of data represented as lists, numpy arrays, or pandas Series objects passed directly to the x, y, and/or hue parameters.

A “long-form” DataFrame, in which case the x, y, and hue variables will determine how the data are plotted.

A “wide-form” DataFrame, such that each numeric column will be plotted.

An array or list of vectors.

In most cases, it is possible to use numpy or Python objects, but pandas objects are preferable because the associated names will be used to annotate the axes. Additionally, you can use Categorical types for the grouping variables to control the order of plot elements.

This function always treats one of the variables as categorical and draws data at ordinal positions (0, 1, … n) on the relevant axis, even when the data has a numeric or date type.

输入数据可以以多种格式传递,包括:

表示为列表、numpy数组或pandas系列对象的数据向量,直接传递给x、y和/或hue参数。

一个“长格式”数据帧,在这种情况下,x, y和hue变量将决定数据如何绘制。

一种“宽格式”数据帧,这样每个数字列都将被绘制出来。

向量的数组或列表。

在大多数情况下,可以使用numpy或Python对象,但pandas对象更合适,因为关联的名称将用于注释坐标轴。此外,您可以为分组变量使用类别类型来控制绘图元素的顺序。

该函数总是将其中一个变量视为类别变量,并在相关轴上的顺序位置(0,1,…n)绘制数据,即使数据具有数字或日期类型。

image.png






5、stripplot函数:散点图可视化


seaborn.stripplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, jitter=True, dodge=False, orient=None, color=None, palette=None, size=5, edgecolor='gray', linewidth=0, ax=None, **kwargs)

官方文档解释:http://seaborn.pydata.org/generated/seaborn.stripplot.html?highlight=stripplot#seaborn.stripplot


Draw a scatterplot where one variable is categorical.

A strip plot can be drawn on its own, but it is also a good complement to a box or violin plot in cases where you want to show all observations along with some representation of the underlying distribution.

Input data can be passed in a variety of formats, including:

Vectors of data represented as lists, numpy arrays, or pandas Series objects passed directly to the x, y, and/or hue parameters.

A “long-form” DataFrame, in which case the x, y, and hue variables will determine how the data are plotted.

A “wide-form” DataFrame, such that each numeric column will be plotted.

An array or list of vectors.

绘制一个散点图,其中一个变量是类别变量。

条形图可以自己绘制,但在您想要显示所有观察结果以及一些潜在分布的表示的情况下,它也是盒形图或小提琴形图的一个很好的补充。

输入数据可以以多种格式传递,包括:

表示为列表、numpy数组或pandas系列对象的数据向量,直接传递给x、y和/或hue参数。

一个“长格式”数据帧,在这种情况下,x, y和hue变量将决定数据如何绘制。

一种“宽格式”数据帧,这样每个数字列都将被绘制出来。

向量的数组或列表。

In most cases, it is possible to use numpy or Python objects, but pandas objects are preferable because the associated names will be used to annotate the axes. Additionally, you can use Categorical types for the grouping variables to control the order of plot elements.

This function always treats one of the variables as categorical and draws data at ordinal positions (0, 1, … n) on the relevant axis, even when the data has a numeric or date type.

在大多数情况下,可以使用numpy或Python对象,但pandas对象更合适,因为关联的名称将用于注释坐标轴。此外,您可以为分组变量使用类别类型来控制情节元素的顺序。

该函数总是将其中一个变量视为类别变量,并在相关轴上的顺序位置(0,1,…n)绘制数据,即使数据具有数字或日期类型。

image.png







6、relplot函数:散点图/折线图可视化


seaborn.relplot(*, x=None, y=None, hue=None, size=None, style=None, data=None, row=None, col=None, col_wrap=None, row_order=None, col_order=None, palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None, size_norm=None, markers=None, dashes=None, style_order=None, legend='auto', kind='scatter', height=5, aspect=1, facet_kws=None, units=None, **kwargs)

官方文档解释:http://seaborn.pydata.org/generated/seaborn.relplot.html?highlight=relplot#seaborn.relplot


Figure-level interface for drawing relational plots onto a FacetGrid.

This function provides access to several different axes-level functions that show the relationship between two variables with semantic mappings of subsets. The kind parameter selects the underlying axes-level function to use:

scatterplot() (with kind="scatter"; the default)

lineplot() (with kind="line")

Extra keyword arguments are passed to the underlying function, so you should refer to the documentation for each to see kind-specific options.

The relationship between x and y can be shown for different subsets of the data using the hue, size, and style parameters. These parameters control what visual semantics are used to identify the different subsets. It is possible to show up to three dimensions independently by using all three semantic types, but this style of plot can be hard to interpret and is often ineffective. Using redundant semantics (i.e. both hue and style for the same variable) can be helpful for making graphics more accessible.

See the tutorial for more information.

用于在FacetGrid上绘制关系图的图形级接口。

这个函数提供了对几个不同的轴级函数的访问,这些函数显示了两个具有子集语义映射的变量之间的关系。kind参数选择要使用的axis级函数:

scatterplot() (with kind="scatter"; the default)

lineplot() (with kind="line")

额外的关键字参数被传递给底层函数,因此您应该参考每个函数的文档来查看特定种类的选项。

x和y之间的关系可以通过使用hue、size和style参数来显示数据的不同子集。这些参数控制使用什么视觉语义来标识不同的子集。通过使用这三种语义类型,我们可以独立呈现出三个维度,但这种绘图风格很难解释,而且通常是无效的。使用冗余的语义(例如,相同变量的色调和样式)有助于让图形更容易访问。

有关更多信息,请参阅本教程。

The default treatment of the hue (and to a lesser extent, size) semantic, if present, depends on whether the variable is inferred to represent “numeric” or “categorical” data. In particular, numeric variables are represented with a sequential colormap by default, and the legend entries show regular “ticks” with values that may or may not exist in the data. This behavior can be controlled through various parameters, as described and illustrated below.

After plotting, the FacetGrid with the plot is returned and can be used directly to tweak supporting plot details or add other layers.

如果存在色相(以及较小程度上的大小)语义的默认处理,则取决于该变量是被推断为表示“numeric”还是“categorical”数据。具体来说,默认情况下,数值变量用顺序的colormap表示,并且图例条目显示有规律的“刻度”,刻度的值可能存在于数据中,也可能不存在。这种行为可以通过各种参数来控制,如下面的描述和说明所示。

绘制后,返回带有plot的FacetGrid,可以直接用于调整支持的plot细节或添加其他层。

image.png


image.png




 


相关文章
|
21天前
|
XML JSON 数据库
Python的标准库
Python的标准库
161 77
|
2天前
|
数据可视化 数据挖掘 DataX
Python 数据可视化的完整指南
Python 数据可视化在数据分析和科学研究中至关重要,它能帮助我们理解数据、发现规律并以直观方式呈现复杂信息。Python 提供了丰富的可视化库,如 Matplotlib、Seaborn、Plotly 和 Pandas 的绘图功能,使得图表生成简单高效。本文通过具体代码示例和案例,介绍了折线图、柱状图、饼图、散点图、箱形图、热力图和小提琴图等常用图表类型,并讲解了自定义样式和高级技巧,帮助读者更好地掌握 Python 数据可视化工具的应用。
19 3
|
19天前
|
数据可视化 DataX Python
Seaborn 教程-绘图函数
Seaborn 教程-绘图函数
46 8
|
22天前
|
XML JSON 数据库
Python的标准库
Python的标准库
47 11
|
19天前
Seaborn 教程-主题(Theme)
Seaborn 教程-主题(Theme)
56 7
|
19天前
|
Python
Seaborn 教程-模板(Context)
Seaborn 教程-模板(Context)
47 4
|
19天前
|
数据可视化 Python
Seaborn 教程
Seaborn 教程
41 5
|
22天前
|
数据可视化 Python
以下是一些常用的图表类型及其Python代码示例,使用Matplotlib和Seaborn库。
通过这些思维导图和分析说明表,您可以更直观地理解和选择适合的数据可视化图表类型,帮助更有效地展示和分析数据。
62 8
|
28天前
|
人工智能 数据可视化 数据挖掘
探索Python编程:从基础到高级
在这篇文章中,我们将一起深入探索Python编程的世界。无论你是初学者还是有经验的程序员,都可以从中获得新的知识和技能。我们将从Python的基础语法开始,然后逐步过渡到更复杂的主题,如面向对象编程、异常处理和模块使用。最后,我们将通过一些实际的代码示例,来展示如何应用这些知识解决实际问题。让我们一起开启Python编程的旅程吧!
|
27天前
|
存储 数据采集 人工智能
Python编程入门:从零基础到实战应用
本文是一篇面向初学者的Python编程教程,旨在帮助读者从零开始学习Python编程语言。文章首先介绍了Python的基本概念和特点,然后通过一个简单的例子展示了如何编写Python代码。接下来,文章详细介绍了Python的数据类型、变量、运算符、控制结构、函数等基本语法知识。最后,文章通过一个实战项目——制作一个简单的计算器程序,帮助读者巩固所学知识并提高编程技能。