Python pandas库|任凭弱水三千,我只取一瓢饮(5)

简介: Python pandas库|任凭弱水三千,我只取一瓢饮(5)

S~W:  Function46~56

Types['Function'][45:]
['set_eng_float_format', 'show_versions', 'test', 'timedelta_range', 'to_datetime', 'to_numeric', 'to_pickle', 'to_timedelta', 'unique', 'value_counts', 'wide_to_long']


Function46

set_eng_float_format(accuracy: 'int' = 3, use_eng_prefix: 'bool' = False) -> 'None'

Help on function set_eng_float_format in module pandas.io.formats.format:
set_eng_float_format(accuracy: 'int' = 3, use_eng_prefix: 'bool' = False) -> 'None'
    Alter default behavior on how float is formatted in DataFrame.
    Format float in engineering format. By accuracy, we mean the number of
    decimal digits after the floating point.
    See also EngFormatter.




Function47

show_versions(as_json: 'str | bool' = False) -> 'None'

Help on function show_versions in module pandas.util._print_versions:
show_versions(as_json: 'str | bool' = False) -> 'None'
    Provide useful information, important for bug reports.
    It comprises info about hosting operation system, pandas version,
    and versions of other installed relative packages.
    Parameters
    ----------
    as_json : str or bool, default False
        * If False, outputs info in a human readable form to the console.
        * If str, it will be considered as a path to a file.
          Info will be written to that file in JSON format.
        * If True, outputs info in JSON format to the console.



Function48

test(extra_args=None)

Help on function test in module pandas.util._tester:

test(extra_args=None)



Function49

timedelta_range(start=None, end=None, periods: 'Optional[int]' = None, freq=None, name=None, closed=None) -> 'TimedeltaIndex'

Help on function timedelta_range in module pandas.core.indexes.timedeltas:
timedelta_range(start=None, end=None, periods: 'Optional[int]' = None, freq=None, name=None, closed=None) -> 'TimedeltaIndex'
    Return a fixed frequency TimedeltaIndex, with day as the default
    frequency.
    Parameters
    ----------
    start : str or timedelta-like, default None
        Left bound for generating timedeltas.
    end : str or timedelta-like, default None
        Right bound for generating timedeltas.
    periods : int, default None
        Number of periods to generate.
    freq : str or DateOffset, default 'D'
        Frequency strings can have multiples, e.g. '5H'.
    name : str, default None
        Name of the resulting TimedeltaIndex.
    closed : str, default None
        Make the interval closed with respect to the given frequency to
        the 'left', 'right', or both sides (None).
    Returns
    -------
    TimedeltaIndex
    Notes
    -----
    Of the four parameters ``start``, ``end``, ``periods``, and ``freq``,
    exactly three must be specified. If ``freq`` is omitted, the resulting
    ``TimedeltaIndex`` will have ``periods`` linearly spaced elements between
    ``start`` and ``end`` (closed on both sides).
    To learn more about the frequency strings, please see `this link
    <https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases>`__.
    Examples
    --------
    >>> pd.timedelta_range(start='1 day', periods=4)
    TimedeltaIndex(['1 days', '2 days', '3 days', '4 days'],
                   dtype='timedelta64[ns]', freq='D')
    The ``closed`` parameter specifies which endpoint is included.  The default
    behavior is to include both endpoints.
    >>> pd.timedelta_range(start='1 day', periods=4, closed='right')
    TimedeltaIndex(['2 days', '3 days', '4 days'],
                   dtype='timedelta64[ns]', freq='D')
    The ``freq`` parameter specifies the frequency of the TimedeltaIndex.
    Only fixed frequencies can be passed, non-fixed frequencies such as
    'M' (month end) will raise.
    >>> pd.timedelta_range(start='1 day', end='2 days', freq='6H')
    TimedeltaIndex(['1 days 00:00:00', '1 days 06:00:00', '1 days 12:00:00',
                    '1 days 18:00:00', '2 days 00:00:00'],
                   dtype='timedelta64[ns]', freq='6H')
    Specify ``start``, ``end``, and ``periods``; the frequency is generated
    automatically (linearly spaced).
    >>> pd.timedelta_range(start='1 day', end='5 days', periods=4)
    TimedeltaIndex(['1 days 00:00:00', '2 days 08:00:00', '3 days 16:00:00',
                    '5 days 00:00:00'],
                   dtype='timedelta64[ns]', freq=None)


Function50

to_datetime(arg: 'DatetimeScalarOrArrayConvertible', errors: 'str' = 'raise', dayfirst: 'bool' = False, yearfirst: 'bool' = False, utc: 'bool | None' = None, format: 'str | None' = None, exact: 'bool' = True, unit: 'str | None' = None, infer_datetime_format: 'bool' = False, origin='unix', cache: 'bool' = True) -> 'DatetimeIndex | Series | DatetimeScalar | NaTType | None'



Help on function to_datetime in module pandas.core.tools.datetimes:
to_datetime(arg: 'DatetimeScalarOrArrayConvertible', errors: 'str' = 'raise', dayfirst: 'bool' = False, yearfirst: 'bool' = False, utc: 'bool | None' = None, format: 'str | None' = None, exact: 'bool' = True, unit: 'str | None' = None, infer_datetime_format: 'bool' = False, origin='unix', cache: 'bool' = True) -> 'DatetimeIndex | Series | DatetimeScalar | NaTType | None'
    Convert argument to datetime.
    Parameters
    ----------
    arg : int, float, str, datetime, list, tuple, 1-d array, Series, DataFrame/dict-like
        The object to convert to a datetime.
    errors : {'ignore', 'raise', 'coerce'}, default 'raise'
        - If 'raise', then invalid parsing will raise an exception.
        - If 'coerce', then invalid parsing will be set as NaT.
        - If 'ignore', then invalid parsing will return the input.
    dayfirst : bool, default False
        Specify a date parse order if `arg` is str or its list-likes.
        If True, parses dates with the day first, eg 10/11/12 is parsed as
        2012-11-10.
        Warning: dayfirst=True is not strict, but will prefer to parse
        with day first (this is a known bug, based on dateutil behavior).
    yearfirst : bool, default False
        Specify a date parse order if `arg` is str or its list-likes.
        - If True parses dates with the year first, eg 10/11/12 is parsed as
          2010-11-12.
        - If both dayfirst and yearfirst are True, yearfirst is preceded (same
          as dateutil).
        Warning: yearfirst=True is not strict, but will prefer to parse
        with year first (this is a known bug, based on dateutil behavior).
    utc : bool, default None
        Return UTC DatetimeIndex if True (converting any tz-aware
        datetime.datetime objects as well).
    format : str, default None
        The strftime to parse time, eg "%d/%m/%Y", note that "%f" will parse
        all the way up to nanoseconds.
        See strftime documentation for more information on choices:
        https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior.
    exact : bool, True by default
        Behaves as:
        - If True, require an exact format match.
        - If False, allow the format to match anywhere in the target string.
    unit : str, default 'ns'
        The unit of the arg (D,s,ms,us,ns) denote the unit, which is an
        integer or float number. This will be based off the origin.
        Example, with unit='ms' and origin='unix' (the default), this
        would calculate the number of milliseconds to the unix epoch start.
    infer_datetime_format : bool, default False
        If True and no `format` is given, attempt to infer the format of the
        datetime strings based on the first non-NaN element,
        and if it can be inferred, switch to a faster method of parsing them.
        In some cases this can increase the parsing speed by ~5-10x.
    origin : scalar, default 'unix'
        Define the reference date. The numeric values would be parsed as number
        of units (defined by `unit`) since this reference date.
        - If 'unix' (or POSIX) time; origin is set to 1970-01-01.
        - If 'julian', unit must be 'D', and origin is set to beginning of
          Julian Calendar. Julian day number 0 is assigned to the day starting
          at noon on January 1, 4713 BC.
        - If Timestamp convertible, origin is set to Timestamp identified by
          origin.
    cache : bool, default True
        If True, use a cache of unique, converted dates to apply the datetime
        conversion. May produce significant speed-up when parsing duplicate
        date strings, especially ones with timezone offsets. The cache is only
        used when there are at least 50 values. The presence of out-of-bounds
        values will render the cache unusable and may slow down parsing.
        .. versionchanged:: 0.25.0
            - changed default value from False to True.
    Returns
    -------
    datetime
        If parsing succeeded.
        Return type depends on input:
        - list-like: DatetimeIndex
        - Series: Series of datetime64 dtype
        - scalar: Timestamp
        In case when it is not possible to return designated types (e.g. when
        any element of input is before Timestamp.min or after Timestamp.max)
        return will have datetime.datetime type (or corresponding
        array/Series).
    See Also
    --------
    DataFrame.astype : Cast argument to a specified dtype.
    to_timedelta : Convert argument to timedelta.
    convert_dtypes : Convert dtypes.
    Examples
    --------
    Assembling a datetime from multiple columns of a DataFrame. The keys can be
    common abbreviations like ['year', 'month', 'day', 'minute', 'second',
    'ms', 'us', 'ns']) or plurals of the same
    >>> df = pd.DataFrame({'year': [2015, 2016],
    ...                    'month': [2, 3],
    ...                    'day': [4, 5]})
    >>> pd.to_datetime(df)
    0   2015-02-04
    1   2016-03-05
    dtype: datetime64[ns]
    If a date does not meet the `timestamp limitations
    <https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html
    #timeseries-timestamp-limits>`_, passing errors='ignore'
    will return the original input instead of raising any exception.
    Passing errors='coerce' will force an out-of-bounds date to NaT,
    in addition to forcing non-dates (or non-parseable dates) to NaT.
    >>> pd.to_datetime('13000101', format='%Y%m%d', errors='ignore')
    datetime.datetime(1300, 1, 1, 0, 0)
    >>> pd.to_datetime('13000101', format='%Y%m%d', errors='coerce')
    NaT
    Passing infer_datetime_format=True can often-times speedup a parsing
    if its not an ISO8601 format exactly, but in a regular format.
    >>> s = pd.Series(['3/11/2000', '3/12/2000', '3/13/2000'] * 1000)
    >>> s.head()
    0    3/11/2000
    1    3/12/2000
    2    3/13/2000
    3    3/11/2000
    4    3/12/2000
    dtype: object
    >>> %timeit pd.to_datetime(s, infer_datetime_format=True)  # doctest: +SKIP
    100 loops, best of 3: 10.4 ms per loop
    >>> %timeit pd.to_datetime(s, infer_datetime_format=False)  # doctest: +SKIP
    1 loop, best of 3: 471 ms per loop
    Using a unix epoch time
    >>> pd.to_datetime(1490195805, unit='s')
    Timestamp('2017-03-22 15:16:45')
    >>> pd.to_datetime(1490195805433502912, unit='ns')
    Timestamp('2017-03-22 15:16:45.433502912')
    .. warning:: For float arg, precision rounding might happen. To prevent
        unexpected behavior use a fixed-width exact type.
    Using a non-unix epoch origin
    >>> pd.to_datetime([1, 2, 3], unit='D',
    ...                origin=pd.Timestamp('1960-01-01'))
    DatetimeIndex(['1960-01-02', '1960-01-03', '1960-01-04'],
                  dtype='datetime64[ns]', freq=None)
    In case input is list-like and the elements of input are of mixed
    timezones, return will have object type Index if utc=False.
    >>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500'])
    Index([2018-10-26 12:00:00-05:30, 2018-10-26 12:00:00-05:00], dtype='object')
    >>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500'],
    ...                utc=True)
    DatetimeIndex(['2018-10-26 17:30:00+00:00', '2018-10-26 17:00:00+00:00'],
                  dtype='datetime64[ns, UTC]', freq=None)



Function51

to_numeric(arg, errors='raise', downcast=None)

Help on function to_numeric in module pandas.core.tools.numeric:
to_numeric(arg, errors='raise', downcast=None)
    Convert argument to a numeric type.
    The default return dtype is `float64` or `int64`
    depending on the data supplied. Use the `downcast` parameter
    to obtain other dtypes.
    Please note that precision loss may occur if really large numbers
    are passed in. Due to the internal limitations of `ndarray`, if
    numbers smaller than `-9223372036854775808` (np.iinfo(np.int64).min)
    or larger than `18446744073709551615` (np.iinfo(np.uint64).max) are
    passed in, it is very likely they will be converted to float so that
    they can stored in an `ndarray`. These warnings apply similarly to
    `Series` since it internally leverages `ndarray`.
    Parameters
    ----------
    arg : scalar, list, tuple, 1-d array, or Series
        Argument to be converted.
    errors : {'ignore', 'raise', 'coerce'}, default 'raise'
        - If 'raise', then invalid parsing will raise an exception.
        - If 'coerce', then invalid parsing will be set as NaN.
        - If 'ignore', then invalid parsing will return the input.
    downcast : {'integer', 'signed', 'unsigned', 'float'}, default None
        If not None, and if the data has been successfully cast to a
        numerical dtype (or if the data was numeric to begin with),
        downcast that resulting data to the smallest numerical dtype
        possible according to the following rules:
        - 'integer' or 'signed': smallest signed int dtype (min.: np.int8)
        - 'unsigned': smallest unsigned int dtype (min.: np.uint8)
        - 'float': smallest float dtype (min.: np.float32)
        As this behaviour is separate from the core conversion to
        numeric values, any errors raised during the downcasting
        will be surfaced regardless of the value of the 'errors' input.
        In addition, downcasting will only occur if the size
        of the resulting data's dtype is strictly larger than
        the dtype it is to be cast to, so if none of the dtypes
        checked satisfy that specification, no downcasting will be
        performed on the data.
    Returns
    -------
    ret
        Numeric if parsing succeeded.
        Return type depends on input.  Series if Series, otherwise ndarray.
    See Also
    --------
    DataFrame.astype : Cast argument to a specified dtype.
    to_datetime : Convert argument to datetime.
    to_timedelta : Convert argument to timedelta.
    numpy.ndarray.astype : Cast a numpy array to a specified type.
    DataFrame.convert_dtypes : Convert dtypes.
    Examples
    --------
    Take separate series and convert to numeric, coercing when told to
    >>> s = pd.Series(['1.0', '2', -3])
    >>> pd.to_numeric(s)
    0    1.0
    1    2.0
    2   -3.0
    dtype: float64
    >>> pd.to_numeric(s, downcast='float')
    0    1.0
    1    2.0
    2   -3.0
    dtype: float32
    >>> pd.to_numeric(s, downcast='signed')
    0    1
    1    2
    2   -3
    dtype: int8
    >>> s = pd.Series(['apple', '1.0', '2', -3])
    >>> pd.to_numeric(s, errors='ignore')
    0    apple
    1      1.0
    2        2
    3       -3
    dtype: object
    >>> pd.to_numeric(s, errors='coerce')
    0    NaN
    1    1.0
    2    2.0
    3   -3.0
    dtype: float64
    Downcasting of nullable integer and floating dtypes is supported:
    >>> s = pd.Series([1, 2, 3], dtype="Int64")
    >>> pd.to_numeric(s, downcast="integer")
    0    1
    1    2
    2    3
    dtype: Int8
    >>> s = pd.Series([1.0, 2.1, 3.0], dtype="Float64")
    >>> pd.to_numeric(s, downcast="float")
    0    1.0
    1    2.1
    2    3.0
    dtype: Float32



Function52

to_pickle(obj: Any, filepath_or_buffer: Union[ForwardRef('PathLike[str]'), str, IO[~AnyStr], io.RawIOBase, io.BufferedIOBase, io.TextIOBase, _io.TextIOWrapper, mmap.mmap], compression: Union[str, Dict[str, Any], NoneType] = 'infer', protocol: int = 5, storage_options: Union[Dict[str, Any], NoneType] = None)


Help on function to_pickle in module pandas.io.pickle:
to_pickle(obj: Any, filepath_or_buffer: Union[ForwardRef('PathLike[str]'), str, IO[~AnyStr], io.RawIOBase, io.BufferedIOBase, io.TextIOBase, _io.TextIOWrapper, mmap.mmap], compression: Union[str, Dict[str, Any], NoneType] = 'infer', protocol: int = 5, storage_options: Union[Dict[str, Any], NoneType] = None)
    Pickle (serialize) object to file.
    Parameters
    ----------
    obj : any object
        Any python object.
    filepath_or_buffer : str, path object or file-like object
        File path, URL, or buffer where the pickled object will be stored.
        .. versionchanged:: 1.0.0
           Accept URL. URL has to be of S3 or GCS.
    compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer'
        If 'infer' and 'path_or_url' is path-like, then detect compression from
        the following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no
        compression) If 'infer' and 'path_or_url' is not path-like, then use
        None (= no decompression).
    protocol : int
        Int which indicates which protocol should be used by the pickler,
        default HIGHEST_PROTOCOL (see [1], paragraph 12.1.2). The possible
        values for this parameter depend on the version of Python. For Python
        2.x, possible values are 0, 1, 2. For Python>=3.0, 3 is a valid value.
        For Python >= 3.4, 4 is a valid value. A negative value for the
        protocol parameter is equivalent to setting its value to
        HIGHEST_PROTOCOL.
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to ``urllib`` as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        .. versionadded:: 1.2.0
        .. [1] https://docs.python.org/3/library/pickle.html
    See Also
    --------
    read_pickle : Load pickled pandas object (or any object) from file.
    DataFrame.to_hdf : Write DataFrame to an HDF5 file.
    DataFrame.to_sql : Write DataFrame to a SQL database.
    DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
    Examples
    --------
    >>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)})
    >>> original_df
       foo  bar
    0    0    5
    1    1    6
    2    2    7
    3    3    8
    4    4    9
    >>> pd.to_pickle(original_df, "./dummy.pkl")
    >>> unpickled_df = pd.read_pickle("./dummy.pkl")
    >>> unpickled_df
       foo  bar
    0    0    5
    1    1    6
    2    2    7
    3    3    8
    4    4    9
    >>> import os
    >>> os.remove("./dummy.pkl")


Function53

to_timedelta(arg, unit=None, errors='raise')

Help on function to_timedelta in module pandas.core.tools.timedeltas:
to_timedelta(arg, unit=None, errors='raise')
    Convert argument to timedelta.
    Timedeltas are absolute differences in times, expressed in difference
    units (e.g. days, hours, minutes, seconds). This method converts
    an argument from a recognized timedelta format / value into
    a Timedelta type.
    Parameters
    ----------
    arg : str, timedelta, list-like or Series
        The data to be converted to timedelta.
        .. deprecated:: 1.2
            Strings with units 'M', 'Y' and 'y' do not represent
            unambiguous timedelta values and will be removed in a future version
    unit : str, optional
        Denotes the unit of the arg for numeric `arg`. Defaults to ``"ns"``.
        Possible values:
        * 'W'
        * 'D' / 'days' / 'day'
        * 'hours' / 'hour' / 'hr' / 'h'
        * 'm' / 'minute' / 'min' / 'minutes' / 'T'
        * 'S' / 'seconds' / 'sec' / 'second'
        * 'ms' / 'milliseconds' / 'millisecond' / 'milli' / 'millis' / 'L'
        * 'us' / 'microseconds' / 'microsecond' / 'micro' / 'micros' / 'U'
        * 'ns' / 'nanoseconds' / 'nano' / 'nanos' / 'nanosecond' / 'N'
        .. versionchanged:: 1.1.0
           Must not be specified when `arg` context strings and
           ``errors="raise"``.
    errors : {'ignore', 'raise', 'coerce'}, default 'raise'
        - If 'raise', then invalid parsing will raise an exception.
        - If 'coerce', then invalid parsing will be set as NaT.
        - If 'ignore', then invalid parsing will return the input.
    Returns
    -------
    timedelta64 or numpy.array of timedelta64
        Output type returned if parsing succeeded.
    See Also
    --------
    DataFrame.astype : Cast argument to a specified dtype.
    to_datetime : Convert argument to datetime.
    convert_dtypes : Convert dtypes.
    Notes
    -----
    If the precision is higher than nanoseconds, the precision of the duration is
    truncated to nanoseconds for string inputs.
    Examples
    --------
    Parsing a single string to a Timedelta:
    >>> pd.to_timedelta('1 days 06:05:01.00003')
    Timedelta('1 days 06:05:01.000030')
    >>> pd.to_timedelta('15.5us')
    Timedelta('0 days 00:00:00.000015500')
    Parsing a list or array of strings:
    >>> pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan'])
    TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015500', NaT],
                   dtype='timedelta64[ns]', freq=None)
    Converting numbers by specifying the `unit` keyword argument:
    >>> pd.to_timedelta(np.arange(5), unit='s')
    TimedeltaIndex(['0 days 00:00:00', '0 days 00:00:01', '0 days 00:00:02',
                    '0 days 00:00:03', '0 days 00:00:04'],
                   dtype='timedelta64[ns]', freq=None)
    >>> pd.to_timedelta(np.arange(5), unit='d')
    TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'],
                   dtype='timedelta64[ns]', freq=None)



Function54

unique(values)

Help on function unique in module pandas.core.algorithms:
unique(values)
    Hash table-based unique. Uniques are returned in order
    of appearance. This does NOT sort.
    Significantly faster than numpy.unique for long enough sequences.
    Includes NA values.
    Parameters
    ----------
    values : 1d array-like
    Returns
    -------
    numpy.ndarray or ExtensionArray
        The return can be:
        * Index : when the input is an Index
        * Categorical : when the input is a Categorical dtype
        * ndarray : when the input is a Series/ndarray
        Return numpy.ndarray or ExtensionArray.
    See Also
    --------
    Index.unique : Return unique values from an Index.
    Series.unique : Return unique values of Series object.
    Examples
    --------
    >>> pd.unique(pd.Series([2, 1, 3, 3]))
    array([2, 1, 3])
    >>> pd.unique(pd.Series([2] + [1] * 5))
    array([2, 1])
    >>> pd.unique(pd.Series([pd.Timestamp("20160101"), pd.Timestamp("20160101")]))
    array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
    >>> pd.unique(
    ...     pd.Series(
    ...         [
    ...             pd.Timestamp("20160101", tz="US/Eastern"),
    ...             pd.Timestamp("20160101", tz="US/Eastern"),
    ...         ]
    ...     )
    ... )
    <DatetimeArray>
    ['2016-01-01 00:00:00-05:00']
    Length: 1, dtype: datetime64[ns, US/Eastern]
    >>> pd.unique(
    ...     pd.Index(
    ...         [
    ...             pd.Timestamp("20160101", tz="US/Eastern"),
    ...             pd.Timestamp("20160101", tz="US/Eastern"),
    ...         ]
    ...     )
    ... )
    DatetimeIndex(['2016-01-01 00:00:00-05:00'],
            dtype='datetime64[ns, US/Eastern]',
            freq=None)
    >>> pd.unique(list("baabc"))
    array(['b', 'a', 'c'], dtype=object)
    An unordered Categorical will return categories in the
    order of appearance.
    >>> pd.unique(pd.Series(pd.Categorical(list("baabc"))))
    ['b', 'a', 'c']
    Categories (3, object): ['a', 'b', 'c']
    >>> pd.unique(pd.Series(pd.Categorical(list("baabc"), categories=list("abc"))))
    ['b', 'a', 'c']
    Categories (3, object): ['a', 'b', 'c']
    An ordered Categorical preserves the category ordering.
    >>> pd.unique(
    ...     pd.Series(
    ...         pd.Categorical(list("baabc"), categories=list("abc"), ordered=True)
    ...     )
    ... )
    ['b', 'a', 'c']
    Categories (3, object): ['a' < 'b' < 'c']
    An array of tuples
    >>> pd.unique([("a", "b"), ("b", "a"), ("a", "c"), ("b", "a")])
    array([('a', 'b'), ('b', 'a'), ('a', 'c')], dtype=object)




Function55

value_counts(values, sort: 'bool' = True, ascending: 'bool' = False, normalize: 'bool' = False, bins=None, dropna: 'bool' = True) -> 'Series'



Help on function value_counts in module pandas.core.algorithms:
value_counts(values, sort: 'bool' = True, ascending: 'bool' = False, normalize: 'bool' = False, bins=None, dropna: 'bool' = True) -> 'Series'
    Compute a histogram of the counts of non-null values.
    Parameters
    ----------
    values : ndarray (1-d)
    sort : bool, default True
        Sort by values
    ascending : bool, default False
        Sort in ascending order
    normalize: bool, default False
        If True then compute a relative histogram
    bins : integer, optional
        Rather than count values, group them into half-open bins,
        convenience for pd.cut, only works with numeric data
    dropna : bool, default True
        Don't include counts of NaN
    Returns
    -------
    Series


Function56

wide_to_long(df: 'DataFrame', stubnames, i, j, sep: 'str' = '', suffix: 'str' = '\\d+') -> 'DataFrame'

Help on function wide_to_long in module pandas.core.reshape.melt:
wide_to_long(df: 'DataFrame', stubnames, i, j, sep: 'str' = '', suffix: 'str' = '\\d+') -> 'DataFrame'
    Wide panel to long format. Less flexible but more user-friendly than melt.
    With stubnames ['A', 'B'], this function expects to find one or more
    group of columns with format
    A-suffix1, A-suffix2,..., B-suffix1, B-suffix2,...
    You specify what you want to call this suffix in the resulting long format
    with `j` (for example `j='year'`)
    Each row of these wide variables are assumed to be uniquely identified by
    `i` (can be a single column name or a list of column names)
    All remaining variables in the data frame are left intact.
    Parameters
    ----------
    df : DataFrame
        The wide-format DataFrame.
    stubnames : str or list-like
        The stub name(s). The wide format variables are assumed to
        start with the stub names.
    i : str or list-like
        Column(s) to use as id variable(s).
    j : str
        The name of the sub-observation variable. What you wish to name your
        suffix in the long format.
    sep : str, default ""
        A character indicating the separation of the variable names
        in the wide format, to be stripped from the names in the long format.
        For example, if your column names are A-suffix1, A-suffix2, you
        can strip the hyphen by specifying `sep='-'`.
    suffix : str, default '\\d+'
        A regular expression capturing the wanted suffixes. '\\d+' captures
        numeric suffixes. Suffixes with no numbers could be specified with the
        negated character class '\\D+'. You can also further disambiguate
        suffixes, for example, if your wide variables are of the form A-one,
        B-two,.., and you have an unrelated column A-rating, you can ignore the
        last one by specifying `suffix='(!?one|two)'`. When all suffixes are
        numeric, they are cast to int64/float64.
    Returns
    -------
    DataFrame
        A DataFrame that contains each stub name as a variable, with new index
        (i, j).
    See Also
    --------
    melt : Unpivot a DataFrame from wide to long format, optionally leaving
        identifiers set.
    pivot : Create a spreadsheet-style pivot table as a DataFrame.
    DataFrame.pivot : Pivot without aggregation that can handle
        non-numeric data.
    DataFrame.pivot_table : Generalization of pivot that can handle
        duplicate values for one index/column pair.
    DataFrame.unstack : Pivot based on the index values instead of a
        column.
    Notes
    -----
    All extra variables are left untouched. This simply uses
    `pandas.melt` under the hood, but is hard-coded to "do the right thing"
    in a typical case.
    Examples
    --------
    >>> np.random.seed(123)
    >>> df = pd.DataFrame({"A1970" : {0 : "a", 1 : "b", 2 : "c"},
    ...                    "A1980" : {0 : "d", 1 : "e", 2 : "f"},
    ...                    "B1970" : {0 : 2.5, 1 : 1.2, 2 : .7},
    ...                    "B1980" : {0 : 3.2, 1 : 1.3, 2 : .1},
    ...                    "X"     : dict(zip(range(3), np.random.randn(3)))
    ...                   })
    >>> df["id"] = df.index
    >>> df
      A1970 A1980  B1970  B1980         X  id
    0     a     d    2.5    3.2 -1.085631   0
    1     b     e    1.2    1.3  0.997345   1
    2     c     f    0.7    0.1  0.282978   2
    >>> pd.wide_to_long(df, ["A", "B"], i="id", j="year")
    ... # doctest: +NORMALIZE_WHITESPACE
                    X  A    B
    id year
    0  1970 -1.085631  a  2.5
    1  1970  0.997345  b  1.2
    2  1970  0.282978  c  0.7
    0  1980 -1.085631  d  3.2
    1  1980  0.997345  e  1.3
    2  1980  0.282978  f  0.1
    With multiple id columns
    >>> df = pd.DataFrame({
    ...     'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3],
    ...     'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3],
    ...     'ht1': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1],
    ...     'ht2': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9]
    ... })
    >>> df
       famid  birth  ht1  ht2
    0      1      1  2.8  3.4
    1      1      2  2.9  3.8
    2      1      3  2.2  2.9
    3      2      1  2.0  3.2
    4      2      2  1.8  2.8
    5      2      3  1.9  2.4
    6      3      1  2.2  3.3
    7      3      2  2.3  3.4
    8      3      3  2.1  2.9
    >>> l = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age')
    >>> l
    ... # doctest: +NORMALIZE_WHITESPACE
                      ht
    famid birth age
    1     1     1    2.8
                2    3.4
          2     1    2.9
                2    3.8
          3     1    2.2
                2    2.9
    2     1     1    2.0
                2    3.2
          2     1    1.8
                2    2.8
          3     1    1.9
                2    2.4
    3     1     1    2.2
                2    3.3
          2     1    2.3
                2    3.4
          3     1    2.1
                2    2.9
    Going from long back to wide just takes some creative use of `unstack`
    >>> w = l.unstack()
    >>> w.columns = w.columns.map('{0[0]}{0[1]}'.format)
    >>> w.reset_index()
       famid  birth  ht1  ht2
    0      1      1  2.8  3.4
    1      1      2  2.9  3.8
    2      1      3  2.2  2.9
    3      2      1  2.0  3.2
    4      2      2  1.8  2.8
    5      2      3  1.9  2.4
    6      3      1  2.2  3.3
    7      3      2  2.3  3.4
    8      3      3  2.1  2.9
    Less wieldy column names are also handled
    >>> np.random.seed(0)
    >>> df = pd.DataFrame({'A(weekly)-2010': np.random.rand(3),
    ...                    'A(weekly)-2011': np.random.rand(3),
    ...                    'B(weekly)-2010': np.random.rand(3),
    ...                    'B(weekly)-2011': np.random.rand(3),
    ...                    'X' : np.random.randint(3, size=3)})
    >>> df['id'] = df.index
    >>> df # doctest: +NORMALIZE_WHITESPACE, +ELLIPSIS
       A(weekly)-2010  A(weekly)-2011  B(weekly)-2010  B(weekly)-2011  X  id
    0        0.548814        0.544883        0.437587        0.383442  0   0
    1        0.715189        0.423655        0.891773        0.791725  1   1
    2        0.602763        0.645894        0.963663        0.528895  1   2
    >>> pd.wide_to_long(df, ['A(weekly)', 'B(weekly)'], i='id',
    ...                 j='year', sep='-')
    ... # doctest: +NORMALIZE_WHITESPACE
             X  A(weekly)  B(weekly)
    id year
    0  2010  0   0.548814   0.437587
    1  2010  1   0.715189   0.891773
    2  2010  1   0.602763   0.963663
    0  2011  0   0.544883   0.383442
    1  2011  1   0.423655   0.791725
    2  2011  1   0.645894   0.528895
    If we have many columns, we could also use a regex to find our
    stubnames and pass that list on to wide_to_long
    >>> stubnames = sorted(
    ...     set([match[0] for match in df.columns.str.findall(
    ...         r'[A-B].∗
').values if match != []])
    ... )
    >>> list(stubnames)
    ['A(weekly)', 'B(weekly)']
    All of the above examples have integers as suffixes. It is possible to
    have non-integers as suffixes.
    >>> df = pd.DataFrame({
    ...     'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3],
    ...     'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3],
    ...     'ht_one': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1],
    ...     'ht_two': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9]
    ... })
    >>> df
       famid  birth  ht_one  ht_two
    0      1      1     2.8     3.4
    1      1      2     2.9     3.8
    2      1      3     2.2     2.9
    3      2      1     2.0     3.2
    4      2      2     1.8     2.8
    5      2      3     1.9     2.4
    6      3      1     2.2     3.3
    7      3      2     2.3     3.4
    8      3      3     2.1     2.9
    >>> l = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age',
    ...                     sep='_', suffix=r'\w+')
    >>> l
    ... # doctest: +NORMALIZE_WHITESPACE
                      ht
    famid birth age
    1     1     one  2.8
                two  3.4
          2     one  2.9
                two  3.8
          3     one  2.2
                two  2.9
    2     1     one  2.0
                two  3.2
          2     one  1.8
                two  2.8
          3     one  1.9
                two  2.4
    3     1     one  2.2
                two  3.3
          2     one  2.3
                two  3.4
          3     one  2.1
                two  2.9



12个pandas子模块又包含310个库函数(含类、方法、子模块)

import pandas as pd
funcs = [_ for _ in dir(pd) if not _.startswith('_')]
types = type(pd.DataFrame), type(pd.array), type(pd)
Names = 'Type','Function','Module','Other'
Types = {}
count = 0
for f in funcs:
    t = type(eval("pd."+f))
    t = Names[-1 if t not in types else types.index(type(eval("pd."+f)))]
    Types[t] = Types.get(t,[])+[f]
for j,n in enumerate(Types['Module'],1):
    print(f"\n{j}:【{n}】")
    fun = [_ for _ in dir(eval('pd.'+n)) if not _.startswith('_')]
    count += len(fun)
    for i,f in enumerate(fun,1):
        print(f'{f:18} ',end='' if i%5 or i==len(fun) else '\n')
    print("\n小计:",len(fun))
print("合计:",count)


1:【api】

extensions         indexers           types              

小计: 3

2:【arrays】

ArrowStringArray   BooleanArray       Categorical        DatetimeArray      FloatingArray      

IntegerArray       IntervalArray      PandasArray        PeriodArray        SparseArray        

StringArray        TimedeltaArray      

小计: 12

3:【compat】

F                  IS64               PY310              PY38               PY39                

PYPY               chainmap           get_lzma_file      import_lzma        is_numpy_dev        

is_platform_arm    is_platform_linux  is_platform_little_endian is_platform_mac    is_platform_windows  

np_array_datetime64_compat np_datetime64_compat np_version_under1p18 np_version_under1p19 np_version_under1p20  

numpy           pa_version_under1p0 pa_version_under2p0 pa_version_under3p0 pa_version_under4p0  

pickle_compat      platform           pyarrow            set_function_name  sys                

warnings            

小计: 31

4:【core】

accessor           aggregation        algorithms         api                apply              

array_algos        arraylike          arrays             base               common              

computation        config_init        construction       describe           dtypes              

flags              frame              generic            groupby            indexers            

indexes            indexing           internals          missing            nanops              

ops                reshape            roperator          series             shared_docs        

sorting            strings            tools              util               window              

小计: 35

5:【errors】

AbstractMethodError AccessorRegistrationWarning DtypeWarning       DuplicateLabelError EmptyDataError    

IntCastingNaNError InvalidIndexError  MergeError         NullFrequencyError NumbaUtilError  

OptionError        OutOfBoundsDatetime OutOfBoundsTimedelta ParserError        ParserWarning    

PerformanceWarning UnsortedIndexError UnsupportedFunctionCall  

小计: 18

6:【io】

api                clipboards         common             date_converters    excel              

feather_format     formats            gbq                html               json                

orc                parquet            parsers            pickle             pytables            

sas                spss               sql                stata              xml                

小计: 20

7:【offsets】

BDay               BMonthBegin        BMonthEnd          BQuarterBegin      BQuarterEnd        

BYearBegin         BYearEnd           BaseOffset         BusinessDay        BusinessHour        

BusinessMonthBegin BusinessMonthEnd   CBMonthBegin       CBMonthEnd        CDay

CustomBusinessDay  CustomBusinessHour CustomBusinessMonthBegin CustomBusinessMonthEnd DateOffset          

Day                Easter             FY5253             FY5253Quarter      Hour                

LastWeekOfMonth    Micro              Milli              Minute             MonthBegin          

MonthEnd           Nano               QuarterBegin       QuarterEnd         Second              

SemiMonthBegin     SemiMonthEnd       Tick               Week               WeekOfMonth        

YearBegin          YearEnd            

小计: 42

8:【pandas】

BooleanDtype       Categorical        CategoricalDtype   CategoricalIndex   DataFrame          

DateOffset         DatetimeIndex      DatetimeTZDtype    ExcelFile          ExcelWriter        

Flags              Float32Dtype       Float64Dtype       Float64Index       Grouper            

HDFStore           Index              IndexSlice         Int16Dtype         Int32Dtype          

Int64Dtype         Int64Index         Int8Dtype          Interval           IntervalDtype      

IntervalIndex      MultiIndex         NA                 NaT                NamedAgg            

Period             PeriodDtype        PeriodIndex        RangeIndex         Series              

SparseDtype        StringDtype        Timedelta          TimedeltaIndex     Timestamp          

UInt16Dtype        UInt32Dtype        UInt64Dtype        UInt64Index        UInt8Dtype          

api                array              arrays             bdate_range        compat              

concat             core               crosstab           cut                date_range          

describe_option    errors             eval               factorize          get_dummies        

get_option         infer_freq         interval_range     io                 isna                

isnull             json_normalize     lreshape           melt               merge              

merge_asof         merge_ordered      notna              notnull            offsets            

option_context     options            pandas             period_range       pivot              

pivot_table        plotting           qcut               read_clipboard     read_csv            

read_excel         read_feather       read_fwf           read_gbq           read_hdf            

read_html          read_json          read_orc           read_parquet       read_pickle        

read_sas           read_spss          read_sql           read_sql_query     read_sql_table      

read_stata         read_table         read_xml           reset_option       set_eng_float_format  

set_option         show_versions      test               testing            timedelta_range    

to_datetime        to_numeric         to_pickle          to_timedelta       tseries            

unique             util               value_counts       wide_to_long        

小计: 119

9:【plotting】

PlotAccessor       andrews_curves     autocorrelation_plot bootstrap_plot     boxplot

boxplot_frame      boxplot_frame_groupby deregister_matplotlib_converters hist_frame         hist_series

lag_plot           parallel_coordinates plot_params        radviz             register_matplotlib_converters  

scatter_matrix     table              

小计: 17

10:【testing】

assert_extension_array_equal assert_frame_equal assert_index_equal assert_series_equal  

小计: 4

11:【tseries】

api                frequencies        offsets            

小计: 3

12:【util】

Appender           Substitution       cache_readonly     hash_array         hash_pandas_object  

version            

小计: 6

合计: 310



其中第8个pandas就是主模块:

1. >>> dir(pd)==dir(pd.pandas)
2. True


对第4个子模块core再扩展一下:

import pandas as pd
funcs = [_ for _ in dir(pd.core) if not _.startswith('_')]
types = type(pd.DataFrame), type(pd.array), type(pd)
Names = 'Type','Function','Module','Other'
Types = {}
count = 0
for f in funcs:
    t = type(eval("pd.core."+f))
    t = Names[-1 if t not in types else types.index(type(eval("pd.core."+f)))]
    Types[t] = Types.get(t,[])+[f]
for j,n in enumerate(Types['Module'],1):
    print(f"\n{j}:【{n}】")
    fun = [_ for _ in dir(eval('pd.core.'+n)) if not _.startswith('_')]
    count += len(fun)
    for i,f in enumerate(fun,1):
        print(f'{f:18} ',end='' if i%5 or i==len(fun) else '\n')
    print("\n小计:",len(fun))



又翻出1299个:

   1:【accessor】

 

CachedAccessor     DirNamesMixin      PandasDelegate     annotations        delegate_names      
    doc                register_dataframe_accessor register_index_accessor register_series_accessor warnings

   小计: 10

   2:【aggregation】

 

ABCSeries          AggFuncType        Any                Callable           DefaultDict        
    FrameOrSeries      Hashable           Index              Iterable           Sequence            
    SpecificationError TYPE_CHECKING      annotations        com                defaultdict        
    is_dict_like       is_list_like       is_multi_agg_with_relabel maybe_mangle_lambdas normalize_keyword_aggregation  
    partial            reconstruct_func   relabel_result     validate_func_kwargs

   小计: 24

   3:【algorithms】

 

ABCDatetimeArray   ABCExtensionArray  ABCIndex           ABCMultiIndex      ABCRangeIndex      
    ABCSeries          ABCTimedeltaArray  AnyArrayLike       ArrayLike          DtypeObj            
    FrameOrSeriesUnion PandasDtype        Scalar             SelectN            SelectNFrame        
    SelectNSeries      TYPE_CHECKING      Union              algos              annotations        
    cast        checked_add_with_arr construct_1d_object_array_from_listlike dedent        diff  
    doc                duplicated         ensure_float64     ensure_object      ensure_platform_int  
    ensure_wrapped_if_datetimelike extract_array      factorize          factorize_array    final
    get_data_algo      htable             iNaT               infer_dtype_from_array is_array_like      
    is_bool_dtype      is_categorical_dtype is_complex_dtype   is_datetime64_dtype is_extension_array_dtype  
    is_float_dtype     is_integer         is_integer_dtype   is_list_like       is_numeric_dtype    
    is_object_dtype    is_scalar          is_timedelta64_dtype isin               isna                
    lib                mode               na_value_for_dtype needs_i8_conversion np                  
    operator           pandas_dtype       pd_array           quantile           rank                
    safe_sort          sanitize_to_nanoseconds searchsorted       take               take_nd            
    union_with_duplicates unique             unique1d           validate_indices   value_counts        
    value_counts_arraylike warn

     

   小计: 77

   4:【api】

 

BooleanDtype       Categorical        CategoricalDtype   CategoricalIndex   DataFrame          
    DateOffset         DatetimeIndex      DatetimeTZDtype    Flags              Float32Dtype        
    Float64Dtype       Float64Index       Grouper            Index              IndexSlice          
    Int16Dtype         Int32Dtype         Int64Dtype         Int64Index         Int8Dtype          
    Interval           IntervalDtype      IntervalIndex      MultiIndex         NA                  
    NaT                NamedAgg           Period             PeriodDtype        PeriodIndex        
    RangeIndex         Series             StringDtype        Timedelta          TimedeltaIndex      
    Timestamp          UInt16Dtype        UInt32Dtype        UInt64Dtype        UInt64Index        
    UInt8Dtype         array              bdate_range        date_range         factorize          
    interval_range     isna               isnull             notna              notnull            
    period_range       set_eng_float_format timedelta_range    to_datetime        to_numeric  
    to_timedelta       unique             value_counts

   小计: 58

   5:【apply】

ABCDataFrame       ABCNDFrame         ABCSeries          AggFuncType        AggFuncTypeBase    
    AggFuncTypeDict    AggObjType         Any                Apply              Axis                
    DataError          Dict               FrameApply         FrameColumnApply   FrameOrSeries      
    FrameOrSeriesUnion FrameRowApply      GroupByApply       Hashable           Iterator  
    List           NDFrameApply       ResType            ResamplerWindowApply SelectionMixin
    SeriesApply        SpecificationError TYPE_CHECKING      abc                annotations        
    cache_readonly     cast               com                create_series_with_explicit_dtype ensure_wrapped_if_datetimelike  
    frame_apply        inspect            is_dict_like       is_extension_array_dtype is_list_like        
    is_nested_object   is_sequence        lib                np                 option_context      
    pd_array           safe_sort          warnings

   小计: 48

   6:【array_algos】

   masked_reductions  putmask            quantile           replace            take                

   transforms          

   小计: 6

   7:【arraylike】

   Any                OpsMixin           array_ufunc        extract_array      lib                

   maybe_dispatch_ufunc_to_dunder_op np                 operator           roperator          unpack_zerodim_and_defer  

   warnings            

   小计: 11

   8:【arrays】

ArrowStringArray   BaseMaskedArray    BooleanArray       Categorical        DatetimeArray
    ExtensionArray     ExtensionOpsMixin  ExtensionScalarOpsMixin FloatingArray      IntegerArray        
    IntervalArray      PandasArray        PeriodArray        SparseArray        StringArray        
    TimedeltaArray     base               boolean            categorical        datetimelike        
    datetimes          floating           integer            interval           masked              
    numeric            numpy_             period             period_array       sparse              
    string_            string_arrow       timedeltas

 

   小计: 33

   9:【base】

ABCDataFrame       ABCIndex           ABCSeries          AbstractMethodError Any                
    ArrayLike          DataError          DirNamesMixin      Dtype              DtypeObj            
    ExtensionArray     FrameOrSeries      Generic            Hashable           IndexLabel          
    IndexOpsMixin      NoNewAttributesMixin OpsMixin           PYPY           PandasObject  
    SelectionMixin     Shape              SpecificationError TYPE_CHECKING      TypeVar            
    algorithms         annotations        cache_readonly     cast               create_series_with_explicit_dtype  
    doc                duplicated         final              is_categorical_dtype is_dict_like        
    is_extension_array_dtype is_object_dtype    is_scalar          isna               lib                
    nanops             np                 nv                 remove_na_arraylike textwrap            
    unique1d           value_counts

   小计: 47

   10:【common】

 

ABCExtensionArray  ABCIndex           ABCSeries          Any                AnyArrayLike        
    Callable           Collection         Iterable           Iterator           NpDtype            
    Scalar             SettingWithCopyError SettingWithCopyWarning T                  TYPE_CHECKING      
    abc                all_none           all_not_none       annotations        any_none            
    any_not_none       apply_if_callable  asarray_tuplesafe  builtins           cast                
    cast_scalar_indexer consensus_name_attr construct_1d_object_array_from_listlike contextlib         convert_to_list_like  
    count_not_none     defaultdict        flatten            get_callable_name  get_cython_func    
    get_rename_function index_labels_to_array inspect            is_array_like      is_bool_dtype
    is_bool_indexer    is_builtin_func    is_extension_array_dtype is_full_slice      is_integer
    is_null_slice      is_true_slices     isna               iterable_not_string lib                
    maybe_iterable_to_list maybe_make_list    not_none         np         np_version_under1p18  
    partial            pipe               random_state       require_length_match standardize_mapping  
    temp_setattr       warnings

   小计: 62

   11:【computation】

   align              api                check              common             engines            

   eval               expr               expressions        ops                parsing            

   pytables           scope              

   小计: 12

   12:【config_init】

 

cf                 chained_assignment colheader_justify_doc data_manager_doc   float_format_doc    
    is_bool            is_callable        is_instance_factory is_int             is_nonnegative_int  
    is_one_of_factory  is_terminal        is_text            max_cols           max_colwidth_doc    
    os      parquet_engine_doc pc_ambiguous_as_wide_doc pc_chop_threshold_doc pc_colspace_doc
    pc_east_asian_width_doc pc_expand_repr_doc pc_html_border_doc pc_html_use_mathjax_doc pc_large_repr_doc  
    pc_latex_escape    pc_latex_longtable pc_latex_multicolumn pc_latex_multicolumn_format pc_latex_multirow  
    pc_latex_repr_doc  pc_max_categories_doc pc_max_cols_doc    pc_max_info_cols_doc pc_max_info_rows_doc  
    pc_max_rows_doc    pc_max_seq_items   pc_memory_usage_doc pc_min_rows_doc    pc_multi_sparse_doc  
    pc_nb_repr_h_doc   pc_pprint_nest_depth pc_precision_doc   pc_show_dimensions_doc pc_table_schema_doc  
    pc_width_doc       plotting_backend_doc reader_engine_doc  register_converter_cb register_converter_doc  
    register_plotting_backend_cb sql_engine_doc     string_storage_doc styler_max_elements styler_sparse_columns_doc  
    styler_sparse_index_doc table_schema_cb    tc_sim_interactive_doc use_bottleneck_cb  use_bottleneck_doc  
    use_inf_as_na_cb   use_inf_as_na_doc  use_inf_as_null_doc use_numba_cb       use_numba_doc      
    use_numexpr_cb     use_numexpr_doc    warnings           writer_engine_doc

   小计: 69

   13:【construction】

ABCExtensionArray  ABCIndex    ABCPandasArray     ABCRangeIndex      ABCSeries
    Any            AnyArrayLike       ArrayLike          DatetimeTZDtype    Dtype              
    DtypeObj           ExtensionDtype     IntCastingNaNError Sequence       TYPE_CHECKING
    annotations        array              cast               com                construct_1d_arraylike_from_scalar  
    construct_1d_object_array_from_listlike create_series_with_explicit_dtype ensure_wrapped_if_datetimelike extract_array      is_datetime64_ns_dtype  
    is_empty_data      is_extension_array_dtype is_float_dtype     is_integer_dtype   is_list_like
    is_object_dtype    is_timedelta64_ns_dtype isna               lib                ma                  
    maybe_cast_to_datetime maybe_cast_to_integer_array maybe_convert_platform maybe_infer_to_datetimelike maybe_upcast        
    np                 range_to_ndarray   registry           sanitize_array     sanitize_masked_array  
    sanitize_to_nanoseconds warnings

 

   小计: 47

   14:【describe】

 

ABC                Callable           DataFrameDescriber FrameOrSeries      FrameOrSeriesUnion  
    Hashable           NDFrameDescriberAbstract Sequence           SeriesDescriber    TYPE_CHECKING      
    Timestamp          abstractmethod     annotations        cast               concat              
    describe_categorical_1d describe_ndframe   describe_numeric_1d describe_timestamp_1d describe_timestamp_as_categorical_1d  
    format_percentiles is_bool_dtype      is_datetime64_any_dtype is_numeric_dtype   is_timedelta64_dtype  
    np                 refine_percentiles reorder_columns    select_describe_func validate_percentile  
    warnings

   小计: 31

   15:【dtypes】

   api                base               cast               common             concat              

   dtypes             generic            inference          missing            

   小计: 9

   16:【flags】

   Flags              weakref            

   小计: 2

   17:【frame】

AggFuncType        Any                AnyArrayLike       AnyStr             Appender            
    ArrayLike          ArrayManager       Axes               Axis               BaseInfo            
    BlockManager       CachedAccessor     Callable           CategoricalIndex   ColspaceArgType
    CompressionOptions DataFrame          DataFrameInfo      DatetimeArray      DatetimeIndex
    Dtype              ExtensionArray     ExtensionDtype     FilePathOrBuffer   FillnaOptions      
    FloatFormatType    FormattersType     FrameOrSeriesUnion Frequency         Hashable  
    IO                 Index              IndexKeyFunc       IndexLabel         Iterable            
    Iterator           Level              MultiIndex         NDFrame            NpDtype            
    OpsMixin           PeriodIndex        PythonFuncType     Renamer            Scalar              
    Sequence           Series             SparseFrameAccessor StorageOptions     StringIO            
    Substitution       Suffixes           TYPE_CHECKING      TimedeltaArray     ValueKeyFunc        
    abc                algorithms         annotations        arrays_to_mgr      cast                
    check_bool_indexer check_key_length   collections        com                console            
    construct_1d_arraylike_from_scalar construct_2d_arraylike_from_scalar convert_to_index_sliceable dataclasses_to_dicts datetime            
    dedent        deprecate_kwarg    deprecate_nonkeyword_arguments dict_to_mgr      doc
    duplicated      ensure_index       ensure_index_from_sequences ensure_platform_int extract_array
    find_common_type   fmt                functools          generic            get_group_index    
    get_handle         get_option         ibase              import_optional_dependency infer_dtype_from_object  
    infer_dtype_from_scalar invalidate_string_dtypes is_1d_only_ea_dtype is_1d_only_ea_obj  is_bool_dtype      
    is_dataclass       is_datetime64_any_dtype is_dict_like       is_dtype_equal     is_extension_array_dtype  
    is_float           is_float_dtype     is_hashable        is_integer         is_integer_dtype    
    is_iterator        is_list_like       is_object_dtype    is_scalar          is_sequence        
    isna               itertools          lexsort_indexer    lib                libalgos            
    ma                 maybe_box_native   maybe_downcast_to_dtype maybe_droplevels   melt
    mgr_to_mgr         mmap               nanops             nargsort           ndarray_to_mgr      
    nested_data_to_arrays no_default         notna              np                 nv                  
    ops                overload           pandas             pandas_dtype       properties          
    rec_array_to_mgr   reconstruct_func   relabel_result     reorder_arrays     rewrite_axis_style_signature  
    sanitize_array     sanitize_masked_array take_2d_multi      to_arrays      treat_as_nested
    validate_axis_style_args validate_bool_kwarg validate_numeric_casting validate_percentile warnings

   小计: 150

   18:【generic】

ABCDataFrame       ABCSeries          AbstractMethodError Any                AnyStr              
    ArrayManager       Axis               BlockManager       Callable           CompressionOptions  
    DataFrameFormatter DataFrameRenderer  DatetimeIndex      Dtype              DtypeArg  
    DtypeObj           Expanding          ExponentialMovingWindow ExtensionArray     FilePathOrBuffer    
    Flags              FrameOrSeries      Hashable           Index              IndexKeyFunc        
    IndexLabel         InvalidIndexError  JSONSerializable   Level              Manager            
    Mapping            MultiIndex         NDFrame            NpDtype            PandasObject        
    Period             PeriodIndex        RangeIndex         Renamer            Rolling            
    Sequence        SingleArrayManager StorageOptions     T       TYPE_CHECKING
    Tick           TimedeltaConvertibleTypes Timestamp       TimestampConvertibleTypes ValueKeyFunc
    Window             algos              align_method_FRAME annotations        arraylike          
    bool_t             cast               collections        com                concat              
    config             create_series_with_explicit_dtype describe_ndframe   doc                ensure_index        
    ensure_object      ensure_platform_int ensure_str         extract_array      final              
    find_valid_index   fmt                functools          gc                 get_indexer_indexer  
    ibase              import_optional_dependency indexing           is_bool            is_bool_dtype      
    is_datetime64_any_dtype is_datetime64tz_dtype is_dict_like       is_dtype_equal     is_extension_array_dtype  
    is_float           is_hashable        is_list_like       is_nested_list_like is_number          
    is_numeric_dtype   is_object_dtype    is_re_compilable   is_scalar   is_timedelta64_dtype
    isna               json               lib                mgr_to_mgr         missing            
    nanops             notna              np                 nv                 operator            
    overload           pandas_dtype       pickle             pprint_thing       re                  
    rewrite_axis_style_signature timedelta          to_offset          validate_ascending validate_bool_kwarg  
    validate_fillna_kwargs warnings           weakref

 

   小计: 118

   19:【groupby】

   DataFrameGroupBy   GroupBy            Grouper            NamedAgg           SeriesGroupBy      

   base               categorical        generic            groupby            grouper            

   numba_             ops                

   小计: 12

   20:【indexers】

 

ABCIndex           ABCSeries          Any                AnyArrayLike       ArrayLike          
    TYPE_CHECKING      annotations        check_array_indexer check_key_length   check_setitem_lengths  
    deprecate_ndim_indexing is_array_like      is_bool_dtype      is_empty_indexer   is_exact_shape_match  
    is_extension_array_dtype is_integer         is_integer_dtype   is_list_like       is_list_like_indexer  
    is_scalar_indexer  is_valid_positional_slice length_of_indexer  maybe_convert_indices np
    unpack_1tuple      validate_indices   warnings

   小计: 28

   21:【indexes】

 

accessors          api                base               category           datetimelike        
    datetimes          extension          frozen             interval           multi              
    numeric            period             range              timedeltas

 

   小计: 14

   22:【indexing】

 

ABCDataFrame       ABCSeries          AbstractMethodError Any                CategoricalIndex    
    Hashable           Index              IndexSlice         IndexingError      IndexingMixin      
    IntervalIndex      InvalidIndexError  MultiIndex         NDFrameIndexerBase Sequence            
    TYPE_CHECKING      algos              annotations        check_array_indexer check_bool_indexer  
    com                concat_compat      convert_from_missing_indexer_tuple convert_missing_indexer convert_to_index_sliceable  
    doc                ensure_index       extract_array      infer_fill_value   is_array_like      
    is_bool_dtype      is_empty_indexer   is_exact_shape_match is_hashable        is_integer
    is_iterator        is_label_like      is_list_like       is_list_like_indexer is_nested_tuple    
    is_numeric_dtype   is_object_dtype    is_scalar          is_sequence        isna                
    item_from_zerodim  length_of_indexer  maybe_convert_ix   need_slice         needs_i8_conversion  
    np                 pd_array           suppress           warnings

   小计: 54

   23:【internals】

 

ArrayManager       Block              BlockManager       DataManager        DatetimeTZBlock    
    ExtensionBlock     NumericBlock       ObjectBlock        SingleArrayManager SingleBlockManager  
    SingleDataManager  api                array_manager      base               blocks              
    concat             concatenate_managers construction       create_block_manager_from_arrays create_block_manager_from_blocks  
    make_block         managers           ops

     

   小计: 23

   24:【missing】

Any                ArrayLike          Axis               F                  NP_METHODS          
    SP_METHODS         TYPE_CHECKING      algos              annotations        cast                
    check_value_size   clean_fill_method  clean_interp_method clean_reindex_fill_method find_valid_index    
    get_fill_func      import_optional_dependency infer_dtype_from   interpolate_1d     interpolate_2d      
    interpolate_2d_with_fill interpolate_array_2d is_array_like      is_numeric_v_string_like is_valid_na_for_dtype  
    isna               lib                mask_missing       na_value_for_dtype needs_i8_conversion  
    np                 partial            wraps

 

   小计: 33

   25:【nanops】

 

Any                ArrayLike          Dtype              DtypeObj           F                  
    NaT                NaTType            PeriodDtype        Scalar             Shape              
    Timedelta          annotations        bn                 bottleneck_switch  cast                
    check_below_min_count disallow           extract_array      functools          get_corr_func      
    get_dtype          get_empty_reduction_result get_option         iNaT               import_optional_dependency  
    is_any_int_dtype   is_bool_dtype      is_complex         is_datetime64_any_dtype is_float
    is_float_dtype     is_integer         is_integer_dtype   is_numeric_dtype   is_object_dtype    
    is_scalar          is_timedelta64_dtype isna               itertools          lib                
    make_nancomp       na_accum_func      na_value_for_dtype nanall             nanany              
    nanargmax          nanargmin          nancorr            nancov             naneq              
    nange              nangt              nankurt            nanle              nanlt              
    nanmax             nanmean            nanmedian          nanmin             nanne              
    nanpercentile      nanprod            nansem             nanskew            nanstd              
    nansum             nanvar             needs_i8_conversion notna              np                  
    np_percentile_argname operator        pandas_dtype       set_use_bottleneck warnings

   小计: 75

   26:【ops】

 

ABCDataFrame       ABCSeries          ARITHMETIC_BINOPS  Appender           COMPARISON_BINOPS  
    Level              TYPE_CHECKING      add_flex_arithmetic_methods algorithms         align_method_FRAME  
    align_method_SERIES annotations        arithmetic_op      array_ops          common              
    comp_method_OBJECT_ARRAY comparison_op      dispatch           docstrings         fill_binop          
    flex_arith_method_FRAME flex_comp_method_FRAME flex_method_SERIES frame_arith_method_with_reindex get_array_op        
    get_op_result_name invalid            invalid_comparison is_array_like      is_list_like        
    isna               kleene_and         kleene_or          kleene_xor         logical_op          
    make_flex_doc      mask_ops           maybe_dispatch_ufunc_to_dunder_op maybe_prepare_scalar_for_op methods            
    missing            np                 operator           radd               rand_              
    rdiv               rdivmod            rfloordiv          rmod               rmul                
    roperator          ror_               rpow               rsub               rtruediv            
    rxor               should_reindex_frame_op unpack_zerodim_and_defer warnings

   小计: 59

   27:【reshape】

   api                concat             melt               merge              pivot              

   reshape            tile               util                

   小计: 8

   28:【roperator】

   operator           radd               rand_              rdiv               rdivmod            

   rfloordiv          rmod               rmul               ror_               rpow                

   rsub               rtruediv           rxor                

   小计: 13

   29:【series】

 

ABCDataFrame       AggFuncType        Any                Appender           ArrayLike          
    Axis               CachedAccessor     Callable           CategoricalAccessor CategoricalIndex    
    CombinedDatetimelikeProperties DatetimeIndex      Dtype              DtypeObj           ExtensionArray      
    FillnaOptions      Float64Index       FrameOrSeriesUnion Hashable           IO                  
    Index              IndexKeyFunc       InvalidIndexError  Iterable           MultiIndex          
    NDFrame            NpDtype            PeriodIndex        Sequence           Series              
    SeriesApply        SingleArrayManager SingleBlockManager SingleManager      SparseAccessor      
    StorageOptions     StringIO           StringMethods      Substitution       TYPE_CHECKING      
    TimedeltaIndex     Union              ValueKeyFunc       algorithms         annotations        
    base               cast               check_bool_indexer com                convert_dtypes      
    create_series_with_explicit_dtype dedent             deprecate_ndim_indexing deprecate_nonkeyword_arguments doc                
    ensure_index       ensure_key_mapped  ensure_platform_int ensure_wrapped_if_datetimelike extract_array      
    fmt                generic            get_option         get_terminal_size  ibase              
    is_bool            is_dict_like       is_empty_data      is_hashable        is_integer          
    is_iterator        is_list_like       is_object_dtype    is_scalar          isna                
    lib                maybe_box_native   maybe_cast_pointwise_result missing            na_value_for_dtype  
    nanops             nargsort           no_default         notna              np                  
    nv                 ops                overload           pandas             pandas_dtype        
    properties         remove_na_arraylike reshape            sanitize_array     to_datetime        
    tslibs             unpack_1tuple      validate_all_hashable validate_bool_kwarg validate_numeric_casting  
    validate_percentile warnings           weakref

   小计: 103

   30:【shared_docs】

   annotations        

   小计: 1

   31:【sorting】

 

ABCMultiIndex      ABCRangeIndex      Callable           DefaultDict        IndexKeyFunc        
    Iterable           Sequence           Shape              TYPE_CHECKING      algos              
    annotations        compress_group_index decons_group_index decons_obs_group_ids defaultdict  
    ensure_int64       ensure_key_mapped  ensure_platform_int extract_array      get_compressed_ids  
    get_flattened_list get_group_index    get_group_index_sorter get_indexer_dict   get_indexer_indexer  
    hashtable    indexer_from_factorized is_extension_array_dtype is_int64_overflow_possible isna
    lexsort_indexer    lib         nargminmax         nargsort           np
    unique_label_indices

   小计: 36

   32:【strings】

   BaseStringArrayMethods StringMethods      accessor           base        object_array  

   小计: 5

   33:【tools】

   datetimes          numeric            timedeltas         times              

   小计: 4

   34:【util】

   hashing            numba_              

   小计: 2

   35:【window】

Expanding          ExpandingGroupby   ExponentialMovingWindow ExponentialMovingWindowGroupby Rolling            
    RollingGroupby     Window             common             doc                ewm                
    expanding          indexers           numba_             online             rolling

   小计: 15

   合计: 1299

待续......



目录
相关文章
|
13天前
|
存储 缓存 JavaScript
python实战篇:利用request库打造自己的翻译接口
python实战篇:利用request库打造自己的翻译接口
26 1
python实战篇:利用request库打造自己的翻译接口
|
23天前
|
Web App开发 Python
在ModelScope中,你可以使用Python的浏览器自动化库
在ModelScope中,你可以使用Python的浏览器自动化库
15 2
|
28天前
|
数据格式 Python
如何使用Python的Pandas库进行数据透视图(melt/cast)操作?
Pandas的`melt()`和`pivot()`函数用于数据透视。基本步骤:导入pandas,创建DataFrame,然后使用这两个函数转换数据格式。示例代码展示了如何通过`melt()`转为长格式,再用`pivot()`恢复为宽格式。输入数据是包含&#39;Name&#39;和&#39;Age&#39;列的DataFrame,最终结果经过转换后呈现出不同的布局。
39 6
|
28天前
|
数据挖掘 数据处理 索引
如何使用Python的Pandas库进行数据筛选和过滤?
Pandas是Python数据分析的核心库,其DataFrame数据结构便于数据操作。筛选与过滤数据主要包括:导入pandas,创建DataFrame,通过布尔索引、`query()`或`loc[]`、`iloc[]`方法筛选。
|
25天前
|
BI 数据处理 索引
Pandas基本操作:Series和DataFrame(Python)
Pandas基本操作:Series和DataFrame(Python)
95 1
|
1天前
|
数据采集 SQL 数据可视化
Python数据分析工具Pandas
【4月更文挑战第14天】Pandas是Python的数据分析库,提供Series和DataFrame数据结构,用于高效处理标记数据。它支持从多种数据源加载数据,包括CSV、Excel和SQL。功能包括数据清洗(处理缺失值、异常值)、数据操作(切片、过滤、分组)、时间序列分析及与Matplotlib等库集成进行数据可视化。其高性能底层基于NumPy,适合大型数据集处理。通过加载数据、清洗、分析和可视化,Pandas简化了数据分析流程。广泛的学习资源使其成为数据分析初学者的理想选择。
4 1
|
3天前
|
JSON API 数据格式
python的request库如何拿到json的返回值
python的request库如何拿到json的返回值
7 0
|
3天前
|
SQL 机器学习/深度学习 数据可视化
Pandas与其他库的集成:构建强大的数据处理生态
【4月更文挑战第16天】Pandas在数据处理中扮演关键角色,但与其他Python库如NumPy、Matplotlib/Seaborn、Scikit-learn和SQL的集成使其功能更加强大。结合NumPy进行数值计算,搭配Matplotlib/Seaborn实现高效可视化,与Scikit-learn联用加速机器学习,以及与SQL集成便于数据库操作,这些都构建了一个全面的数据处理生态系统,提升了数据科学家的工作效率,助力于数据价值的发掘。
|
7天前
|
开发者 Python
Python中使用`requests`库进行文件上传与下载的技术详解
【4月更文挑战第12天】在Python的网络编程中,文件上传和下载是常见的需求。`requests`库作为一个强大且易用的HTTP客户端,为我们提供了简便的文件上传和下载功能。本文将详细介绍如何在Python中使用`requests`库进行文件上传和下载。
|
7天前
|
安全 API 开发者
Python中使用`requests`库进行请求头与自定义参数设置的技术详解
【4月更文挑战第12天】在Python中,`requests`库是一个强大且灵活的HTTP客户端,用于发送所有类型的HTTP请求。在发送请求时,我们经常需要设置请求头和自定义参数来满足不同的需求。本文将详细探讨如何在Python中使用`requests`库进行请求头和自定义参数的设置。