S~W: Function46~56
Types['Function'][45:] ['set_eng_float_format', 'show_versions', 'test', 'timedelta_range', 'to_datetime', 'to_numeric', 'to_pickle', 'to_timedelta', 'unique', 'value_counts', 'wide_to_long']
Function46
set_eng_float_format(accuracy: 'int' = 3, use_eng_prefix: 'bool' = False) -> 'None'
Help on function set_eng_float_format in module pandas.io.formats.format: set_eng_float_format(accuracy: 'int' = 3, use_eng_prefix: 'bool' = False) -> 'None' Alter default behavior on how float is formatted in DataFrame. Format float in engineering format. By accuracy, we mean the number of decimal digits after the floating point. See also EngFormatter.
Function47
show_versions(as_json: 'str | bool' = False) -> 'None'
Help on function show_versions in module pandas.util._print_versions: show_versions(as_json: 'str | bool' = False) -> 'None' Provide useful information, important for bug reports. It comprises info about hosting operation system, pandas version, and versions of other installed relative packages. Parameters ---------- as_json : str or bool, default False * If False, outputs info in a human readable form to the console. * If str, it will be considered as a path to a file. Info will be written to that file in JSON format. * If True, outputs info in JSON format to the console.
Function48
test(extra_args=None)
Help on function test in module pandas.util._tester:
test(extra_args=None)
Function49
timedelta_range(start=None, end=None, periods: 'Optional[int]' = None, freq=None, name=None, closed=None) -> 'TimedeltaIndex'
Help on function timedelta_range in module pandas.core.indexes.timedeltas: timedelta_range(start=None, end=None, periods: 'Optional[int]' = None, freq=None, name=None, closed=None) -> 'TimedeltaIndex' Return a fixed frequency TimedeltaIndex, with day as the default frequency. Parameters ---------- start : str or timedelta-like, default None Left bound for generating timedeltas. end : str or timedelta-like, default None Right bound for generating timedeltas. periods : int, default None Number of periods to generate. freq : str or DateOffset, default 'D' Frequency strings can have multiples, e.g. '5H'. name : str, default None Name of the resulting TimedeltaIndex. closed : str, default None Make the interval closed with respect to the given frequency to the 'left', 'right', or both sides (None). Returns ------- TimedeltaIndex Notes ----- Of the four parameters ``start``, ``end``, ``periods``, and ``freq``, exactly three must be specified. If ``freq`` is omitted, the resulting ``TimedeltaIndex`` will have ``periods`` linearly spaced elements between ``start`` and ``end`` (closed on both sides). To learn more about the frequency strings, please see `this link <https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases>`__. Examples -------- >>> pd.timedelta_range(start='1 day', periods=4) TimedeltaIndex(['1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq='D') The ``closed`` parameter specifies which endpoint is included. The default behavior is to include both endpoints. >>> pd.timedelta_range(start='1 day', periods=4, closed='right') TimedeltaIndex(['2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq='D') The ``freq`` parameter specifies the frequency of the TimedeltaIndex. Only fixed frequencies can be passed, non-fixed frequencies such as 'M' (month end) will raise. >>> pd.timedelta_range(start='1 day', end='2 days', freq='6H') TimedeltaIndex(['1 days 00:00:00', '1 days 06:00:00', '1 days 12:00:00', '1 days 18:00:00', '2 days 00:00:00'], dtype='timedelta64[ns]', freq='6H') Specify ``start``, ``end``, and ``periods``; the frequency is generated automatically (linearly spaced). >>> pd.timedelta_range(start='1 day', end='5 days', periods=4) TimedeltaIndex(['1 days 00:00:00', '2 days 08:00:00', '3 days 16:00:00', '5 days 00:00:00'], dtype='timedelta64[ns]', freq=None)
Function50
to_datetime(arg: 'DatetimeScalarOrArrayConvertible', errors: 'str' = 'raise', dayfirst: 'bool' = False, yearfirst: 'bool' = False, utc: 'bool | None' = None, format: 'str | None' = None, exact: 'bool' = True, unit: 'str | None' = None, infer_datetime_format: 'bool' = False, origin='unix', cache: 'bool' = True) -> 'DatetimeIndex | Series | DatetimeScalar | NaTType | None'
Help on function to_datetime in module pandas.core.tools.datetimes: to_datetime(arg: 'DatetimeScalarOrArrayConvertible', errors: 'str' = 'raise', dayfirst: 'bool' = False, yearfirst: 'bool' = False, utc: 'bool | None' = None, format: 'str | None' = None, exact: 'bool' = True, unit: 'str | None' = None, infer_datetime_format: 'bool' = False, origin='unix', cache: 'bool' = True) -> 'DatetimeIndex | Series | DatetimeScalar | NaTType | None' Convert argument to datetime. Parameters ---------- arg : int, float, str, datetime, list, tuple, 1-d array, Series, DataFrame/dict-like The object to convert to a datetime. errors : {'ignore', 'raise', 'coerce'}, default 'raise' - If 'raise', then invalid parsing will raise an exception. - If 'coerce', then invalid parsing will be set as NaT. - If 'ignore', then invalid parsing will return the input. dayfirst : bool, default False Specify a date parse order if `arg` is str or its list-likes. If True, parses dates with the day first, eg 10/11/12 is parsed as 2012-11-10. Warning: dayfirst=True is not strict, but will prefer to parse with day first (this is a known bug, based on dateutil behavior). yearfirst : bool, default False Specify a date parse order if `arg` is str or its list-likes. - If True parses dates with the year first, eg 10/11/12 is parsed as 2010-11-12. - If both dayfirst and yearfirst are True, yearfirst is preceded (same as dateutil). Warning: yearfirst=True is not strict, but will prefer to parse with year first (this is a known bug, based on dateutil behavior). utc : bool, default None Return UTC DatetimeIndex if True (converting any tz-aware datetime.datetime objects as well). format : str, default None The strftime to parse time, eg "%d/%m/%Y", note that "%f" will parse all the way up to nanoseconds. See strftime documentation for more information on choices: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior. exact : bool, True by default Behaves as: - If True, require an exact format match. - If False, allow the format to match anywhere in the target string. unit : str, default 'ns' The unit of the arg (D,s,ms,us,ns) denote the unit, which is an integer or float number. This will be based off the origin. Example, with unit='ms' and origin='unix' (the default), this would calculate the number of milliseconds to the unix epoch start. infer_datetime_format : bool, default False If True and no `format` is given, attempt to infer the format of the datetime strings based on the first non-NaN element, and if it can be inferred, switch to a faster method of parsing them. In some cases this can increase the parsing speed by ~5-10x. origin : scalar, default 'unix' Define the reference date. The numeric values would be parsed as number of units (defined by `unit`) since this reference date. - If 'unix' (or POSIX) time; origin is set to 1970-01-01. - If 'julian', unit must be 'D', and origin is set to beginning of Julian Calendar. Julian day number 0 is assigned to the day starting at noon on January 1, 4713 BC. - If Timestamp convertible, origin is set to Timestamp identified by origin. cache : bool, default True If True, use a cache of unique, converted dates to apply the datetime conversion. May produce significant speed-up when parsing duplicate date strings, especially ones with timezone offsets. The cache is only used when there are at least 50 values. The presence of out-of-bounds values will render the cache unusable and may slow down parsing. .. versionchanged:: 0.25.0 - changed default value from False to True. Returns ------- datetime If parsing succeeded. Return type depends on input: - list-like: DatetimeIndex - Series: Series of datetime64 dtype - scalar: Timestamp In case when it is not possible to return designated types (e.g. when any element of input is before Timestamp.min or after Timestamp.max) return will have datetime.datetime type (or corresponding array/Series). See Also -------- DataFrame.astype : Cast argument to a specified dtype. to_timedelta : Convert argument to timedelta. convert_dtypes : Convert dtypes. Examples -------- Assembling a datetime from multiple columns of a DataFrame. The keys can be common abbreviations like ['year', 'month', 'day', 'minute', 'second', 'ms', 'us', 'ns']) or plurals of the same >>> df = pd.DataFrame({'year': [2015, 2016], ... 'month': [2, 3], ... 'day': [4, 5]}) >>> pd.to_datetime(df) 0 2015-02-04 1 2016-03-05 dtype: datetime64[ns] If a date does not meet the `timestamp limitations <https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html #timeseries-timestamp-limits>`_, passing errors='ignore' will return the original input instead of raising any exception. Passing errors='coerce' will force an out-of-bounds date to NaT, in addition to forcing non-dates (or non-parseable dates) to NaT. >>> pd.to_datetime('13000101', format='%Y%m%d', errors='ignore') datetime.datetime(1300, 1, 1, 0, 0) >>> pd.to_datetime('13000101', format='%Y%m%d', errors='coerce') NaT Passing infer_datetime_format=True can often-times speedup a parsing if its not an ISO8601 format exactly, but in a regular format. >>> s = pd.Series(['3/11/2000', '3/12/2000', '3/13/2000'] * 1000) >>> s.head() 0 3/11/2000 1 3/12/2000 2 3/13/2000 3 3/11/2000 4 3/12/2000 dtype: object >>> %timeit pd.to_datetime(s, infer_datetime_format=True) # doctest: +SKIP 100 loops, best of 3: 10.4 ms per loop >>> %timeit pd.to_datetime(s, infer_datetime_format=False) # doctest: +SKIP 1 loop, best of 3: 471 ms per loop Using a unix epoch time >>> pd.to_datetime(1490195805, unit='s') Timestamp('2017-03-22 15:16:45') >>> pd.to_datetime(1490195805433502912, unit='ns') Timestamp('2017-03-22 15:16:45.433502912') .. warning:: For float arg, precision rounding might happen. To prevent unexpected behavior use a fixed-width exact type. Using a non-unix epoch origin >>> pd.to_datetime([1, 2, 3], unit='D', ... origin=pd.Timestamp('1960-01-01')) DatetimeIndex(['1960-01-02', '1960-01-03', '1960-01-04'], dtype='datetime64[ns]', freq=None) In case input is list-like and the elements of input are of mixed timezones, return will have object type Index if utc=False. >>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500']) Index([2018-10-26 12:00:00-05:30, 2018-10-26 12:00:00-05:00], dtype='object') >>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500'], ... utc=True) DatetimeIndex(['2018-10-26 17:30:00+00:00', '2018-10-26 17:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None)
Function51
to_numeric(arg, errors='raise', downcast=None)
Help on function to_numeric in module pandas.core.tools.numeric: to_numeric(arg, errors='raise', downcast=None) Convert argument to a numeric type. The default return dtype is `float64` or `int64` depending on the data supplied. Use the `downcast` parameter to obtain other dtypes. Please note that precision loss may occur if really large numbers are passed in. Due to the internal limitations of `ndarray`, if numbers smaller than `-9223372036854775808` (np.iinfo(np.int64).min) or larger than `18446744073709551615` (np.iinfo(np.uint64).max) are passed in, it is very likely they will be converted to float so that they can stored in an `ndarray`. These warnings apply similarly to `Series` since it internally leverages `ndarray`. Parameters ---------- arg : scalar, list, tuple, 1-d array, or Series Argument to be converted. errors : {'ignore', 'raise', 'coerce'}, default 'raise' - If 'raise', then invalid parsing will raise an exception. - If 'coerce', then invalid parsing will be set as NaN. - If 'ignore', then invalid parsing will return the input. downcast : {'integer', 'signed', 'unsigned', 'float'}, default None If not None, and if the data has been successfully cast to a numerical dtype (or if the data was numeric to begin with), downcast that resulting data to the smallest numerical dtype possible according to the following rules: - 'integer' or 'signed': smallest signed int dtype (min.: np.int8) - 'unsigned': smallest unsigned int dtype (min.: np.uint8) - 'float': smallest float dtype (min.: np.float32) As this behaviour is separate from the core conversion to numeric values, any errors raised during the downcasting will be surfaced regardless of the value of the 'errors' input. In addition, downcasting will only occur if the size of the resulting data's dtype is strictly larger than the dtype it is to be cast to, so if none of the dtypes checked satisfy that specification, no downcasting will be performed on the data. Returns ------- ret Numeric if parsing succeeded. Return type depends on input. Series if Series, otherwise ndarray. See Also -------- DataFrame.astype : Cast argument to a specified dtype. to_datetime : Convert argument to datetime. to_timedelta : Convert argument to timedelta. numpy.ndarray.astype : Cast a numpy array to a specified type. DataFrame.convert_dtypes : Convert dtypes. Examples -------- Take separate series and convert to numeric, coercing when told to >>> s = pd.Series(['1.0', '2', -3]) >>> pd.to_numeric(s) 0 1.0 1 2.0 2 -3.0 dtype: float64 >>> pd.to_numeric(s, downcast='float') 0 1.0 1 2.0 2 -3.0 dtype: float32 >>> pd.to_numeric(s, downcast='signed') 0 1 1 2 2 -3 dtype: int8 >>> s = pd.Series(['apple', '1.0', '2', -3]) >>> pd.to_numeric(s, errors='ignore') 0 apple 1 1.0 2 2 3 -3 dtype: object >>> pd.to_numeric(s, errors='coerce') 0 NaN 1 1.0 2 2.0 3 -3.0 dtype: float64 Downcasting of nullable integer and floating dtypes is supported: >>> s = pd.Series([1, 2, 3], dtype="Int64") >>> pd.to_numeric(s, downcast="integer") 0 1 1 2 2 3 dtype: Int8 >>> s = pd.Series([1.0, 2.1, 3.0], dtype="Float64") >>> pd.to_numeric(s, downcast="float") 0 1.0 1 2.1 2 3.0 dtype: Float32
Function52
to_pickle(obj: Any, filepath_or_buffer: Union[ForwardRef('PathLike[str]'), str, IO[~AnyStr], io.RawIOBase, io.BufferedIOBase, io.TextIOBase, _io.TextIOWrapper, mmap.mmap], compression: Union[str, Dict[str, Any], NoneType] = 'infer', protocol: int = 5, storage_options: Union[Dict[str, Any], NoneType] = None)
Help on function to_pickle in module pandas.io.pickle: to_pickle(obj: Any, filepath_or_buffer: Union[ForwardRef('PathLike[str]'), str, IO[~AnyStr], io.RawIOBase, io.BufferedIOBase, io.TextIOBase, _io.TextIOWrapper, mmap.mmap], compression: Union[str, Dict[str, Any], NoneType] = 'infer', protocol: int = 5, storage_options: Union[Dict[str, Any], NoneType] = None) Pickle (serialize) object to file. Parameters ---------- obj : any object Any python object. filepath_or_buffer : str, path object or file-like object File path, URL, or buffer where the pickled object will be stored. .. versionchanged:: 1.0.0 Accept URL. URL has to be of S3 or GCS. compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer' If 'infer' and 'path_or_url' is path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no compression) If 'infer' and 'path_or_url' is not path-like, then use None (= no decompression). protocol : int Int which indicates which protocol should be used by the pickler, default HIGHEST_PROTOCOL (see [1], paragraph 12.1.2). The possible values for this parameter depend on the version of Python. For Python 2.x, possible values are 0, 1, 2. For Python>=3.0, 3 is a valid value. For Python >= 3.4, 4 is a valid value. A negative value for the protocol parameter is equivalent to setting its value to HIGHEST_PROTOCOL. storage_options : dict, optional Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to ``urllib`` as header options. For other URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are forwarded to ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details. .. versionadded:: 1.2.0 .. [1] https://docs.python.org/3/library/pickle.html See Also -------- read_pickle : Load pickled pandas object (or any object) from file. DataFrame.to_hdf : Write DataFrame to an HDF5 file. DataFrame.to_sql : Write DataFrame to a SQL database. DataFrame.to_parquet : Write a DataFrame to the binary parquet format. Examples -------- >>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)}) >>> original_df foo bar 0 0 5 1 1 6 2 2 7 3 3 8 4 4 9 >>> pd.to_pickle(original_df, "./dummy.pkl") >>> unpickled_df = pd.read_pickle("./dummy.pkl") >>> unpickled_df foo bar 0 0 5 1 1 6 2 2 7 3 3 8 4 4 9 >>> import os >>> os.remove("./dummy.pkl")
Function53
to_timedelta(arg, unit=None, errors='raise')
Help on function to_timedelta in module pandas.core.tools.timedeltas: to_timedelta(arg, unit=None, errors='raise') Convert argument to timedelta. Timedeltas are absolute differences in times, expressed in difference units (e.g. days, hours, minutes, seconds). This method converts an argument from a recognized timedelta format / value into a Timedelta type. Parameters ---------- arg : str, timedelta, list-like or Series The data to be converted to timedelta. .. deprecated:: 1.2 Strings with units 'M', 'Y' and 'y' do not represent unambiguous timedelta values and will be removed in a future version unit : str, optional Denotes the unit of the arg for numeric `arg`. Defaults to ``"ns"``. Possible values: * 'W' * 'D' / 'days' / 'day' * 'hours' / 'hour' / 'hr' / 'h' * 'm' / 'minute' / 'min' / 'minutes' / 'T' * 'S' / 'seconds' / 'sec' / 'second' * 'ms' / 'milliseconds' / 'millisecond' / 'milli' / 'millis' / 'L' * 'us' / 'microseconds' / 'microsecond' / 'micro' / 'micros' / 'U' * 'ns' / 'nanoseconds' / 'nano' / 'nanos' / 'nanosecond' / 'N' .. versionchanged:: 1.1.0 Must not be specified when `arg` context strings and ``errors="raise"``. errors : {'ignore', 'raise', 'coerce'}, default 'raise' - If 'raise', then invalid parsing will raise an exception. - If 'coerce', then invalid parsing will be set as NaT. - If 'ignore', then invalid parsing will return the input. Returns ------- timedelta64 or numpy.array of timedelta64 Output type returned if parsing succeeded. See Also -------- DataFrame.astype : Cast argument to a specified dtype. to_datetime : Convert argument to datetime. convert_dtypes : Convert dtypes. Notes ----- If the precision is higher than nanoseconds, the precision of the duration is truncated to nanoseconds for string inputs. Examples -------- Parsing a single string to a Timedelta: >>> pd.to_timedelta('1 days 06:05:01.00003') Timedelta('1 days 06:05:01.000030') >>> pd.to_timedelta('15.5us') Timedelta('0 days 00:00:00.000015500') Parsing a list or array of strings: >>> pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan']) TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015500', NaT], dtype='timedelta64[ns]', freq=None) Converting numbers by specifying the `unit` keyword argument: >>> pd.to_timedelta(np.arange(5), unit='s') TimedeltaIndex(['0 days 00:00:00', '0 days 00:00:01', '0 days 00:00:02', '0 days 00:00:03', '0 days 00:00:04'], dtype='timedelta64[ns]', freq=None) >>> pd.to_timedelta(np.arange(5), unit='d') TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None)
Function54
unique(values)
Help on function unique in module pandas.core.algorithms: unique(values) Hash table-based unique. Uniques are returned in order of appearance. This does NOT sort. Significantly faster than numpy.unique for long enough sequences. Includes NA values. Parameters ---------- values : 1d array-like Returns ------- numpy.ndarray or ExtensionArray The return can be: * Index : when the input is an Index * Categorical : when the input is a Categorical dtype * ndarray : when the input is a Series/ndarray Return numpy.ndarray or ExtensionArray. See Also -------- Index.unique : Return unique values from an Index. Series.unique : Return unique values of Series object. Examples -------- >>> pd.unique(pd.Series([2, 1, 3, 3])) array([2, 1, 3]) >>> pd.unique(pd.Series([2] + [1] * 5)) array([2, 1]) >>> pd.unique(pd.Series([pd.Timestamp("20160101"), pd.Timestamp("20160101")])) array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]') >>> pd.unique( ... pd.Series( ... [ ... pd.Timestamp("20160101", tz="US/Eastern"), ... pd.Timestamp("20160101", tz="US/Eastern"), ... ] ... ) ... ) <DatetimeArray> ['2016-01-01 00:00:00-05:00'] Length: 1, dtype: datetime64[ns, US/Eastern] >>> pd.unique( ... pd.Index( ... [ ... pd.Timestamp("20160101", tz="US/Eastern"), ... pd.Timestamp("20160101", tz="US/Eastern"), ... ] ... ) ... ) DatetimeIndex(['2016-01-01 00:00:00-05:00'], dtype='datetime64[ns, US/Eastern]', freq=None) >>> pd.unique(list("baabc")) array(['b', 'a', 'c'], dtype=object) An unordered Categorical will return categories in the order of appearance. >>> pd.unique(pd.Series(pd.Categorical(list("baabc")))) ['b', 'a', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> pd.unique(pd.Series(pd.Categorical(list("baabc"), categories=list("abc")))) ['b', 'a', 'c'] Categories (3, object): ['a', 'b', 'c'] An ordered Categorical preserves the category ordering. >>> pd.unique( ... pd.Series( ... pd.Categorical(list("baabc"), categories=list("abc"), ordered=True) ... ) ... ) ['b', 'a', 'c'] Categories (3, object): ['a' < 'b' < 'c'] An array of tuples >>> pd.unique([("a", "b"), ("b", "a"), ("a", "c"), ("b", "a")]) array([('a', 'b'), ('b', 'a'), ('a', 'c')], dtype=object)
Function55
value_counts(values, sort: 'bool' = True, ascending: 'bool' = False, normalize: 'bool' = False, bins=None, dropna: 'bool' = True) -> 'Series'
Help on function value_counts in module pandas.core.algorithms: value_counts(values, sort: 'bool' = True, ascending: 'bool' = False, normalize: 'bool' = False, bins=None, dropna: 'bool' = True) -> 'Series' Compute a histogram of the counts of non-null values. Parameters ---------- values : ndarray (1-d) sort : bool, default True Sort by values ascending : bool, default False Sort in ascending order normalize: bool, default False If True then compute a relative histogram bins : integer, optional Rather than count values, group them into half-open bins, convenience for pd.cut, only works with numeric data dropna : bool, default True Don't include counts of NaN Returns ------- Series
Function56
wide_to_long(df: 'DataFrame', stubnames, i, j, sep: 'str' = '', suffix: 'str' = '\\d+') -> 'DataFrame'
Help on function wide_to_long in module pandas.core.reshape.melt: wide_to_long(df: 'DataFrame', stubnames, i, j, sep: 'str' = '', suffix: 'str' = '\\d+') -> 'DataFrame' Wide panel to long format. Less flexible but more user-friendly than melt. With stubnames ['A', 'B'], this function expects to find one or more group of columns with format A-suffix1, A-suffix2,..., B-suffix1, B-suffix2,... You specify what you want to call this suffix in the resulting long format with `j` (for example `j='year'`) Each row of these wide variables are assumed to be uniquely identified by `i` (can be a single column name or a list of column names) All remaining variables in the data frame are left intact. Parameters ---------- df : DataFrame The wide-format DataFrame. stubnames : str or list-like The stub name(s). The wide format variables are assumed to start with the stub names. i : str or list-like Column(s) to use as id variable(s). j : str The name of the sub-observation variable. What you wish to name your suffix in the long format. sep : str, default "" A character indicating the separation of the variable names in the wide format, to be stripped from the names in the long format. For example, if your column names are A-suffix1, A-suffix2, you can strip the hyphen by specifying `sep='-'`. suffix : str, default '\\d+' A regular expression capturing the wanted suffixes. '\\d+' captures numeric suffixes. Suffixes with no numbers could be specified with the negated character class '\\D+'. You can also further disambiguate suffixes, for example, if your wide variables are of the form A-one, B-two,.., and you have an unrelated column A-rating, you can ignore the last one by specifying `suffix='(!?one|two)'`. When all suffixes are numeric, they are cast to int64/float64. Returns ------- DataFrame A DataFrame that contains each stub name as a variable, with new index (i, j). See Also -------- melt : Unpivot a DataFrame from wide to long format, optionally leaving identifiers set. pivot : Create a spreadsheet-style pivot table as a DataFrame. DataFrame.pivot : Pivot without aggregation that can handle non-numeric data. DataFrame.pivot_table : Generalization of pivot that can handle duplicate values for one index/column pair. DataFrame.unstack : Pivot based on the index values instead of a column. Notes ----- All extra variables are left untouched. This simply uses `pandas.melt` under the hood, but is hard-coded to "do the right thing" in a typical case. Examples -------- >>> np.random.seed(123) >>> df = pd.DataFrame({"A1970" : {0 : "a", 1 : "b", 2 : "c"}, ... "A1980" : {0 : "d", 1 : "e", 2 : "f"}, ... "B1970" : {0 : 2.5, 1 : 1.2, 2 : .7}, ... "B1980" : {0 : 3.2, 1 : 1.3, 2 : .1}, ... "X" : dict(zip(range(3), np.random.randn(3))) ... }) >>> df["id"] = df.index >>> df A1970 A1980 B1970 B1980 X id 0 a d 2.5 3.2 -1.085631 0 1 b e 1.2 1.3 0.997345 1 2 c f 0.7 0.1 0.282978 2 >>> pd.wide_to_long(df, ["A", "B"], i="id", j="year") ... # doctest: +NORMALIZE_WHITESPACE X A B id year 0 1970 -1.085631 a 2.5 1 1970 0.997345 b 1.2 2 1970 0.282978 c 0.7 0 1980 -1.085631 d 3.2 1 1980 0.997345 e 1.3 2 1980 0.282978 f 0.1 With multiple id columns >>> df = pd.DataFrame({ ... 'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3], ... 'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3], ... 'ht1': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1], ... 'ht2': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9] ... }) >>> df famid birth ht1 ht2 0 1 1 2.8 3.4 1 1 2 2.9 3.8 2 1 3 2.2 2.9 3 2 1 2.0 3.2 4 2 2 1.8 2.8 5 2 3 1.9 2.4 6 3 1 2.2 3.3 7 3 2 2.3 3.4 8 3 3 2.1 2.9 >>> l = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age') >>> l ... # doctest: +NORMALIZE_WHITESPACE ht famid birth age 1 1 1 2.8 2 3.4 2 1 2.9 2 3.8 3 1 2.2 2 2.9 2 1 1 2.0 2 3.2 2 1 1.8 2 2.8 3 1 1.9 2 2.4 3 1 1 2.2 2 3.3 2 1 2.3 2 3.4 3 1 2.1 2 2.9 Going from long back to wide just takes some creative use of `unstack` >>> w = l.unstack() >>> w.columns = w.columns.map('{0[0]}{0[1]}'.format) >>> w.reset_index() famid birth ht1 ht2 0 1 1 2.8 3.4 1 1 2 2.9 3.8 2 1 3 2.2 2.9 3 2 1 2.0 3.2 4 2 2 1.8 2.8 5 2 3 1.9 2.4 6 3 1 2.2 3.3 7 3 2 2.3 3.4 8 3 3 2.1 2.9 Less wieldy column names are also handled >>> np.random.seed(0) >>> df = pd.DataFrame({'A(weekly)-2010': np.random.rand(3), ... 'A(weekly)-2011': np.random.rand(3), ... 'B(weekly)-2010': np.random.rand(3), ... 'B(weekly)-2011': np.random.rand(3), ... 'X' : np.random.randint(3, size=3)}) >>> df['id'] = df.index >>> df # doctest: +NORMALIZE_WHITESPACE, +ELLIPSIS A(weekly)-2010 A(weekly)-2011 B(weekly)-2010 B(weekly)-2011 X id 0 0.548814 0.544883 0.437587 0.383442 0 0 1 0.715189 0.423655 0.891773 0.791725 1 1 2 0.602763 0.645894 0.963663 0.528895 1 2 >>> pd.wide_to_long(df, ['A(weekly)', 'B(weekly)'], i='id', ... j='year', sep='-') ... # doctest: +NORMALIZE_WHITESPACE X A(weekly) B(weekly) id year 0 2010 0 0.548814 0.437587 1 2010 1 0.715189 0.891773 2 2010 1 0.602763 0.963663 0 2011 0 0.544883 0.383442 1 2011 1 0.423655 0.791725 2 2011 1 0.645894 0.528895 If we have many columns, we could also use a regex to find our stubnames and pass that list on to wide_to_long >>> stubnames = sorted( ... set([match[0] for match in df.columns.str.findall( ... r'[A-B].∗ ').values if match != []]) ... ) >>> list(stubnames) ['A(weekly)', 'B(weekly)'] All of the above examples have integers as suffixes. It is possible to have non-integers as suffixes. >>> df = pd.DataFrame({ ... 'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3], ... 'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3], ... 'ht_one': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1], ... 'ht_two': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9] ... }) >>> df famid birth ht_one ht_two 0 1 1 2.8 3.4 1 1 2 2.9 3.8 2 1 3 2.2 2.9 3 2 1 2.0 3.2 4 2 2 1.8 2.8 5 2 3 1.9 2.4 6 3 1 2.2 3.3 7 3 2 2.3 3.4 8 3 3 2.1 2.9 >>> l = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age', ... sep='_', suffix=r'\w+') >>> l ... # doctest: +NORMALIZE_WHITESPACE ht famid birth age 1 1 one 2.8 two 3.4 2 one 2.9 two 3.8 3 one 2.2 two 2.9 2 1 one 2.0 two 3.2 2 one 1.8 two 2.8 3 one 1.9 two 2.4 3 1 one 2.2 two 3.3 2 one 2.3 two 3.4 3 one 2.1 two 2.9
12个pandas子模块又包含310个库函数(含类、方法、子模块):
import pandas as pd funcs = [_ for _ in dir(pd) if not _.startswith('_')] types = type(pd.DataFrame), type(pd.array), type(pd) Names = 'Type','Function','Module','Other' Types = {} count = 0 for f in funcs: t = type(eval("pd."+f)) t = Names[-1 if t not in types else types.index(type(eval("pd."+f)))] Types[t] = Types.get(t,[])+[f] for j,n in enumerate(Types['Module'],1): print(f"\n{j}:【{n}】") fun = [_ for _ in dir(eval('pd.'+n)) if not _.startswith('_')] count += len(fun) for i,f in enumerate(fun,1): print(f'{f:18} ',end='' if i%5 or i==len(fun) else '\n') print("\n小计:",len(fun)) print("合计:",count)
1:【api】
extensions indexers types
小计: 3
2:【arrays】
ArrowStringArray BooleanArray Categorical DatetimeArray FloatingArray
IntegerArray IntervalArray PandasArray PeriodArray SparseArray
StringArray TimedeltaArray
小计: 12
3:【compat】
F IS64 PY310 PY38 PY39
PYPY chainmap get_lzma_file import_lzma is_numpy_dev
is_platform_arm is_platform_linux is_platform_little_endian is_platform_mac is_platform_windows
np_array_datetime64_compat np_datetime64_compat np_version_under1p18 np_version_under1p19 np_version_under1p20
numpy pa_version_under1p0 pa_version_under2p0 pa_version_under3p0 pa_version_under4p0
pickle_compat platform pyarrow set_function_name sys
warnings
小计: 31
4:【core】
accessor aggregation algorithms api apply
array_algos arraylike arrays base common
computation config_init construction describe dtypes
flags frame generic groupby indexers
indexes indexing internals missing nanops
ops reshape roperator series shared_docs
sorting strings tools util window
小计: 35
5:【errors】
AbstractMethodError AccessorRegistrationWarning DtypeWarning DuplicateLabelError EmptyDataError
IntCastingNaNError InvalidIndexError MergeError NullFrequencyError NumbaUtilError
OptionError OutOfBoundsDatetime OutOfBoundsTimedelta ParserError ParserWarning
PerformanceWarning UnsortedIndexError UnsupportedFunctionCall
小计: 18
6:【io】
api clipboards common date_converters excel
feather_format formats gbq html json
orc parquet parsers pickle pytables
sas spss sql stata xml
小计: 20
7:【offsets】
BDay BMonthBegin BMonthEnd BQuarterBegin BQuarterEnd
BYearBegin BYearEnd BaseOffset BusinessDay BusinessHour
BusinessMonthBegin BusinessMonthEnd CBMonthBegin CBMonthEnd CDay
CustomBusinessDay CustomBusinessHour CustomBusinessMonthBegin CustomBusinessMonthEnd DateOffset
Day Easter FY5253 FY5253Quarter Hour
LastWeekOfMonth Micro Milli Minute MonthBegin
MonthEnd Nano QuarterBegin QuarterEnd Second
SemiMonthBegin SemiMonthEnd Tick Week WeekOfMonth
YearBegin YearEnd
小计: 42
8:【pandas】
BooleanDtype Categorical CategoricalDtype CategoricalIndex DataFrame
DateOffset DatetimeIndex DatetimeTZDtype ExcelFile ExcelWriter
Flags Float32Dtype Float64Dtype Float64Index Grouper
HDFStore Index IndexSlice Int16Dtype Int32Dtype
Int64Dtype Int64Index Int8Dtype Interval IntervalDtype
IntervalIndex MultiIndex NA NaT NamedAgg
Period PeriodDtype PeriodIndex RangeIndex Series
SparseDtype StringDtype Timedelta TimedeltaIndex Timestamp
UInt16Dtype UInt32Dtype UInt64Dtype UInt64Index UInt8Dtype
api array arrays bdate_range compat
concat core crosstab cut date_range
describe_option errors eval factorize get_dummies
get_option infer_freq interval_range io isna
isnull json_normalize lreshape melt merge
merge_asof merge_ordered notna notnull offsets
option_context options pandas period_range pivot
pivot_table plotting qcut read_clipboard read_csv
read_excel read_feather read_fwf read_gbq read_hdf
read_html read_json read_orc read_parquet read_pickle
read_sas read_spss read_sql read_sql_query read_sql_table
read_stata read_table read_xml reset_option set_eng_float_format
set_option show_versions test testing timedelta_range
to_datetime to_numeric to_pickle to_timedelta tseries
unique util value_counts wide_to_long
小计: 119
9:【plotting】
PlotAccessor andrews_curves autocorrelation_plot bootstrap_plot boxplot
boxplot_frame boxplot_frame_groupby deregister_matplotlib_converters hist_frame hist_series
lag_plot parallel_coordinates plot_params radviz register_matplotlib_converters
scatter_matrix table
小计: 17
10:【testing】
assert_extension_array_equal assert_frame_equal assert_index_equal assert_series_equal
小计: 4
11:【tseries】
api frequencies offsets
小计: 3
12:【util】
Appender Substitution cache_readonly hash_array hash_pandas_object
version
小计: 6
合计: 310
其中第8个pandas就是主模块:
1. >>> dir(pd)==dir(pd.pandas) 2. True
对第4个子模块core再扩展一下:
import pandas as pd funcs = [_ for _ in dir(pd.core) if not _.startswith('_')] types = type(pd.DataFrame), type(pd.array), type(pd) Names = 'Type','Function','Module','Other' Types = {} count = 0 for f in funcs: t = type(eval("pd.core."+f)) t = Names[-1 if t not in types else types.index(type(eval("pd.core."+f)))] Types[t] = Types.get(t,[])+[f] for j,n in enumerate(Types['Module'],1): print(f"\n{j}:【{n}】") fun = [_ for _ in dir(eval('pd.core.'+n)) if not _.startswith('_')] count += len(fun) for i,f in enumerate(fun,1): print(f'{f:18} ',end='' if i%5 or i==len(fun) else '\n') print("\n小计:",len(fun))
又翻出1299个:
1:【accessor】
CachedAccessor DirNamesMixin PandasDelegate annotations delegate_names doc register_dataframe_accessor register_index_accessor register_series_accessor warnings
小计: 10
2:【aggregation】
ABCSeries AggFuncType Any Callable DefaultDict FrameOrSeries Hashable Index Iterable Sequence SpecificationError TYPE_CHECKING annotations com defaultdict is_dict_like is_list_like is_multi_agg_with_relabel maybe_mangle_lambdas normalize_keyword_aggregation partial reconstruct_func relabel_result validate_func_kwargs
小计: 24
3:【algorithms】
ABCDatetimeArray ABCExtensionArray ABCIndex ABCMultiIndex ABCRangeIndex ABCSeries ABCTimedeltaArray AnyArrayLike ArrayLike DtypeObj FrameOrSeriesUnion PandasDtype Scalar SelectN SelectNFrame SelectNSeries TYPE_CHECKING Union algos annotations cast checked_add_with_arr construct_1d_object_array_from_listlike dedent diff doc duplicated ensure_float64 ensure_object ensure_platform_int ensure_wrapped_if_datetimelike extract_array factorize factorize_array final get_data_algo htable iNaT infer_dtype_from_array is_array_like is_bool_dtype is_categorical_dtype is_complex_dtype is_datetime64_dtype is_extension_array_dtype is_float_dtype is_integer is_integer_dtype is_list_like is_numeric_dtype is_object_dtype is_scalar is_timedelta64_dtype isin isna lib mode na_value_for_dtype needs_i8_conversion np operator pandas_dtype pd_array quantile rank safe_sort sanitize_to_nanoseconds searchsorted take take_nd union_with_duplicates unique unique1d validate_indices value_counts value_counts_arraylike warn
小计: 77
4:【api】
BooleanDtype Categorical CategoricalDtype CategoricalIndex DataFrame DateOffset DatetimeIndex DatetimeTZDtype Flags Float32Dtype Float64Dtype Float64Index Grouper Index IndexSlice Int16Dtype Int32Dtype Int64Dtype Int64Index Int8Dtype Interval IntervalDtype IntervalIndex MultiIndex NA NaT NamedAgg Period PeriodDtype PeriodIndex RangeIndex Series StringDtype Timedelta TimedeltaIndex Timestamp UInt16Dtype UInt32Dtype UInt64Dtype UInt64Index UInt8Dtype array bdate_range date_range factorize interval_range isna isnull notna notnull period_range set_eng_float_format timedelta_range to_datetime to_numeric to_timedelta unique value_counts
小计: 58
5:【apply】
ABCDataFrame ABCNDFrame ABCSeries AggFuncType AggFuncTypeBase AggFuncTypeDict AggObjType Any Apply Axis DataError Dict FrameApply FrameColumnApply FrameOrSeries FrameOrSeriesUnion FrameRowApply GroupByApply Hashable Iterator List NDFrameApply ResType ResamplerWindowApply SelectionMixin SeriesApply SpecificationError TYPE_CHECKING abc annotations cache_readonly cast com create_series_with_explicit_dtype ensure_wrapped_if_datetimelike frame_apply inspect is_dict_like is_extension_array_dtype is_list_like is_nested_object is_sequence lib np option_context pd_array safe_sort warnings
小计: 48
6:【array_algos】
masked_reductions putmask quantile replace take
transforms
小计: 6
7:【arraylike】
Any OpsMixin array_ufunc extract_array lib
maybe_dispatch_ufunc_to_dunder_op np operator roperator unpack_zerodim_and_defer
warnings
小计: 11
8:【arrays】
ArrowStringArray BaseMaskedArray BooleanArray Categorical DatetimeArray ExtensionArray ExtensionOpsMixin ExtensionScalarOpsMixin FloatingArray IntegerArray IntervalArray PandasArray PeriodArray SparseArray StringArray TimedeltaArray base boolean categorical datetimelike datetimes floating integer interval masked numeric numpy_ period period_array sparse string_ string_arrow timedeltas
小计: 33
9:【base】
ABCDataFrame ABCIndex ABCSeries AbstractMethodError Any ArrayLike DataError DirNamesMixin Dtype DtypeObj ExtensionArray FrameOrSeries Generic Hashable IndexLabel IndexOpsMixin NoNewAttributesMixin OpsMixin PYPY PandasObject SelectionMixin Shape SpecificationError TYPE_CHECKING TypeVar algorithms annotations cache_readonly cast create_series_with_explicit_dtype doc duplicated final is_categorical_dtype is_dict_like is_extension_array_dtype is_object_dtype is_scalar isna lib nanops np nv remove_na_arraylike textwrap unique1d value_counts
小计: 47
10:【common】
ABCExtensionArray ABCIndex ABCSeries Any AnyArrayLike Callable Collection Iterable Iterator NpDtype Scalar SettingWithCopyError SettingWithCopyWarning T TYPE_CHECKING abc all_none all_not_none annotations any_none any_not_none apply_if_callable asarray_tuplesafe builtins cast cast_scalar_indexer consensus_name_attr construct_1d_object_array_from_listlike contextlib convert_to_list_like count_not_none defaultdict flatten get_callable_name get_cython_func get_rename_function index_labels_to_array inspect is_array_like is_bool_dtype is_bool_indexer is_builtin_func is_extension_array_dtype is_full_slice is_integer is_null_slice is_true_slices isna iterable_not_string lib maybe_iterable_to_list maybe_make_list not_none np np_version_under1p18 partial pipe random_state require_length_match standardize_mapping temp_setattr warnings
小计: 62
11:【computation】
align api check common engines
eval expr expressions ops parsing
pytables scope
小计: 12
12:【config_init】
cf chained_assignment colheader_justify_doc data_manager_doc float_format_doc is_bool is_callable is_instance_factory is_int is_nonnegative_int is_one_of_factory is_terminal is_text max_cols max_colwidth_doc os parquet_engine_doc pc_ambiguous_as_wide_doc pc_chop_threshold_doc pc_colspace_doc pc_east_asian_width_doc pc_expand_repr_doc pc_html_border_doc pc_html_use_mathjax_doc pc_large_repr_doc pc_latex_escape pc_latex_longtable pc_latex_multicolumn pc_latex_multicolumn_format pc_latex_multirow pc_latex_repr_doc pc_max_categories_doc pc_max_cols_doc pc_max_info_cols_doc pc_max_info_rows_doc pc_max_rows_doc pc_max_seq_items pc_memory_usage_doc pc_min_rows_doc pc_multi_sparse_doc pc_nb_repr_h_doc pc_pprint_nest_depth pc_precision_doc pc_show_dimensions_doc pc_table_schema_doc pc_width_doc plotting_backend_doc reader_engine_doc register_converter_cb register_converter_doc register_plotting_backend_cb sql_engine_doc string_storage_doc styler_max_elements styler_sparse_columns_doc styler_sparse_index_doc table_schema_cb tc_sim_interactive_doc use_bottleneck_cb use_bottleneck_doc use_inf_as_na_cb use_inf_as_na_doc use_inf_as_null_doc use_numba_cb use_numba_doc use_numexpr_cb use_numexpr_doc warnings writer_engine_doc
小计: 69
13:【construction】
ABCExtensionArray ABCIndex ABCPandasArray ABCRangeIndex ABCSeries Any AnyArrayLike ArrayLike DatetimeTZDtype Dtype DtypeObj ExtensionDtype IntCastingNaNError Sequence TYPE_CHECKING annotations array cast com construct_1d_arraylike_from_scalar construct_1d_object_array_from_listlike create_series_with_explicit_dtype ensure_wrapped_if_datetimelike extract_array is_datetime64_ns_dtype is_empty_data is_extension_array_dtype is_float_dtype is_integer_dtype is_list_like is_object_dtype is_timedelta64_ns_dtype isna lib ma maybe_cast_to_datetime maybe_cast_to_integer_array maybe_convert_platform maybe_infer_to_datetimelike maybe_upcast np range_to_ndarray registry sanitize_array sanitize_masked_array sanitize_to_nanoseconds warnings
小计: 47
14:【describe】
ABC Callable DataFrameDescriber FrameOrSeries FrameOrSeriesUnion Hashable NDFrameDescriberAbstract Sequence SeriesDescriber TYPE_CHECKING Timestamp abstractmethod annotations cast concat describe_categorical_1d describe_ndframe describe_numeric_1d describe_timestamp_1d describe_timestamp_as_categorical_1d format_percentiles is_bool_dtype is_datetime64_any_dtype is_numeric_dtype is_timedelta64_dtype np refine_percentiles reorder_columns select_describe_func validate_percentile warnings
小计: 31
15:【dtypes】
api base cast common concat
dtypes generic inference missing
小计: 9
16:【flags】
Flags weakref
小计: 2
17:【frame】
AggFuncType Any AnyArrayLike AnyStr Appender ArrayLike ArrayManager Axes Axis BaseInfo BlockManager CachedAccessor Callable CategoricalIndex ColspaceArgType CompressionOptions DataFrame DataFrameInfo DatetimeArray DatetimeIndex Dtype ExtensionArray ExtensionDtype FilePathOrBuffer FillnaOptions FloatFormatType FormattersType FrameOrSeriesUnion Frequency Hashable IO Index IndexKeyFunc IndexLabel Iterable Iterator Level MultiIndex NDFrame NpDtype OpsMixin PeriodIndex PythonFuncType Renamer Scalar Sequence Series SparseFrameAccessor StorageOptions StringIO Substitution Suffixes TYPE_CHECKING TimedeltaArray ValueKeyFunc abc algorithms annotations arrays_to_mgr cast check_bool_indexer check_key_length collections com console construct_1d_arraylike_from_scalar construct_2d_arraylike_from_scalar convert_to_index_sliceable dataclasses_to_dicts datetime dedent deprecate_kwarg deprecate_nonkeyword_arguments dict_to_mgr doc duplicated ensure_index ensure_index_from_sequences ensure_platform_int extract_array find_common_type fmt functools generic get_group_index get_handle get_option ibase import_optional_dependency infer_dtype_from_object infer_dtype_from_scalar invalidate_string_dtypes is_1d_only_ea_dtype is_1d_only_ea_obj is_bool_dtype is_dataclass is_datetime64_any_dtype is_dict_like is_dtype_equal is_extension_array_dtype is_float is_float_dtype is_hashable is_integer is_integer_dtype is_iterator is_list_like is_object_dtype is_scalar is_sequence isna itertools lexsort_indexer lib libalgos ma maybe_box_native maybe_downcast_to_dtype maybe_droplevels melt mgr_to_mgr mmap nanops nargsort ndarray_to_mgr nested_data_to_arrays no_default notna np nv ops overload pandas pandas_dtype properties rec_array_to_mgr reconstruct_func relabel_result reorder_arrays rewrite_axis_style_signature sanitize_array sanitize_masked_array take_2d_multi to_arrays treat_as_nested validate_axis_style_args validate_bool_kwarg validate_numeric_casting validate_percentile warnings
小计: 150
18:【generic】
ABCDataFrame ABCSeries AbstractMethodError Any AnyStr ArrayManager Axis BlockManager Callable CompressionOptions DataFrameFormatter DataFrameRenderer DatetimeIndex Dtype DtypeArg DtypeObj Expanding ExponentialMovingWindow ExtensionArray FilePathOrBuffer Flags FrameOrSeries Hashable Index IndexKeyFunc IndexLabel InvalidIndexError JSONSerializable Level Manager Mapping MultiIndex NDFrame NpDtype PandasObject Period PeriodIndex RangeIndex Renamer Rolling Sequence SingleArrayManager StorageOptions T TYPE_CHECKING Tick TimedeltaConvertibleTypes Timestamp TimestampConvertibleTypes ValueKeyFunc Window algos align_method_FRAME annotations arraylike bool_t cast collections com concat config create_series_with_explicit_dtype describe_ndframe doc ensure_index ensure_object ensure_platform_int ensure_str extract_array final find_valid_index fmt functools gc get_indexer_indexer ibase import_optional_dependency indexing is_bool is_bool_dtype is_datetime64_any_dtype is_datetime64tz_dtype is_dict_like is_dtype_equal is_extension_array_dtype is_float is_hashable is_list_like is_nested_list_like is_number is_numeric_dtype is_object_dtype is_re_compilable is_scalar is_timedelta64_dtype isna json lib mgr_to_mgr missing nanops notna np nv operator overload pandas_dtype pickle pprint_thing re rewrite_axis_style_signature timedelta to_offset validate_ascending validate_bool_kwarg validate_fillna_kwargs warnings weakref
小计: 118
19:【groupby】
DataFrameGroupBy GroupBy Grouper NamedAgg SeriesGroupBy
base categorical generic groupby grouper
numba_ ops
小计: 12
20:【indexers】
ABCIndex ABCSeries Any AnyArrayLike ArrayLike TYPE_CHECKING annotations check_array_indexer check_key_length check_setitem_lengths deprecate_ndim_indexing is_array_like is_bool_dtype is_empty_indexer is_exact_shape_match is_extension_array_dtype is_integer is_integer_dtype is_list_like is_list_like_indexer is_scalar_indexer is_valid_positional_slice length_of_indexer maybe_convert_indices np unpack_1tuple validate_indices warnings
小计: 28
21:【indexes】
accessors api base category datetimelike datetimes extension frozen interval multi numeric period range timedeltas
小计: 14
22:【indexing】
ABCDataFrame ABCSeries AbstractMethodError Any CategoricalIndex Hashable Index IndexSlice IndexingError IndexingMixin IntervalIndex InvalidIndexError MultiIndex NDFrameIndexerBase Sequence TYPE_CHECKING algos annotations check_array_indexer check_bool_indexer com concat_compat convert_from_missing_indexer_tuple convert_missing_indexer convert_to_index_sliceable doc ensure_index extract_array infer_fill_value is_array_like is_bool_dtype is_empty_indexer is_exact_shape_match is_hashable is_integer is_iterator is_label_like is_list_like is_list_like_indexer is_nested_tuple is_numeric_dtype is_object_dtype is_scalar is_sequence isna item_from_zerodim length_of_indexer maybe_convert_ix need_slice needs_i8_conversion np pd_array suppress warnings
小计: 54
23:【internals】
ArrayManager Block BlockManager DataManager DatetimeTZBlock ExtensionBlock NumericBlock ObjectBlock SingleArrayManager SingleBlockManager SingleDataManager api array_manager base blocks concat concatenate_managers construction create_block_manager_from_arrays create_block_manager_from_blocks make_block managers ops
小计: 23
24:【missing】
Any ArrayLike Axis F NP_METHODS SP_METHODS TYPE_CHECKING algos annotations cast check_value_size clean_fill_method clean_interp_method clean_reindex_fill_method find_valid_index get_fill_func import_optional_dependency infer_dtype_from interpolate_1d interpolate_2d interpolate_2d_with_fill interpolate_array_2d is_array_like is_numeric_v_string_like is_valid_na_for_dtype isna lib mask_missing na_value_for_dtype needs_i8_conversion np partial wraps
小计: 33
25:【nanops】
Any ArrayLike Dtype DtypeObj F NaT NaTType PeriodDtype Scalar Shape Timedelta annotations bn bottleneck_switch cast check_below_min_count disallow extract_array functools get_corr_func get_dtype get_empty_reduction_result get_option iNaT import_optional_dependency is_any_int_dtype is_bool_dtype is_complex is_datetime64_any_dtype is_float is_float_dtype is_integer is_integer_dtype is_numeric_dtype is_object_dtype is_scalar is_timedelta64_dtype isna itertools lib make_nancomp na_accum_func na_value_for_dtype nanall nanany nanargmax nanargmin nancorr nancov naneq nange nangt nankurt nanle nanlt nanmax nanmean nanmedian nanmin nanne nanpercentile nanprod nansem nanskew nanstd nansum nanvar needs_i8_conversion notna np np_percentile_argname operator pandas_dtype set_use_bottleneck warnings
小计: 75
26:【ops】
ABCDataFrame ABCSeries ARITHMETIC_BINOPS Appender COMPARISON_BINOPS Level TYPE_CHECKING add_flex_arithmetic_methods algorithms align_method_FRAME align_method_SERIES annotations arithmetic_op array_ops common comp_method_OBJECT_ARRAY comparison_op dispatch docstrings fill_binop flex_arith_method_FRAME flex_comp_method_FRAME flex_method_SERIES frame_arith_method_with_reindex get_array_op get_op_result_name invalid invalid_comparison is_array_like is_list_like isna kleene_and kleene_or kleene_xor logical_op make_flex_doc mask_ops maybe_dispatch_ufunc_to_dunder_op maybe_prepare_scalar_for_op methods missing np operator radd rand_ rdiv rdivmod rfloordiv rmod rmul roperator ror_ rpow rsub rtruediv rxor should_reindex_frame_op unpack_zerodim_and_defer warnings
小计: 59
27:【reshape】
api concat melt merge pivot
reshape tile util
小计: 8
28:【roperator】
operator radd rand_ rdiv rdivmod
rfloordiv rmod rmul ror_ rpow
rsub rtruediv rxor
小计: 13
29:【series】
ABCDataFrame AggFuncType Any Appender ArrayLike Axis CachedAccessor Callable CategoricalAccessor CategoricalIndex CombinedDatetimelikeProperties DatetimeIndex Dtype DtypeObj ExtensionArray FillnaOptions Float64Index FrameOrSeriesUnion Hashable IO Index IndexKeyFunc InvalidIndexError Iterable MultiIndex NDFrame NpDtype PeriodIndex Sequence Series SeriesApply SingleArrayManager SingleBlockManager SingleManager SparseAccessor StorageOptions StringIO StringMethods Substitution TYPE_CHECKING TimedeltaIndex Union ValueKeyFunc algorithms annotations base cast check_bool_indexer com convert_dtypes create_series_with_explicit_dtype dedent deprecate_ndim_indexing deprecate_nonkeyword_arguments doc ensure_index ensure_key_mapped ensure_platform_int ensure_wrapped_if_datetimelike extract_array fmt generic get_option get_terminal_size ibase is_bool is_dict_like is_empty_data is_hashable is_integer is_iterator is_list_like is_object_dtype is_scalar isna lib maybe_box_native maybe_cast_pointwise_result missing na_value_for_dtype nanops nargsort no_default notna np nv ops overload pandas pandas_dtype properties remove_na_arraylike reshape sanitize_array to_datetime tslibs unpack_1tuple validate_all_hashable validate_bool_kwarg validate_numeric_casting validate_percentile warnings weakref
小计: 103
30:【shared_docs】
annotations
小计: 1
31:【sorting】
ABCMultiIndex ABCRangeIndex Callable DefaultDict IndexKeyFunc Iterable Sequence Shape TYPE_CHECKING algos annotations compress_group_index decons_group_index decons_obs_group_ids defaultdict ensure_int64 ensure_key_mapped ensure_platform_int extract_array get_compressed_ids get_flattened_list get_group_index get_group_index_sorter get_indexer_dict get_indexer_indexer hashtable indexer_from_factorized is_extension_array_dtype is_int64_overflow_possible isna lexsort_indexer lib nargminmax nargsort np unique_label_indices
小计: 36
32:【strings】
BaseStringArrayMethods StringMethods accessor base object_array
小计: 5
33:【tools】
datetimes numeric timedeltas times
小计: 4
34:【util】
hashing numba_
小计: 2
35:【window】
Expanding ExpandingGroupby ExponentialMovingWindow ExponentialMovingWindowGroupby Rolling RollingGroupby Window common doc ewm expanding indexers numba_ online rolling
小计: 15
合计: 1299
待续......