Python pandas库|任凭弱水三千,我只取一瓢饮(7)

简介: Python pandas库|任凭弱水三千,我只取一瓢饮(7)

to_系列函数:22个 (12~22)

Function12

to_numpy(self, dtype: 'NpDtype | None' = None, copy: 'bool' = False, na_value=<no_default>) -> 'np.ndarray'

Help on function to_numpy in module pandas.core.frame:
to_numpy(self, dtype: 'NpDtype | None' = None, copy: 'bool' = False, na_value=<no_default>) -> 'np.ndarray'
    Convert the DataFrame to a NumPy array.
    By default, the dtype of the returned array will be the common NumPy
    dtype of all types in the DataFrame. For example, if the dtypes are
    ``float16`` and ``float32``, the results dtype will be ``float32``.
    This may require copying data and coercing values, which may be
    expensive.
    Parameters
    ----------
    dtype : str or numpy.dtype, optional
        The dtype to pass to :meth:`numpy.asarray`.
    copy : bool, default False
        Whether to ensure that the returned value is not a view on
        another array. Note that ``copy=False`` does not *ensure* that
        ``to_numpy()`` is no-copy. Rather, ``copy=True`` ensure that
        a copy is made, even if not strictly necessary.
    na_value : Any, optional
        The value to use for missing values. The default value depends
        on `dtype` and the dtypes of the DataFrame columns.
        .. versionadded:: 1.1.0
    Returns
    -------
    numpy.ndarray
    See Also
    --------
    Series.to_numpy : Similar method for Series.
    Examples
    --------
    >>> pd.DataFrame({"A": [1, 2], "B": [3, 4]}).to_numpy()
    array([[1, 3],
           [2, 4]])
    With heterogeneous data, the lowest common type will have to
    be used.
    >>> df = pd.DataFrame({"A": [1, 2], "B": [3.0, 4.5]})
    >>> df.to_numpy()
    array([[1. , 3. ],
           [2. , 4.5]])
    For a mix of numeric and non-numeric types, the output array will
    have object dtype.
    >>> df['C'] = pd.date_range('2000', periods=2)
    >>> df.to_numpy()
    array([[1, 3.0, Timestamp('2000-01-01 00:00:00')],
           [2, 4.5, Timestamp('2000-01-02 00:00:00')]], dtype=object)




Function13

to_parquet(self, path: 'FilePathOrBuffer | None' = None, engine: 'str' = 'auto', compression: 'str | None' = 'snappy', index: 'bool | None' = None, partition_cols: 'list[str] | None' = None, storage_options: 'StorageOptions' = None, **kwargs) -> 'bytes | None'



Help on function to_parquet in module pandas.core.frame:
to_parquet(self, path: 'FilePathOrBuffer | None' = None, engine: 'str' = 'auto', compression: 'str | None' = 'snappy', index: 'bool | None' = None, partition_cols: 'list[str] | None' = None, storage_options: 'StorageOptions' = None, **kwargs) -> 'bytes | None'
    Write a DataFrame to the binary parquet format.
    This function writes the dataframe as a `parquet file
    <https://parquet.apache.org/>`_. You can choose different parquet
    backends, and have the option of compression. See
    :ref:`the user guide <io.parquet>` for more details.
    Parameters
    ----------
    path : str or file-like object, default None
        If a string, it will be used as Root Directory path
        when writing a partitioned dataset. By file-like object,
        we refer to objects with a write() method, such as a file handle
        (e.g. via builtin open function) or io.BytesIO. The engine
        fastparquet does not accept file-like objects. If path is None,
        a bytes object is returned.
        .. versionchanged:: 1.2.0
        Previously this was "fname"
    engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto'
        Parquet library to use. If 'auto', then the option
        ``io.parquet.engine`` is used. The default ``io.parquet.engine``
        behavior is to try 'pyarrow', falling back to 'fastparquet' if
        'pyarrow' is unavailable.
    compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
        Name of the compression to use. Use ``None`` for no compression.
    index : bool, default None
        If ``True``, include the dataframe's index(es) in the file output.
        If ``False``, they will not be written to the file.
        If ``None``, similar to ``True`` the dataframe's index(es)
        will be saved. However, instead of being saved as values,
        the RangeIndex will be stored as a range in the metadata so it
        doesn't require much space and is faster. Other indexes will
        be included as columns in the file output.
    partition_cols : list, optional, default None
        Column names by which to partition the dataset.
        Columns are partitioned in the order they are given.
        Must be None if path is not a string.
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to ``urllib`` as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        .. versionadded:: 1.2.0
    **kwargs
        Additional arguments passed to the parquet library. See
        :ref:`pandas io <io.parquet>` for more details.
    Returns
    -------
    bytes if no path argument is provided else None
    See Also
    --------
    read_parquet : Read a parquet file.
    DataFrame.to_csv : Write a csv file.
    DataFrame.to_sql : Write to a sql table.
    DataFrame.to_hdf : Write to hdf.
    Notes
    -----
    This function requires either the `fastparquet
    <https://pypi.org/project/fastparquet>`_ or `pyarrow
    <https://arrow.apache.org/docs/python/>`_ library.
    Examples
    --------
    >>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
    >>> df.to_parquet('df.parquet.gzip',
    ...               compression='gzip')  # doctest: +SKIP
    >>> pd.read_parquet('df.parquet.gzip')  # doctest: +SKIP
       col1  col2
    0     1     3
    1     2     4
    If you want to get a buffer to the parquet content you can use a io.BytesIO
    object, as long as you don't use partition_cols, which creates multiple files.
    >>> import io
    >>> f = io.BytesIO()
    >>> df.to_parquet(f)
    >>> f.seek(0)
    0
    >>> content = f.read()



Function14

to_period(self, freq: 'Frequency | None' = None, axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'

Help on function to_period in module pandas.core.frame:
to_period(self, freq: 'Frequency | None' = None, axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'
    Convert DataFrame from DatetimeIndex to PeriodIndex.
    Convert DataFrame from DatetimeIndex to PeriodIndex with desired
    frequency (inferred from index if not passed).
    Parameters
    ----------
    freq : str, default
        Frequency of the PeriodIndex.
    axis : {0 or 'index', 1 or 'columns'}, default 0
        The axis to convert (the index by default).
    copy : bool, default True
        If False then underlying input data is not copied.
    Returns
    -------
    DataFrame with PeriodIndex



Function15

to_pickle(self, path, compression: 'CompressionOptions' = 'infer', protocol: 'int' = 5, storage_options: 'StorageOptions' = None) -> 'None'



Help on function to_pickle in module pandas.core.generic:
to_pickle(self, path, compression: 'CompressionOptions' = 'infer', protocol: 'int' = 5, storage_options: 'StorageOptions' = None) -> 'None'
    Pickle (serialize) object to file.
    Parameters
    ----------
    path : str
        File path where the pickled object will be stored.
    compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None},         default 'infer'
        A string representing the compression to use in the output file. By
        default, infers from the file extension in specified path.
        Compression mode may be any of the following possible
        values: {¡®infer¡¯, ¡®gzip¡¯, ¡®bz2¡¯, ¡®zip¡¯, ¡®xz¡¯, None}. If compression
        mode is ¡®infer¡¯ and path_or_buf is path-like, then detect
        compression mode from the following extensions:
        ¡®.gz¡¯, ¡®.bz2¡¯, ¡®.zip¡¯ or ¡®.xz¡¯. (otherwise no compression).
        If dict given and mode is ¡®zip¡¯ or inferred as ¡®zip¡¯, other entries
        passed as additional compression options.
    protocol : int
        Int which indicates which protocol should be used by the pickler,
        default HIGHEST_PROTOCOL (see [1]_ paragraph 12.1.2). The possible
        values are 0, 1, 2, 3, 4, 5. A negative value for the protocol
        parameter is equivalent to setting its value to HIGHEST_PROTOCOL.
        .. [1] https://docs.python.org/3/library/pickle.html.
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to ``urllib`` as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        .. versionadded:: 1.2.0
    See Also
    --------
    read_pickle : Load pickled pandas object (or any object) from file.
    DataFrame.to_hdf : Write DataFrame to an HDF5 file.
    DataFrame.to_sql : Write DataFrame to a SQL database.
    DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
    Examples
    --------
    >>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)})
    >>> original_df
       foo  bar
    0    0    5
    1    1    6
    2    2    7
    3    3    8
    4    4    9
    >>> original_df.to_pickle("./dummy.pkl")
    >>> unpickled_df = pd.read_pickle("./dummy.pkl")
    >>> unpickled_df
       foo  bar
    0    0    5
    1    1    6
    2    2    7
    3    3    8
    4    4    9
    >>> import os
    >>> os.remove("./dummy.pkl")




Function16

to_records(self, index=True, column_dtypes=None, index_dtypes=None) -> 'np.recarray'

Help on function to_records in module pandas.core.frame:
to_records(self, index=True, column_dtypes=None, index_dtypes=None) -> 'np.recarray'
    Convert DataFrame to a NumPy record array.
    Index will be included as the first field of the record array if
    requested.
    Parameters
    ----------
    index : bool, default True
        Include index in resulting record array, stored in 'index'
        field or using the index label, if set.
    column_dtypes : str, type, dict, default None
        If a string or type, the data type to store all columns. If
        a dictionary, a mapping of column names and indices (zero-indexed)
        to specific data types.
    index_dtypes : str, type, dict, default None
        If a string or type, the data type to store all index levels. If
        a dictionary, a mapping of index level names and indices
        (zero-indexed) to specific data types.
        This mapping is applied only if `index=True`.
    Returns
    -------
    numpy.recarray
        NumPy ndarray with the DataFrame labels as fields and each row
        of the DataFrame as entries.
    See Also
    --------
    DataFrame.from_records: Convert structured or record ndarray
        to DataFrame.
    numpy.recarray: An ndarray that allows field access using
        attributes, analogous to typed columns in a
        spreadsheet.
    Examples
    --------
    >>> df = pd.DataFrame({'A': [1, 2], 'B': [0.5, 0.75]},
    ...                   index=['a', 'b'])
    >>> df
       A     B
    a  1  0.50
    b  2  0.75
    >>> df.to_records()
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')])
    If the DataFrame index has no label then the recarray field name
    is set to 'index'. If the index has a label then this is used as the
    field name:
    >>> df.index = df.index.rename("I")
    >>> df.to_records()
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('I', 'O'), ('A', '<i8'), ('B', '<f8')])
    The index can be excluded from the record array:
    >>> df.to_records(index=False)
    rec.array([(1, 0.5 ), (2, 0.75)],
              dtype=[('A', '<i8'), ('B', '<f8')])
    Data types can be specified for the columns:
    >>> df.to_records(column_dtypes={"A": "int32"})
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('I', 'O'), ('A', '<i4'), ('B', '<f8')])
    As well as for the index:
    >>> df.to_records(index_dtypes="<S2")
    rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)],
              dtype=[('I', 'S2'), ('A', '<i8'), ('B', '<f8')])
    >>> index_dtypes = f"<S{df.index.str.len().max()}"
    >>> df.to_records(index_dtypes=index_dtypes)
    rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)],
              dtype=[('I', 'S1'), ('A', '<i8'), ('B', '<f8')])



Function17

to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'None'



Help on function to_sql in module pandas.core.generic:
to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'None'
    Write records stored in a DataFrame to a SQL database.
    Databases supported by SQLAlchemy [1]_ are supported. Tables can be
    newly created, appended to, or overwritten.
    Parameters
    ----------
    name : str
        Name of SQL table.
    con : sqlalchemy.engine.(Engine or Connection) or sqlite3.Connection
        Using SQLAlchemy makes it possible to use any DB supported by that
        library. Legacy support is provided for sqlite3.Connection objects. The user
        is responsible for engine disposal and connection closure for the SQLAlchemy
        connectable See `here                 <https://docs.sqlalchemy.org/en/13/core/connections.html>`_.
    schema : str, optional
        Specify the schema (if database flavor supports this). If None, use
        default schema.
    if_exists : {'fail', 'replace', 'append'}, default 'fail'
        How to behave if the table already exists.
        * fail: Raise a ValueError.
        * replace: Drop the table before inserting new values.
        * append: Insert new values to the existing table.
    index : bool, default True
        Write DataFrame index as a column. Uses `index_label` as the column
        name in the table.
    index_label : str or sequence, default None
        Column label for index column(s). If None is given (default) and
        `index` is True, then the index names are used.
        A sequence should be given if the DataFrame uses MultiIndex.
    chunksize : int, optional
        Specify the number of rows in each batch to be written at a time.
        By default, all rows will be written at once.
    dtype : dict or scalar, optional
        Specifying the datatype for columns. If a dictionary is used, the
        keys should be the column names and the values should be the
        SQLAlchemy types or strings for the sqlite3 legacy mode. If a
        scalar is provided, it will be applied to all columns.
    method : {None, 'multi', callable}, optional
        Controls the SQL insertion clause used:
        * None : Uses standard SQL ``INSERT`` clause (one per row).
        * 'multi': Pass multiple values in a single ``INSERT`` clause.
        * callable with signature ``(pd_table, conn, keys, data_iter)``.
        Details and a sample callable implementation can be found in the
        section :ref:`insert method <io.sql.method>`.
    Raises
    ------
    ValueError
        When the table already exists and `if_exists` is 'fail' (the
        default).
    See Also
    --------
    read_sql : Read a DataFrame from a table.
    Notes
    -----
    Timezone aware datetime columns will be written as
    ``Timestamp with timezone`` type with SQLAlchemy if supported by the
    database. Otherwise, the datetimes will be stored as timezone unaware
    timestamps local to the original timezone.
    References
    ----------
    .. [1] https://docs.sqlalchemy.org
    .. [2] https://www.python.org/dev/peps/pep-0249/
    Examples
    --------
    Create an in-memory SQLite database.
    >>> from sqlalchemy import create_engine
    >>> engine = create_engine('sqlite://', echo=False)
    Create a table from scratch with 3 rows.
    >>> df = pd.DataFrame({'name' : ['User 1', 'User 2', 'User 3']})
    >>> df
         name
    0  User 1
    1  User 2
    2  User 3
    >>> df.to_sql('users', con=engine)
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 1'), (1, 'User 2'), (2, 'User 3')]
    An `sqlalchemy.engine.Connection` can also be passed to `con`:
    >>> with engine.begin() as connection:
    ...     df1 = pd.DataFrame({'name' : ['User 4', 'User 5']})
    ...     df1.to_sql('users', con=connection, if_exists='append')
    This is allowed to support operations that require that the same
    DBAPI connection is used for the entire operation.
    >>> df2 = pd.DataFrame({'name' : ['User 6', 'User 7']})
    >>> df2.to_sql('users', con=engine, if_exists='append')
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 1'), (1, 'User 2'), (2, 'User 3'),
     (0, 'User 4'), (1, 'User 5'), (0, 'User 6'),
     (1, 'User 7')]
    Overwrite the table with just ``df2``.
    >>> df2.to_sql('users', con=engine, if_exists='replace',
    ...            index_label='id')
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 6'), (1, 'User 7')]
    Specify the dtype (especially useful for integers with missing values).
    Notice that while pandas is forced to store the data as floating point,
    the database supports nullable integers. When fetching the data with
    Python, we get back integer scalars.
    >>> df = pd.DataFrame({"A": [1, None, 2]})
    >>> df
         A
    0  1.0
    1  NaN
    2  2.0
    >>> from sqlalchemy.types import Integer
    >>> df.to_sql('integers', con=engine, index=False,
    ...           dtype={"A": Integer()})
    >>> engine.execute("SELECT * FROM integers").fetchall()
    [(1,), (None,), (2,)]



Function18

to_stata(self, path: 'FilePathOrBuffer', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'None'



Help on function to_stata in module pandas.core.frame:
to_stata(self, path: 'FilePathOrBuffer', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'None'
    Export DataFrame object to Stata dta format.
    Writes the DataFrame to a Stata dataset file.
    "dta" files contain a Stata dataset.
    Parameters
    ----------
    path : str, buffer or path object
        String, path object (pathlib.Path or py._path.local.LocalPath) or
        object implementing a binary write() function. If using a buffer
        then the buffer will not be automatically closed after the file
        data has been written.
        .. versionchanged:: 1.0.0
        Previously this was "fname"
    convert_dates : dict
        Dictionary mapping columns containing datetime types to stata
        internal format to use when writing the dates. Options are 'tc',
        'td', 'tm', 'tw', 'th', 'tq', 'ty'. Column can be either an integer
        or a name. Datetime columns that do not have a conversion type
        specified will be converted to 'tc'. Raises NotImplementedError if
        a datetime column has timezone information.
    write_index : bool
        Write the index to Stata dataset.
    byteorder : str
        Can be ">", "<", "little", or "big". default is `sys.byteorder`.
    time_stamp : datetime
        A datetime to use as file creation date.  Default is the current
        time.
    data_label : str, optional
        A label for the data set.  Must be 80 characters or smaller.
    variable_labels : dict
        Dictionary containing columns as keys and variable labels as
        values. Each label must be 80 characters or smaller.
    version : {114, 117, 118, 119, None}, default 114
        Version to use in the output dta file. Set to None to let pandas
        decide between 118 or 119 formats depending on the number of
        columns in the frame. Version 114 can be read by Stata 10 and
        later. Version 117 can be read by Stata 13 or later. Version 118
        is supported in Stata 14 and later. Version 119 is supported in
        Stata 15 and later. Version 114 limits string variables to 244
        characters or fewer while versions 117 and later allow strings
        with lengths up to 2,000,000 characters. Versions 118 and 119
        support Unicode characters, and version 119 supports more than
        32,767 variables.
        Version 119 should usually only be used when the number of
        variables exceeds the capacity of dta format 118. Exporting
        smaller datasets in format 119 may have unintended consequences,
        and, as of November 2020, Stata SE cannot read version 119 files.
        .. versionchanged:: 1.0.0
            Added support for formats 118 and 119.
    convert_strl : list, optional
        List of column names to convert to string columns to Stata StrL
        format. Only available if version is 117.  Storing strings in the
        StrL format can produce smaller dta files if strings have more than
        8 characters and values are repeated.
    compression : str or dict, default 'infer'
        For on-the-fly compression of the output dta. If string, specifies
        compression mode. If dict, value at key 'method' specifies
        compression mode. Compression mode must be one of {'infer', 'gzip',
        'bz2', 'zip', 'xz', None}. If compression mode is 'infer' and
        `fname` is path-like, then detect compression from the following
        extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no
        compression). If dict and compression mode is one of {'zip',
        'gzip', 'bz2'}, or inferred as one of the above, other entries
        passed as additional compression options.
        .. versionadded:: 1.1.0
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to ``urllib`` as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        .. versionadded:: 1.2.0
    Raises
    ------
    NotImplementedError
        * If datetimes contain timezone information
        * Column dtype is not representable in Stata
    ValueError
        * Columns listed in convert_dates are neither datetime64[ns]
          or datetime.datetime
        * Column listed in convert_dates is not in DataFrame
        * Categorical label contains more than 32,000 characters
    See Also
    --------
    read_stata : Import Stata data files.
    io.stata.StataWriter : Low-level writer for Stata data files.
    io.stata.StataWriter117 : Low-level writer for version 117 files.
    Examples
    --------
    >>> df = pd.DataFrame({'animal': ['falcon', 'parrot', 'falcon',
    ...                               'parrot'],
    ...                    'speed': [350, 18, 361, 15]})
    >>> df.to_stata('animals.dta')  # doctest: +SKIP



Function19

to_string(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'int | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'fmt.FormattersType | None' = None, float_format: 'fmt.FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, min_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool' = False, decimal: 'str' = '.', line_width: 'int | None' = None, max_colwidth: 'int | None' = None, encoding: 'str | None' = None) -> 'str | None'



Help on function to_string in module pandas.core.frame:
to_string(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'int | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'fmt.FormattersType | None' = None, float_format: 'fmt.FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, min_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool' = False, decimal: 'str' = '.', line_width: 'int | None' = None, max_colwidth: 'int | None' = None, encoding: 'str | None' = None) -> 'str | None'
    Render a DataFrame to a console-friendly tabular output.
    Parameters
    ----------
    buf : str, Path or StringIO-like, optional, default None
        Buffer to write to. If None, the output is returned as a string.
    columns : sequence, optional, default None
        The subset of columns to write. Writes all columns by default.
    col_space : int, list or dict of int, optional
        The minimum width of each column.
    header : bool or sequence, optional
        Write out the column names. If a list of strings is given, it is assumed to be aliases for the column names.
    index : bool, optional, default True
        Whether to print index (row) labels.
    na_rep : str, optional, default 'NaN'
        String representation of ``NaN`` to use.
    formatters : list, tuple or dict of one-param. functions, optional
        Formatter functions to apply to columns' elements by position or
        name.
        The result of each function must be a unicode string.
        List/tuple must be of length equal to the number of columns.
    float_format : one-parameter function, optional, default None
        Formatter function to apply to columns' elements if they are
        floats. This function must return a unicode string and will be
        applied only to the non-``NaN`` elements, with ``NaN`` being
        handled by ``na_rep``.
        .. versionchanged:: 1.2.0
    sparsify : bool, optional, default True
        Set to False for a DataFrame with a hierarchical index to print
        every multiindex key at each row.
    index_names : bool, optional, default True
        Prints the names of the indexes.
    justify : str, default None
        How to justify the column labels. If None uses the option from
        the print configuration (controlled by set_option), 'right' out
        of the box. Valid values are
        * left
        * right
        * center
        * justify
        * justify-all
        * start
        * end
        * inherit
        * match-parent
        * initial
        * unset.
    max_rows : int, optional
        Maximum number of rows to display in the console.
    min_rows : int, optional
        The number of rows to display in the console in a truncated repr
        (when number of rows is above `max_rows`).
    max_cols : int, optional
        Maximum number of columns to display in the console.
    show_dimensions : bool, default False
        Display DataFrame dimensions (number of rows by number of columns).
    decimal : str, default '.'
        Character recognized as decimal separator, e.g. ',' in Europe.
    line_width : int, optional
        Width to wrap a line in characters.
    max_colwidth : int, optional
        Max width to truncate each column in characters. By default, no limit.
        .. versionadded:: 1.0.0
    encoding : str, default "utf-8"
        Set character encoding.
        .. versionadded:: 1.0
    Returns
    -------
    str or None
        If buf is None, returns the result as a string. Otherwise returns
        None.
    See Also
    --------
    to_html : Convert DataFrame to HTML.
    Examples
    --------
    >>> d = {'col1': [1, 2, 3], 'col2': [4, 5, 6]}
    >>> df = pd.DataFrame(d)
    >>> print(df.to_string())
       col1  col2
    0     1     4
    1     2     5
    2     3     6



Function20

to_timestamp(self, freq: 'Frequency | None' = None, how: 'str' = 'start', axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'

Help on function to_timestamp in module pandas.core.frame:
to_timestamp(self, freq: 'Frequency | None' = None, how: 'str' = 'start', axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'
    Cast to DatetimeIndex of timestamps, at *beginning* of period.
    Parameters
    ----------
    freq : str, default frequency of PeriodIndex
        Desired frequency.
    how : {'s', 'e', 'start', 'end'}
        Convention for converting period to timestamp; start of period
        vs. end.
    axis : {0 or 'index', 1 or 'columns'}, default 0
        The axis to convert (the index by default).
    copy : bool, default True
        If False then underlying input data is not copied.
    Returns
    -------
    DataFrame with DatetimeIndex



Function21

to_xarray(self)

Help on function to_xarray in module pandas.core.generic:
to_xarray(self)
    Return an xarray object from the pandas object.
    Returns
    -------
    xarray.DataArray or xarray.Dataset
        Data in the pandas structure converted to Dataset if the object is
        a DataFrame, or a DataArray if the object is a Series.
    See Also
    --------
    DataFrame.to_hdf : Write DataFrame to an HDF5 file.
    DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
    Notes
    -----
    See the `xarray docs <https://xarray.pydata.org/en/stable/>`__
    Examples
    --------
    >>> df = pd.DataFrame([('falcon', 'bird', 389.0, 2),
    ...                    ('parrot', 'bird', 24.0, 2),
    ...                    ('lion', 'mammal', 80.5, 4),
    ...                    ('monkey', 'mammal', np.nan, 4)],
    ...                   columns=['name', 'class', 'max_speed',
    ...                            'num_legs'])
    >>> df
         name   class  max_speed  num_legs
    0  falcon    bird      389.0         2
    1  parrot    bird       24.0         2
    2    lion  mammal       80.5         4
    3  monkey  mammal        NaN         4
    >>> df.to_xarray()
    <xarray.Dataset>
    Dimensions:    (index: 4)
    Coordinates:
      * index      (index) int64 0 1 2 3
    Data variables:
        name       (index) object 'falcon' 'parrot' 'lion' 'monkey'
        class      (index) object 'bird' 'bird' 'mammal' 'mammal'
        max_speed  (index) float64 389.0 24.0 80.5 nan
        num_legs   (index) int64 2 2 4 4
    >>> df['max_speed'].to_xarray()
    <xarray.DataArray 'max_speed' (index: 4)>
    array([389. ,  24. ,  80.5,   nan])
    Coordinates:
      * index    (index) int64 0 1 2 3
    >>> dates = pd.to_datetime(['2018-01-01', '2018-01-01',
    ...                         '2018-01-02', '2018-01-02'])
    >>> df_multiindex = pd.DataFrame({'date': dates,
    ...                               'animal': ['falcon', 'parrot',
    ...                                          'falcon', 'parrot'],
    ...                               'speed': [350, 18, 361, 15]})
    >>> df_multiindex = df_multiindex.set_index(['date', 'animal'])
    >>> df_multiindex
                       speed
    date       animal
    2018-01-01 falcon    350
               parrot     18
    2018-01-02 falcon    361
               parrot     15
    >>> df_multiindex.to_xarray()
    <xarray.Dataset>
    Dimensions:  (animal: 2, date: 2)
    Coordinates:
      * date     (date) datetime64[ns] 2018-01-01 2018-01-02
      * animal   (animal) object 'falcon' 'parrot'
    Data variables:
        speed    (date, animal) int64 350 18 361 15




Function22

to_xml(self, path_or_buffer: 'FilePathOrBuffer | None' = None, index: 'bool' = True, root_name: 'str | None' = 'data', row_name: 'str | None' = 'row', na_rep: 'str | None' = None, attr_cols: 'str | list[str] | None' = None, elem_cols: 'str | list[str] | None' = None, namespaces: 'dict[str | None, str] | None' = None, prefix: 'str | None' = None, encoding: 'str' = 'utf-8', xml_declaration: 'bool | None' = True, pretty_print: 'bool | None' = True, parser: 'str | None' = 'lxml', stylesheet: 'FilePathOrBuffer | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'str | None'
    Help on function to_xml in module pandas.core.frame:
    to_xml(self, path_or_buffer: 'FilePathOrBuffer | None' = None, index: 'bool' = True, root_name: 'str | None' = 'data', row_name: 'str | None' = 'row', na_rep: 'str | None' = None, attr_cols: 'str | list[str] | None' = None, elem_cols: 'str | list[str] | None' = None, namespaces: 'dict[str | None, str] | None' = None, prefix: 'str | None' = None, encoding: 'str' = 'utf-8', xml_declaration: 'bool | None' = True, pretty_print: 'bool | None' = True, parser: 'str | None' = 'lxml', stylesheet: 'FilePathOrBuffer | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'str | None'
        Render a DataFrame to an XML document.
        .. versionadded:: 1.3.0
        Parameters
        ----------
        path_or_buffer : str, path object or file-like object, optional
            File to write output to. If None, the output is returned as a
            string.
        index : bool, default True
            Whether to include index in XML document.
        root_name : str, default 'data'
            The name of root element in XML document.
        row_name : str, default 'row'
            The name of row element in XML document.
        na_rep : str, optional
            Missing data representation.
        attr_cols : list-like, optional
            List of columns to write as attributes in row element.
            Hierarchical columns will be flattened with underscore
            delimiting the different levels.
        elem_cols : list-like, optional
            List of columns to write as children in row element. By default,
            all columns output as children of row element. Hierarchical
            columns will be flattened with underscore delimiting the
            different levels.
        namespaces : dict, optional
            All namespaces to be defined in root element. Keys of dict
            should be prefix names and values of dict corresponding URIs.
            Default namespaces should be given empty string key. For
            example, ::
                namespaces = {"": "https://example.com"}
        prefix : str, optional
            Namespace prefix to be used for every element and/or attribute
            in document. This should be one of the keys in ``namespaces``
            dict.
        encoding : str, default 'utf-8'
            Encoding of the resulting document.
        xml_declaration : bool, default True
            Whether to include the XML declaration at start of document.
        pretty_print : bool, default True
            Whether output should be pretty printed with indentation and
            line breaks.
        parser : {'lxml','etree'}, default 'lxml'
            Parser module to use for building of tree. Only 'lxml' and
            'etree' are supported. With 'lxml', the ability to use XSLT
            stylesheet is supported.
        stylesheet : str, path object or file-like object, optional
            A URL, file-like object, or a raw string containing an XSLT
            script used to transform the raw XML output. Script should use
            layout of elements and attributes from original output. This
            argument requires ``lxml`` to be installed. Only XSLT 1.0
            scripts and not later versions is currently supported.
        compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer'
            For on-the-fly decompression of on-disk data. If 'infer', then use
            gzip, bz2, zip or xz if path_or_buffer is a string ending in
            '.gz', '.bz2', '.zip', or 'xz', respectively, and no decompression
            otherwise. If using 'zip', the ZIP file must contain only one data
            file to be read in. Set to None for no decompression.
        storage_options : dict, optional
            Extra options that make sense for a particular storage connection, e.g.
            host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
            are forwarded to ``urllib`` as header options. For other URLs (e.g.
            starting with "s3://", and "gcs://") the key-value pairs are forwarded to
            ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        Returns
        -------
        None or str
            If ``io`` is None, returns the resulting XML format as a
            string. Otherwise returns None.
        See Also
        --------
        to_json : Convert the pandas object to a JSON string.
        to_html : Convert DataFrame to a html.
        Examples
        --------
        >>> df = pd.DataFrame({'shape': ['square', 'circle', 'triangle'],
        ...                    'degrees': [360, 360, 180],
        ...                    'sides': [4, np.nan, 3]})
        >>> df.to_xml()  # doctest: +SKIP
        <?xml version='1.0' encoding='utf-8'?>
        <data>
          <row>
            <index>0</index>
            <shape>square</shape>
            <degrees>360</degrees>
            <sides>4.0</sides>
          </row>
          <row>
            <index>1</index>
            <shape>circle</shape>
            <degrees>360</degrees>
            <sides/>
          </row>
          <row>
            <index>2</index>
            <shape>triangle</shape>
            <degrees>180</degrees>
            <sides>3.0</sides>
          </row>
        </data>
        >>> df.to_xml(attr_cols=[
        ...           'index', 'shape', 'degrees', 'sides'
        ...           ])  # doctest: +SKIP
        <?xml version='1.0' encoding='utf-8'?>
        <data>
          <row index="0" shape="square" degrees="360" sides="4.0"/>
          <row index="1" shape="circle" degrees="360"/>
          <row index="2" shape="triangle" degrees="180" sides="3.0"/>
        </data>
        >>> df.to_xml(namespaces={"doc": "https://example.com"},
        ...           prefix="doc")  # doctest: +SKIP
        <?xml version='1.0' encoding='utf-8'?>
        <doc:data xmlns:doc="https://example.com">
          <doc:row>
            <doc:index>0</doc:index>
            <doc:shape>square</doc:shape>
            <doc:degrees>360</doc:degrees>
            <doc:sides>4.0</doc:sides>
          </doc:row>
          <doc:row>
            <doc:index>1</doc:index>
            <doc:shape>circle</doc:shape>
            <doc:degrees>360</doc:degrees>
            <doc:sides/>
          </doc:row>
          <doc:row>
            <doc:index>2</doc:index>
            <doc:shape>triangle</doc:shape>
            <doc:degrees>180</doc:degrees>
            <doc:sides>3.0</doc:sides>
          </doc:row>
        </doc:data>


df.to_excel

与pd.read_excel()对应,找这个pd.DataFrame.to_excel()扩展学习:

参数表及对应默认值:

to_excel(
  excel_writer,
  sheet_name: 'str' = 'Sheet1',
  na_rep: 'str' = '',
  float_format: 'Optional[str]' = None,
  columns=None,
  header=True,
  index=True,
  index_label=None,
  startrow=0,
  startcol=0,
  engine=None,
  merge_cells=True,
  encoding=None,
  inf_rep='inf',
  verbose=True,
  freeze_panes=None,
  storage_options: 'StorageOptions' = None)


注意点:

要将单个对象写入Excel的.xlsx文件,只需指定目标文件名。但要写入多个工作表,需要创建一个具有目标文件名的“ExcelWriter”对象,并在文件中指定要写入的工作表。


通过指定唯一的“sheet_name”,可以写入多个工作表。


将所有数据写入文件后,需要保存更改。

请注意,使用已存在的文件名创建“ExcelWriter”对象将导致删除现有文件的内容。



参数详解:



01. excel_writer

excel_writer:path like、file like或ExcelWriter对象文件路径或现有ExcelWriter。


02. sheet_name: 'str' = 'Sheet1'

sheet_name:str,默认为“Sheet1”

将包含DataFrame的工作表的名称。


03. na_rep: 'str' = ''

na_rep:str,默认“”

缺少数据表示。


04. float_format: 'Optional[str]' = None

float_format:str,可选

设置浮点数的字符串格式。例如

``float_format=“%.2f”``将设置0.1234到0.12的格式。


05. columns=None

str的序列或列表,可选

要写入的列。

06. header=True

header:bool或str列表,默认为True

写出列名。如果给定了字符串列表,则假定为列名的别名。


07. index=True

index:bool,默认为True

写入行名称(索引)。


08. index_label=None

index_label:str或sequence,可选

索引列的列标签(如果需要)。如果未指定,并且“header”和“index”为True,则使用索引名称。如果DataFrame使用MultiIndex,则应给出序列。


09. startrow=0

startrow:int,默认值0

要转储数据帧的左上单元格行。


10. startcol=0

startcol:int,默认值0

要转储数据帧的左上角单元格列。


11. engine=None

引擎:str,可选

要使用的写入引擎“openpyxl”或“xlsxwriter”。您也可以通过选项`io.exel.xlsx.writer``、`io.excel.xls.writer``和`io.exex.xlsm.writer`设置此选项。


..已弃用::1.2.0

作为`xlwt<https://pypi.org/project/xlwt/>`__包不再维护,“xlwt”引擎将在未来版本的panda中删除。


merge_cells=True

merge_cells:bool,默认为True

将多索引和分层行写入合并单元格。


12. encoding=None

编码:str,可选

生成的excel文件的编码。只有xlwt才需要,其他编写器本机支持unicode。


13. inf_rep='inf'

inf_rep:str,默认为“inf”

无限表示(Excel中没有无限的原生表示)。


14. verbose=True

verbose:bool,默认为True

在错误日志中显示更多信息。


15. freeze_panes=None

冷冻路径:整数元组(长度2),可选

指定要冻结的最底行和最右列。


16. storage_options: 'StorageOptions' = None

storage_options:dict,可选

对特定存储连接有意义的额外选项,例如。


主机、端口、用户名、密码等,如果使用将由`fsspec``解析的URL,例如,开始“s3://”、“gcs://”。如果使用非fsspec URL提供此参数,将引发错误。

请参阅fsspec和后端存储实现文档,以获取一组允许的键和值。


  待续......



目录
相关文章
|
2月前
|
Java 数据处理 索引
(Pandas)Python做数据处理必选框架之一!(二):附带案例分析;刨析DataFrame结构和其属性;学会访问具体元素;判断元素是否存在;元素求和、求标准值、方差、去重、删除、排序...
DataFrame结构 每一列都属于Series类型,不同列之间数据类型可以不一样,但同一列的值类型必须一致。 DataFrame拥有一个总的 idx记录列,该列记录了每一行的索引 在DataFrame中,若列之间的元素个数不匹配,且使用Series填充时,在DataFrame里空值会显示为NaN;当列之间元素个数不匹配,并且不使用Series填充,会报错。在指定了index 属性显示情况下,会按照index的位置进行排序,默认是 [0,1,2,3,...] 从0索引开始正序排序行。
294 0
|
2月前
|
Java 数据挖掘 数据处理
(Pandas)Python做数据处理必选框架之一!(一):介绍Pandas中的两个数据结构;刨析Series:如何访问数据;数据去重、取众数、总和、标准差、方差、平均值等;判断缺失值、获取索引...
Pandas 是一个开源的数据分析和数据处理库,它是基于 Python 编程语言的。 Pandas 提供了易于使用的数据结构和数据分析工具,特别适用于处理结构化数据,如表格型数据(类似于Excel表格)。 Pandas 是数据科学和分析领域中常用的工具之一,它使得用户能够轻松地从各种数据源中导入数据,并对数据进行高效的操作和分析。 Pandas 主要引入了两种新的数据结构:Series 和 DataFrame。
473 0
|
3月前
|
存储 人工智能 测试技术
如何使用LangChain的Python库结合DeepSeek进行多轮次对话?
本文介绍如何使用LangChain结合DeepSeek实现多轮对话,测开人员可借此自动生成测试用例,提升自动化测试效率。
610 125
如何使用LangChain的Python库结合DeepSeek进行多轮次对话?
|
3月前
|
监控 数据可视化 数据挖掘
Python Rich库使用指南:打造更美观的命令行应用
Rich库是Python的终端美化利器,支持彩色文本、智能表格、动态进度条和语法高亮,大幅提升命令行应用的可视化效果与用户体验。
301 0
|
5月前
|
存储 Web App开发 前端开发
Python + Requests库爬取动态Ajax分页数据
Python + Requests库爬取动态Ajax分页数据
|
2月前
|
数据可视化 关系型数据库 MySQL
【可视化大屏】全流程讲解用python的pyecharts库实现拖拽可视化大屏的背后原理,简单粗暴!
本文详解基于Python的电影TOP250数据可视化大屏开发全流程,涵盖爬虫、数据存储、分析及可视化。使用requests+BeautifulSoup爬取数据,pandas存入MySQL,pyecharts实现柱状图、饼图、词云图、散点图等多种图表,并通过Page组件拖拽布局组合成大屏,支持多种主题切换,附完整源码与视频讲解。
294 4
【可视化大屏】全流程讲解用python的pyecharts库实现拖拽可视化大屏的背后原理,简单粗暴!
|
2月前
|
传感器 运维 前端开发
Python离群值检测实战:使用distfit库实现基于分布拟合的异常检测
本文解析异常(anomaly)与新颖性(novelty)检测的本质差异,结合distfit库演示基于概率密度拟合的单变量无监督异常检测方法,涵盖全局、上下文与集体离群值识别,助力构建高可解释性模型。
353 10
Python离群值检测实战:使用distfit库实现基于分布拟合的异常检测
|
4月前
|
运维 Linux 开发者
Linux系统中使用Python的ping3库进行网络连通性测试
以上步骤展示了如何利用 Python 的 `ping3` 库来检测网络连通性,并且提供了基本错误处理方法以确保程序能够优雅地处理各种意外情形。通过简洁明快、易读易懂、实操性强等特点使得该方法非常适合开发者或系统管理员快速集成至自动化工具链之内进行日常运维任务之需求满足。
300 18
|
4月前
|
机器学习/深度学习 API 异构计算
JAX快速上手:从NumPy到GPU加速的Python高性能计算库入门教程
JAX是Google开发的高性能数值计算库,旨在解决NumPy在现代计算需求下的局限性。它不仅兼容NumPy的API,还引入了自动微分、GPU/TPU加速和即时编译(JIT)等关键功能,显著提升了计算效率。JAX适用于机器学习、科学模拟等需要大规模计算和梯度优化的场景,为Python在高性能计算领域开辟了新路径。
447 0
JAX快速上手:从NumPy到GPU加速的Python高性能计算库入门教程
|
4月前
|
数据采集 存储 Web App开发
Python爬虫库性能与选型实战指南:从需求到落地的全链路解析
本文深入解析Python爬虫库的性能与选型策略,涵盖需求分析、技术评估与实战案例,助你构建高效稳定的数据采集系统。
435 0

推荐镜像

更多