Python pandas库|任凭弱水三千,我只取一瓢饮(7)

简介: Python pandas库|任凭弱水三千,我只取一瓢饮(7)

to_系列函数:22个 (12~22)

Function12

to_numpy(self, dtype: 'NpDtype | None' = None, copy: 'bool' = False, na_value=<no_default>) -> 'np.ndarray'

Help on function to_numpy in module pandas.core.frame:
to_numpy(self, dtype: 'NpDtype | None' = None, copy: 'bool' = False, na_value=<no_default>) -> 'np.ndarray'
    Convert the DataFrame to a NumPy array.
    By default, the dtype of the returned array will be the common NumPy
    dtype of all types in the DataFrame. For example, if the dtypes are
    ``float16`` and ``float32``, the results dtype will be ``float32``.
    This may require copying data and coercing values, which may be
    expensive.
    Parameters
    ----------
    dtype : str or numpy.dtype, optional
        The dtype to pass to :meth:`numpy.asarray`.
    copy : bool, default False
        Whether to ensure that the returned value is not a view on
        another array. Note that ``copy=False`` does not *ensure* that
        ``to_numpy()`` is no-copy. Rather, ``copy=True`` ensure that
        a copy is made, even if not strictly necessary.
    na_value : Any, optional
        The value to use for missing values. The default value depends
        on `dtype` and the dtypes of the DataFrame columns.
        .. versionadded:: 1.1.0
    Returns
    -------
    numpy.ndarray
    See Also
    --------
    Series.to_numpy : Similar method for Series.
    Examples
    --------
    >>> pd.DataFrame({"A": [1, 2], "B": [3, 4]}).to_numpy()
    array([[1, 3],
           [2, 4]])
    With heterogeneous data, the lowest common type will have to
    be used.
    >>> df = pd.DataFrame({"A": [1, 2], "B": [3.0, 4.5]})
    >>> df.to_numpy()
    array([[1. , 3. ],
           [2. , 4.5]])
    For a mix of numeric and non-numeric types, the output array will
    have object dtype.
    >>> df['C'] = pd.date_range('2000', periods=2)
    >>> df.to_numpy()
    array([[1, 3.0, Timestamp('2000-01-01 00:00:00')],
           [2, 4.5, Timestamp('2000-01-02 00:00:00')]], dtype=object)




Function13

to_parquet(self, path: 'FilePathOrBuffer | None' = None, engine: 'str' = 'auto', compression: 'str | None' = 'snappy', index: 'bool | None' = None, partition_cols: 'list[str] | None' = None, storage_options: 'StorageOptions' = None, **kwargs) -> 'bytes | None'



Help on function to_parquet in module pandas.core.frame:
to_parquet(self, path: 'FilePathOrBuffer | None' = None, engine: 'str' = 'auto', compression: 'str | None' = 'snappy', index: 'bool | None' = None, partition_cols: 'list[str] | None' = None, storage_options: 'StorageOptions' = None, **kwargs) -> 'bytes | None'
    Write a DataFrame to the binary parquet format.
    This function writes the dataframe as a `parquet file
    <https://parquet.apache.org/>`_. You can choose different parquet
    backends, and have the option of compression. See
    :ref:`the user guide <io.parquet>` for more details.
    Parameters
    ----------
    path : str or file-like object, default None
        If a string, it will be used as Root Directory path
        when writing a partitioned dataset. By file-like object,
        we refer to objects with a write() method, such as a file handle
        (e.g. via builtin open function) or io.BytesIO. The engine
        fastparquet does not accept file-like objects. If path is None,
        a bytes object is returned.
        .. versionchanged:: 1.2.0
        Previously this was "fname"
    engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto'
        Parquet library to use. If 'auto', then the option
        ``io.parquet.engine`` is used. The default ``io.parquet.engine``
        behavior is to try 'pyarrow', falling back to 'fastparquet' if
        'pyarrow' is unavailable.
    compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
        Name of the compression to use. Use ``None`` for no compression.
    index : bool, default None
        If ``True``, include the dataframe's index(es) in the file output.
        If ``False``, they will not be written to the file.
        If ``None``, similar to ``True`` the dataframe's index(es)
        will be saved. However, instead of being saved as values,
        the RangeIndex will be stored as a range in the metadata so it
        doesn't require much space and is faster. Other indexes will
        be included as columns in the file output.
    partition_cols : list, optional, default None
        Column names by which to partition the dataset.
        Columns are partitioned in the order they are given.
        Must be None if path is not a string.
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to ``urllib`` as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        .. versionadded:: 1.2.0
    **kwargs
        Additional arguments passed to the parquet library. See
        :ref:`pandas io <io.parquet>` for more details.
    Returns
    -------
    bytes if no path argument is provided else None
    See Also
    --------
    read_parquet : Read a parquet file.
    DataFrame.to_csv : Write a csv file.
    DataFrame.to_sql : Write to a sql table.
    DataFrame.to_hdf : Write to hdf.
    Notes
    -----
    This function requires either the `fastparquet
    <https://pypi.org/project/fastparquet>`_ or `pyarrow
    <https://arrow.apache.org/docs/python/>`_ library.
    Examples
    --------
    >>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
    >>> df.to_parquet('df.parquet.gzip',
    ...               compression='gzip')  # doctest: +SKIP
    >>> pd.read_parquet('df.parquet.gzip')  # doctest: +SKIP
       col1  col2
    0     1     3
    1     2     4
    If you want to get a buffer to the parquet content you can use a io.BytesIO
    object, as long as you don't use partition_cols, which creates multiple files.
    >>> import io
    >>> f = io.BytesIO()
    >>> df.to_parquet(f)
    >>> f.seek(0)
    0
    >>> content = f.read()



Function14

to_period(self, freq: 'Frequency | None' = None, axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'

Help on function to_period in module pandas.core.frame:
to_period(self, freq: 'Frequency | None' = None, axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'
    Convert DataFrame from DatetimeIndex to PeriodIndex.
    Convert DataFrame from DatetimeIndex to PeriodIndex with desired
    frequency (inferred from index if not passed).
    Parameters
    ----------
    freq : str, default
        Frequency of the PeriodIndex.
    axis : {0 or 'index', 1 or 'columns'}, default 0
        The axis to convert (the index by default).
    copy : bool, default True
        If False then underlying input data is not copied.
    Returns
    -------
    DataFrame with PeriodIndex



Function15

to_pickle(self, path, compression: 'CompressionOptions' = 'infer', protocol: 'int' = 5, storage_options: 'StorageOptions' = None) -> 'None'



Help on function to_pickle in module pandas.core.generic:
to_pickle(self, path, compression: 'CompressionOptions' = 'infer', protocol: 'int' = 5, storage_options: 'StorageOptions' = None) -> 'None'
    Pickle (serialize) object to file.
    Parameters
    ----------
    path : str
        File path where the pickled object will be stored.
    compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None},         default 'infer'
        A string representing the compression to use in the output file. By
        default, infers from the file extension in specified path.
        Compression mode may be any of the following possible
        values: {¡®infer¡¯, ¡®gzip¡¯, ¡®bz2¡¯, ¡®zip¡¯, ¡®xz¡¯, None}. If compression
        mode is ¡®infer¡¯ and path_or_buf is path-like, then detect
        compression mode from the following extensions:
        ¡®.gz¡¯, ¡®.bz2¡¯, ¡®.zip¡¯ or ¡®.xz¡¯. (otherwise no compression).
        If dict given and mode is ¡®zip¡¯ or inferred as ¡®zip¡¯, other entries
        passed as additional compression options.
    protocol : int
        Int which indicates which protocol should be used by the pickler,
        default HIGHEST_PROTOCOL (see [1]_ paragraph 12.1.2). The possible
        values are 0, 1, 2, 3, 4, 5. A negative value for the protocol
        parameter is equivalent to setting its value to HIGHEST_PROTOCOL.
        .. [1] https://docs.python.org/3/library/pickle.html.
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to ``urllib`` as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        .. versionadded:: 1.2.0
    See Also
    --------
    read_pickle : Load pickled pandas object (or any object) from file.
    DataFrame.to_hdf : Write DataFrame to an HDF5 file.
    DataFrame.to_sql : Write DataFrame to a SQL database.
    DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
    Examples
    --------
    >>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)})
    >>> original_df
       foo  bar
    0    0    5
    1    1    6
    2    2    7
    3    3    8
    4    4    9
    >>> original_df.to_pickle("./dummy.pkl")
    >>> unpickled_df = pd.read_pickle("./dummy.pkl")
    >>> unpickled_df
       foo  bar
    0    0    5
    1    1    6
    2    2    7
    3    3    8
    4    4    9
    >>> import os
    >>> os.remove("./dummy.pkl")




Function16

to_records(self, index=True, column_dtypes=None, index_dtypes=None) -> 'np.recarray'

Help on function to_records in module pandas.core.frame:
to_records(self, index=True, column_dtypes=None, index_dtypes=None) -> 'np.recarray'
    Convert DataFrame to a NumPy record array.
    Index will be included as the first field of the record array if
    requested.
    Parameters
    ----------
    index : bool, default True
        Include index in resulting record array, stored in 'index'
        field or using the index label, if set.
    column_dtypes : str, type, dict, default None
        If a string or type, the data type to store all columns. If
        a dictionary, a mapping of column names and indices (zero-indexed)
        to specific data types.
    index_dtypes : str, type, dict, default None
        If a string or type, the data type to store all index levels. If
        a dictionary, a mapping of index level names and indices
        (zero-indexed) to specific data types.
        This mapping is applied only if `index=True`.
    Returns
    -------
    numpy.recarray
        NumPy ndarray with the DataFrame labels as fields and each row
        of the DataFrame as entries.
    See Also
    --------
    DataFrame.from_records: Convert structured or record ndarray
        to DataFrame.
    numpy.recarray: An ndarray that allows field access using
        attributes, analogous to typed columns in a
        spreadsheet.
    Examples
    --------
    >>> df = pd.DataFrame({'A': [1, 2], 'B': [0.5, 0.75]},
    ...                   index=['a', 'b'])
    >>> df
       A     B
    a  1  0.50
    b  2  0.75
    >>> df.to_records()
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')])
    If the DataFrame index has no label then the recarray field name
    is set to 'index'. If the index has a label then this is used as the
    field name:
    >>> df.index = df.index.rename("I")
    >>> df.to_records()
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('I', 'O'), ('A', '<i8'), ('B', '<f8')])
    The index can be excluded from the record array:
    >>> df.to_records(index=False)
    rec.array([(1, 0.5 ), (2, 0.75)],
              dtype=[('A', '<i8'), ('B', '<f8')])
    Data types can be specified for the columns:
    >>> df.to_records(column_dtypes={"A": "int32"})
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('I', 'O'), ('A', '<i4'), ('B', '<f8')])
    As well as for the index:
    >>> df.to_records(index_dtypes="<S2")
    rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)],
              dtype=[('I', 'S2'), ('A', '<i8'), ('B', '<f8')])
    >>> index_dtypes = f"<S{df.index.str.len().max()}"
    >>> df.to_records(index_dtypes=index_dtypes)
    rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)],
              dtype=[('I', 'S1'), ('A', '<i8'), ('B', '<f8')])



Function17

to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'None'



Help on function to_sql in module pandas.core.generic:
to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'None'
    Write records stored in a DataFrame to a SQL database.
    Databases supported by SQLAlchemy [1]_ are supported. Tables can be
    newly created, appended to, or overwritten.
    Parameters
    ----------
    name : str
        Name of SQL table.
    con : sqlalchemy.engine.(Engine or Connection) or sqlite3.Connection
        Using SQLAlchemy makes it possible to use any DB supported by that
        library. Legacy support is provided for sqlite3.Connection objects. The user
        is responsible for engine disposal and connection closure for the SQLAlchemy
        connectable See `here                 <https://docs.sqlalchemy.org/en/13/core/connections.html>`_.
    schema : str, optional
        Specify the schema (if database flavor supports this). If None, use
        default schema.
    if_exists : {'fail', 'replace', 'append'}, default 'fail'
        How to behave if the table already exists.
        * fail: Raise a ValueError.
        * replace: Drop the table before inserting new values.
        * append: Insert new values to the existing table.
    index : bool, default True
        Write DataFrame index as a column. Uses `index_label` as the column
        name in the table.
    index_label : str or sequence, default None
        Column label for index column(s). If None is given (default) and
        `index` is True, then the index names are used.
        A sequence should be given if the DataFrame uses MultiIndex.
    chunksize : int, optional
        Specify the number of rows in each batch to be written at a time.
        By default, all rows will be written at once.
    dtype : dict or scalar, optional
        Specifying the datatype for columns. If a dictionary is used, the
        keys should be the column names and the values should be the
        SQLAlchemy types or strings for the sqlite3 legacy mode. If a
        scalar is provided, it will be applied to all columns.
    method : {None, 'multi', callable}, optional
        Controls the SQL insertion clause used:
        * None : Uses standard SQL ``INSERT`` clause (one per row).
        * 'multi': Pass multiple values in a single ``INSERT`` clause.
        * callable with signature ``(pd_table, conn, keys, data_iter)``.
        Details and a sample callable implementation can be found in the
        section :ref:`insert method <io.sql.method>`.
    Raises
    ------
    ValueError
        When the table already exists and `if_exists` is 'fail' (the
        default).
    See Also
    --------
    read_sql : Read a DataFrame from a table.
    Notes
    -----
    Timezone aware datetime columns will be written as
    ``Timestamp with timezone`` type with SQLAlchemy if supported by the
    database. Otherwise, the datetimes will be stored as timezone unaware
    timestamps local to the original timezone.
    References
    ----------
    .. [1] https://docs.sqlalchemy.org
    .. [2] https://www.python.org/dev/peps/pep-0249/
    Examples
    --------
    Create an in-memory SQLite database.
    >>> from sqlalchemy import create_engine
    >>> engine = create_engine('sqlite://', echo=False)
    Create a table from scratch with 3 rows.
    >>> df = pd.DataFrame({'name' : ['User 1', 'User 2', 'User 3']})
    >>> df
         name
    0  User 1
    1  User 2
    2  User 3
    >>> df.to_sql('users', con=engine)
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 1'), (1, 'User 2'), (2, 'User 3')]
    An `sqlalchemy.engine.Connection` can also be passed to `con`:
    >>> with engine.begin() as connection:
    ...     df1 = pd.DataFrame({'name' : ['User 4', 'User 5']})
    ...     df1.to_sql('users', con=connection, if_exists='append')
    This is allowed to support operations that require that the same
    DBAPI connection is used for the entire operation.
    >>> df2 = pd.DataFrame({'name' : ['User 6', 'User 7']})
    >>> df2.to_sql('users', con=engine, if_exists='append')
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 1'), (1, 'User 2'), (2, 'User 3'),
     (0, 'User 4'), (1, 'User 5'), (0, 'User 6'),
     (1, 'User 7')]
    Overwrite the table with just ``df2``.
    >>> df2.to_sql('users', con=engine, if_exists='replace',
    ...            index_label='id')
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 6'), (1, 'User 7')]
    Specify the dtype (especially useful for integers with missing values).
    Notice that while pandas is forced to store the data as floating point,
    the database supports nullable integers. When fetching the data with
    Python, we get back integer scalars.
    >>> df = pd.DataFrame({"A": [1, None, 2]})
    >>> df
         A
    0  1.0
    1  NaN
    2  2.0
    >>> from sqlalchemy.types import Integer
    >>> df.to_sql('integers', con=engine, index=False,
    ...           dtype={"A": Integer()})
    >>> engine.execute("SELECT * FROM integers").fetchall()
    [(1,), (None,), (2,)]



Function18

to_stata(self, path: 'FilePathOrBuffer', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'None'



Help on function to_stata in module pandas.core.frame:
to_stata(self, path: 'FilePathOrBuffer', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'None'
    Export DataFrame object to Stata dta format.
    Writes the DataFrame to a Stata dataset file.
    "dta" files contain a Stata dataset.
    Parameters
    ----------
    path : str, buffer or path object
        String, path object (pathlib.Path or py._path.local.LocalPath) or
        object implementing a binary write() function. If using a buffer
        then the buffer will not be automatically closed after the file
        data has been written.
        .. versionchanged:: 1.0.0
        Previously this was "fname"
    convert_dates : dict
        Dictionary mapping columns containing datetime types to stata
        internal format to use when writing the dates. Options are 'tc',
        'td', 'tm', 'tw', 'th', 'tq', 'ty'. Column can be either an integer
        or a name. Datetime columns that do not have a conversion type
        specified will be converted to 'tc'. Raises NotImplementedError if
        a datetime column has timezone information.
    write_index : bool
        Write the index to Stata dataset.
    byteorder : str
        Can be ">", "<", "little", or "big". default is `sys.byteorder`.
    time_stamp : datetime
        A datetime to use as file creation date.  Default is the current
        time.
    data_label : str, optional
        A label for the data set.  Must be 80 characters or smaller.
    variable_labels : dict
        Dictionary containing columns as keys and variable labels as
        values. Each label must be 80 characters or smaller.
    version : {114, 117, 118, 119, None}, default 114
        Version to use in the output dta file. Set to None to let pandas
        decide between 118 or 119 formats depending on the number of
        columns in the frame. Version 114 can be read by Stata 10 and
        later. Version 117 can be read by Stata 13 or later. Version 118
        is supported in Stata 14 and later. Version 119 is supported in
        Stata 15 and later. Version 114 limits string variables to 244
        characters or fewer while versions 117 and later allow strings
        with lengths up to 2,000,000 characters. Versions 118 and 119
        support Unicode characters, and version 119 supports more than
        32,767 variables.
        Version 119 should usually only be used when the number of
        variables exceeds the capacity of dta format 118. Exporting
        smaller datasets in format 119 may have unintended consequences,
        and, as of November 2020, Stata SE cannot read version 119 files.
        .. versionchanged:: 1.0.0
            Added support for formats 118 and 119.
    convert_strl : list, optional
        List of column names to convert to string columns to Stata StrL
        format. Only available if version is 117.  Storing strings in the
        StrL format can produce smaller dta files if strings have more than
        8 characters and values are repeated.
    compression : str or dict, default 'infer'
        For on-the-fly compression of the output dta. If string, specifies
        compression mode. If dict, value at key 'method' specifies
        compression mode. Compression mode must be one of {'infer', 'gzip',
        'bz2', 'zip', 'xz', None}. If compression mode is 'infer' and
        `fname` is path-like, then detect compression from the following
        extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no
        compression). If dict and compression mode is one of {'zip',
        'gzip', 'bz2'}, or inferred as one of the above, other entries
        passed as additional compression options.
        .. versionadded:: 1.1.0
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to ``urllib`` as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        .. versionadded:: 1.2.0
    Raises
    ------
    NotImplementedError
        * If datetimes contain timezone information
        * Column dtype is not representable in Stata
    ValueError
        * Columns listed in convert_dates are neither datetime64[ns]
          or datetime.datetime
        * Column listed in convert_dates is not in DataFrame
        * Categorical label contains more than 32,000 characters
    See Also
    --------
    read_stata : Import Stata data files.
    io.stata.StataWriter : Low-level writer for Stata data files.
    io.stata.StataWriter117 : Low-level writer for version 117 files.
    Examples
    --------
    >>> df = pd.DataFrame({'animal': ['falcon', 'parrot', 'falcon',
    ...                               'parrot'],
    ...                    'speed': [350, 18, 361, 15]})
    >>> df.to_stata('animals.dta')  # doctest: +SKIP



Function19

to_string(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'int | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'fmt.FormattersType | None' = None, float_format: 'fmt.FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, min_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool' = False, decimal: 'str' = '.', line_width: 'int | None' = None, max_colwidth: 'int | None' = None, encoding: 'str | None' = None) -> 'str | None'



Help on function to_string in module pandas.core.frame:
to_string(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'int | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'fmt.FormattersType | None' = None, float_format: 'fmt.FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, min_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool' = False, decimal: 'str' = '.', line_width: 'int | None' = None, max_colwidth: 'int | None' = None, encoding: 'str | None' = None) -> 'str | None'
    Render a DataFrame to a console-friendly tabular output.
    Parameters
    ----------
    buf : str, Path or StringIO-like, optional, default None
        Buffer to write to. If None, the output is returned as a string.
    columns : sequence, optional, default None
        The subset of columns to write. Writes all columns by default.
    col_space : int, list or dict of int, optional
        The minimum width of each column.
    header : bool or sequence, optional
        Write out the column names. If a list of strings is given, it is assumed to be aliases for the column names.
    index : bool, optional, default True
        Whether to print index (row) labels.
    na_rep : str, optional, default 'NaN'
        String representation of ``NaN`` to use.
    formatters : list, tuple or dict of one-param. functions, optional
        Formatter functions to apply to columns' elements by position or
        name.
        The result of each function must be a unicode string.
        List/tuple must be of length equal to the number of columns.
    float_format : one-parameter function, optional, default None
        Formatter function to apply to columns' elements if they are
        floats. This function must return a unicode string and will be
        applied only to the non-``NaN`` elements, with ``NaN`` being
        handled by ``na_rep``.
        .. versionchanged:: 1.2.0
    sparsify : bool, optional, default True
        Set to False for a DataFrame with a hierarchical index to print
        every multiindex key at each row.
    index_names : bool, optional, default True
        Prints the names of the indexes.
    justify : str, default None
        How to justify the column labels. If None uses the option from
        the print configuration (controlled by set_option), 'right' out
        of the box. Valid values are
        * left
        * right
        * center
        * justify
        * justify-all
        * start
        * end
        * inherit
        * match-parent
        * initial
        * unset.
    max_rows : int, optional
        Maximum number of rows to display in the console.
    min_rows : int, optional
        The number of rows to display in the console in a truncated repr
        (when number of rows is above `max_rows`).
    max_cols : int, optional
        Maximum number of columns to display in the console.
    show_dimensions : bool, default False
        Display DataFrame dimensions (number of rows by number of columns).
    decimal : str, default '.'
        Character recognized as decimal separator, e.g. ',' in Europe.
    line_width : int, optional
        Width to wrap a line in characters.
    max_colwidth : int, optional
        Max width to truncate each column in characters. By default, no limit.
        .. versionadded:: 1.0.0
    encoding : str, default "utf-8"
        Set character encoding.
        .. versionadded:: 1.0
    Returns
    -------
    str or None
        If buf is None, returns the result as a string. Otherwise returns
        None.
    See Also
    --------
    to_html : Convert DataFrame to HTML.
    Examples
    --------
    >>> d = {'col1': [1, 2, 3], 'col2': [4, 5, 6]}
    >>> df = pd.DataFrame(d)
    >>> print(df.to_string())
       col1  col2
    0     1     4
    1     2     5
    2     3     6



Function20

to_timestamp(self, freq: 'Frequency | None' = None, how: 'str' = 'start', axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'

Help on function to_timestamp in module pandas.core.frame:
to_timestamp(self, freq: 'Frequency | None' = None, how: 'str' = 'start', axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'
    Cast to DatetimeIndex of timestamps, at *beginning* of period.
    Parameters
    ----------
    freq : str, default frequency of PeriodIndex
        Desired frequency.
    how : {'s', 'e', 'start', 'end'}
        Convention for converting period to timestamp; start of period
        vs. end.
    axis : {0 or 'index', 1 or 'columns'}, default 0
        The axis to convert (the index by default).
    copy : bool, default True
        If False then underlying input data is not copied.
    Returns
    -------
    DataFrame with DatetimeIndex



Function21

to_xarray(self)

Help on function to_xarray in module pandas.core.generic:
to_xarray(self)
    Return an xarray object from the pandas object.
    Returns
    -------
    xarray.DataArray or xarray.Dataset
        Data in the pandas structure converted to Dataset if the object is
        a DataFrame, or a DataArray if the object is a Series.
    See Also
    --------
    DataFrame.to_hdf : Write DataFrame to an HDF5 file.
    DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
    Notes
    -----
    See the `xarray docs <https://xarray.pydata.org/en/stable/>`__
    Examples
    --------
    >>> df = pd.DataFrame([('falcon', 'bird', 389.0, 2),
    ...                    ('parrot', 'bird', 24.0, 2),
    ...                    ('lion', 'mammal', 80.5, 4),
    ...                    ('monkey', 'mammal', np.nan, 4)],
    ...                   columns=['name', 'class', 'max_speed',
    ...                            'num_legs'])
    >>> df
         name   class  max_speed  num_legs
    0  falcon    bird      389.0         2
    1  parrot    bird       24.0         2
    2    lion  mammal       80.5         4
    3  monkey  mammal        NaN         4
    >>> df.to_xarray()
    <xarray.Dataset>
    Dimensions:    (index: 4)
    Coordinates:
      * index      (index) int64 0 1 2 3
    Data variables:
        name       (index) object 'falcon' 'parrot' 'lion' 'monkey'
        class      (index) object 'bird' 'bird' 'mammal' 'mammal'
        max_speed  (index) float64 389.0 24.0 80.5 nan
        num_legs   (index) int64 2 2 4 4
    >>> df['max_speed'].to_xarray()
    <xarray.DataArray 'max_speed' (index: 4)>
    array([389. ,  24. ,  80.5,   nan])
    Coordinates:
      * index    (index) int64 0 1 2 3
    >>> dates = pd.to_datetime(['2018-01-01', '2018-01-01',
    ...                         '2018-01-02', '2018-01-02'])
    >>> df_multiindex = pd.DataFrame({'date': dates,
    ...                               'animal': ['falcon', 'parrot',
    ...                                          'falcon', 'parrot'],
    ...                               'speed': [350, 18, 361, 15]})
    >>> df_multiindex = df_multiindex.set_index(['date', 'animal'])
    >>> df_multiindex
                       speed
    date       animal
    2018-01-01 falcon    350
               parrot     18
    2018-01-02 falcon    361
               parrot     15
    >>> df_multiindex.to_xarray()
    <xarray.Dataset>
    Dimensions:  (animal: 2, date: 2)
    Coordinates:
      * date     (date) datetime64[ns] 2018-01-01 2018-01-02
      * animal   (animal) object 'falcon' 'parrot'
    Data variables:
        speed    (date, animal) int64 350 18 361 15




Function22

to_xml(self, path_or_buffer: 'FilePathOrBuffer | None' = None, index: 'bool' = True, root_name: 'str | None' = 'data', row_name: 'str | None' = 'row', na_rep: 'str | None' = None, attr_cols: 'str | list[str] | None' = None, elem_cols: 'str | list[str] | None' = None, namespaces: 'dict[str | None, str] | None' = None, prefix: 'str | None' = None, encoding: 'str' = 'utf-8', xml_declaration: 'bool | None' = True, pretty_print: 'bool | None' = True, parser: 'str | None' = 'lxml', stylesheet: 'FilePathOrBuffer | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'str | None'
    Help on function to_xml in module pandas.core.frame:
    to_xml(self, path_or_buffer: 'FilePathOrBuffer | None' = None, index: 'bool' = True, root_name: 'str | None' = 'data', row_name: 'str | None' = 'row', na_rep: 'str | None' = None, attr_cols: 'str | list[str] | None' = None, elem_cols: 'str | list[str] | None' = None, namespaces: 'dict[str | None, str] | None' = None, prefix: 'str | None' = None, encoding: 'str' = 'utf-8', xml_declaration: 'bool | None' = True, pretty_print: 'bool | None' = True, parser: 'str | None' = 'lxml', stylesheet: 'FilePathOrBuffer | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'str | None'
        Render a DataFrame to an XML document.
        .. versionadded:: 1.3.0
        Parameters
        ----------
        path_or_buffer : str, path object or file-like object, optional
            File to write output to. If None, the output is returned as a
            string.
        index : bool, default True
            Whether to include index in XML document.
        root_name : str, default 'data'
            The name of root element in XML document.
        row_name : str, default 'row'
            The name of row element in XML document.
        na_rep : str, optional
            Missing data representation.
        attr_cols : list-like, optional
            List of columns to write as attributes in row element.
            Hierarchical columns will be flattened with underscore
            delimiting the different levels.
        elem_cols : list-like, optional
            List of columns to write as children in row element. By default,
            all columns output as children of row element. Hierarchical
            columns will be flattened with underscore delimiting the
            different levels.
        namespaces : dict, optional
            All namespaces to be defined in root element. Keys of dict
            should be prefix names and values of dict corresponding URIs.
            Default namespaces should be given empty string key. For
            example, ::
                namespaces = {"": "https://example.com"}
        prefix : str, optional
            Namespace prefix to be used for every element and/or attribute
            in document. This should be one of the keys in ``namespaces``
            dict.
        encoding : str, default 'utf-8'
            Encoding of the resulting document.
        xml_declaration : bool, default True
            Whether to include the XML declaration at start of document.
        pretty_print : bool, default True
            Whether output should be pretty printed with indentation and
            line breaks.
        parser : {'lxml','etree'}, default 'lxml'
            Parser module to use for building of tree. Only 'lxml' and
            'etree' are supported. With 'lxml', the ability to use XSLT
            stylesheet is supported.
        stylesheet : str, path object or file-like object, optional
            A URL, file-like object, or a raw string containing an XSLT
            script used to transform the raw XML output. Script should use
            layout of elements and attributes from original output. This
            argument requires ``lxml`` to be installed. Only XSLT 1.0
            scripts and not later versions is currently supported.
        compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer'
            For on-the-fly decompression of on-disk data. If 'infer', then use
            gzip, bz2, zip or xz if path_or_buffer is a string ending in
            '.gz', '.bz2', '.zip', or 'xz', respectively, and no decompression
            otherwise. If using 'zip', the ZIP file must contain only one data
            file to be read in. Set to None for no decompression.
        storage_options : dict, optional
            Extra options that make sense for a particular storage connection, e.g.
            host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
            are forwarded to ``urllib`` as header options. For other URLs (e.g.
            starting with "s3://", and "gcs://") the key-value pairs are forwarded to
            ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.
        Returns
        -------
        None or str
            If ``io`` is None, returns the resulting XML format as a
            string. Otherwise returns None.
        See Also
        --------
        to_json : Convert the pandas object to a JSON string.
        to_html : Convert DataFrame to a html.
        Examples
        --------
        >>> df = pd.DataFrame({'shape': ['square', 'circle', 'triangle'],
        ...                    'degrees': [360, 360, 180],
        ...                    'sides': [4, np.nan, 3]})
        >>> df.to_xml()  # doctest: +SKIP
        <?xml version='1.0' encoding='utf-8'?>
        <data>
          <row>
            <index>0</index>
            <shape>square</shape>
            <degrees>360</degrees>
            <sides>4.0</sides>
          </row>
          <row>
            <index>1</index>
            <shape>circle</shape>
            <degrees>360</degrees>
            <sides/>
          </row>
          <row>
            <index>2</index>
            <shape>triangle</shape>
            <degrees>180</degrees>
            <sides>3.0</sides>
          </row>
        </data>
        >>> df.to_xml(attr_cols=[
        ...           'index', 'shape', 'degrees', 'sides'
        ...           ])  # doctest: +SKIP
        <?xml version='1.0' encoding='utf-8'?>
        <data>
          <row index="0" shape="square" degrees="360" sides="4.0"/>
          <row index="1" shape="circle" degrees="360"/>
          <row index="2" shape="triangle" degrees="180" sides="3.0"/>
        </data>
        >>> df.to_xml(namespaces={"doc": "https://example.com"},
        ...           prefix="doc")  # doctest: +SKIP
        <?xml version='1.0' encoding='utf-8'?>
        <doc:data xmlns:doc="https://example.com">
          <doc:row>
            <doc:index>0</doc:index>
            <doc:shape>square</doc:shape>
            <doc:degrees>360</doc:degrees>
            <doc:sides>4.0</doc:sides>
          </doc:row>
          <doc:row>
            <doc:index>1</doc:index>
            <doc:shape>circle</doc:shape>
            <doc:degrees>360</doc:degrees>
            <doc:sides/>
          </doc:row>
          <doc:row>
            <doc:index>2</doc:index>
            <doc:shape>triangle</doc:shape>
            <doc:degrees>180</doc:degrees>
            <doc:sides>3.0</doc:sides>
          </doc:row>
        </doc:data>


df.to_excel

与pd.read_excel()对应,找这个pd.DataFrame.to_excel()扩展学习:

参数表及对应默认值:

to_excel(
  excel_writer,
  sheet_name: 'str' = 'Sheet1',
  na_rep: 'str' = '',
  float_format: 'Optional[str]' = None,
  columns=None,
  header=True,
  index=True,
  index_label=None,
  startrow=0,
  startcol=0,
  engine=None,
  merge_cells=True,
  encoding=None,
  inf_rep='inf',
  verbose=True,
  freeze_panes=None,
  storage_options: 'StorageOptions' = None)


注意点:

要将单个对象写入Excel的.xlsx文件,只需指定目标文件名。但要写入多个工作表,需要创建一个具有目标文件名的“ExcelWriter”对象,并在文件中指定要写入的工作表。


通过指定唯一的“sheet_name”,可以写入多个工作表。


将所有数据写入文件后,需要保存更改。

请注意,使用已存在的文件名创建“ExcelWriter”对象将导致删除现有文件的内容。



参数详解:



01. excel_writer

excel_writer:path like、file like或ExcelWriter对象文件路径或现有ExcelWriter。


02. sheet_name: 'str' = 'Sheet1'

sheet_name:str,默认为“Sheet1”

将包含DataFrame的工作表的名称。


03. na_rep: 'str' = ''

na_rep:str,默认“”

缺少数据表示。


04. float_format: 'Optional[str]' = None

float_format:str,可选

设置浮点数的字符串格式。例如

``float_format=“%.2f”``将设置0.1234到0.12的格式。


05. columns=None

str的序列或列表,可选

要写入的列。

06. header=True

header:bool或str列表,默认为True

写出列名。如果给定了字符串列表,则假定为列名的别名。


07. index=True

index:bool,默认为True

写入行名称(索引)。


08. index_label=None

index_label:str或sequence,可选

索引列的列标签(如果需要)。如果未指定,并且“header”和“index”为True,则使用索引名称。如果DataFrame使用MultiIndex,则应给出序列。


09. startrow=0

startrow:int,默认值0

要转储数据帧的左上单元格行。


10. startcol=0

startcol:int,默认值0

要转储数据帧的左上角单元格列。


11. engine=None

引擎:str,可选

要使用的写入引擎“openpyxl”或“xlsxwriter”。您也可以通过选项`io.exel.xlsx.writer``、`io.excel.xls.writer``和`io.exex.xlsm.writer`设置此选项。


..已弃用::1.2.0

作为`xlwt<https://pypi.org/project/xlwt/>`__包不再维护,“xlwt”引擎将在未来版本的panda中删除。


merge_cells=True

merge_cells:bool,默认为True

将多索引和分层行写入合并单元格。


12. encoding=None

编码:str,可选

生成的excel文件的编码。只有xlwt才需要,其他编写器本机支持unicode。


13. inf_rep='inf'

inf_rep:str,默认为“inf”

无限表示(Excel中没有无限的原生表示)。


14. verbose=True

verbose:bool,默认为True

在错误日志中显示更多信息。


15. freeze_panes=None

冷冻路径:整数元组(长度2),可选

指定要冻结的最底行和最右列。


16. storage_options: 'StorageOptions' = None

storage_options:dict,可选

对特定存储连接有意义的额外选项,例如。


主机、端口、用户名、密码等,如果使用将由`fsspec``解析的URL,例如,开始“s3://”、“gcs://”。如果使用非fsspec URL提供此参数,将引发错误。

请参阅fsspec和后端存储实现文档,以获取一组允许的键和值。


  待续......



目录
相关文章
|
2月前
|
Python
使用 Pandas 库时,如何处理数据的重复值?
在使用Pandas处理数据重复值时,需要根据具体的数据特点和分析需求,选择合适的方法来确保数据的准确性和唯一性。
257 64
|
2月前
|
数据采集 数据挖掘 数据处理
如何使用 Pandas 库进行数据清洗和预处理?
数据清洗和预处理是数据分析中至关重要的步骤,Pandas库提供了丰富的函数和方法来完成这些任务
81 8
|
22天前
|
XML JSON 数据库
Python的标准库
Python的标准库
161 77
|
2月前
|
机器学习/深度学习 算法 数据挖掘
数据分析的 10 个最佳 Python 库
数据分析的 10 个最佳 Python 库
98 4
数据分析的 10 个最佳 Python 库
|
23天前
|
XML JSON 数据库
Python的标准库
Python的标准库
47 11
|
2月前
|
人工智能 API 开发工具
aisuite:吴恩达发布开源Python库,一个接口调用多个大模型
吴恩达发布的开源Python库aisuite,提供了一个统一的接口来调用多个大型语言模型(LLM)服务。支持包括OpenAI、Anthropic、Azure等在内的11个模型平台,简化了多模型管理和测试的工作,促进了人工智能技术的应用和发展。
129 1
aisuite:吴恩达发布开源Python库,一个接口调用多个大模型
|
2月前
|
XML 存储 数据库
Python中的xmltodict库
xmltodict是Python中用于处理XML数据的强大库,可将XML数据与Python字典相互转换,适用于Web服务、配置文件读取及数据转换等场景。通过`parse`和`unparse`函数,轻松实现XML与字典间的转换,支持复杂结构和属性处理,并能有效管理错误。此外,还提供了实战案例,展示如何从XML配置文件中读取数据库连接信息并使用。
Python中的xmltodict库
|
23天前
|
数据可视化 Python
以下是一些常用的图表类型及其Python代码示例,使用Matplotlib和Seaborn库。
通过这些思维导图和分析说明表,您可以更直观地理解和选择适合的数据可视化图表类型,帮助更有效地展示和分析数据。
64 8
|
2月前
|
存储 人工智能 搜索推荐
Memoripy:支持 AI 应用上下文感知的记忆管理 Python 库
Memoripy 是一个 Python 库,用于管理 AI 应用中的上下文感知记忆,支持短期和长期存储,兼容 OpenAI 和 Ollama API。
100 6
Memoripy:支持 AI 应用上下文感知的记忆管理 Python 库
|
1月前
|
安全 API 文件存储
Yagmail邮件发送库:如何用Python实现自动化邮件营销?
本文详细介绍了如何使用Yagmail库实现自动化邮件营销。Yagmail是一个简洁强大的Python库,能简化邮件发送流程,支持文本、HTML邮件及附件发送,适用于数字营销场景。文章涵盖了Yagmail的基本使用、高级功能、案例分析及最佳实践,帮助读者轻松上手。
35 4