to_系列函数:22个 (12~22)
Function12
to_numpy(self, dtype: 'NpDtype | None' = None, copy: 'bool' = False, na_value=<no_default>) -> 'np.ndarray'
Help on function to_numpy in module pandas.core.frame: to_numpy(self, dtype: 'NpDtype | None' = None, copy: 'bool' = False, na_value=<no_default>) -> 'np.ndarray' Convert the DataFrame to a NumPy array. By default, the dtype of the returned array will be the common NumPy dtype of all types in the DataFrame. For example, if the dtypes are ``float16`` and ``float32``, the results dtype will be ``float32``. This may require copying data and coercing values, which may be expensive. Parameters ---------- dtype : str or numpy.dtype, optional The dtype to pass to :meth:`numpy.asarray`. copy : bool, default False Whether to ensure that the returned value is not a view on another array. Note that ``copy=False`` does not *ensure* that ``to_numpy()`` is no-copy. Rather, ``copy=True`` ensure that a copy is made, even if not strictly necessary. na_value : Any, optional The value to use for missing values. The default value depends on `dtype` and the dtypes of the DataFrame columns. .. versionadded:: 1.1.0 Returns ------- numpy.ndarray See Also -------- Series.to_numpy : Similar method for Series. Examples -------- >>> pd.DataFrame({"A": [1, 2], "B": [3, 4]}).to_numpy() array([[1, 3], [2, 4]]) With heterogeneous data, the lowest common type will have to be used. >>> df = pd.DataFrame({"A": [1, 2], "B": [3.0, 4.5]}) >>> df.to_numpy() array([[1. , 3. ], [2. , 4.5]]) For a mix of numeric and non-numeric types, the output array will have object dtype. >>> df['C'] = pd.date_range('2000', periods=2) >>> df.to_numpy() array([[1, 3.0, Timestamp('2000-01-01 00:00:00')], [2, 4.5, Timestamp('2000-01-02 00:00:00')]], dtype=object)
Function13
to_parquet(self, path: 'FilePathOrBuffer | None' = None, engine: 'str' = 'auto', compression: 'str | None' = 'snappy', index: 'bool | None' = None, partition_cols: 'list[str] | None' = None, storage_options: 'StorageOptions' = None, **kwargs) -> 'bytes | None'
Help on function to_parquet in module pandas.core.frame: to_parquet(self, path: 'FilePathOrBuffer | None' = None, engine: 'str' = 'auto', compression: 'str | None' = 'snappy', index: 'bool | None' = None, partition_cols: 'list[str] | None' = None, storage_options: 'StorageOptions' = None, **kwargs) -> 'bytes | None' Write a DataFrame to the binary parquet format. This function writes the dataframe as a `parquet file <https://parquet.apache.org/>`_. You can choose different parquet backends, and have the option of compression. See :ref:`the user guide <io.parquet>` for more details. Parameters ---------- path : str or file-like object, default None If a string, it will be used as Root Directory path when writing a partitioned dataset. By file-like object, we refer to objects with a write() method, such as a file handle (e.g. via builtin open function) or io.BytesIO. The engine fastparquet does not accept file-like objects. If path is None, a bytes object is returned. .. versionchanged:: 1.2.0 Previously this was "fname" engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto' Parquet library to use. If 'auto', then the option ``io.parquet.engine`` is used. The default ``io.parquet.engine`` behavior is to try 'pyarrow', falling back to 'fastparquet' if 'pyarrow' is unavailable. compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy' Name of the compression to use. Use ``None`` for no compression. index : bool, default None If ``True``, include the dataframe's index(es) in the file output. If ``False``, they will not be written to the file. If ``None``, similar to ``True`` the dataframe's index(es) will be saved. However, instead of being saved as values, the RangeIndex will be stored as a range in the metadata so it doesn't require much space and is faster. Other indexes will be included as columns in the file output. partition_cols : list, optional, default None Column names by which to partition the dataset. Columns are partitioned in the order they are given. Must be None if path is not a string. storage_options : dict, optional Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to ``urllib`` as header options. For other URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are forwarded to ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details. .. versionadded:: 1.2.0 **kwargs Additional arguments passed to the parquet library. See :ref:`pandas io <io.parquet>` for more details. Returns ------- bytes if no path argument is provided else None See Also -------- read_parquet : Read a parquet file. DataFrame.to_csv : Write a csv file. DataFrame.to_sql : Write to a sql table. DataFrame.to_hdf : Write to hdf. Notes ----- This function requires either the `fastparquet <https://pypi.org/project/fastparquet>`_ or `pyarrow <https://arrow.apache.org/docs/python/>`_ library. Examples -------- >>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]}) >>> df.to_parquet('df.parquet.gzip', ... compression='gzip') # doctest: +SKIP >>> pd.read_parquet('df.parquet.gzip') # doctest: +SKIP col1 col2 0 1 3 1 2 4 If you want to get a buffer to the parquet content you can use a io.BytesIO object, as long as you don't use partition_cols, which creates multiple files. >>> import io >>> f = io.BytesIO() >>> df.to_parquet(f) >>> f.seek(0) 0 >>> content = f.read()
Function14
to_period(self, freq: 'Frequency | None' = None, axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'
Help on function to_period in module pandas.core.frame: to_period(self, freq: 'Frequency | None' = None, axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame' Convert DataFrame from DatetimeIndex to PeriodIndex. Convert DataFrame from DatetimeIndex to PeriodIndex with desired frequency (inferred from index if not passed). Parameters ---------- freq : str, default Frequency of the PeriodIndex. axis : {0 or 'index', 1 or 'columns'}, default 0 The axis to convert (the index by default). copy : bool, default True If False then underlying input data is not copied. Returns ------- DataFrame with PeriodIndex
Function15
to_pickle(self, path, compression: 'CompressionOptions' = 'infer', protocol: 'int' = 5, storage_options: 'StorageOptions' = None) -> 'None'
Help on function to_pickle in module pandas.core.generic: to_pickle(self, path, compression: 'CompressionOptions' = 'infer', protocol: 'int' = 5, storage_options: 'StorageOptions' = None) -> 'None' Pickle (serialize) object to file. Parameters ---------- path : str File path where the pickled object will be stored. compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer' A string representing the compression to use in the output file. By default, infers from the file extension in specified path. Compression mode may be any of the following possible values: {¡®infer¡¯, ¡®gzip¡¯, ¡®bz2¡¯, ¡®zip¡¯, ¡®xz¡¯, None}. If compression mode is ¡®infer¡¯ and path_or_buf is path-like, then detect compression mode from the following extensions: ¡®.gz¡¯, ¡®.bz2¡¯, ¡®.zip¡¯ or ¡®.xz¡¯. (otherwise no compression). If dict given and mode is ¡®zip¡¯ or inferred as ¡®zip¡¯, other entries passed as additional compression options. protocol : int Int which indicates which protocol should be used by the pickler, default HIGHEST_PROTOCOL (see [1]_ paragraph 12.1.2). The possible values are 0, 1, 2, 3, 4, 5. A negative value for the protocol parameter is equivalent to setting its value to HIGHEST_PROTOCOL. .. [1] https://docs.python.org/3/library/pickle.html. storage_options : dict, optional Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to ``urllib`` as header options. For other URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are forwarded to ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details. .. versionadded:: 1.2.0 See Also -------- read_pickle : Load pickled pandas object (or any object) from file. DataFrame.to_hdf : Write DataFrame to an HDF5 file. DataFrame.to_sql : Write DataFrame to a SQL database. DataFrame.to_parquet : Write a DataFrame to the binary parquet format. Examples -------- >>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)}) >>> original_df foo bar 0 0 5 1 1 6 2 2 7 3 3 8 4 4 9 >>> original_df.to_pickle("./dummy.pkl") >>> unpickled_df = pd.read_pickle("./dummy.pkl") >>> unpickled_df foo bar 0 0 5 1 1 6 2 2 7 3 3 8 4 4 9 >>> import os >>> os.remove("./dummy.pkl")
Function16
to_records(self, index=True, column_dtypes=None, index_dtypes=None) -> 'np.recarray'
Help on function to_records in module pandas.core.frame: to_records(self, index=True, column_dtypes=None, index_dtypes=None) -> 'np.recarray' Convert DataFrame to a NumPy record array. Index will be included as the first field of the record array if requested. Parameters ---------- index : bool, default True Include index in resulting record array, stored in 'index' field or using the index label, if set. column_dtypes : str, type, dict, default None If a string or type, the data type to store all columns. If a dictionary, a mapping of column names and indices (zero-indexed) to specific data types. index_dtypes : str, type, dict, default None If a string or type, the data type to store all index levels. If a dictionary, a mapping of index level names and indices (zero-indexed) to specific data types. This mapping is applied only if `index=True`. Returns ------- numpy.recarray NumPy ndarray with the DataFrame labels as fields and each row of the DataFrame as entries. See Also -------- DataFrame.from_records: Convert structured or record ndarray to DataFrame. numpy.recarray: An ndarray that allows field access using attributes, analogous to typed columns in a spreadsheet. Examples -------- >>> df = pd.DataFrame({'A': [1, 2], 'B': [0.5, 0.75]}, ... index=['a', 'b']) >>> df A B a 1 0.50 b 2 0.75 >>> df.to_records() rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)], dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')]) If the DataFrame index has no label then the recarray field name is set to 'index'. If the index has a label then this is used as the field name: >>> df.index = df.index.rename("I") >>> df.to_records() rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)], dtype=[('I', 'O'), ('A', '<i8'), ('B', '<f8')]) The index can be excluded from the record array: >>> df.to_records(index=False) rec.array([(1, 0.5 ), (2, 0.75)], dtype=[('A', '<i8'), ('B', '<f8')]) Data types can be specified for the columns: >>> df.to_records(column_dtypes={"A": "int32"}) rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)], dtype=[('I', 'O'), ('A', '<i4'), ('B', '<f8')]) As well as for the index: >>> df.to_records(index_dtypes="<S2") rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)], dtype=[('I', 'S2'), ('A', '<i8'), ('B', '<f8')]) >>> index_dtypes = f"<S{df.index.str.len().max()}" >>> df.to_records(index_dtypes=index_dtypes) rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)], dtype=[('I', 'S1'), ('A', '<i8'), ('B', '<f8')])
Function17
to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'None'
Help on function to_sql in module pandas.core.generic: to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'None' Write records stored in a DataFrame to a SQL database. Databases supported by SQLAlchemy [1]_ are supported. Tables can be newly created, appended to, or overwritten. Parameters ---------- name : str Name of SQL table. con : sqlalchemy.engine.(Engine or Connection) or sqlite3.Connection Using SQLAlchemy makes it possible to use any DB supported by that library. Legacy support is provided for sqlite3.Connection objects. The user is responsible for engine disposal and connection closure for the SQLAlchemy connectable See `here <https://docs.sqlalchemy.org/en/13/core/connections.html>`_. schema : str, optional Specify the schema (if database flavor supports this). If None, use default schema. if_exists : {'fail', 'replace', 'append'}, default 'fail' How to behave if the table already exists. * fail: Raise a ValueError. * replace: Drop the table before inserting new values. * append: Insert new values to the existing table. index : bool, default True Write DataFrame index as a column. Uses `index_label` as the column name in the table. index_label : str or sequence, default None Column label for index column(s). If None is given (default) and `index` is True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex. chunksize : int, optional Specify the number of rows in each batch to be written at a time. By default, all rows will be written at once. dtype : dict or scalar, optional Specifying the datatype for columns. If a dictionary is used, the keys should be the column names and the values should be the SQLAlchemy types or strings for the sqlite3 legacy mode. If a scalar is provided, it will be applied to all columns. method : {None, 'multi', callable}, optional Controls the SQL insertion clause used: * None : Uses standard SQL ``INSERT`` clause (one per row). * 'multi': Pass multiple values in a single ``INSERT`` clause. * callable with signature ``(pd_table, conn, keys, data_iter)``. Details and a sample callable implementation can be found in the section :ref:`insert method <io.sql.method>`. Raises ------ ValueError When the table already exists and `if_exists` is 'fail' (the default). See Also -------- read_sql : Read a DataFrame from a table. Notes ----- Timezone aware datetime columns will be written as ``Timestamp with timezone`` type with SQLAlchemy if supported by the database. Otherwise, the datetimes will be stored as timezone unaware timestamps local to the original timezone. References ---------- .. [1] https://docs.sqlalchemy.org .. [2] https://www.python.org/dev/peps/pep-0249/ Examples -------- Create an in-memory SQLite database. >>> from sqlalchemy import create_engine >>> engine = create_engine('sqlite://', echo=False) Create a table from scratch with 3 rows. >>> df = pd.DataFrame({'name' : ['User 1', 'User 2', 'User 3']}) >>> df name 0 User 1 1 User 2 2 User 3 >>> df.to_sql('users', con=engine) >>> engine.execute("SELECT * FROM users").fetchall() [(0, 'User 1'), (1, 'User 2'), (2, 'User 3')] An `sqlalchemy.engine.Connection` can also be passed to `con`: >>> with engine.begin() as connection: ... df1 = pd.DataFrame({'name' : ['User 4', 'User 5']}) ... df1.to_sql('users', con=connection, if_exists='append') This is allowed to support operations that require that the same DBAPI connection is used for the entire operation. >>> df2 = pd.DataFrame({'name' : ['User 6', 'User 7']}) >>> df2.to_sql('users', con=engine, if_exists='append') >>> engine.execute("SELECT * FROM users").fetchall() [(0, 'User 1'), (1, 'User 2'), (2, 'User 3'), (0, 'User 4'), (1, 'User 5'), (0, 'User 6'), (1, 'User 7')] Overwrite the table with just ``df2``. >>> df2.to_sql('users', con=engine, if_exists='replace', ... index_label='id') >>> engine.execute("SELECT * FROM users").fetchall() [(0, 'User 6'), (1, 'User 7')] Specify the dtype (especially useful for integers with missing values). Notice that while pandas is forced to store the data as floating point, the database supports nullable integers. When fetching the data with Python, we get back integer scalars. >>> df = pd.DataFrame({"A": [1, None, 2]}) >>> df A 0 1.0 1 NaN 2 2.0 >>> from sqlalchemy.types import Integer >>> df.to_sql('integers', con=engine, index=False, ... dtype={"A": Integer()}) >>> engine.execute("SELECT * FROM integers").fetchall() [(1,), (None,), (2,)]
Function18
to_stata(self, path: 'FilePathOrBuffer', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'None'
Help on function to_stata in module pandas.core.frame: to_stata(self, path: 'FilePathOrBuffer', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'None' Export DataFrame object to Stata dta format. Writes the DataFrame to a Stata dataset file. "dta" files contain a Stata dataset. Parameters ---------- path : str, buffer or path object String, path object (pathlib.Path or py._path.local.LocalPath) or object implementing a binary write() function. If using a buffer then the buffer will not be automatically closed after the file data has been written. .. versionchanged:: 1.0.0 Previously this was "fname" convert_dates : dict Dictionary mapping columns containing datetime types to stata internal format to use when writing the dates. Options are 'tc', 'td', 'tm', 'tw', 'th', 'tq', 'ty'. Column can be either an integer or a name. Datetime columns that do not have a conversion type specified will be converted to 'tc'. Raises NotImplementedError if a datetime column has timezone information. write_index : bool Write the index to Stata dataset. byteorder : str Can be ">", "<", "little", or "big". default is `sys.byteorder`. time_stamp : datetime A datetime to use as file creation date. Default is the current time. data_label : str, optional A label for the data set. Must be 80 characters or smaller. variable_labels : dict Dictionary containing columns as keys and variable labels as values. Each label must be 80 characters or smaller. version : {114, 117, 118, 119, None}, default 114 Version to use in the output dta file. Set to None to let pandas decide between 118 or 119 formats depending on the number of columns in the frame. Version 114 can be read by Stata 10 and later. Version 117 can be read by Stata 13 or later. Version 118 is supported in Stata 14 and later. Version 119 is supported in Stata 15 and later. Version 114 limits string variables to 244 characters or fewer while versions 117 and later allow strings with lengths up to 2,000,000 characters. Versions 118 and 119 support Unicode characters, and version 119 supports more than 32,767 variables. Version 119 should usually only be used when the number of variables exceeds the capacity of dta format 118. Exporting smaller datasets in format 119 may have unintended consequences, and, as of November 2020, Stata SE cannot read version 119 files. .. versionchanged:: 1.0.0 Added support for formats 118 and 119. convert_strl : list, optional List of column names to convert to string columns to Stata StrL format. Only available if version is 117. Storing strings in the StrL format can produce smaller dta files if strings have more than 8 characters and values are repeated. compression : str or dict, default 'infer' For on-the-fly compression of the output dta. If string, specifies compression mode. If dict, value at key 'method' specifies compression mode. Compression mode must be one of {'infer', 'gzip', 'bz2', 'zip', 'xz', None}. If compression mode is 'infer' and `fname` is path-like, then detect compression from the following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no compression). If dict and compression mode is one of {'zip', 'gzip', 'bz2'}, or inferred as one of the above, other entries passed as additional compression options. .. versionadded:: 1.1.0 storage_options : dict, optional Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to ``urllib`` as header options. For other URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are forwarded to ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details. .. versionadded:: 1.2.0 Raises ------ NotImplementedError * If datetimes contain timezone information * Column dtype is not representable in Stata ValueError * Columns listed in convert_dates are neither datetime64[ns] or datetime.datetime * Column listed in convert_dates is not in DataFrame * Categorical label contains more than 32,000 characters See Also -------- read_stata : Import Stata data files. io.stata.StataWriter : Low-level writer for Stata data files. io.stata.StataWriter117 : Low-level writer for version 117 files. Examples -------- >>> df = pd.DataFrame({'animal': ['falcon', 'parrot', 'falcon', ... 'parrot'], ... 'speed': [350, 18, 361, 15]}) >>> df.to_stata('animals.dta') # doctest: +SKIP
Function19
to_string(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'int | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'fmt.FormattersType | None' = None, float_format: 'fmt.FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, min_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool' = False, decimal: 'str' = '.', line_width: 'int | None' = None, max_colwidth: 'int | None' = None, encoding: 'str | None' = None) -> 'str | None'
Help on function to_string in module pandas.core.frame: to_string(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'int | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'fmt.FormattersType | None' = None, float_format: 'fmt.FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, min_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool' = False, decimal: 'str' = '.', line_width: 'int | None' = None, max_colwidth: 'int | None' = None, encoding: 'str | None' = None) -> 'str | None' Render a DataFrame to a console-friendly tabular output. Parameters ---------- buf : str, Path or StringIO-like, optional, default None Buffer to write to. If None, the output is returned as a string. columns : sequence, optional, default None The subset of columns to write. Writes all columns by default. col_space : int, list or dict of int, optional The minimum width of each column. header : bool or sequence, optional Write out the column names. If a list of strings is given, it is assumed to be aliases for the column names. index : bool, optional, default True Whether to print index (row) labels. na_rep : str, optional, default 'NaN' String representation of ``NaN`` to use. formatters : list, tuple or dict of one-param. functions, optional Formatter functions to apply to columns' elements by position or name. The result of each function must be a unicode string. List/tuple must be of length equal to the number of columns. float_format : one-parameter function, optional, default None Formatter function to apply to columns' elements if they are floats. This function must return a unicode string and will be applied only to the non-``NaN`` elements, with ``NaN`` being handled by ``na_rep``. .. versionchanged:: 1.2.0 sparsify : bool, optional, default True Set to False for a DataFrame with a hierarchical index to print every multiindex key at each row. index_names : bool, optional, default True Prints the names of the indexes. justify : str, default None How to justify the column labels. If None uses the option from the print configuration (controlled by set_option), 'right' out of the box. Valid values are * left * right * center * justify * justify-all * start * end * inherit * match-parent * initial * unset. max_rows : int, optional Maximum number of rows to display in the console. min_rows : int, optional The number of rows to display in the console in a truncated repr (when number of rows is above `max_rows`). max_cols : int, optional Maximum number of columns to display in the console. show_dimensions : bool, default False Display DataFrame dimensions (number of rows by number of columns). decimal : str, default '.' Character recognized as decimal separator, e.g. ',' in Europe. line_width : int, optional Width to wrap a line in characters. max_colwidth : int, optional Max width to truncate each column in characters. By default, no limit. .. versionadded:: 1.0.0 encoding : str, default "utf-8" Set character encoding. .. versionadded:: 1.0 Returns ------- str or None If buf is None, returns the result as a string. Otherwise returns None. See Also -------- to_html : Convert DataFrame to HTML. Examples -------- >>> d = {'col1': [1, 2, 3], 'col2': [4, 5, 6]} >>> df = pd.DataFrame(d) >>> print(df.to_string()) col1 col2 0 1 4 1 2 5 2 3 6
Function20
to_timestamp(self, freq: 'Frequency | None' = None, how: 'str' = 'start', axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'
Help on function to_timestamp in module pandas.core.frame: to_timestamp(self, freq: 'Frequency | None' = None, how: 'str' = 'start', axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame' Cast to DatetimeIndex of timestamps, at *beginning* of period. Parameters ---------- freq : str, default frequency of PeriodIndex Desired frequency. how : {'s', 'e', 'start', 'end'} Convention for converting period to timestamp; start of period vs. end. axis : {0 or 'index', 1 or 'columns'}, default 0 The axis to convert (the index by default). copy : bool, default True If False then underlying input data is not copied. Returns ------- DataFrame with DatetimeIndex
Function21
to_xarray(self)
Help on function to_xarray in module pandas.core.generic: to_xarray(self) Return an xarray object from the pandas object. Returns ------- xarray.DataArray or xarray.Dataset Data in the pandas structure converted to Dataset if the object is a DataFrame, or a DataArray if the object is a Series. See Also -------- DataFrame.to_hdf : Write DataFrame to an HDF5 file. DataFrame.to_parquet : Write a DataFrame to the binary parquet format. Notes ----- See the `xarray docs <https://xarray.pydata.org/en/stable/>`__ Examples -------- >>> df = pd.DataFrame([('falcon', 'bird', 389.0, 2), ... ('parrot', 'bird', 24.0, 2), ... ('lion', 'mammal', 80.5, 4), ... ('monkey', 'mammal', np.nan, 4)], ... columns=['name', 'class', 'max_speed', ... 'num_legs']) >>> df name class max_speed num_legs 0 falcon bird 389.0 2 1 parrot bird 24.0 2 2 lion mammal 80.5 4 3 monkey mammal NaN 4 >>> df.to_xarray() <xarray.Dataset> Dimensions: (index: 4) Coordinates: * index (index) int64 0 1 2 3 Data variables: name (index) object 'falcon' 'parrot' 'lion' 'monkey' class (index) object 'bird' 'bird' 'mammal' 'mammal' max_speed (index) float64 389.0 24.0 80.5 nan num_legs (index) int64 2 2 4 4 >>> df['max_speed'].to_xarray() <xarray.DataArray 'max_speed' (index: 4)> array([389. , 24. , 80.5, nan]) Coordinates: * index (index) int64 0 1 2 3 >>> dates = pd.to_datetime(['2018-01-01', '2018-01-01', ... '2018-01-02', '2018-01-02']) >>> df_multiindex = pd.DataFrame({'date': dates, ... 'animal': ['falcon', 'parrot', ... 'falcon', 'parrot'], ... 'speed': [350, 18, 361, 15]}) >>> df_multiindex = df_multiindex.set_index(['date', 'animal']) >>> df_multiindex speed date animal 2018-01-01 falcon 350 parrot 18 2018-01-02 falcon 361 parrot 15 >>> df_multiindex.to_xarray() <xarray.Dataset> Dimensions: (animal: 2, date: 2) Coordinates: * date (date) datetime64[ns] 2018-01-01 2018-01-02 * animal (animal) object 'falcon' 'parrot' Data variables: speed (date, animal) int64 350 18 361 15
Function22
to_xml(self, path_or_buffer: 'FilePathOrBuffer | None' = None, index: 'bool' = True, root_name: 'str | None' = 'data', row_name: 'str | None' = 'row', na_rep: 'str | None' = None, attr_cols: 'str | list[str] | None' = None, elem_cols: 'str | list[str] | None' = None, namespaces: 'dict[str | None, str] | None' = None, prefix: 'str | None' = None, encoding: 'str' = 'utf-8', xml_declaration: 'bool | None' = True, pretty_print: 'bool | None' = True, parser: 'str | None' = 'lxml', stylesheet: 'FilePathOrBuffer | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'str | None' Help on function to_xml in module pandas.core.frame: to_xml(self, path_or_buffer: 'FilePathOrBuffer | None' = None, index: 'bool' = True, root_name: 'str | None' = 'data', row_name: 'str | None' = 'row', na_rep: 'str | None' = None, attr_cols: 'str | list[str] | None' = None, elem_cols: 'str | list[str] | None' = None, namespaces: 'dict[str | None, str] | None' = None, prefix: 'str | None' = None, encoding: 'str' = 'utf-8', xml_declaration: 'bool | None' = True, pretty_print: 'bool | None' = True, parser: 'str | None' = 'lxml', stylesheet: 'FilePathOrBuffer | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'str | None' Render a DataFrame to an XML document. .. versionadded:: 1.3.0 Parameters ---------- path_or_buffer : str, path object or file-like object, optional File to write output to. If None, the output is returned as a string. index : bool, default True Whether to include index in XML document. root_name : str, default 'data' The name of root element in XML document. row_name : str, default 'row' The name of row element in XML document. na_rep : str, optional Missing data representation. attr_cols : list-like, optional List of columns to write as attributes in row element. Hierarchical columns will be flattened with underscore delimiting the different levels. elem_cols : list-like, optional List of columns to write as children in row element. By default, all columns output as children of row element. Hierarchical columns will be flattened with underscore delimiting the different levels. namespaces : dict, optional All namespaces to be defined in root element. Keys of dict should be prefix names and values of dict corresponding URIs. Default namespaces should be given empty string key. For example, :: namespaces = {"": "https://example.com"} prefix : str, optional Namespace prefix to be used for every element and/or attribute in document. This should be one of the keys in ``namespaces`` dict. encoding : str, default 'utf-8' Encoding of the resulting document. xml_declaration : bool, default True Whether to include the XML declaration at start of document. pretty_print : bool, default True Whether output should be pretty printed with indentation and line breaks. parser : {'lxml','etree'}, default 'lxml' Parser module to use for building of tree. Only 'lxml' and 'etree' are supported. With 'lxml', the ability to use XSLT stylesheet is supported. stylesheet : str, path object or file-like object, optional A URL, file-like object, or a raw string containing an XSLT script used to transform the raw XML output. Script should use layout of elements and attributes from original output. This argument requires ``lxml`` to be installed. Only XSLT 1.0 scripts and not later versions is currently supported. compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer' For on-the-fly decompression of on-disk data. If 'infer', then use gzip, bz2, zip or xz if path_or_buffer is a string ending in '.gz', '.bz2', '.zip', or 'xz', respectively, and no decompression otherwise. If using 'zip', the ZIP file must contain only one data file to be read in. Set to None for no decompression. storage_options : dict, optional Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to ``urllib`` as header options. For other URLs (e.g. starting with "s3://", and "gcs://") the key-value pairs are forwarded to ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details. Returns ------- None or str If ``io`` is None, returns the resulting XML format as a string. Otherwise returns None. See Also -------- to_json : Convert the pandas object to a JSON string. to_html : Convert DataFrame to a html. Examples -------- >>> df = pd.DataFrame({'shape': ['square', 'circle', 'triangle'], ... 'degrees': [360, 360, 180], ... 'sides': [4, np.nan, 3]}) >>> df.to_xml() # doctest: +SKIP <?xml version='1.0' encoding='utf-8'?> <data> <row> <index>0</index> <shape>square</shape> <degrees>360</degrees> <sides>4.0</sides> </row> <row> <index>1</index> <shape>circle</shape> <degrees>360</degrees> <sides/> </row> <row> <index>2</index> <shape>triangle</shape> <degrees>180</degrees> <sides>3.0</sides> </row> </data> >>> df.to_xml(attr_cols=[ ... 'index', 'shape', 'degrees', 'sides' ... ]) # doctest: +SKIP <?xml version='1.0' encoding='utf-8'?> <data> <row index="0" shape="square" degrees="360" sides="4.0"/> <row index="1" shape="circle" degrees="360"/> <row index="2" shape="triangle" degrees="180" sides="3.0"/> </data> >>> df.to_xml(namespaces={"doc": "https://example.com"}, ... prefix="doc") # doctest: +SKIP <?xml version='1.0' encoding='utf-8'?> <doc:data xmlns:doc="https://example.com"> <doc:row> <doc:index>0</doc:index> <doc:shape>square</doc:shape> <doc:degrees>360</doc:degrees> <doc:sides>4.0</doc:sides> </doc:row> <doc:row> <doc:index>1</doc:index> <doc:shape>circle</doc:shape> <doc:degrees>360</doc:degrees> <doc:sides/> </doc:row> <doc:row> <doc:index>2</doc:index> <doc:shape>triangle</doc:shape> <doc:degrees>180</doc:degrees> <doc:sides>3.0</doc:sides> </doc:row> </doc:data>
df.to_excel
与pd.read_excel()对应,找这个pd.DataFrame.to_excel()扩展学习:
参数表及对应默认值:
to_excel( excel_writer, sheet_name: 'str' = 'Sheet1', na_rep: 'str' = '', float_format: 'Optional[str]' = None, columns=None, header=True, index=True, index_label=None, startrow=0, startcol=0, engine=None, merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None, storage_options: 'StorageOptions' = None)
注意点:
要将单个对象写入Excel的.xlsx文件,只需指定目标文件名。但要写入多个工作表,需要创建一个具有目标文件名的“ExcelWriter”对象,并在文件中指定要写入的工作表。
通过指定唯一的“sheet_name”,可以写入多个工作表。
将所有数据写入文件后,需要保存更改。
请注意,使用已存在的文件名创建“ExcelWriter”对象将导致删除现有文件的内容。
参数详解:
01. excel_writer
excel_writer:path like、file like或ExcelWriter对象文件路径或现有ExcelWriter。
02. sheet_name: 'str' = 'Sheet1'
sheet_name:str,默认为“Sheet1”
将包含DataFrame的工作表的名称。
03. na_rep: 'str' = ''
na_rep:str,默认“”
缺少数据表示。
04. float_format: 'Optional[str]' = None
float_format:str,可选
设置浮点数的字符串格式。例如
``float_format=“%.2f”``将设置0.1234到0.12的格式。
05. columns=None
str的序列或列表,可选
要写入的列。
06. header=True
header:bool或str列表,默认为True
写出列名。如果给定了字符串列表,则假定为列名的别名。
07. index=True
index:bool,默认为True
写入行名称(索引)。
08. index_label=None
index_label:str或sequence,可选
索引列的列标签(如果需要)。如果未指定,并且“header”和“index”为True,则使用索引名称。如果DataFrame使用MultiIndex,则应给出序列。
09. startrow=0
startrow:int,默认值0
要转储数据帧的左上单元格行。
10. startcol=0
startcol:int,默认值0
要转储数据帧的左上角单元格列。
11. engine=None
引擎:str,可选
要使用的写入引擎“openpyxl”或“xlsxwriter”。您也可以通过选项`io.exel.xlsx.writer``、`io.excel.xls.writer``和`io.exex.xlsm.writer`设置此选项。
..已弃用::1.2.0
作为`xlwt<https://pypi.org/project/xlwt/>`__包不再维护,“xlwt”引擎将在未来版本的panda中删除。
merge_cells=True
merge_cells:bool,默认为True
将多索引和分层行写入合并单元格。
12. encoding=None
编码:str,可选
生成的excel文件的编码。只有xlwt才需要,其他编写器本机支持unicode。
13. inf_rep='inf'
inf_rep:str,默认为“inf”
无限表示(Excel中没有无限的原生表示)。
14. verbose=True
verbose:bool,默认为True
在错误日志中显示更多信息。
15. freeze_panes=None
冷冻路径:整数元组(长度2),可选
指定要冻结的最底行和最右列。
16. storage_options: 'StorageOptions' = None
storage_options:dict,可选
对特定存储连接有意义的额外选项,例如。
主机、端口、用户名、密码等,如果使用将由`fsspec``解析的URL,例如,开始“s3://”、“gcs://”。如果使用非fsspec URL提供此参数,将引发错误。
请参阅fsspec和后端存储实现文档,以获取一组允许的键和值。
待续......