开发者社区> 问答> 正文

如何将字典中的列拆分为2列

最好的方法是将以下列拆分为一个数据框,该数据框包含每个国家/地区的名称,而另两列包含第一列的数据(历史记录)?

从此数据帧:

+-----------------------------------------+----------------------------------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------------------------------+
| coordinates                             | country                          | country_code   | history                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |   latest | province                     |
|-----------------------------------------+----------------------------------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+------------------------------|
| {'lat': '15', 'long': '101'}            | Thailand                         | TH             | {'1/22/20': 0, '1/23/20': 0, '1/24/20': 0, '1/25/20': 0, '1/26/20': 0, '1/27/20': 0, '1/28/20': 0, '1/29/20': 0, '1/30/20': 0, '1/31/20': 0, '2/1/20': 0, '2/10/20': 0, '2/11/20': 0, '2/12/20': 0, '2/13/20': 0, '2/14/20': 0, '2/15/20': 0, '2/16/20': 0, '2/17/20': 0, '2/18/20': 0, '2/19/20': 0, '2/2/20': 0, '2/20/20': 0, '2/21/20': 0, '2/22/20': 0, '2/23/20': 0, '2/24/20': 0, '2/25/20': 0, '2/26/20': 0, '2/27/20': 0, '2/28/20': 0, '2/29/20': 0, '2/3/20': 0, '2/4/20': 0, '2/5/20': 0, '2/6/20': 0, '2/7/20': 0, '2/8/20': 0, '2/9/20': 0, '3/1/20': 1, '3/10/20': 1, '3/11/20': 1, '3/12/20': 1, '3/13/20': 1, '3/14/20': 1, '3/15/20': 1, '3/16/20': 1, '3/2/20': 1, '3/3/20': 1, '3/4/20': 1, '3/5/20': 1, '3/6/20': 1, '3/7/20': 1, '3/8/20': 1, '3/9/20': 1}                                                                                                                                            |        1 |                              |
| {'lat': '36', 'long': '138'}            | Japan                            | JP             | {'1/22/20': 0, '1/23/20': 0, '1/24/20': 0, '1/25/20': 0, '1/26/20': 0, '1/27/20': 0, '1/28/20': 0, '1/29/20': 0, '1/30/20': 0, '1/31/20': 0, '2/1/20': 0, '2/10/20': 0, '2/11/20': 0, '2/12/20': 0, '2/13/20': 1, '2/14/20': 1, '2/15/20': 1, '2/16/20': 1, '2/17/20': 1, '2/18/20': 1, '2/19/20': 1, '2/2/20': 0, '2/20/20': 1, '2/21/20': 1, '2/22/20': 1, '2/23/20': 1, '2/24/20': 1, '2/25/20': 1, '2/26/20': 2, '2/27/20': 4, '2/28/20': 4, '2/29/20': 5, '2/3/20': 0, '2/4/20': 0, '2/5/20': 0, '2/6/20': 0, '2/7/20': 0, '2/8/20': 0, '2/9/20': 0, '3/1/20': 6, '3/10/20': 10, '3/11/20': 15, '3/12/20': 16, '3/13/20': 19, '3/14/20': 22, '3/15/20': 22, '3/16/20': 27, '3/2/20': 6, '3/3/20': 6, '3/4/20': 6, '3/5/20': 6, '3/6/20': 6, '3/7/20': 6, '3/8/20': 6, '3/9/20': 10}                                                                                                                                    |       27 |                              

到这个:

 country  days    values
Thailand  1/2/22     0
Thailand  2/2/22     0
Thailand  2/2/22     0
....
Sweden    3/4/55     0
Sweden    3/4/55     0

问题来源:stackoverflow

展开
收起
is大龙 2020-03-24 22:39:47 558 0
1 条回答
写回答
取消 提交回答
  • IIUC,

    new_df = (pd.DataFrame(df['history'].tolist(),
                           index = df['country'])
                 .reset_index()
                 .melt('country',var_name = 'days')
                 .sort_values('country'))
    

    或暗示:

    #import numpy as np
    pd.DataFrame(data = np.concatenate([[(k, v) for k, v in d.items()] 
                                        for d in df['history']]),
                 columns = ['days','values'],
                index = df['country'].repeat(df['history'].str.len())).reset_index()
    

    print(df)
      country  country_code       history
    0       A             0  {1: 0, 2: 0}
    1       B             1  {1: 0, 2: 0}
    2       C             2  {1: 0, 2: 0}
    

    *

    new_df = (pd.DataFrame(df['history'].tolist(), index = df['country']) .reset_index() .melt('country',var_name = 'days',value_name='values') .sort_values('country')) print(new_df) country days values 0 A 1 0 3 A 2 0 1 B 1 0 4 B 2 0 2 C 1 0 5 C 2 0

    也许第二种方法更好

    %%timeit
    pd.DataFrame(data = np.concatenate([[(k,v) for k,v in d.items()] 
                                        for d in df['history']]),
                 columns = ['days','values'],
                index = df['country'].repeat(df['history'].str.len())).reset_index()
    1.71 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    *

    %%timeit new_df = (pd.DataFrame(df['history'].tolist(), index = df['country']) .reset_index() .melt('country',var_name = 'days') .sort_values('country')) new_df 5.01 ms ± 272 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

    回答来源:stackoverflow

    2020-03-24 22:39:56
    赞同 展开评论 打赏
问答地址:
问答排行榜
最热
最新

相关电子书

更多
RowKey与索引设计:技巧与案例分析 立即下载
低代码开发师(初级)实战教程 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载