COVID-19 Cases Prediction (Regression)(一)

简介: COVID-19 Cases Prediction (Regression)

Objectives:

  • Solve a regression problem with deep neural networks (DNN).
  • Understand basic DNN training tips.
  • Familiarize yourself with PyTorch.

Task Description

  • COVID-19 Cases Prediction
  • Source: Delphi group @ CMU
  • A daily survey since April 2020 via facebook.


Try to find out the data and use it to your training is forbidden


image.png

  • Given survey results in the past 5 days in a specific state in U.S., then predict the percentage of new tested positive cases in the 5th day.

image.png

Data

image.png

Conducted surveys via facebook (every day & every state) Survey: symptoms, COVID-19 testing,social distancing, mental health, demographics, economic effects, …

  • States (37, encoded to one-hot vectors)
  • COVID-like illness (4)
  • cli、ili …
  • Behavior Indicators (8)
  • wearing_mask、travel_outside_state …
  • Mental Health Indicators (3)
  • anxious、depressed …
  • Tested Positive Cases (1)
  • tested_positive (this is what we want to predict)


Data – One-hot Vector

  • One-hot vectors:

   Vectors with only one element equals to one while others are zero. Usually used to encode discrete values.

the details about One-hot Vector please read the blog:One-Hot

image.png

Evaluation Metric

  • Mean Squared Error (MSE)

image.png

image.png

Download data

If the Google Drive links below do not work, you can download data from Kaggle, and upload data manually to the workspace.

!gdown --id '1kLSW_-cW2Huj7bh84YTdimGBOJaODiOS' --output covid.train.csv
!gdown --id '1iiI5qROrAhZn-o4FPqsE97bMzDEFvIdg' --output covid.test.csv
/usr/local/lib/python3.7/dist-packages/gdown/cli.py:131: FutureWarning: Option `--id` was deprecated in version 4.3.1 and will be removed in 5.0. You don't need to pass it anymore to use a file ID.
  category=FutureWarning,
Downloading...
From: https://drive.google.com/uc?id=1kLSW_-cW2Huj7bh84YTdimGBOJaODiOS
To: /content/covid.train.csv
100% 2.49M/2.49M [00:00<00:00, 238MB/s]
/usr/local/lib/python3.7/dist-packages/gdown/cli.py:131: FutureWarning: Option `--id` was deprecated in version 4.3.1 and will be removed in 5.0. You don't need to pass it anymore to use a file ID.
  category=FutureWarning,
Downloading...
From: https://drive.google.com/uc?id=1iiI5qROrAhZn-o4FPqsE97bMzDEFvIdg
To: /content/covid.test.csv
100% 993k/993k [00:00<00:00, 137MB/s]

Import packages

# Numerical Operations
import math
import numpy as np
# Reading/Writing Data
import pandas as pd
import os
import csv
# For Progress Bar
from tqdm import tqdm
# Pytorch
import torch 
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader, random_split
# For plotting learning curve
from torch.utils.tensorboard import SummaryWriter

Some Utility Functions

You do not need to modify this part.

def same_seed(seed): 
    '''Fixes random number generator seeds for reproducibility.'''
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    np.random.seed(seed)
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)
def train_valid_split(data_set, valid_ratio, seed):
    '''Split provided training data into training set and validation set'''
    valid_set_size = int(valid_ratio * len(data_set)) 
    train_set_size = len(data_set) - valid_set_size
    train_set, valid_set = random_split(data_set, [train_set_size, valid_set_size], generator=torch.Generator().manual_seed(seed))
    return np.array(train_set), np.array(valid_set)
def predict(test_loader, model, device):
    model.eval() # Set your model to evaluation mode.
    preds = []
    for x in tqdm(test_loader):
        x = x.to(device)                        
        with torch.no_grad():                   
            pred = model(x)                     
            preds.append(pred.detach().cpu())   
    preds = torch.cat(preds, dim=0).numpy()  
    return preds

Dataset

class COVID19Dataset(Dataset):
    '''
    x: Features.
    y: Targets, if none, do prediction.
    '''
    def __init__(self, x, y=None):
        if y is None:
            self.y = y
        else:
            self.y = torch.FloatTensor(y)
        self.x = torch.FloatTensor(x)
    def __getitem__(self, idx):
        if self.y is None:
            return self.x[idx]
        else:
            return self.x[idx], self.y[idx]
    def __len__(self):
        return len(self.x)

Neural Network Model

Try out different model architectures by modifying the class below.

class My_Model(nn.Module):
    def __init__(self, input_dim):
        super(My_Model, self).__init__()
        # TODO: modify model's structure, be aware of dimensions. 
        self.layers = nn.Sequential(
            nn.Linear(input_dim, 16),
            nn.ReLU(),
            nn.Linear(16, 8),
            nn.ReLU(),
            nn.Linear(8, 1)
        )
    def forward(self, x):
        x = self.layers(x)
        x = x.squeeze(1) # (B, 1) -> (B)
        return x






目录
相关文章
|
12月前
|
机器学习/深度学习 搜索推荐 算法
协同过滤算法
协同过滤算法
918 0
|
机器学习/深度学习 数据采集 算法
一文搞懂 卷积神经网络 批归一化 丢弃法
这篇文章详细介绍了卷积神经网络中的批归一化(Batch Normalization)和丢弃法(Dropout),包括它们的计算过程、作用、优势以及如何在飞桨框架中应用这些技术来提高模型的稳定性和泛化能力,并提供了网络结构定义和参数计算的示例。
|
搜索推荐 算法 前端开发
美食物管理与推荐系统Python+Django网站开发+协同过滤推荐算法应用【计算机课设项目推荐】
美食物管理与推荐系统Python+Django网站开发+协同过滤推荐算法应用【计算机课设项目推荐】
396 4
美食物管理与推荐系统Python+Django网站开发+协同过滤推荐算法应用【计算机课设项目推荐】
|
12月前
|
人工智能 算法 调度
【AI系统】AI系统的组成
本文详细解析了AI系统的多层次架构,涵盖应用与开发层、AI框架层、编译与运行时及硬件体系结构等,阐述各部分如何协同支撑AI应用的开发与运行,提升整体性能与效率,并随著AI技术进步持续演进。从编程语言到AI芯片设计,每一层都对系统的最终表现起着至关重要的作用。
1233 0
|
达摩院 BI 索引
切割问题【数学规划的应用(含代码)】阿里达摩院MindOpt
本文主要讲述了使用MindOpt工具对切割问题进行优化的过程与实践。切割问题是指从一维原材料(如木材、钢材等)中切割出特定长度的零件以满足不同需求,同时尽可能减少浪费的成本。文章通过实例详细介绍了如何使用MindOpt云上建模求解平台及其配套的MindOpt APL建模语言来解决此类问题,包括数学建模、代码实现、求解过程及结果分析等内容。此外,还讨论了一维切割问题的应用场景,并对其进行了扩展,探讨了更复杂的二维和三维切割问题。通过本文的学习,读者能够掌握利用MindOpt工具解决实际切割问题的方法和技术。
|
存储 人工智能 小程序
比赛须知【2024 年睿抗机器人开发者大赛CAIP-编程技能赛(国赛)】
该文章是关于2024年睿抗机器人开发者大赛CAIP-编程技能赛(国赛)的参赛通知,强调了比赛时间、阅读比赛须知的重要性,并列举了多项比赛期间禁止的行为以确保比赛的公平性。
 比赛须知【2024 年睿抗机器人开发者大赛CAIP-编程技能赛(国赛)】
|
机器人 Java 编译器
2024年睿抗机器人开发者大赛(RAICOM)CAIP-编程技能赛-本科组省赛_题解
这篇文章是关于2024年睿抗机器人开发者大赛(RAICOM)CAIP-编程技能赛-本科组省赛的题解,作者分享了自己的得分和比赛经历,以及对比赛过程中出现问题的不满,同时提供了几道题目的解题思路和代码实现。
|
Linux Shell 虚拟化
开机自动挂载NTFS分区至Linux:分步指南
在Linux中自动挂载Windows NTFS分区,需创建挂载点(如`/media/c_win`),识别分区(如`/dev/sda1`),获取UUID,并编辑`fstab`文件添加挂载信息。推荐使用UUID以保持稳定性。在VMware环境中可能需添加`force`选项。完成这些步骤后,重启系统,NTFS分区将自动挂载。这对于双系统用户非常方便。
|
物联网 程序员 语音技术
STM32智能小车(循迹、跟随、避障、测速、蓝牙、wife、4g、语音识别)总结-3
STM32智能小车(循迹、跟随、避障、测速、蓝牙、wife、4g、语音识别)总结
STM32智能小车(循迹、跟随、避障、测速、蓝牙、wife、4g、语音识别)总结-3
|
数据采集 安全 测试技术
LabVIEW调用DLL时需注意的问题
LabVIEW调用DLL时需注意的问题
604 0