COVID-19 Cases Prediction (Regression)(一)

简介: COVID-19 Cases Prediction (Regression)

Objectives:

  • Solve a regression problem with deep neural networks (DNN).
  • Understand basic DNN training tips.
  • Familiarize yourself with PyTorch.

Task Description

  • COVID-19 Cases Prediction
  • Source: Delphi group @ CMU
  • A daily survey since April 2020 via facebook.


Try to find out the data and use it to your training is forbidden


image.png

  • Given survey results in the past 5 days in a specific state in U.S., then predict the percentage of new tested positive cases in the 5th day.

image.png

Data

image.png

Conducted surveys via facebook (every day & every state) Survey: symptoms, COVID-19 testing,social distancing, mental health, demographics, economic effects, …

  • States (37, encoded to one-hot vectors)
  • COVID-like illness (4)
  • cli、ili …
  • Behavior Indicators (8)
  • wearing_mask、travel_outside_state …
  • Mental Health Indicators (3)
  • anxious、depressed …
  • Tested Positive Cases (1)
  • tested_positive (this is what we want to predict)


Data – One-hot Vector

  • One-hot vectors:

   Vectors with only one element equals to one while others are zero. Usually used to encode discrete values.

the details about One-hot Vector please read the blog:One-Hot

image.png

Evaluation Metric

  • Mean Squared Error (MSE)

image.png

image.png

Download data

If the Google Drive links below do not work, you can download data from Kaggle, and upload data manually to the workspace.

!gdown --id '1kLSW_-cW2Huj7bh84YTdimGBOJaODiOS' --output covid.train.csv
!gdown --id '1iiI5qROrAhZn-o4FPqsE97bMzDEFvIdg' --output covid.test.csv
/usr/local/lib/python3.7/dist-packages/gdown/cli.py:131: FutureWarning: Option `--id` was deprecated in version 4.3.1 and will be removed in 5.0. You don't need to pass it anymore to use a file ID.
  category=FutureWarning,
Downloading...
From: https://drive.google.com/uc?id=1kLSW_-cW2Huj7bh84YTdimGBOJaODiOS
To: /content/covid.train.csv
100% 2.49M/2.49M [00:00<00:00, 238MB/s]
/usr/local/lib/python3.7/dist-packages/gdown/cli.py:131: FutureWarning: Option `--id` was deprecated in version 4.3.1 and will be removed in 5.0. You don't need to pass it anymore to use a file ID.
  category=FutureWarning,
Downloading...
From: https://drive.google.com/uc?id=1iiI5qROrAhZn-o4FPqsE97bMzDEFvIdg
To: /content/covid.test.csv
100% 993k/993k [00:00<00:00, 137MB/s]

Import packages

# Numerical Operations
import math
import numpy as np
# Reading/Writing Data
import pandas as pd
import os
import csv
# For Progress Bar
from tqdm import tqdm
# Pytorch
import torch 
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader, random_split
# For plotting learning curve
from torch.utils.tensorboard import SummaryWriter

Some Utility Functions

You do not need to modify this part.

def same_seed(seed): 
    '''Fixes random number generator seeds for reproducibility.'''
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    np.random.seed(seed)
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)
def train_valid_split(data_set, valid_ratio, seed):
    '''Split provided training data into training set and validation set'''
    valid_set_size = int(valid_ratio * len(data_set)) 
    train_set_size = len(data_set) - valid_set_size
    train_set, valid_set = random_split(data_set, [train_set_size, valid_set_size], generator=torch.Generator().manual_seed(seed))
    return np.array(train_set), np.array(valid_set)
def predict(test_loader, model, device):
    model.eval() # Set your model to evaluation mode.
    preds = []
    for x in tqdm(test_loader):
        x = x.to(device)                        
        with torch.no_grad():                   
            pred = model(x)                     
            preds.append(pred.detach().cpu())   
    preds = torch.cat(preds, dim=0).numpy()  
    return preds

Dataset

class COVID19Dataset(Dataset):
    '''
    x: Features.
    y: Targets, if none, do prediction.
    '''
    def __init__(self, x, y=None):
        if y is None:
            self.y = y
        else:
            self.y = torch.FloatTensor(y)
        self.x = torch.FloatTensor(x)
    def __getitem__(self, idx):
        if self.y is None:
            return self.x[idx]
        else:
            return self.x[idx], self.y[idx]
    def __len__(self):
        return len(self.x)

Neural Network Model

Try out different model architectures by modifying the class below.

class My_Model(nn.Module):
    def __init__(self, input_dim):
        super(My_Model, self).__init__()
        # TODO: modify model's structure, be aware of dimensions. 
        self.layers = nn.Sequential(
            nn.Linear(input_dim, 16),
            nn.ReLU(),
            nn.Linear(16, 8),
            nn.ReLU(),
            nn.Linear(8, 1)
        )
    def forward(self, x):
        x = self.layers(x)
        x = x.squeeze(1) # (B, 1) -> (B)
        return x






目录
相关文章
基于PyTorch的大语言模型微调指南:Torchtune完整教程与代码示例
**Torchtune**是由PyTorch团队开发的一个专门用于LLM微调的库。它旨在简化LLM的微调流程,提供了一系列高级API和预置的最佳实践
454 59
基于PyTorch的大语言模型微调指南:Torchtune完整教程与代码示例
一文搞懂 卷积神经网络 批归一化 丢弃法
这篇文章详细介绍了卷积神经网络中的批归一化(Batch Normalization)和丢弃法(Dropout),包括它们的计算过程、作用、优势以及如何在飞桨框架中应用这些技术来提高模型的稳定性和泛化能力,并提供了网络结构定义和参数计算的示例。
美食物管理与推荐系统Python+Django网站开发+协同过滤推荐算法应用【计算机课设项目推荐】
美食物管理与推荐系统Python+Django网站开发+协同过滤推荐算法应用【计算机课设项目推荐】
326 4
美食物管理与推荐系统Python+Django网站开发+协同过滤推荐算法应用【计算机课设项目推荐】
【AI系统】AI系统的组成
本文详细解析了AI系统的多层次架构,涵盖应用与开发层、AI框架层、编译与运行时及硬件体系结构等,阐述各部分如何协同支撑AI应用的开发与运行,提升整体性能与效率,并随著AI技术进步持续演进。从编程语言到AI芯片设计,每一层都对系统的最终表现起着至关重要的作用。
924 0
切割问题【数学规划的应用(含代码)】阿里达摩院MindOpt
本文主要讲述了使用MindOpt工具对切割问题进行优化的过程与实践。切割问题是指从一维原材料(如木材、钢材等)中切割出特定长度的零件以满足不同需求,同时尽可能减少浪费的成本。文章通过实例详细介绍了如何使用MindOpt云上建模求解平台及其配套的MindOpt APL建模语言来解决此类问题,包括数学建模、代码实现、求解过程及结果分析等内容。此外,还讨论了一维切割问题的应用场景,并对其进行了扩展,探讨了更复杂的二维和三维切割问题。通过本文的学习,读者能够掌握利用MindOpt工具解决实际切割问题的方法和技术。
开机自动挂载NTFS分区至Linux:分步指南
在Linux中自动挂载Windows NTFS分区,需创建挂载点(如`/media/c_win`),识别分区(如`/dev/sda1`),获取UUID,并编辑`fstab`文件添加挂载信息。推荐使用UUID以保持稳定性。在VMware环境中可能需添加`force`选项。完成这些步骤后,重启系统,NTFS分区将自动挂载。这对于双系统用户非常方便。
1117 1
【Redis】4、全局唯一 ID生成、单机(非分布式)情况下的秒杀和一人一单
【Redis】4、全局唯一 ID生成、单机(非分布式)情况下的秒杀和一人一单
325 0
【网安AIGC专题】46篇前沿代码大模型论文、24篇论文阅读笔记汇总
【网安AIGC专题】46篇前沿代码大模型论文、24篇论文阅读笔记汇总
483 0
AI助理

你好,我是AI助理

可以解答问题、推荐解决方案等

登录插画

登录以查看您的控制台资源

管理云资源
状态一览
快捷访问