COVID-19 Cases Prediction (Regression)(一)

简介: COVID-19 Cases Prediction (Regression)

Objectives:

  • Solve a regression problem with deep neural networks (DNN).
  • Understand basic DNN training tips.
  • Familiarize yourself with PyTorch.

Task Description

  • COVID-19 Cases Prediction
  • Source: Delphi group @ CMU
  • A daily survey since April 2020 via facebook.


Try to find out the data and use it to your training is forbidden


image.png

  • Given survey results in the past 5 days in a specific state in U.S., then predict the percentage of new tested positive cases in the 5th day.

image.png

Data

image.png

Conducted surveys via facebook (every day & every state) Survey: symptoms, COVID-19 testing,social distancing, mental health, demographics, economic effects, …

  • States (37, encoded to one-hot vectors)
  • COVID-like illness (4)
  • cli、ili …
  • Behavior Indicators (8)
  • wearing_mask、travel_outside_state …
  • Mental Health Indicators (3)
  • anxious、depressed …
  • Tested Positive Cases (1)
  • tested_positive (this is what we want to predict)


Data – One-hot Vector

  • One-hot vectors:

   Vectors with only one element equals to one while others are zero. Usually used to encode discrete values.

the details about One-hot Vector please read the blog:One-Hot

image.png

Evaluation Metric

  • Mean Squared Error (MSE)

image.png

image.png

Download data

If the Google Drive links below do not work, you can download data from Kaggle, and upload data manually to the workspace.

!gdown --id '1kLSW_-cW2Huj7bh84YTdimGBOJaODiOS' --output covid.train.csv
!gdown --id '1iiI5qROrAhZn-o4FPqsE97bMzDEFvIdg' --output covid.test.csv
/usr/local/lib/python3.7/dist-packages/gdown/cli.py:131: FutureWarning: Option `--id` was deprecated in version 4.3.1 and will be removed in 5.0. You don't need to pass it anymore to use a file ID.
  category=FutureWarning,
Downloading...
From: https://drive.google.com/uc?id=1kLSW_-cW2Huj7bh84YTdimGBOJaODiOS
To: /content/covid.train.csv
100% 2.49M/2.49M [00:00<00:00, 238MB/s]
/usr/local/lib/python3.7/dist-packages/gdown/cli.py:131: FutureWarning: Option `--id` was deprecated in version 4.3.1 and will be removed in 5.0. You don't need to pass it anymore to use a file ID.
  category=FutureWarning,
Downloading...
From: https://drive.google.com/uc?id=1iiI5qROrAhZn-o4FPqsE97bMzDEFvIdg
To: /content/covid.test.csv
100% 993k/993k [00:00<00:00, 137MB/s]

Import packages

# Numerical Operations
import math
import numpy as np
# Reading/Writing Data
import pandas as pd
import os
import csv
# For Progress Bar
from tqdm import tqdm
# Pytorch
import torch 
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader, random_split
# For plotting learning curve
from torch.utils.tensorboard import SummaryWriter

Some Utility Functions

You do not need to modify this part.

def same_seed(seed): 
    '''Fixes random number generator seeds for reproducibility.'''
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    np.random.seed(seed)
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)
def train_valid_split(data_set, valid_ratio, seed):
    '''Split provided training data into training set and validation set'''
    valid_set_size = int(valid_ratio * len(data_set)) 
    train_set_size = len(data_set) - valid_set_size
    train_set, valid_set = random_split(data_set, [train_set_size, valid_set_size], generator=torch.Generator().manual_seed(seed))
    return np.array(train_set), np.array(valid_set)
def predict(test_loader, model, device):
    model.eval() # Set your model to evaluation mode.
    preds = []
    for x in tqdm(test_loader):
        x = x.to(device)                        
        with torch.no_grad():                   
            pred = model(x)                     
            preds.append(pred.detach().cpu())   
    preds = torch.cat(preds, dim=0).numpy()  
    return preds

Dataset

class COVID19Dataset(Dataset):
    '''
    x: Features.
    y: Targets, if none, do prediction.
    '''
    def __init__(self, x, y=None):
        if y is None:
            self.y = y
        else:
            self.y = torch.FloatTensor(y)
        self.x = torch.FloatTensor(x)
    def __getitem__(self, idx):
        if self.y is None:
            return self.x[idx]
        else:
            return self.x[idx], self.y[idx]
    def __len__(self):
        return len(self.x)

Neural Network Model

Try out different model architectures by modifying the class below.

class My_Model(nn.Module):
    def __init__(self, input_dim):
        super(My_Model, self).__init__()
        # TODO: modify model's structure, be aware of dimensions. 
        self.layers = nn.Sequential(
            nn.Linear(input_dim, 16),
            nn.ReLU(),
            nn.Linear(16, 8),
            nn.ReLU(),
            nn.Linear(8, 1)
        )
    def forward(self, x):
        x = self.layers(x)
        x = x.squeeze(1) # (B, 1) -> (B)
        return x






目录
相关文章
|
机器学习/深度学习 人工智能 算法
图解机器学习 | KNN算法及其应用
KNN算法(K近邻算法)是一种很朴实的机器学习方法,既可以做分类,也可以做回归。本文详细讲解KNN算法相关的知识,包括:核心思想、算法步骤、核心要素、缺点与改进等。
4836 1
图解机器学习 | KNN算法及其应用
|
6月前
|
机器学习/深度学习 人工智能 计算机视觉
让AI真正"看懂"世界:多模态表征空间构建秘籍
本文深入解析多模态学习的两大核心难题:多模态对齐与多模态融合,探讨如何让AI理解并关联图像、文字、声音等异构数据,实现类似人类的综合认知能力。
2539 6
|
机器学习/深度学习 数据采集 算法
一文搞懂 卷积神经网络 批归一化 丢弃法
这篇文章详细介绍了卷积神经网络中的批归一化(Batch Normalization)和丢弃法(Dropout),包括它们的计算过程、作用、优势以及如何在飞桨框架中应用这些技术来提高模型的稳定性和泛化能力,并提供了网络结构定义和参数计算的示例。
|
UED
【HarmonyOS——ArkTS语言】计算器的实现【合集】
【ArkTS语言-HarmonyOS】计算器的实现【合集】组件,点击等号后计算函数高效解析表达式并给出准确结果,达成核心功能要求。错误提示不够详尽,难以助力用户快速定位输入错误;响应式设计不足,在不同屏幕规格下适配性差。总体而言,本次实验已成功构建起基本功能框架,后续将针对上述问题深入探究优化方案,不断打磨细节,在完善计算器功能与提升用户体验的道路上持续精进,进而提升自身编程与应用开发的综合能力水平。利用按钮组件顺利完成布局设计,数字、运算符及功能按钮排列有序,操作逻辑清晰直观。
700 2
|
11月前
|
安全 Windows
“由于启动计算机时出现了页面文件配置问题,Windows在你的计算机上创建了一个临时页面文件。。。”的问题解决
本文主要介绍了因清理电脑垃圾文件时误删虚拟内存导致的Windows页面文件配置问题,并提供了详细的解决步骤。问题表现为开机后出现临时页面文件创建的提示弹窗。解决方法包括通过控制面板或快捷键进入高级系统设置,进而调整虚拟内存设置:进入性能选项中的虚拟内存栏,选择自动管理所有驱动器的分页文件大小,最后确认并重启计算机以恢复正常运行。
8710 5
“由于启动计算机时出现了页面文件配置问题,Windows在你的计算机上创建了一个临时页面文件。。。”的问题解决
|
机器学习/深度学习 搜索推荐 算法
协同过滤算法
协同过滤算法
1577 0
|
机器学习/深度学习 PyTorch 算法框架/工具
【Transformer系列(5)】Transformer代码超详细解读(Pytorch)
【Transformer系列(5)】Transformer代码超详细解读(Pytorch)
2470 1
【Transformer系列(5)】Transformer代码超详细解读(Pytorch)
|
关系型数据库 MySQL 数据库
MySQL数据库:基础概念、应用与最佳实践
一、引言随着互联网技术的快速发展,数据库管理系统在现代信息系统中扮演着核心角色。在众多数据库管理系统中,MySQL以其开源、稳定、可靠以及跨平台的特性受到了广泛的关注和应用。本文将详细介绍MySQL数据库的基本概念、特性、应用领域以及最佳实践,帮助读者更好地理解和应用MySQL数据库。二、MySQL
1053 5
|
存储 人工智能 小程序
比赛须知【2024 年睿抗机器人开发者大赛CAIP-编程技能赛(国赛)】
该文章是关于2024年睿抗机器人开发者大赛CAIP-编程技能赛(国赛)的参赛通知,强调了比赛时间、阅读比赛须知的重要性,并列举了多项比赛期间禁止的行为以确保比赛的公平性。
 比赛须知【2024 年睿抗机器人开发者大赛CAIP-编程技能赛(国赛)】
|
数据采集 安全 测试技术
LabVIEW调用DLL时需注意的问题
LabVIEW调用DLL时需注意的问题
891 0

热门文章

最新文章