一起架构-某实时分析项目云原生 serverless 架构的设计思路和poc代码实现

简介: 一起架构-某实时分析项目云原生 serverless 架构的设计思路和poc代码实现

1. 前言 - 云原生与多云混合云的部署架构

大家好,我是明哥!

在数字化转型的大背景下,越来越多的企业不断将越来越多的应用部署到云上,应用的架构也更加倾向云原生,以支持多云和混合云的部署架构。

前段时间,笔者参与了某个实时分析项目在 AWS 上的架构设计和 POC 开发,该项目使用了 serverless 的云原生架构,在此跟大家分享下架构设计和 poc 代码的细节,希望大家喜欢。

2. 项目背景和目标 Background and goals

整个项目的背景和目标如下:

640.png


经提炼和概括,项目的背景,基本目标和额外目标如下:

  • 背景:Ingest, transform and prepare the netCDF data provided by UK Met Office, make them available for secure querying by our customer, as soon as it arrives in the S3 bucket.
  • 基本目标:Core capabilities include:
  • high availability (no downtime)
  • quick response
  • timely availability of new data.
  • 额外目标:Extra Goals:
  • Security
  • cost effectiveness

3. 整体架构图 Architecture overview

最终设计的完整的架构图如下:

640.png


4. 架构设计和技术选型 Architecture details and thought process

4.1 How to discover new available data ASAP? - SQS

  • UK Met Office prepares the original data in netCDF format and uploaded them to a S3 bucket, but as listing a bucket is both expensive and slow (file system vs object store), we can’t take this approach for quickly discover of new available data in s3;
  • We noticed that UK Met Office will also send a message to a SNS topic once new data is available in the S3 bucket, so we can use a SQS to scribe to the SNS topic, and got notified when new objects are created, this solution is latency-efficient, cost-effective and scalable;

4.2 Can we use the original S3 bucket used by the UK Met Office?

  • We noticed that the original data will be held in the bucket for 7 days after the notification is sent, then they will be deleted;
  • We can use our own S3 bucket to store the data, so we have full control of the data, including the data lifecycle, the data security policies, etc;

4.3 How to server our end users, with quick response and high availability? – API gateway + DynamoDB

  • Our end users typically ask questions like “how will the weather/humidity/temperature be like in city C1, at time T1? how about city C2 and C3? How about time T2?”, to answer that question, we have to first figure out which files in the S3 bucket contains forecast results for that specific time (all the files contains forecast results for all the cities in UK, so place should not be a problem);
  • So we can use a RDS or DynamoDB to store the metadata “which s3 file contains forecast results for which time”, then when we receive a specific question from our customer, we can first query the RDS/DynamoDB to find out the corresponding S3 file, then they can query the s3 file to get all the forecast details, including weather/humidity/temperature etc, for all the UK cities;
  • RDS is a relational database and is typically for well-formed structured data, while DynamoDB is a fully-managed Key-value NoSql data store, both can fulfill our functional requirements, but considering that we don’t have highly-structured data, and DynamoDB shines in Availability. Scalability and Performance, so we will go with DynamoDB;
  • We can use an API gateway as a proxy to the DynamoDB and answer the end user’s request directly, with out an extra lambda layer between API gateway and DynamoDB, hence the whole data pipeline is shorter, which will be more time-effective, cost-effective, and less issues will occur; Also API gateway provides many security mechanisms, including authentication, authorization,audit and encryption;
  • With api gateway, DynamoDB and S3, the whole serving layer will response quickly with high availability, and is also cost-effective and secured;

4.4 How to ingest, transform and prepare the original data - Lambda!

  • To consume messages in SQS queues, we normally follow the event-driven architecture and use streaming processing frameworks like spark streaming/flink/kafka stream, but to use them, you need to first provision ec2 servers and possibly use ecs/eks, but you need to deploy, monitor and scale(both up and down) your app all by yourself, this is cumbersome and not cost-effective;
  • You can consider using serverless Fargate, but you have to deal with the event-driven by yourself;
  • Lambda is both serverless and event-driven, it automatically scales according to your data volume, it integrates with other aws services like sqs, s3, DynamoDB, api gateway well, and it allow you to pay for what you use, so it is a perfect match for our case!
  • we can use lambda and create a sqs trigger, so right after events arrived in sqs, it will trigger the execution of lamba where we can do the transform and load into downstream DynamoDB table;

5 技术组件细节和示例代码 Component details and code samples

5.1 Component details and code samples – sqs and lambda

  • Sqs type:as there is no need for First-in-first-out message delivery and Exactly-once processing, we can stay with the standard type sqs,which offers better scalability;
  • Sqs encryption: Amazon SQS provides in-transit encryption by default, we also added at-rest encryption to our queue by enable server-side encryption (uses Amazon SQS key (SSE-SQS));
  • Lambda: lambda has an sqs trigger, and for performance consideration, we are using batch to writer into dynamodb;
  • Labmda permission: to follow the least-access polity, we created a new IAM role with basic Lambda permissions (with just polices like AWSLambdaSQSQueueExecutionRole/AWSLambdaExecute/AWSLambdaDynamoDBExecutionRole)

5.2 Component details and code samples - dynamoDB

  • dynamoDB is serverless and will auto scale based on data volume and query, so to avoid hot spot bottleneck, we used forecast_period as partition key/hash key and forecast_time as sort key;(forecast_period is the difference between forecast_reference_time and forecast_time);
  • As end users typically query based on time, so we created a secondary global indexes sgi, with partition key on the time field forecast_time;
  • Encryption: we turned on Encryption at rest, and used encryption keys stored in AWS Key Management Service, whch is managed by DynamoDB at no extra cost;
  • permission: for apis to query the dynamoDB, we followed the least-access polity and created an access control policy with only read policy on the table and index;

5.3 Component details and code samples – api gateway

  • Api: I created two methods and resources, and configured the integration request and integration response’s mapping template, to full fill the scan and query on the dynamoDB, with paths like /times and /times/{time}, the latter one will use the sgi we created for the table;
  • Api key: I configured the method request to use API Key;
  • permission: to follow the least-access polity, we created a new IAM role with only necessary permissions (with just polices like AmazonAPIGatewayPushToCloudWatchLogs, and the dynamodb read-only policy we created earlier)

5.4 Component details and code samples – lambda codes

640.png

5.5 Component details and code samples – api codes

640.png

6. 脚本与自动化 automation using script - cloudFormation

  • I believe in IaC (infrustructure As Code) and GitOps, humans will make mistakes and automation helps us on this (plus automation is more efficient and script is more repeatable);
  • So I tried to use cloudFormation template to simplify the infrastructure management (due to time constraint, I only finished the dynamodb template);
  • Below are part of the cloudFormation script for the dynamodb table creation;

640.png

7. 终端用户模拟访问效果 End user query simulation results

640.png

  • IAM user with read only permission – IAM user name: arn:aws:iam::000435319421:user/demo
  • IAM user with read only permission – IAM user password: demo123@aws
  • End user request url: https://jye2m0pw20.execute-api.us-east2.amazonaws.com/v1/times/2022-04-16T22:45:00Z
  • End user request sample path parameter: 2022-04-17T22:30:00Z/2022-04-16T22:45:00Z, etc;
  • End user request type: get
  • End user request Authorization Type: api key
  • Key: x-api-key
  • Value: kNKmXfQGNx802XU1f75Mu9vRAFBvWIdM5uT7NmHa
  • Add to: header

8. 总结 Wrap up

  • high availability (no downtime): The solution used components like sns,sqs,lambda,dynamodb,api gateway and s3, all of which are managed services which scaled well and scaled automatically, to ensure high availability (no downtime);
  • quick response: The solution used dynamoDB in the serving layer, which scales well and scales automatically, and with the careful design of hashkey,sortkey and sgi, it offers quick response time to end users;
  • timely availability of new data: The solution followed the event driven architecture, with sqs and lambda, and ensured the timely availability of new data;
  • cost effectiveness:The solution followed the server-less architecture and used aws serveless services, so we can pay only what we use, and hence is cost effective;
  • security:
  • Encryption:aws service used TLS to provide encryption between user application and the AWS service which offered data-in-motion/transit encryption, and we enabled data-at-rest encryption;
  • Authentication and Authorization:we also followed the least-access policy to create IAM roles and policyes. we also used an api key to protect our api gateway from malicious attacks
  • Audit: CloudWatch is used for the audit;
相关实践学习
函数计算部署PuLID for FLUX人像写真实现智能换颜效果
只需一张图片,生成程序员专属写真!本次实验在函数计算中内置PuLID for FLUX,您可以通过函数计算+Serverless应用中心一键部署Flux模型,快速体验超写实图像生成的魅力。
从 0 入门函数计算
在函数计算的架构中,开发者只需要编写业务代码,并监控业务运行情况就可以了。这将开发者从繁重的运维工作中解放出来,将精力投入到更有意义的开发任务上。
相关文章
|
7月前
|
运维 监控 Cloud Native
【云故事探索】NO.17:国诚投顾的云原生 Serverless 实践
国诚投顾携手阿里云,依托Serverless架构实现技术全面升级,构建高弹性、智能化技术底座,提升业务稳定性与运行效率。通过云原生API网关、微服务治理与智能监控,实现流量精细化管理与系统可观测性增强,打造安全、敏捷的智能投顾平台,助力行业数字化变革。
【云故事探索】NO.17:国诚投顾的云原生 Serverless 实践
|
7月前
|
运维 监控 Cloud Native
【云故事探索】NO.17:国诚投顾的云原生 Serverless 实践
通过与阿里云深度合作,国诚投顾完成了从传统 ECS 架构向云原生 Serverless 架构的全面转型。新的技术架构不仅解决了原有系统在稳定性、弹性、运维效率等方面的痛点,还在成本控制、API 治理、可观测性、DevOps 自动化等方面实现了全方位升级。
|
7月前
|
弹性计算 运维 Cloud Native
【云故事探索】NO.17:国诚投顾的云原生Serverless实践
简介: 通过与阿里云深度合作,国诚投顾完成了从传统 ECS 架构向云原生 Serverless 架构的全面转型。新的技术架构不仅解决了原有系统在稳定性、弹性、运维效率等方面的痛点,还在成本控制、API 治理、可观测性、DevOps 自动化等方面实现了全方位升级。
193 1
|
11月前
|
数据采集 运维 Serverless
云函数采集架构:Serverless模式下的动态IP与冷启动优化
本文探讨了在Serverless架构中使用云函数进行网页数据采集的挑战与解决方案。针对动态IP、冷启动及目标网站反爬策略等问题,提出了动态代理IP、请求头优化、云函数预热及容错设计等方法。通过网易云音乐歌曲信息采集案例,展示了如何结合Python代码实现高效的数据抓取,包括搜索、歌词与评论的获取。此方案不仅解决了传统采集方式在Serverless环境下的局限,还提升了系统的稳定性和性能。
343 0
|
11月前
|
存储 运维 Serverless
千万级数据秒级响应!碧桂园基于 EMR Serverless StarRocks 升级存算分离架构实践
碧桂园服务通过引入 EMR Serverless StarRocks 存算分离架构,解决了海量数据处理中的资源利用率低、并发能力不足等问题,显著降低了硬件和运维成本。实时查询性能提升8倍,查询出错率减少30倍,集群数据 SLA 达99.99%。此次技术升级不仅优化了用户体验,还结合AI打造了“一看”和“—问”智能场景助力精准决策与风险预测。
1056 69
|
10月前
|
存储 缓存 分布式计算
StarRocks x Iceberg:云原生湖仓分析技术揭秘与最佳实践
本文将深入探讨基于 StarRocks 和 Iceberg 构建的云原生湖仓分析技术,详细解析两者结合如何实现高效的查询性能优化。内容涵盖 StarRocks Lakehouse 架构、与 Iceberg 的性能协同、最佳实践应用以及未来的发展规划,为您提供全面的技术解读。 作者:杨关锁,北京镜舟科技研发工程师
StarRocks x Iceberg:云原生湖仓分析技术揭秘与最佳实践
|
10月前
|
数据采集 运维 监控
Serverless爬虫架构揭秘:动态IP、冷启动与成本优化
随着互联网数据采集需求的增长,传统爬虫架构因固定IP易封禁、资源浪费及扩展性差等问题逐渐显现。本文提出基于Serverless与代理IP技术的新一代爬虫方案,通过动态轮换IP、弹性调度任务等特性,显著提升启动效率、降低成本并增强并发能力。架构图与代码示例详细展示了其工作原理,性能对比数据显示采集成功率从71%提升至92%。行业案例表明,该方案在电商情报与价格对比平台中效果显著,未来有望成为主流趋势。
438 0
Serverless爬虫架构揭秘:动态IP、冷启动与成本优化
|
11月前
|
Cloud Native Serverless 流计算
云原生时代的应用架构演进:从微服务到 Serverless 的阿里云实践
云原生技术正重塑企业数字化转型路径。阿里云作为亚太领先云服务商,提供完整云原生产品矩阵:容器服务ACK优化启动速度与镜像分发效率;MSE微服务引擎保障高可用性;ASM服务网格降低资源消耗;函数计算FC突破冷启动瓶颈;SAE重新定义PaaS边界;PolarDB数据库实现存储计算分离;DataWorks简化数据湖构建;Flink实时计算助力风控系统。这些技术已在多行业落地,推动效率提升与商业模式创新,助力企业在数字化浪潮中占据先机。
601 12
|
Cloud Native 安全 Serverless
云原生应用实战:基于阿里云Serverless的API服务开发与部署
随着云计算的发展,Serverless架构日益流行。阿里云函数计算(Function Compute)作为Serverless服务,让开发者无需管理服务器即可运行代码,按需付费,简化开发运维流程。本文从零开始,介绍如何使用阿里云函数计算开发简单的API服务,并探讨其核心优势与最佳实践。通过Python示例,演示创建、部署及优化API的过程,涵盖环境准备、代码实现、性能优化和安全管理等内容,帮助读者快速上手Serverless开发。
|
存储 消息中间件 人工智能
基于 Apache RocketMQ 的 ApsaraMQ Serverless 架构升级
基于 Apache RocketMQ 的 ApsaraMQ Serverless 架构升级
320 0

热门文章

最新文章