下面的教程提供了一个很好的起点。尽管Supervision API很简单,但需要注意的是,教程是基于最早的版本Supervision 0.1.0的。从那时起,Supervision API的许多方面都经过了重构,以提供更简化和结构化的方法。
方法1 — Jupyter Notebook
方法2 — 部署到Streamllit云的Streamlit Web应用
- 置信度阈值:基于置信度值过滤检测结果;只显示高于设置阈值的值。
- 上传视频:接受最大为40MB的MP4视频。
- 选择所需类别:允许选择检测和跟踪的特定类别;默认类别是person和car。
- 源视频信息:显示已上传视频的元数据,您可以在配置线计数器时作为参考。
- 线计数器配置:使用坐标(x,y)设置线计数器的起点和终点。
- 框注释器配置:配置对象检测框注释器。
- 线注释器配置:配置线计数器注释器。
该应用托管在Streamlit Cloud上,可以在:https://yolov8-object-counting.streamlit.app/上访问。
- 性能约束:由于Streamlit不提供GPU环境,因此由于仅使用CPU处理,操作速度较慢。
- 并发问题:在高并发使用期间,该应用可能会遇到显着的减速,可能是由于共享云资源而不是可扩展资源。
- 模型选择:由于Streamlit Cloud的资源限制,我选择了YOLOv8n.pt模型,该模型提供更快的推理,但相对于其他v8模型(如YOLOv8x.pt)来说,检测性能较差。
方法3: 在Streamlit中进行本地部署
conda create -n yolov8-object-conting-local-deployment python=3.8
2. 激活此环境
conda activate yolov8-object-conting-local-deployment
3. 克隆存储库
git clone https://github.com/grhaonan/yolov8-object-counting.git
4. 安装所有依赖项
cd yolov8-object-counting pip install -r requirements.txt
5. 启动Streamlit应用程序,它应该在http://localhost:8501/上显示应用程序
streamlit run app.py
import streamlit as st import os import cv2 import supervision as sv from ultralytics import YOLO from time import sleep from tqdm.notebook import tqdm from tqdm.notebook import tqdm import numpy as np from PIL import Image import tempfile import gdown import glob #Constants MODEL_NAME = 'yolov8n.pt' HOME = os.getcwd() MODEL_DEFAULT_PATH = os.path.join(HOME, 'models', MODEL_NAME) VIDEO_PATHS = { 'source': os.path.join(HOME, 'data/raw', 'bgl.mp4'), 'target': os.path.join(HOME, 'data/processed', 'temp_output_video.mp4') } # ------------------------ Helper Functions--------------------------------- # Helper function to download model from Google Drive def download_from_gdrive(gdrive_url, output_path): gdown.download(gdrive_url, output_path, quiet=False) # Helper function to load model, with caching for 60 minutes @st.cache_resource(ttl=60*60) def load_model(path): return YOLO(path) # Helper function to remove all mp4 files in a directory except bgl.mp4 to clean up the space def remove_mp4_except_bgl(directory): for filepath in glob.glob(f"{directory}/*.mp4"): if filepath != f"{directory}/bgl.mp4": os.remove(filepath)
# ------------------------ Streamlit App Description --------------------------------- st.set_page_config( page_title="YOLOV8 Object Counting", page_icon="🧊", layout="wide", initial_sidebar_state="expanded", menu_items={ 'Get Help': 'https://github.com/grhaonan/yolov8-object-counting', 'Report a bug': 'https://github.com/grhaonan/yolov8-object-counting', 'About': 'Demo app for YOLOV8 Object Counting' } ) st.title('YOLOV8 Object Counting') st.subheader('A Streamlit App for Object Counting using YOLOV8 & Supervision') st.markdown('Welcome to the YOLOV8 Object Counting Demo App! This app is built by ' '[Dustin Liu](https://www.linkedin.com/in/dustin-liu/) - ' 'view project source code on ' '[GitHub](https://github.com/grhaonan/yolov8-object-counting)') st.markdown('Detail explainatuion of this app can be found at [Medium](https://medium.com/@grdustin/yolov8-object-counting-19fa384a9cd3)') st.markdown('Note: Please remeber to adjust the line counter configuration to fit your video otherwise it will raise error when point value if out of range!') st.markdown('Streamlit Cloud is CPU only, so the processing speed is slow, and cloud resouces can be contrained from time to time, so please be patient.' 'I highly recommend you to run this app locally and please refer to the README.md in repo for more details')
def main(): # ------------------------pre-loading--------------------------------- # Downloading default source video from Google Drive if it doesn't exist if not os.path.exists(VIDEO_PATHS['source']): download_from_gdrive('https://drive.google.com/uc?id=1Zv-i5bj5wi22URGf3otlOSHJzdkSi7V_', VIDEO_PATHS['source']) # Initialize YOLO model placeholder = st.empty() if os.path.exists(MODEL_DEFAULT_PATH): model = load_model(MODEL_DEFAULT_PATH) else: placeholder.info('Model does not exist, downloading and this may take a while', icon="ℹ️") model = load_model(MODEL_NAME) os.rename(MODEL_NAME, MODEL_DEFAULT_PATH) placeholder.success('The model is loaded successfully') sleep(1) placeholder.empty() # Remove *.mp4 in data/raw/ except bgl.mp4 remove_mp4_except_bgl("data/raw") # Remove *.mp4 in data/processed/ remove_mp4_except_bgl("data/processed") # ------------------------ Streamlit Sidebar --------------------------------- #Show mode size info in sidebar st.sidebar.markdown(f"Model Type: {MODEL_NAME[:-3]}") st.sidebar.markdown("<br>", unsafe_allow_html=True) # # Video File Uploader and use a temp file to store the uploaded file and delete it after the process automatically
    uploaded_file = st.sidebar.file_uploader("Upload a video file", type=["mp4"])
    if uploaded_file is not None:
        with open("data/raw/temp_input_video.mp4", "wb") as f:
            f.write(uploaded_file.read())
        VIDEO_PATHS['source'] = "data/raw/temp_input_video.mp4"
    
    # Adding space between sidebar items
    st.sidebar.markdown("<br>", unsafe_allow_html=True)
    
    #Confidence threshold
    confidence_threshold = st.sidebar.slider("Confidence Threshold", min_value=0.0, max_value=1.0, value=0.25, step=0.01)
    st.sidebar.markdown("<br>", unsafe_allow_html=True)
    
    # Select desired detection classes
    class_names_dict = model.model.names
    selected_class_names = st.sidebar.multiselect("Select Desired Classes", list(class_names_dict.values()), ['car','person'])
    selected_class_ids = [k for k, v in class_names_dict.items() if v in selected_class_names]
    
    # Show video info
    st.sidebar.markdown("<br>", unsafe_allow_html=True) st.sidebar.markdown(f"Source Video Information:")
    video_info = sv.VideoInfo.from_video_path(VIDEO_PATHS['source'])
    col1, col2, col3 = st.sidebar.columns(3)
    with col1:
        st.write(f"width\n{video_info.width}")
    with col2:
        st.write(f"height\n{video_info.height}")
    with col3:
        st.write(f"#frames\n{video_info.total_frames}")
    
    # Point Configuration
    st.sidebar.markdown("<br>", unsafe_allow_html=True)
    with st.sidebar.expander("⚙️ Line Points Configuration: ", expanded=True):
        LINE_START = tuple(map(int, st.text_input("Starting Point (x,y)", "180,50").split(',')))
        LINE_END = tuple(map(int, st.text_input("Ending Point (x,y)", "180,1230").split(',')))
    
    # Check x from either LINE_START or LINE_END should be smaller than video_information.width otherwise raise error
    if LINE_START[0] > video_info.width or LINE_END[0] > video_info.width:
        st.error('x from either LINE_START or LINE_END should be smaller or equal to video width')
        st.stop()
    
    # Check y from either LINE_START or LINE_END should be smaller than video_information.height otherwise raise error
    if LINE_START[1] > video_info.height or LINE_END[1] > video_info.height:
        st.error('y from either LINE_START or LINE_END should be smaller or equal to video height')
        st.stop() # Box annotator
    # st.sidebar.markdown(f"Box Annotator Configuration:")
    with st.sidebar.expander("⚙️ Box Annotator Configuration: ", expanded=False):
        box_annotator_thickness = st.number_input("Box Thickness", min_value=1, value=1)
        box_annotator_text_thickness = st.number_input("Box Text Thickness", min_value=1, value=1)
        box_annotator_text_scale = st.number_input("Box Text Scale", min_value=0.1, max_value=1.0, value=0.5)
    
    # Line counter annotator
    # st.sidebar.markdown(f"Line Counter Annotator Configuration:")
    with st.sidebar.expander("⚙️ Line Annotator Configuration: ", expanded=False):
        line_thickness = st.number_input("Line Thickness", min_value=1, value=1)
        line_text_thickness = st.number_input("Line Text Thickness", min_value=1, value=1)
        line_text_scale = st.number_input("Line Text Scale", min_value=0.1, max_value=1.0, value=0.5)
    
    # ------------------------ Video Processing---------------------------------
    # Initialize Streamlit placeholder
    frame_placeholder = st.empty()
    
    def callback(frame: np.ndarray, index:int) -> np.ndarray: # model prediction on single frame and conversion to supervision Detections
        results = model(frame, verbose=False)[0]
        detections = sv.Detections.from_ultralytics(results)
        # only consider class id from selected_classes define above
        detections = detections[np.isin(detections.class_id, selected_class_ids)]
        detections = detections[detections.confidence > confidence_threshold]
        # tracking detections
        detections = byte_tracker.update_with_detections(detections)
        
        labels = [
            f"#{tracker_id} {model.model.names[class_id]} {confidence:0.2f}"
            for _, _, confidence, class_id, tracker_id
            in detections
        ]
        
        box_annotated_frame=box_annotator.annotate(scene=frame.copy(), detections=detections, labels=labels)
        
        # update line counter
        line_zone.trigger(detections)
        line_counter_annotated_frame = line_zone_annotator.annotate(box_annotated_frame, line_counter=line_zone)
        
        # display frame
        image_pil = Image.fromarray(cv2.cvtColor(line_counter_annotated_frame, cv2.COLOR_BGR2RGB))
        frame_placeholder.image(image_pil, use_column_width=True)
        
        return line_counter_annotated_frame # Video display
    LINE_START = sv.Point(*LINE_START)
    LINE_END = sv.Point(*LINE_END)
    
    # create BYTETracker instance
    byte_tracker = sv.ByteTrack(track_thresh= 0.25, track_buffer = 30,match_thresh = 0.8,frame_rate =30)
    
    # create frame generator
    generator = sv.get_video_frames_generator(VIDEO_PATHS['source'])
    
    # create LineZone instance, it is previously called LineCounter class
    line_zone = sv.LineZone(start=LINE_START, end=LINE_END)
    
    # create instance of BoxAnnotator
    box_annotator = sv.BoxAnnotator(thickness=box_annotator_thickness, text_thickness=box_annotator_text_thickness, text_scale=box_annotator_text_scale)
    
    # create LineZoneAnnotator instance, it is previously called LineCounterAnnotator class
    line_zone_annotator = sv.LineZoneAnnotator(thickness=line_thickness, text_thickness=line_text_thickness, text_scale=line_text_scale)
    
    # process the whole video
    sv.process_video(
        source_path = VIDEO_PATHS['source'],
        target_path = VIDEO_PATHS['target'],
        callback=callback
    )

if __name__ == "__main__":
    # call main function
    main()
1. 预加载:处理加载YOLOv8模型、上传视频(或默认视频)和清理空间。
2. Streamlit边栏:在应用程序中定义了UI界面。
3. 视频处理:专注于逐帧处理视频,通过process_video函数完成,并由回调函数中的逻辑引导。