Skip to content

Latest commit

 

History

History
175 lines (131 loc) · 9.83 KB

README.md

File metadata and controls

175 lines (131 loc) · 9.83 KB

OSSDC VisionAI

A set of computer vision and artificial intelligence algorithms for robotics and self driving cars

This project has support for Race.OSSDC.org WebRTC based platform, to allow for extensive and quick testing, of computer vision and neural nets algorithms, against live (real life or simulated) or streamed (not live) videos (from Youtube or other datasets).

To contribute follow the approach in video_processing files to add your own algorithm and create a PR to integrate it in this project.

OSSDC VisionAI Demo Reel - run the algoritms in Google Colab

Open Demo Reel In Colab

OAK-D Spacial AI camera - Demos

(Gaze estimation video can be found here)

(Pedestrian re-identification video can be found here)

(SSD object detection video can be found here)

MiDaS mono depth - Demos

(MiDaS mono-depth person demo video can be found here)

(MiDaS mono-depth night walk demo video can be found here)

(MiDaS mono-depth objects demo video can be found here)

Datasets and pretrained models are available in https://github.com/OSSDC/OSSDC-VisionAI-Datasets project.

Install prerequisites

  • pip install opencv-python # required for all video processors
  • pip install opencv-contrib-python # required for video_processing_opencv
  • pip install aiortc aiohttp websockets python-engineio==3.14.2 python-socketio[client]==4.6.0 # required for WebRTC
  • pip install dlib # required for face_landmarks
  • pip install torch torchvision
  • pip install tensorflow-gpu
  • pip install youtube-dl # required for YouTube streaming sources

Install OSSDC VisionAI Android client app

Demos

  • Prerequisite steps every time before running the python video processing scripts

    • Run VisionAI Android app and setup the room name and password and start the WebRTC conference
    • Update room info in signaling_race.py (everytime the room name or password is modified in the VisionAI Android app)
  • SegFormer semantic segmentation with transformers demo

    • Install SegFormer https://github.com/NVlabs/SegFormer - see install steps in video_processing_SegFormer.py or OSSDC_VisionAI_demo_reel.ipynb notebook
    • run the SegFormer video processor on the video stream from VisionAI Android app
      • python race-ossdc-org_webrtc_processing.py -t SegFormer.b3-512-ade --room {your_room_name}
      • demo-reel.sh {your_room_name} (enable SegFormer line)
    • Demo videos SegFormer - semantic segmentation with transformers using OSSDC VisionAI platform hhttps://www.youtube.com/watch?v=3ws-irF4dEQ
  • GANsNRoses demo

    • Install GANsNRoses https://github.com/mchong6/GANsNRoses - see install steps in video_processing_GANsNRoses.py or OSSDC_VisionAI_demo_reel.ipynb notebook
    • run the GANsNRoses video processor on the video stream from VisionAI Android app
      • python race-ossdc-org_webrtc_processing.py -t GANsNRoses --room {your_room_name}
      • demo-reel.sh {your_room_name} (enable GANsNRoses line)
    • Demo videos Have fun with GANsNRoses - using OSSDC VisionAI realtime video processing platform https://www.youtube.com/watch?v=YZTzjk_qh4w
  • DepthAI (OAK-D) stereo smart camera Side-By-Side 3D streaming demo

    • Install latest DepthAI API from https://github.com/luxonis/depthai-python
    • run the DepthAI video processor on the stereo or RGB video stream from OAK-D camera and stream it to VisionAI Android app
      • python race-ossdc-org_webrtc_processing.py -t depthai.sbs --room {your_room_name}
      • demo-reel.sh {your_room_name} (enable depthai.sbs line)
      • python race-ossdc-org_webrtc_processing.py -t depthai.rgb --room {your_room_name}
      • demo-reel.sh {your_room_name} (enable depthai.rgb line)
    • Demo videos Live 3D video streamed over internet from a DepthAI OAK-D with OSSDC VisionAI https://www.youtube.com/watch?v=28awrl5MipQ (use a VR head set to see the 3D depth)
  • Detectron2 demo

    • Install Detectron2 - see install steps in video_processing_detectron2.py or OSSDC_VisionAI_demo_reel.ipynb notebook
    • run the Detectron2 video processor on the video stream from VisionAI Android app
      • python race-ossdc-org_webrtc_processing.py -t detectron2 --room {your_room_name}
      • demo-reel.sh {your_room_name} (enable detectron2 line)
    • Demo videos TBD
  • DeepMind NFNets demo

  • MediaPipe Holistic demo

    • Install MediaPipe - see install steps in video_processing_mediapipe.py or OSSDC_VisionAI_demo_reel.ipynb notebook

    • run the MediaPipe holistic video processor on the video stream from VisionAI Android app

      • python race-ossdc-org_webrtc_processing.py -t mediapipe.holistic --room {your_room_name}
      • demo-reel.sh {your_room_name} (enable mediapipe.holistic line)
    • Demo video

      MediaPipe holistic demo

      Isn't this fun?! MediaPipe Holistic neural net model processed in real time on Google Cloud https://www.youtube.com/watch?v=0l9Bb5IC86E

  • OAK-D gaze estimation demo, the proceessing is done on Luxonis OAK-D camera vision processing unit https://store.opencv.ai/products/oak-d

    • Install OAK-D DepthAI - see install steps in video_processing_oakd.py

    • run the OAK-D video processor on the video stream from VisionAI Android app

      • python race-ossdc-org_webrtc_processing.py -t oakd.gaze
    • Demo video

      Gaze estimation demo with processing done on Luxonis OAK-D camera processor (processing at 10 FPS on 486 x 1062 video, streamed at 30 FPS)

      https://www.youtube.com/watch?v=xMgNWRWytOk

  • OAK-D people reidentification demo, the proceessing is done on Luxonis OAK-D camera vision processing unit https://store.opencv.ai/products/oak-d

    • Run VisionAI Android app and setup the room and start the WebRTC conference

    • Install OAK-D DepthAI - see install steps in video_processing_oakd.py

    • run the OAK-D video processor on the video stream from VisionAI Android app

      • python race-ossdc-org_webrtc_processing.py -t oakd.pre
    • Demo video

      People reidentification demo with processing done on Luxonis OAK-D camera processor (processing at 9 FPS on 486 x 1062 video, streamed at 30 FPS)

      https://www.youtube.com/watch?v=pB0BpHieu3Y

  • OAK-D age and genrer recognition demo, the proceessing is done on Luxonis OAK-D camera vision processing unit https://store.opencv.ai/products/oak-d

    • Install OAK-D DepthAI - see install steps in video_processing_oakd.py

    • run the OAK-D video processor on the video stream from VisionAI Android app

      • python race-ossdc-org_webrtc_processing.py -t oakd.age-gen
    • Demo video

      Upcomming

  • MiDaS mono depth, processing is done on Nvidia GPU

    • Run VisionAI Android app and setup the room and start the WebRTC conference

    • Install MiDaS - see install steps in video_processing_midas.py

    • run the MiDaS video processor on the video stream from VisionAI Android app

      • python race-ossdc-org_webrtc_processing.py -t midas
    • Demo Videos

      Mono depth over WebRTC using Race.OSSDC.org platform

      https://www.youtube.com/watch?v=6a6bqJiZuaM

      OSSDC VisionAI MiDaS Mono Depth - night demo

      https://www.youtube.com/watch?v=T0ZnW1crm7M

  • DLIB face landmarks, processing is done on CPU

    • Install DLIB and face landmarks pretrained model - see instructions steps in video_processing_face_landmarks.py
    • run the DLIB face landmarks video processor on the video stream from VisionAI Android app
      • python race-ossdc-org_webrtc_processing.py -t face_landmarks
  • OpenCV edges detection, processing is done on CPU

    • run the OpenCV edges video processor on the video stream from VisionAI Android app
      • python race-ossdc-org_webrtc_processing.py -t opencv.edges