Parrot, one of the leaders in drone systems and technologies, produce a good collection of aerial videos that can be used to train various aerial autonomous vehicles suitable for Engineering, Construction, and Agriculture use cases.
Another good source for Autonomous Vehicle data, The Level 5 Dataset includes over 55,000 human-labeled 3D annotated frames, surface map, and an underlying HD spatial semantic map that are captured from 7 cameras and up to 3 Lidar sensors that can be used to contextualize the data. Note that this dataset uses nuScenes format.
One of the most famous novel Computer Vision benchmark for Autonomous Driving, KITTI dataset contain videos, Velodyne sensors, and a GPS localization system recording of rural areas and highway driving in the city of Karlsruhe. A collaborative work of Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago
This is high-quality geospatial data with precision-labeled and high-resolution satellite imagery. It contains ~27,000 square km of very high-resolution imagery, 811,000 building footprints, and ~20,000 km of road labels to ensure that there is adequate open source data available for geospatial machine learning research.
HACS = Human Action Clips and Segment Dataset is a CV dataset that is good for building models that deals with Recognition and Temporal Localization, using videos of action segments. This dataset contains 1.55M clips on 504K videos.
COCO, short for Common Objects in Context, is large image recognition/classification, object detection, segmentation, and captioning dataset. Volume: 330K images (200K+ annotated); more than 2M instances in 80 object categories, with 5 captions per image, and 250,000 people with key points.
Known as the de-facto image dataset for CV algorithms, ImageNet is a large image database of various quality-controlled and human annotated object images that aims to support Computer Vision researchers and practitioners with the need of more data.