Overview

This dataset comprises twenty-one consecutive days of video footage collected by a drone, following three troops of wild Olive baboons in Laikipia, Northern Kenya, as they left and returned to known sleeping sites, morning and night. Over ten hours of video footage were collected in various environments, including at a sleeping tree, a river, a rock, during a river crossing, in open savannah, and on a cliff. Frames include up to 70 individuals tracked simultaneously, forming a complex and dense dataset of overlapping individuals from an aerial perspective. Three fundamental subtasks have been identified from the original dataset: detection, tracking, and behavior recognition. The data contains significant visual noise from camera motion, varying light conditions, spectral contrast from shadows, and different background environments. This is the first dataset to automate the classification of non-human primate behavior from aerial video, enabling the understanding of inter-group interactions in the context of the natural environment and in relation to the behavior of other troop members. This provides insight into the social network of the group.

Automatically tracking individual ethograms that form collective behavior can help unravel consistent patterns of group organization. Tracking individual exchanges enables the identification of triggers for switches in group behavior, leading to a better understanding of why splinter groups emerge. Additionally, various formations in the attack and defense of territory can be mapped. Non-human primates are highly territorial and switch allegiances depending on resource availability and prior interactions. By automatically modeling these interactions, we can better understand group dynamics and tipping points in collective behavior, including the emergence of new group formations. This goal is to methodologically advance the analysis of social network structures and collective decision-making to improve predictive capabilities. Existing video datasets of primates are from camera traps or hand-held video recorders, limiting the number of individuals visible in a frame at once and the spatial and temporal window in which behavior is observed. Using a mobile monitoring technique enables troop behavior to be observed collectively and at a scale relevant to the decisions made.


Detection

Examples of baboon detection from drone videos:

We evaluate YOLOv8-X model with input resolution of 768x768 on our dataset and report mAP@50, Precision, and Recall.

Model mAP@50 Precision Recall
YOLOv8-X 92.62 93.70 87.60

Tracking

An example of tracking over a cliff:

An example of tracking over a river:

An example of tracking over a tree:

An example of tracking over a rock:

We evaluate SORT, DeepSORT, StrongSORT, ByteTrack, and BotSort tracking algorithms on our dataset and report MOTA, MOTP, IDF1, Precision, and Recall.

Tracker MOTA MOTP IDF1 Precision Recall
SORT 84.76 50.15 77.43 90.83 91.19
DeepSORT 84.40 87.22 81.38 90.26 91.57
StrongSORT 82.48 85.37 84.98 88.00 90.10
ByteTrack 63.55 34.10 77.01 96.32 64.90
BotSort 63.81 34.31 78.24 97.21 66.16

Behavior Recognition

The dataset includes a total of eight categories that describe various animal behaviors. These categories are Walking/Running, Sitting/Standing, Fighting/Playing, Self-Grooming, Being Groomed, Grooming Somebody, Mutual Grooming, Infant-Carrying, Foraging, Drinking, Mounting, Sleeping, and Occluded.

Walking/Running
Sitting/Standing
Fighting/Playing
Self-Grooming
Being Groomed
Grooming Somebody
Infant-Carrying
Foraging
Drinking
Mounting
Sleeping
Occluded

We evaluate I3D, SlowFast, and X3D models on our dataset and report Micro-Average (Per Instance) and Macro-Average (Per Class) accuracy.

Method WI Micro Top-1 Micro Top-3 Micro Top-5 Macro Top-1 Macro Top-3 Macro Top-5
I3D Random 61.29 89.38 92.34 26.53 54.51 65.47
SlowFast Random 61.71 90.35 93.11 27.08 56.73 67.61
X3D Random 63.97 91.34 95.17 30.04 60.58 72.13
X3D K-400 64.89 92.54 96.66 31.41 62.04 74.01

Format

BaboonLand
    /charades -> The dataset converted to Charades format to train and evaluate behavior
                 recognition models. You can download the generated dataset from our webpage
                 or you can generate it yourself. See instructions below.
        ...
    /cvat_templates -> You can use these templates to backup projects in CVAT.
                       It will allow you to explore and adjust the annotations in CVAT.
        /behavior.zip
        /tracking.zip
    /dataset -> The dataset is located here.
        /video_1
            /actions -> The behavior annotations are located here.
                /0.xml
                /1.xml -> Annotations of the behavior for an individual with ID=1.
                ...
                /n.xml
            /mini-scenes -> Generated mini-scenes from video.xml and tracks.xml. The name of
                            the video matches ID of the track in tracks.xml. The name of the
                            video also matches the behavior annotations file in the actions
                            folder. For example, a track with ID=1 will be extracted into
                            mini-scenes/1.mp4 and there will be behavior annotations for this
                            track located in actions/1.xml.
                /0.mp4
                /1.mp4
                ...
                /n.mp4
            /timeline.jpg -> A timeline of the original video and corresponding mini-scenes.
                             This file is generated for convenience only. You can use it to
                             look for a mini-scene with a specific length or relative
                             location in the video.
            /tracks.xml -> This file contains tracks and bounding boxes of baboons in
                           CVAT for video 1.1 format. Each track has a unique ID. This
                           number matches the name of the file in the actions folder.
                           For example, if you want to get the track and corresponding
                           bounding boxes of a baboon with ID=1, you can get this
                           information from the tracks.xml file. If you want to explore
                           the behavior of the baboon with ID=1, you can get this
                           information with the help of the actions/1.xml file.
            /video.mp4 -> The original video from a drone.
        /video_2
            /actions
                /0.xml
                /1.xml
                ...
                /n.xml
            /mini-scenes
                /0.mp4
                /1.mp4
                ...
                /n.mp4
            /timeline.jpg
            /tracks.xml
            /video.mp4
        ...
        /video_n
            /actions
                /0.xml
                /1.xml
                ...
                /n.xml
            /mini-scenes
                /0.mp4
                /1.mp4
                ...
                /n.mp4
            /tracks.xml
            /video.mp4
    /scripts
        /requirements.txt -> Install all the requirements to be able to run scripts.
        /tracks2mini-scenes.py -> Use this script to generate the mini-scenes from
                                  video.xml and tracks.xml files.
        /dataset2charades.py -> Use this script to generate a dataset for Baboon behavior
                                recognition in Charades format. The generated dataset can
                                be used to train a model with the SlowFast framework. 
        /charades2video.py -> Use this script if you want to combine images from the dataset
                              in Charades format back to videos. These videos can be used to
                              create demos of the model performance.
        /charades2visual.py -> Use this script if you want to combine images from the dataset
                               in Charades format back to videos and visualize corresponding
                               behavior annotations.
        /dataset2tracking.py -> Use this script to generate a data split for training and
                                evaluating tracking algorithms.
        /tracking2ultralytics.py -> Use this script to generate a Baboon detection dataset in
                                    Ultralytics (YOLO) format. The dataset can be used to
                                    train detection models with the Ultralytics (YOLOv8)
                                    framework.
        /ultralytics2pyramid.py -> Use this script to split the original 5.3K images in the
                                   Ultralytics dataset into tiles. You will create a dataset
                                   with 2x2, 3x3, and 4x4 tiles. It will help to train a
                                   model that will be more robust for both small and
                                   large baboons.
    /tracking -> The dataset split into train and test for tracking and train converted to
                 Ultralytics format to train and evaluate detection models. You can download
                 the generated dataset from our webpage or you can generate it yourself.
        ...
    /README.md

Acknowledgments

ID was supported by the National Academy of Sciences Research Associate Program and the United States Army Research Laboratory while conducting this study. MK was supported by the National Science Foundation underĀ Award No. 2118240 and Award No. 2112606 (AI InstituteĀ for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE)). ID collected all the UAV data on a Civil Aviation Authority Drone License CAA NQE Approval Number: 0216/1365 in conjunction with authorization from a KCAA operator under a Remote Pilot License. The data was gathered at the Mpala Research Centre in Kenya, in accordance with Research License No. NACOSTI/P/22/18214. The data collection protocol adhered strictly to the guidelines set forth by the Institutional Animal Care and Use Committee under permission No. IACUC 1835F.


Citation

@misc{duporge2024baboonland,
  title={BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos}, 
  author={Isla Duporge and Maksim Kholiavchenko and Roi Harel and Dan Rubenstein and Meg Crofoot and Tanya Berger-Wolf and Stephen Lee and Scott Wolf and Julie Barreau and Jenna Kline and Michelle Ramirez and Chuck Stewart},
  year={2024},
  eprint={2405.17698},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}