Datasets:
Tasks:
Image Classification
Languages:
English
Size Categories:
10K<n<100K
Source Datasets:
original
License:
odbl
image
image
| labels
class label
15 classes
|
---|---|
11
(sitting) |
|
14
(using_laptop) |
|
7
(hugging) |
|
12
(sleeping) |
|
14
(using_laptop) |
|
12
(sleeping) |
|
4
(drinking) |
|
7
(hugging) |
|
1
(clapping) |
|
3
(dancing) |
|
2
(cycling) |
|
4
(drinking) |
|
1
(clapping) |
|
0
(calling) |
|
12
(sleeping) |
|
4
(drinking) |
|
0
(calling) |
|
8
(laughing) |
|
14
(using_laptop) |
|
14
(using_laptop) |
|
1
(clapping) |
|
5
(eating) |
|
6
(fighting) |
|
9
(listening_to_music) |
|
3
(dancing) |
|
4
(drinking) |
|
2
(cycling) |
|
8
(laughing) |
|
4
(drinking) |
|
3
(dancing) |
|
9
(listening_to_music) |
|
2
(cycling) |
|
11
(sitting) |
|
11
(sitting) |
|
12
(sleeping) |
|
5
(eating) |
|
10
(running) |
|
10
(running) |
|
7
(hugging) |
|
7
(hugging) |
|
0
(calling) |
|
14
(using_laptop) |
|
9
(listening_to_music) |
|
2
(cycling) |
|
12
(sleeping) |
|
7
(hugging) |
|
13
(texting) |
|
12
(sleeping) |
|
14
(using_laptop) |
|
10
(running) |
|
7
(hugging) |
|
0
(calling) |
|
14
(using_laptop) |
|
4
(drinking) |
|
7
(hugging) |
|
2
(cycling) |
|
8
(laughing) |
|
0
(calling) |
|
1
(clapping) |
|
8
(laughing) |
|
13
(texting) |
|
11
(sitting) |
|
8
(laughing) |
|
14
(using_laptop) |
|
12
(sleeping) |
|
9
(listening_to_music) |
|
2
(cycling) |
|
1
(clapping) |
|
13
(texting) |
|
13
(texting) |
|
5
(eating) |
|
2
(cycling) |
|
4
(drinking) |
|
0
(calling) |
|
10
(running) |
|
4
(drinking) |
|
13
(texting) |
|
11
(sitting) |
|
0
(calling) |
|
7
(hugging) |
|
5
(eating) |
|
12
(sleeping) |
|
14
(using_laptop) |
|
14
(using_laptop) |
|
10
(running) |
|
0
(calling) |
|
14
(using_laptop) |
|
14
(using_laptop) |
|
9
(listening_to_music) |
|
0
(calling) |
|
3
(dancing) |
|
8
(laughing) |
|
2
(cycling) |
|
1
(clapping) |
|
14
(using_laptop) |
|
0
(calling) |
|
3
(dancing) |
|
4
(drinking) |
|
13
(texting) |
|
6
(fighting) |
Dataset Summary
A dataset from kaggle. origin: https://dphi.tech/challenges/data-sprint-76-human-activity-recognition/233/data
Introduction
- The dataset features 15 different classes of Human Activities.
- The dataset contains about 12k+ labelled images including the validation images.
- Each image has only one human activity category and are saved in separate folders of the labelled classes
PROBLEM STATEMENT
- Human Action Recognition (HAR) aims to understand human behavior and assign a label to each action. It has a wide range of applications, and therefore has been attracting increasing attention in the field of computer vision. Human actions can be represented using various data modalities, such as RGB, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, radar, and WiFi signal, which encode different sources of useful yet distinct information and have various advantages depending on the application scenarios.
- Consequently, lots of existing works have attempted to investigate different types of approaches for HAR using various modalities.
- Your Task is to build an Image Classification Model using CNN that classifies to which class of activity a human is performing.
About Files
- Train - contains all the images that are to be used for training your model. In this folder you will find 15 folders namely - 'calling', ’clapping’, ’cycling’, ’dancing’, ‘drinking’, ‘eating’, ‘fighting’, ‘hugging’, ‘laughing’, ‘listeningtomusic’, ‘running’, ‘sitting’, ‘sleeping’, texting’, ‘using_laptop’ which contain the images of the respective human activities.
- Test - contains 5400 images of Human Activities. For these images you are required to make predictions as the respective class names -'calling', ’clapping’, ’cycling’, ’dancing’, ‘drinking’, ‘eating’, ‘fighting’, ‘hugging’, ‘laughing’, ‘listeningtomusic’, ‘running’, ‘sitting’, ‘sleeping’, texting’, ‘using_laptop’.
- Testing_set.csv - this is the order of the predictions for each image that is to be submitted on the platform. Make sure the predictions you download are with their image’s filename in the same order as given in this file.
- sample_submission: This is a csv file that contains the sample submission for the data sprint.
Data Fields
The data instances have the following fields:
image
: APIL.Image.Image
object containing the image. Note that when accessing the image column:dataset[0]["image"]
the image file is automatically decoded. Decoding of a large number of image files might take a significant amount of time. Thus it is important to first query the sample index before the"image"
column, i.e.dataset[0]["image"]
should always be preferred overdataset["image"][0]
.labels
: anint
classification label. Alltest
data is labeled 0.
Class Label Mappings:
{
'calling': 0,
'clapping': 1,
'cycling': 2,
'dancing': 3,
'drinking': 4,
'eating': 5,
'fighting': 6,
'hugging': 7,
'laughing': 8,
'listening_to_music': 9,
'running': 10,
'sitting': 11,
'sleeping': 12,
'texting': 13,
'using_laptop': 14
}
Data Splits
train | test | |
---|---|---|
# of examples | 12600 | 5400 |
Data Size
- download: 311.96 MiB
- generated: 312.59 MiB
- total: 624.55 MiB
>>> from datasets import load_dataset
>>> ds = load_dataset("Bingsu/Human_Action_Recognition")
>>> ds
DatasetDict({
test: Dataset({
features: ['image', 'labels'],
num_rows: 5400
})
train: Dataset({
features: ['image', 'labels'],
num_rows: 12600
})
})
>>> ds["train"].features
{'image': Image(decode=True, id=None),
'labels': ClassLabel(num_classes=15, names=['calling', 'clapping', 'cycling', 'dancing', 'drinking', 'eating', 'fighting', 'hugging', 'laughing', 'listening_to_music', 'running', 'sitting', 'sleeping', 'texting', 'using_laptop'], id=None)}
>>> ds["train"][0]
{'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=240x160>,
'labels': 11}
- Downloads last month
- 124