대용량 파일 다운 받는 중 끊기면 처음부터 다시 받아야하는 번거로움이 있다. 이때 쓰면 좋은 코드.
여러 세그멘트 단위로 분할하여 다운받는 방식인듯.

https://github.com/spaceromany/resume_download_for_scamps

resume_download_for_scamps
SCAMPS (https://github.com/danmcduff/scampsdataset) consists of many video files. But download URL link does not provide resume download support. Our code is for downloading SCAMPS dataset using python and provides resume function.

Posted by uniqueone
,

Finally a dataset for virtual hair editing and hairstyle classification! https://www.catalyzex.com/paper/arxiv:2102.06288

👇 Free extension to get code for ML papers (❤️' by Andrew Ng) Chrome: https://chrome.google.com/webstore/detail/find-code-for-research-pa/aikkeehnlfpamidigaffhfmgbkdeheil
Firefox: https://addons.mozilla.org/en-US/firefox/addon/code-finder-catalyzex

Posted by uniqueone
,

COCO-WholeBody dataset is the first large-scale benchmark for whole-body pose estimation. It is an extension of COCO 2017 dataset with the same train/val split as COCO.

For project and code/API/expert requests: https://www.catalyzex.com/paper/arxiv:2007.11858

For each person, they annotate 4 types of bounding boxes (person box, face box, left-hand box, and right-hand box) and 133 keypoints (17 for body, 6 for feet, 68 for face and 42 for hands).

Get the free ML code finder browser extension:
Chrome https://bit.ly/code_finder_chrome
Firefox https://bit.ly/code_finder_firefox.

Posted by uniqueone
,

LandCover.ai: Dataset for Automatic Mapping of Buildings, Woodlands and Water from Aerial Imagery

For project and dataset: https://www.catalyzex.com/paper/arxiv:2005.02264

They collected images of 216.27 sq. km lands across Poland, a country in Central Europe, 39.51 sq. km with resolution 50 cm per pixel and 176.76 sq. km with resolution 25 cm per pixel and manually fine annotated three following classes of objects: buildings, woodlands, and water.

Posted by uniqueone
,

Great dataset recently released for the autonomous vehicle industry: Audi Autonomous Driving Dataset (A2D2)!

Link for project and dataset: https://www.catalyzex.com/paper/arxiv:2004.06320

The dataset consists of simultaneously recorded images and 3D point clouds, together with 3D bounding boxes, semantic segmentation, instance segmentation, and data extracted from the automotive bus

Posted by uniqueone
,

Did you know that now it is possible to search for datasets just like searching for images in Google? This makes easier than ever the searching of data to train our machine learning methods.

PS: Remember that as a good practice in data science you always have to clean and prepare any dataset before using it!

https://toolbox.google.com/datasetsearch
#datascience
#machinelearning

Posted by uniqueone
,
안녕하세요, 최근에 SKTBrain에서 공개한 KoBERT를 이용해서 간단한 한국어 객체명 인식기를 만들어봤습니다. NER에 관심있는 분들은 한 번 보셔도 좋을 것 같습니다

기존 CNN-BiLSTM 보다 학습도 빠르고, LM을 학습해서 그런지 오타에도 좀 더 강건한 편인것 같습니다. (형태소 태그 자질이 없어도 NER이 잘되는건 진짜 좀 신기하네요) CRF 붙이면 성능도 좀 더 좋아지는 것 같습니다.

데이터는 한국해양대학교 자연언어처리 연구실에서 공개한 데이터를 사용했습니다 (NER 데이터셋을 구하기 어려운데 괜찮은 데이터셋 같습니다)
https://www.facebook.com/groups/PyTorchKR/permalink/1519149218224754/?sfnsn=mo
Posted by uniqueone
,
DEVIEW2019 Keynote에서 “석상옥 대표님”이 소개해주신 NAVER LABS의 자율주행용 Open dataset입니다. 국내자율주행 기술 성장에 큰 도움이 될 것으로 기대됩니다 : )
https://www.facebook.com/groups/ReinforcementLearningKR/permalink/2324753411097219/?sfnsn=mo
Posted by uniqueone
,
Hi guys,

Do you want to build computer vision models for cattle monitoring?
I the COCO json, masks, and images freely available here: https://nsmb.me/aw0f

I'm planning on sharing more, maybe writing tutorials if anybody is interested. Would love to get your feedback on this. 😊
https://www.facebook.com/groups/1738168866424224/permalink/2415463898694714/?sfnsn=mo
Posted by uniqueone
,