'Deep Learning/Papers2read'에 해당되는 글 119건

  1. 2020.05.18 Separate a target speaker's speech from a mixture of two speakers For project
  2. 2020.05.15 State of the art in lane detection! For project and code or API request: [https:
  3. 2020.05.15 오늘 소개드릴 논문은 흥미로운 응용사례와 같이 설명드리겠습니다. 최근에 보고있는 논문들이 ICLR이나 CVPR 최근 논문 + 실사례 적용을 하는
  4. 2020.05.01 This week's AI Paper Club topic is Deepfakes. We'll cover the technical, philoso
  5. 2020.05.01 From CVPR: Reconstruct photorealistic 3D faces from a single "in-the-wild" imag
  6. 2020.05.01 From CVPR '20: Robust 3D Self-portraits in Seconds https://www.catalyzex.com/pa
  7. 2020.03.18 Softmax Splatting for Video Frame Interpolation (AI Short Paper Summary) Paper:
  8. 2020.03.13 Handwritten word generation with GANs! https://www.profillic.com/paper/arxiv:20
  9. 2020.02.11 Video from 1896 changed to 60fps and 4K! The paper that was used to bring to 60
  10. 2020.02.11 State of the art in image inpainting (see example in the picture!) https://www.
  11. 2020.02.08 State of the art in image translation (guided) https://www.profillic.com/paper/
  12. 2020.01.28 Latest from Microsoft researchers: ImageBERT (for image-text joint embedding) h
  13. 2020.01.21 Latest from Stanford, Adobe and IIT researchers: State of the art in Virtual Try
  14. 2020.01.13 State of the art- Photoshop faces with hand sketches! https://www.profillic.com
  15. 2020.01.09 Slack group for implementing and discussing this paper: https://www.profillic.co
  16. 2020.01.08 Slack group for anyone interested in implementing and discussing this paper: htt
  17. 2020.01.08 Anyone interested in implementing or discussing this paper: https://www.profilli
  18. 2020.01.07 안녕하세요, Cognex Deep Learning Lab KR(수아랩)의 이호성입니다. 2020년 4월 말 에티오피아에서 열리는 ICLR 2
  19. 2020.01.02 Open-source chatbot project from Microsoft! https://www.profillic.com/paper/arx 1
  20. 2019.12.31 Animorphing into Mario (thanks stylegan2 lol) relevant stylegan2 paper: https:/
  21. 2019.12.30 안녕하세요 ! 매주 금요일 밤마다 따끈따끈하고 재밌는 논문으로 찾아오는.. 딥러닝논문읽기모임의 25번째, 2019년 마지막 유튜브 영상이 업로
  22. 2019.12.26 State of the art in deblurring and generating realistic high-resolution facial i
  23. 2019.12.23 GAN that can fix ambient lighting in low-quality flash-lit images! https://www.
  24. 2019.12.23 State of the art in image processing using GANs https://www.profillic.com/paper
  25. 2019.12.16 State of the art in estimating 3D human pose and shape! https://www.profillic.co
  26. 2019.12.16 안녕하세요! TensorFlow-KR 논문읽기 모임 #PR12 의 214번째 논문은 2015년 ICCV에서 발표된 FlowNet: Learnin
  27. 2019.12.11 ICYMI: Beautify your face using a GAN-based architecture! https://www.profillic.com/paper/arxiv:1912.03630
  28. 2019.12.11 Generate 3D Avatars from a Single Image! Wide range of applications from virtual/augmented reality (VR/AR) and telepsychiatry to human-computer interaction and social networks. https://www.profillic.com/paper/arxiv:1912.03455
  29. 2019.12.05 Our paper “Keratinocytic Skin Cancer Detection on the Face using Region-based Convolutional Neural Network” was published on JAMA Dermatology. To my knowledge, the performance of cancer detection was compared with that of dermatologists for the firs..
  30. 2019.12.02 We just released our #NeurIPS2019 Multimodal Model-Agnostic Meta-Learning (MMAML) code for learning few-shot image classification, which extends MAML to multimodal task distributions (e.g. learning from multiple datasets). The code contains #PyTorch imp..

Separate a target speaker's speech from a mixture of two speakers

For project and code or API request: https://www.catalyzex.com/paper/arxiv:2005.07074

(FaceFilter: Audio-visual speech separation using still images)

Done using a deep audio-visual speech separation network. Unlike previous works that used lip movement on video clips or pre-enrolled speaker information as an auxiliary conditional feature, we use a single face image of the target speaker

Posted by uniqueone
,

State of the art in lane detection!
For project and code or API request:
[https://www.catalyzex.com/paper/arxiv:2004.10924](https://www.catalyzex.com/paper/arxiv:2004.10924)

Novel method for lane detection that uses as input an image from a forward-looking camera mounted in the vehicle and outputs polynomials representing each lane marking in the image, via deep polynomial regression

Posted by uniqueone
,

오늘 소개드릴 논문은 흥미로운 응용사례와 같이 설명드리겠습니다. 최근에 보고있는 논문들이 ICLR이나 CVPR 최근 논문 + 실사례 적용을 하는 것 위주로 보고 있는데 이 사례도 꽤나 재미있었습니다.
[응용 사례 - AR-Cut Paste]
우선 첫 번째 동영상을 보시면 얼핏보면 한 10년전에도 하던 ARTag를 인식한 후 사전에 저장해놓은 이미지를 불러와서 맥북과 연동한 것처럼 보입니다. 그런데 실제로는 ARTag가 아니라 saliency maps(관심영역)을 구하고 그 영역을 세밀하게 segmentation하여 그 그림을 맥북으로 전송한 것입니다.
Code : https://github.com/cyrildiagne/ar-cutpaste/tree/clipboard
[U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection]
위 사례에서 메인 물체의 백그라운드를 제거하는 기술은(saliency object detetion -> segmentation) U^2-Net이라는 논문을 베이스로 만들었습니다. 그런데 아쉽게도 해당 논문은 accept 승인중이라 아직 공개가 안되었고 개념도만 오픈되어 있습니다.
그 대신 코드가 미리 공개되어있는데 해당 코드를 통해 아래 첨부된 총, 글씨, 사람을 찾고 깔끔하게 분리해냈습니다. 저자가 조만간 논문을 공개한다고하니 추후 확인해봐야겠지만 공개한 주요 알고리즘별 성능표를 보면 아래 그림과 같이 (아마도) SOTA 성능을 내는 것으로 보입니다. 총 6개 데이터셋을 비교했는데 PASCAL-S를 제외하고 가장 압도적인 성능을 보입니다.
네트워크 구조도 오픈되어있는데 그림만 보면 U-Net들을 모아서 또 하나의 U-Net을 만들어서 나온 결과물을 fuse하여 최종 결과물로 쓰는것으로 보입니다. (U-Net 논문은 이 글 맨 아래에 언급됩니다.)
[BASNet: Boundary-Aware Salient Object Detection]
이전에 해당 저자의 Basenet(CVPR '19) 논문을 보면(9번째 사진) Predict Module에서 1차로 coarse map을 뽑고 Residual Refinement Module에서 refined된 map을 뽑도록 되어있습니다. Predict Module은 U-Net의 아이디어를 많이 쓴것으로 보이는데 Resnet-34를 베이스로 하지만 일부 res-block을 수정했고, RRM 단계에서도 좀 더 하이레벨의 refine값을 얻기위해 더 깊은 모델을 만들어서 적용했습니다. RRM 에 관련해서는 엄청 유명한 논문인 Large kernel matters : improve semantic segmen-tation by global convolutional network. (CVPR '17) 을 참고해보시면 좋습니다.
[추가 논문]
Silency Object Segmetation(Detection) 분야를 이해하기 위해서는 사전에 중요한 논문 2가지를 추가로 보는 것이 좋습니다. 해당 분야는 나온지 꽤 되긴했는데(저도 석사때 관련 논문을 썼습니다;)
Fully Convolutional Networks for Semantic Segmentation (CVPR '15)
Segmentation을 위해서 만든 네트워크에 마지막 dense 부분에 FC-Layer 대신 Conv-Layer로 교체하고 Skip architechture를 제안하여 segmentation에 새로운 방향을 제시한 논문으로 무려 15000회 이상 인용되었습니다. 교체한 이유는 Segmentation시 위치 정보와 이미지 사이즈 등이 중요한데 FC Layer는 위치 정보 유실이나 사이즈 고정등의 이슈가 있어서 이것을 개선하고자 제안했습니다. Receptive Field 개념도 같이 봐두시면 좋습니다.
U-Net: Convolutional Networks for Biomedical Image Segmentation (MICCAI 2015)
특이하게(?) 메디컬 영상 학회에 실렸던 논문입니다. 이 논문도 13800회 이상 인용될 정도로 중요합니다. 맨 마지막 그림을 보시면 왜 U^2-Net 설명할때 언급했는지 아실것 입니다. 이렇게 특이한 네트워크 구조를 가지는 이유는 U자 모양에 왼쪽은 Contracting Path라고해서 입력 이미지를 Down-sampling을 하며 context caption 역할을 합니다.(VGG based). 오른쪽은 Expanding Path로 Up-sampling을 하며 정교한 Localization을 목적으로 합니다. 그리고 Contracting Path에서 Max-Pooling전의 feature map을 Crop 하여 concat을 하여 각 정보를 연결합니다. 그외에도 Augment 등의 공헌이 있었습니다.
혼자 보는용으로 정리해둔건 많은데 공유하려고 정리해서 다시 요약하는데 생각보다 시간이 오래걸리네요. 곧 출근시간이 다가와서 여기에서 마무리하고 또 1-2주 후에 새로운 논문 공유하겠습니다.

Posted by uniqueone
,

https://www.facebook.com/111227746026144/posts/850062715475973/?sfnsn=mo

This week's AI Paper Club topic is Deepfakes. We'll cover the technical, philosophical, legal, political, and social perspectives on it. Tutorial and video out in a few days. Discussion on Discord is Sun, 2-4pm ET. All are welcome to listen in or join the discussion. Join us: https://discord.gg/lex-ai

Specific paper in focus is: Siarohin et al. First Order Motion Model for Image Animation. 2019.
Link: https://aliaksandrsiarohin.github.io/first-order-model-website/

Posted by uniqueone
,

From CVPR: Reconstruct photorealistic 3D faces from a single "in-the-wild" image with an increasing level of detail https://www.catalyzex.com/paper/arxiv:2003.13845

AvatarMe outperforms the existing arts by a significant margin and reconstructs authentic, 4K by 6K-resolution 3D faces from a single low-resolution image that, for the first time, bridges the uncanny valley.

Posted by uniqueone
,

https://www.facebook.com/groups/1738168866424224/permalink/2601094500131652/?sfnsn=mo

From CVPR '20: Robust 3D Self-portraits in Seconds

https://www.catalyzex.com/paper/arxiv:2004.02460

The results and experiments show that the proposed method achieves more robust and efficient 3D self-portraits compared with state-of-the-art methods.

Posted by uniqueone
,

Softmax Splatting for Video Frame Interpolation (AI Short Paper Summary)
Paper: https://arxiv.org/pdf/2003.05534.pdf
Github: https://github.com/sniklaus/softmax-splatting
Short Summary: https://www.marktechpost.com/2020/03/14/softmax-splatting-for-video-frame-interpolation/

Posted by uniqueone
,

Handwritten word generation with GANs!

https://www.profillic.com/paper/arxiv:2003.02567

They are significantly advanced over prior art and demonstrate with qualitative, quantitative and human-based evaluations the realistic aspect of synthetically produced images.

Posted by uniqueone
,

Video from 1896 changed to 60fps and 4K!

The paper that was used to bring to 60fps:

https://www.profillic.com/paper/arxiv:1904.00830

Gigapixel AI was used to bring it to 4K

Posted by uniqueone
,

State of the art in image inpainting (see example in the picture!)

https://www.profillic.com/paper/arxiv:2002.02609

They proposed a dense multi-scale fusion network with self-guided regression loss and geometrical alignment constraint

Posted by uniqueone
,

State of the art in image translation (guided)

https://www.profillic.com/paper/arxiv:2002.01048

Applications in facial expression generation, hand gesture translation, person image generation, cross view image translation etc.

(The proposed SelectionGAN explicitly utilizes the semantic guidance information and consists of two stages)

Posted by uniqueone
,

Latest from Microsoft researchers: ImageBERT (for image-text joint embedding)

https://www.profillic.com/paper/arxiv:2001.07966

(They achieve new state-of-the-art results on both MSCOCO and Flickr30k datasets.)

Posted by uniqueone
,

Latest from Stanford, Adobe and IIT researchers: State of the art in Virtual Try on!

https://www.profillic.com/paper/arxiv:2001.06265

(An efficient framework for this is composed of 2 stages: (1) warping (transforming) the try-on cloth to align with the pose and shape of the target model, and (2) a texture transfer)

Posted by uniqueone
,

State of the art- Photoshop faces with hand sketches!

https://www.profillic.com/paper/arxiv:2001.02890

(The researchers propose Deep Plastic Surgery, a novel sketch-based image editing framework to achieve both robustness on hand-drawn sketch inputs and the controllability on sketch faithfulness)

Posted by uniqueone
,

Slack group for implementing and discussing this paper: https://www.profillic.com/paper/arxiv:1911.08139 "MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets"

Here's your invite link to join the group: https://join.slack.com/t/machinelearningwiki/shared_invite/enQtNTM5NTIwNTk5OTIwLWFmNjY5MjI3YzY4ZjNmMmEwNjk4MDFjMGE2MzE1NTM1ZDBjZTU0YjlhY2Y1ODQyYjQyNWZmODhlNDZmOWU3MGQ

Once you've joined, here's the channel link for this paper: http://machinelearningwiki.slack.com/#marionette

Posted by uniqueone
,

Slack group for anyone interested in implementing and discussing this paper: https://www.profillic.com/paper/arxiv:1912.03455

"Digital Twin: Acquiring High-Fidelity 3D Avatar from a Single Image"

Here's your invite link to join the group:

https://join.slack.com/t/machinelearningwiki/shared_invite/enQtNTM5NTIwNTk5OTIwLWFmNjY5MjI3YzY4ZjNmMmEwNjk4MDFjMGE2MzE1NTM1ZDBjZTU0YjlhY2Y1ODQyYjQyNWZmODhlNDZmOWU3MGQ

Once you've joined, here's the channel link for this paper: http://machinelearningwiki.slack.com/#digitaltwin

Posted by uniqueone
,

Anyone interested in implementing or discussing this paper: https://www.profillic.com/paper/arxiv:1910.08761

"Component Attention Guided Face Super-Resolution Network: CAGFace"

Here's your invite link to join the group:

https://join.slack.com/t/machinelearningwiki/shared_invite/enQtNTM5NTIwNTk5OTIwLWFmNjY5MjI3YzY4ZjNmMmEwNjk4MDFjMGE2MzE1NTM1ZDBjZTU0YjlhY2Y1ODQyYjQyNWZmODhlNDZmOWU3MGQ

Once you've joined, here's the channel link for this paper:

http://machinelearningwiki.slack.com/#cagface

Posted by uniqueone
,

안녕하세요, Cognex Deep Learning Lab KR(수아랩)의 이호성입니다.

2020년 4월 말 에티오피아에서 열리는 ICLR 2020 학회의 accepted paper 가 공개가 되어서, 제 주요 관심사인 Image Recognition과 관련이 있는 21편의 논문을 빠르게 추려서, 한 편 한 편 읽고 정리를 하여 글로 작성해보았습니다.

https://hoya012.github.io/blog/iclr2020-paper-preview/

최대한 간결하게 작성을 하고 싶었으나, 원활한 설명을 위해 한 줄씩 추가하다 보니 글이 다소 길어졌습니다..!

새해를 맞이해서 최신 논문을 읽고 싶은데 어떤 논문을 읽을지 고민이 되시는 분들 아이쇼핑하고 가세요!

Posted by uniqueone
,

Open-source chatbot project from Microsoft!

https://www.profillic.com/paper/arxiv:1912.08904

Macaw: A conversational bot that enables research for tasks such as document retrieval, question answering, recommendation, and structured data exploration

Posted by uniqueone
,

Animorphing into Mario (thanks stylegan2 lol)

relevant stylegan2 paper: https://www.profillic.com/paper/arxiv:1912.04958

Posted by uniqueone
,

안녕하세요 ! 매주 금요일 밤마다 따끈따끈하고 재밌는 논문으로 찾아오는..

딥러닝논문읽기모임의 25번째, 2019년 마지막 유튜브 영상이 업로드 되어 공유합니다!!!!!

이번주 논문은 요즘 GAN에 대해 연구가 활발히 이루어 짐에 따라 여러 도메인 분야에서 활용이 되고 있는데요 실제로 활용이 될때는 여러가지 제약사항이 발생하게 됩니다. 그 중 대표적으로 데이터의 부족이 하나의 큰 이슈중 하나 일텐대요,

이러한 문제를 해결하고자 Single natural Image 데이터를 활용해 generative에 성공한 논문인 ‘singan’을 리뷰해 보았습니다!

이미지처리 팀의 고진호님이 친절하고 자세한 설명 진행해 주시고, 모임회원님들과 발표를 완성해 보았습니다!

매주 유익한 논문리뷰가 이어질 예정이니 관심 가져주시면 감사하겠습니다. 링크 ->

[https://youtu.be/pgYIuA4O95E](https://www.youtube.com/watch?v=pgYIuA4O95E)

Posted by uniqueone
,

State of the art in deblurring and generating realistic high-resolution facial images

https://www.profillic.com/paper/arxiv:1912.10427

(An adversarial network comprising a generator and two discriminators is proposed)

Posted by uniqueone
,

GAN that can fix ambient lighting in low-quality flash-lit images!

https://www.profillic.com/paper/arxiv:1912.08813

Posted by uniqueone
,

State of the art in image processing using GANs

https://www.profillic.com/paper/arxiv:1912.07116

Posted by uniqueone
,

https://www.facebook.com/groups/1738168866424224/permalink/2482023208705449/?sfnsn=mo
State of the art in estimating 3D human pose and shape! https://www.profillic.com/paper/arxiv:1912.05656

Posted by uniqueone
,

안녕하세요! TensorFlow-KR 논문읽기 모임 #PR12 의 214번째 논문은 2015년 ICCV에서 발표된 FlowNet: Learning Optical Flow with Convolutional Networks라는 논문입니다. Optical Flow는 비디오의 인접한 Frame에 대하여 각 Pixel이 첫 번째 Frame에서 두 번째 Frame으로 얼마나 이동했는지의 Vector를 모든 위치에 대하여 나타낸 Map입니다. Video에 Motion을 분석하는 일은 매우 중요하기 때문에, 이러한 Optical Flow 역시 굉장히 중요한 요소 중 하나인데요, 이번 영상에서는 고전적인 Computer Vision에서 쓰였던 다양한 Optical Flow 알고리즘들과, Deep Learning Based로 Optical Flow를 구하는 Neural Network인 FlowNet에 대하여 정리해 봤습니다. 감사합니다!! :)

Youtube link: [https://www.youtube.com/watch?v=Z_t0shK98pM](https://www.youtube.com/watch?v=Z_t0shK98pM)

Paper link : http://openaccess.thecvf.com/content_iccv_2015/html/Dosovitskiy_FlowNet_Learning_Optical_ICCV_2015_paper.html

Slide link : https://www.slideshare.net/HyeongminLee3/pr213-flownet-learning-optical-flow-with-convolutional-networks

Posted by uniqueone
,
ICYMI: Beautify your face using a GAN-based architecture!

https://www.profillic.com/paper/arxiv:1912.03630
https://www.facebook.com/groups/DeepNetGroup/permalink/1028034027589479/?sfnsn=mo
Posted by uniqueone
,
Generate 3D Avatars from a Single Image! Wide range of applications from virtual/augmented reality (VR/AR) and telepsychiatry to human-computer interaction and social networks.

https://www.profillic.com/paper/arxiv:1912.03455
https://www.facebook.com/groups/TensorFlowKR/permalink/1059469937727397/?sfnsn=mo
Posted by uniqueone
,
Our paper “Keratinocytic Skin Cancer Detection on the Face using Region-based Convolutional Neural Network” was published on JAMA Dermatology. To my knowledge, the performance of cancer detection was compared with that of dermatologists for the first time in dermatology. Because most of previous studies were classification studies, preselection of end-user was essential. In addition, there were numerous false positives because training data set did not include enough number of common disorders and normal structures.
With the assistance of R-CNN, we trained neural networks with 1,106,886 image crops to localize and diagnose malignancy. The algorithm detects suspected lesion and shows malignancy score and predicts possible diagnosis (178 disease classes).
We used region-based CNN (faster-RCNN; backbone = VGG-16) as a region proposal module, and utilized CNN (SE-ResNet-50) to choose adequate lesion, and utilized CNN (SE-ResNeXt-50 + SENet) to determine malignancy. We chose a multi-step approach to reduce the dimension of problem (object detection -> classification).
The AUC for the validation dataset (2,844 images from 673 patients comprising 185 malignant, 305 benign, and 183 normal conditions) was 0.910. The algorithm’s F1 score and Youden index (sensitivity + specificity - 100%) were comparable with those of 13 dermatologists, while surpassing those of 20 non-dermatologists (325 images from 80 patients comprising 40 malignant, 20 benign, and 20 normal). We are performing an additional work with large scale external validation data set. The pilot result is similar with this report, so I hope I will submit soon.
Web DEMO (https://rcnn.modelderm.com) of the model is accessible via smartphone or PC, to facilitate scientific communication. Sorry for the slowness of the DEMO because it runs on my personal computer despite of the multi-threading and parallel processing with 2080 x1 and 1070 x1.
Thank you.
Paper : https://jamanetwork.com/journals/jamadermatology/article-abstract/2756346
Screenshot : https://i.imgur.com/2TCkdHf.png
Screenshot : https://i.imgur.com/IEZLfOg.jpg
DEMO : https://rcnn.modelderm.com
https://m.facebook.com/groups/107107546348803?view=permalink&id=1021762028216679&sfnsn=mo
Posted by uniqueone
,
We just released our #NeurIPS2019 Multimodal Model-Agnostic Meta-Learning (MMAML) code for learning few-shot image classification, which extends MAML to multimodal task distributions (e.g. learning from multiple datasets). The code contains #PyTorch implementations of our model and two baselines (MAML and Multi-MAML) as well as the scripts to evaluate these models to five popular few-shot learning datasets: Omniglot, Mini-ImageNet, FC100 (CIFAR100), CUB-200-2011, and FGVC-Aircraft.

Code: https://github.com/shaohua0116/MMAML-Classification

Paper: https://arxiv.org/abs/1910.13616

#NeurIPS #MachineLearning #ML #code
Posted by uniqueone
,