用Python刻一個深度學習圖片重點裁切系統 – 何明洋 (PyCon Taiwan 2021)


Salient object detection ( Zhao, T., & Wu, X. (2019). Pyramid feature attention network for saliency detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3085-3094). Face recognition ( Chinese OCR ( with PSENet (Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., & Shao, S. (2019). Shape robust text detection with progressive scale expansion network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9336-9345).) 以上模型皆已訓練完畢並也提供了所有參數權重,使用者僅須從github clone下來或是從PyPI安裝使用,不須花時間重新訓練模型,故可以很快將整個架構建起來。

演講過程首先先提出許多社群媒體有圖片裁切需求但不可能在使用者上傳圖片時都有人工幫忙裁切,因此有一個自動裁切系統將會有所助益。然而若單一使用Salient object detection模型(previous work),那結果會忽略許多文字且著重奇怪的物件,故後續會需要加入臉部與文字偵測,藉由三者的輸出以不同權重疊加,最後利用dynamic programming計算疊加權重最大的矩陣區域,來得到最終裁切結果,另外會分享在不同情境下三個權重該如何調整。


Slides not uploaded by the speaker.

Speaker: 何明洋

A passionate data scientist and full stack developer who excels at solving practical problems, especially in 2D/3D CV, audio, and medical signal, by designing ML/DL algorithms and building full-stack web applications to provide service. In addition, I am also a graphic designer and clinical pharmacist familiar with psychiatry.


