谷歌中国开发者社区 (GDG)
  • 主页
  • 博客
    • Android
    • Design
    • GoogleCloud
    • GoogleMaps
    • GooglePlay
    • Web
  • 社区
    • 各地社区
    • 社区历史
    • GDG介绍
    • 社区通知
  • 视频
  • 资源
    • 资源汇总
    • 精选视频
    • 优酷频道

Supercharge your Computer Vision models with the TensorFlow Object Detection API

2017-06-16adminGoogleDevFeedsNo comments

Posted by Jonathan Huang, Research Scientist and Vivek Rathod, Software Engineer

(Cross-posted on the Google Open Source Blog)

At Google, we develop flexible state-of-the-art machine learning (ML) systems for computer vision that not only can be used to improve our products and services, but also spur progress in the research community. Creating accurate ML models capable of localizing and identifying multiple objects in a single image remains a core challenge in the field, and we invest a significant amount of time training and experimenting with these systems.

Detected objects in a sample image (from the COCO dataset) made by one of our models. Image credit: Michael Miley, original image.

Last October, our in-house object detection system achieved new state-of-the-art results, and placed first in the COCO detection challenge. Since then, this system has generated results for a number of research publications1,2,3,4,5,6,7 and has been put to work in Google products such as NestCam, the similar items and style ideas feature in Image Search and street number and name detection in Street View.

Today we are happy to make this system available to the broader research community via the TensorFlow Object Detection API. This codebase is an open-source framework built on top of TensorFlow that makes it easy to construct, train and deploy object detection models. Our goals in designing this system was to support state-of-the-art models while allowing for rapid exploration and research. Our first release contains the following:

  • A selection of trainable detection models, including:
    • Single Shot Multibox Detector (SSD) with MobileNets
    • SSD with Inception V2
    • Region-Based Fully Convolutional Networks (R-FCN) with Resnet 101
    • Faster RCNN with Resnet 101
    • Faster RCNN with Inception Resnet v2
  • Frozen weights (trained on the COCO dataset) for each of the above models to be used for out-of-the-box inference purposes.
  • A Jupyter notebook for performing out-of-the-box inference with one of our released models
  • Convenient local training scripts as well as distributed training and evaluation pipelines via Google Cloud

The SSD models that use MobileNet are lightweight, so that they can be comfortably run in real time on mobile devices. Our winning COCO submission in 2016 used an ensemble of the Faster RCNN models, which are are more computationally intensive but significantly more accurate. For more details on the performance of these models, see our CVPR 2017 paper.

Are you ready to get started?
We’ve certainly found this code to be useful for our computer vision needs, and we hope that you will as well. Contributions to the codebase are welcome and please stay tuned for our own further updates to the framework. To get started, download the code here and try detecting objects in some of your own images using the Jupyter notebook, or training your own pet detector on Cloud ML engine!

Acknowledgements
The release of the Tensorflow Object Detection API and the pre-trained model zoo has been the result of widespread collaboration among Google researchers with feedback and testing from product groups. In particular we want to highlight the contributions of the following individuals:

Core Contributors: Derek Chow, Chen Sun, Menglong Zhu, Matthew Tang, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, Jasper Uijlings, Viacheslav Kovalevskyi, Kevin Murphy

Also special thanks to: Andrew Howard, Rahul Sukthankar, Vittorio Ferrari, Tom Duerig, Chuck Rosenberg, Hartwig Adam, Jing Jing Long, Victor Gomes, George Papandreou, Tyler Zhu

References

  1. Speed/accuracy trade-offs for modern convolutional object detectors, Huang et al., CVPR 2017 (paper describing this framework)
  2. Towards Accurate Multi-person Pose Estimation in the Wild, Papandreou et al., CVPR 2017
  3. YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video, Real et al., CVPR 2017 (see also our blog post)
  4. Beyond Skip Connections: Top-Down Modulation for Object Detection, Shrivastava et al., arXiv preprint arXiv:1612.06851, 2016
  5. Spatially Adaptive Computation Time for Residual Networks, Figurnov et al., CVPR 2017
  6. AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions, Gu et al., arXiv preprint arXiv:1705.08421, 2017
  7. MobileNets: Efficient convolutional neural networks for mobile vision applications, Howard et al., arXiv preprint arXiv:1704.04861, 2017



Source: Supercharge your Computer Vision models with the TensorFlow Object Detection API

除非特别声明,此文章内容采用知识共享署名 3.0许可,代码示例采用Apache 2.0许可。更多细节请查看我们的服务条款。

Tags: AdWords

Related Articles

3 Ways to Customize your Firebase Dynamic Links

2018-06-13admin

Announcing Cloud IoT Core public beta

2017-09-27admin

Three steps to help secure Elasticsearch on Google Cloud Platform

2017-02-23admin

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">

Recent Posts

  • Learning to Generalize from Sparse and Underspecified Rewards
  • Enabling connected transformation with Apache Kafka and TensorFlow on Google Cloud Platform
  • arXiv LaTeX cleaner: safer and easier open source research papers
  • Five new investments for the Google Assistant Investments program
  • Expanding target API level requirements in 2019

Recent Comments

  • Chen Zhixiang on Concurrent marking in V8
  • admin on 使用 Android Jetpack 加快应用开发速度
  • 怪盗kidou on 使用 Android Jetpack 加快应用开发速度
  • 鸿维 on Google 帐号登录 API 更新
  • admin on 推出 CVPR 2018 学习图像压缩挑战赛

Archives

  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • January 1970

Categories

  • Android
  • Design
  • Firebase
  • GoogleCloud
  • GoogleDevFeeds
  • GoogleMaps
  • GooglePlay
  • Google动态
  • iOS
  • Uncategorized
  • VR
  • Web
  • WebMaster
  • 社区
  • 通知

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org

最新文章

  • Learning to Generalize from Sparse and Underspecified Rewards
  • Enabling connected transformation with Apache Kafka and TensorFlow on Google Cloud Platform
  • arXiv LaTeX cleaner: safer and easier open source research papers
  • Five new investments for the Google Assistant Investments program
  • Expanding target API level requirements in 2019
  • The service mesh era: Securing your environment with Istio
  • Launchpad Accelerator Mexico now accepting startup applications
  • On the Path to Cryogenic Control of Quantum Processors
  • Re-thinking federated identity with the Continuous Access Evaluation Protocol
  • Real-time diagnostics from nanopore DNA sequencers on Google Cloud

最多查看

  • 谷歌招聘软件工程师 (21,052)
  • Google 推出的 31 套在线课程 (20,152)
  • 如何选择 compileSdkVersion, minSdkVersion 和 targetSdkVersion (18,793)
  • Seti UI 主题: 让你编辑器焕然一新 (12,700)
  • Android Studio 2.0 稳定版 (8,963)
  • Android N 最初预览版:开发者 API 和工具 (7,935)
  • 像 Sublime Text 一样使用 Chrome DevTools (5,951)
  • Google I/O 2016: Android 演讲视频汇总 (5,520)
  • 用 Google Cloud 打造你的私有免费 Git 仓库 (5,511)
  • 面向普通开发者的机器学习应用方案 (5,201)
  • 生还是死?Android 进程优先级详解 (4,972)
  • 面向 Web 开发者的 Sublime Text 插件 (4,141)
  • 适配 Android N 多窗口特性的 5 个要诀 (4,106)
  • 参加 Google I/O Extended,观看 I/O 直播,线下聚会! (3,477)
© 2018 中国谷歌开发者社区 - ChinaGDG