Research

Email Newsletters

Some Papers Cited in My Research

  • Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., Chen, Y., & Temam, O. (2014). DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. SIGARCH Comput. Archit. News, 42(1), 269-284. DOI:10.1145/2654822.2541967
  • Chen, Y. H., Krishna, T., Emer, J. S., & Sze, V. (2017). Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE Journal of Solid-State Circuits, 52(1), 367-379. DOI:10.1109/JSSC.2016.2616357
  • Chen, Y.-H., Emer, J., & Sze, V. (2016). Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks. SIGARCH Comput. Archit. News, 44(3), 367-379. DOI:10.1145/3007787.3001177
  • Dettmers, T. (2015). 8-Bit Approximations for Parallelism in Deep Learning. International Conference On Learning Representations(2), 1-9. Retrieved from http://arxiv.org/abs/1511.04561
  • Farabet, C., Martini, B., Akselrod, P., Talay, S., Lecun, Y., & Culurciello, E. (2010, 2010/05//). Hardware Accelerated Convolutional Neural Networks for Synthetic Vision Systems.
  • Fu, Y., Wu, E., Sirasao, A., Attia, S., Khan, K., & Wittig, R. (2016). Deep Learning with INT8 Optimization on Xilinx Devices White Paper (WP485). 486(WP486 (v1.0.1)), 1-11. Retrieved from www.xilinx.com
  • Fürer, M. (2007, 2007). Faster integer multiplication, San Diego, California, USA.
  • Gangadharan, S., & Churiwala, S. (2013). Constraining designs for synthesis and timing analysis: a practical guide to Synopsys design constraints ({SDC}): Springer.
  • Garland, J., & Gregg, D. (2017). Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks. IEEE Computer Architecture Letters, 16(2), 132-135. DOI:10.1109/LCA.2017.2656880
  • Gupta, S., Agrawal, A., Gopalakrishnan, K., & Narayanan, P. (2015). Deep Learning with Limited Numerical Precision. Retrieved from http://arxiv.org/abs/1502.02551
  • Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M. A., & Dally, W. J. (2016). EIE: Efficient Inference Engine on Compressed Deep Neural Network. Proceedings – 2016 43rd International Symposium on Computer Architecture, ISCA 2016, 243-254. DOI:10.1109/ISCA.2016.30
  • Han, S., Mao, H., & Dally, W. J. (2016). Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. CoRR, abs/1510.00149.
  • He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). DOI:10.1109/CVPR.2016.90
  • Krizhevsky, A., Sutskever, I., Hinton, G. E., & Geoffrey E, H. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances In Neural Information Processing Systems, 1-9. doi:http://dx.doi.org/10.1016/j.protcy.2014.09.007
  • LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1(4), 541-551. DOI:10.1162/neco.1989.1.4.541
  • Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. DOI:10.1109/5.726791
  • Ma, Y., Cao, Y., Vrudhula, S., & Seo, J.-s. (2017). Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks. Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays – FPGA ’17, 45-54. DOI:10.1145/3020078.3021736
  • Sabeetha, S., Ajayan, J., Shriram, S., Vivek, K., & Rajesh, V. (2015, 26-27 Feb. 2015). A study of performance comparison of digital multipliers using 22nm strained silicon technology. Paper presented at the 2015 2nd International Conference on Electronics and Communication Systems (ICECS).
  • Seide, F., Fu, H., Droppo, J., Li, G., & Yu, D. (2014, 2014/09//). 1-Bit Stochastic Gradient Descent and Application to Data-Parallel Distributed Training of Speech DNNs.
  • Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv e-prints. Retrieved from https://ui.adsabs.harvard.edu/abs/2014arXiv1409.1556S
  • Szegedy, C., Wei, L., Yangqing, J., Sermanet, P., Reed, S., Anguelov, D., . . . Rabinovich, A. (2015, 7-12 June 2015). Going deeper with convolutions. Paper presented at the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  • Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., & Cong, J. (2015). Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. Paper presented at the Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, California, USA. https://dl.acm.org/citation.cfm?doid=2684746.2689060

Other Interesting / Useful Papers

  • H. Lee, R. Grosse, R. Ranganath, A. Ng, “Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations,” in Proc. 26th Int. Conf. Mach. Learn., Montreal, Canada, pp 609-616, 2009, DOI: 10.1145/1553374.1553453
  • V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, “Playing Atari with Deep Reinforcement Learning,” in NIPS Deep Learning Workshop, 2014.
  • L. Orseau, S. Armstrong, “Safely Interruptible Agents,” in Proc. 32nd Conf. on Uncertainty in Artificial Intelligence, Jun. 2016.
  • Y. Yang, Y. Li, Y. Aloimonos, C. Fermuller, Y, Aloimonos, “Robot Learning Manipulation Action Plans by “Watching” Unconstrained Videos from the World Wide Web,” in Proc. 29th AAAI Conf. on Artificial Intelligence, 2015, pp. 3686-3692.
  • Stylianos I. Venieris, Alexandros Kouris, and Christos-Savvas Bouganis. 2018. Tool flows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions. ACM Comput. Surv. 51, 3, Article 56 (June 2018), 39 pages. DOI: https://doi.org/10.1145/3186332
  • Baker, B., Kanitscheider, I., Markov, T., Wu, Y., Powell, G., McGrew, B., & Mordatch, I. (2019). Emergent Tool Use From Multi-Agent Autocurricula. ArXiv, abs/1909.07528.