2020

  1. A 7.3 M Output Non-Zeros/J, 11.7 M Output Non-Zeros/GB Reconfigurable Sparse Matrix–Matrix Multiplication Accelerator Park, Dong-Hyeon, Pal, Subhankar, Feng, Siying, Gao, Paul, Tan, Jielun, Rovinski, Austin, Xie, Shaolin, Zhao, Chun, Amarnath, Aporva, Wesley, Timothy, Beaumont, Jonathan, Chen, Kuan-Yu, Chakrabarti, Chaitali, Taylor, Michael Bedford, Mudge, Trevor, Blaauw, David, Kim, Hun-Seok, and Dreslinski, Ronald G. IEEE Journal of Solid-State Circuits 2020

2019

  1. Evaluating Celerity: A 16-nm 695 Giga-RISC-V Instructions/s Manycore Processor With Synthesizable PLL Rovinski, Austin, Zhao, Chun, Al-Hawaj, Khalid, Gao, Paul, Xie, Shaolin, Torng, Christopher, Davidson, Scott, Amarnath, Aporva, Vega, Luis, Veluri, Bandhav, Rao, Anuj, Ajayi, Tutu, Puscar, Julian, Dai, Steve, Zhao, Ritchie, Richmond, Dustin, Zhang, Zhiru, Galton, Ian, Batten, Christopher, Taylor, Michael B., and Dreslinski, Ronald G. IEEE Solid-State Circuits Letters 2019
  2. A 1.4 GHz 695 Giga Risc-V Inst/s 496-Core Manycore Processor With Mesh On-Chip Network and an All-Digital Synthesized PLL in 16nm CMOS Rovinski, Austin, Zhao, Chun, Al-Hawaj, Khalid, Gao, Paul, Xie, Shaolin, Torng, Christopher, Davidson, Scott, Amarnath, Aporva, Vega, Luis, Veluri, Bandhav, Rao, Anuj, Ajayi, Tutu, Puscar, Julian, Dai, Steve, Zhao, Ritchie, Richmond, Dustin, Zhang, Zhiru, Galton, Ian, Batten, Christopher, Taylor, Michael B, and Dreslinski, Ronald G In 2019 Symposium on VLSI Circuits 2019
  3. A 7.3 M Output Non-Zeros/J Sparse Matrix-Matrix Multiplication Accelerator using Memory Reconfiguration in 40 nm Pal, Subhankar, Park, Dong-hyeon, Feng, Siying, Gao, Paul, Tan, Jielun, Rovinski, Austin, Xie, Shaolin, Zhao, Chun, Amarnath, Aporva, Wesley, Timothy, Beaumont, Jonathan, Chen, Kuan-Yu, Chakrabarti, Chaitali, Taylor, Michael, Mudge, Trevor, Blaauw, David, Kim, Hun-Seok, and Dreslinski, Ronald In 2019 Symposium on VLSI Technology 2019

2018

  1. Progress in a novel architecture for high performance processing Zhang, Zhiwei, Liu, Meng, Liu, Zijun, Du, Xueliang, Xie, Shaolin, Ma, Hong, Ding, Guangxin, Ren, Weili, Zhou, Fabiao, Sun, Wenqin, and others, Japanese Journal of Applied Physics 2018
  2. Fast and efficient deep sparse multi-strength spiking neural networks with dynamic pruning Chen, Ruizhi, Ma, Hong, Xie, Shaolin, Guo, Peng, Li, Pin, and Wang, Donglin In 2018 International Joint Conference on Neural Networks (IJCNN) 2018
  3. FBNA: A Fully Binarized Neural Network Accelerator Guo, Peng, Ma, Hong, Chen, Ruizhi, Li, Pin, Xie, Shaolin, and Wang, Donglin In 2018 28th International Conference on Field Programmable Logic and Applications (FPL) 2018
  4. The BaseJump Manycore Accelerator Network Xie, Shaolin, and Taylor, Michael Bedford arXiv preprint arXiv:1808.00650 2018
  5. Low Latency Spiking ConvNets with Restricted Output Training and False Spike Inhibition Chen, Ruizhi, Ma, Hong, Guo, Peng, Xie, Shaolin, Li, Pin, and Wang, Donglin In 2018 International Joint Conference on Neural Networks (IJCNN) 2018
  6. Extreme Datacenter Specialization for Planet-Scale Computing: ASIC Clouds Xie, Shaolin, Scott, Davidson, Ikuo, Magaki, Moein, Khazraee, Luis, Vega, Lu, Zhang, and Michael, B. Taylor ACM SIGOPS Operating Systems Review 2018
  7. Parallel Polar Encoding in 5G Communication Guo, Yang, Xie, Shaolin, Liu, Zijun, Yang, Lei, and Wang, Donglin In 2018 IEEE Symposium on Computers and Communications (ISCC) 2018
  8. Parallel filtering method and corresponding apparatus Wang, Donglin, Yin, Leizu, Yang, Yongyong, Xie, Shaolin, and Wang, Tao 2018
  9. The Celerity open-source 511-core RISC-V tiered accelerator fabric: Fast architectures and design methodologies for fast chips Davidson, Scott, Xie, Shaolin, Torng, Christopher, Al-Hawai, Khalid, Rovinski, Austin, Ajayi, Tutu, Vega, Luis, Zhao, Chun, Zhao, Ritchie, Dai, Steve, and others, IEEE Micro 2018

2017

  1. Celerity: An open source RISC-V tiered accelerator fabric Ajayi, Tutu, Al-Hawaj, Khalid, Amarnath, Aporva, Dai, Steve, Davidson, Scott, Gao, Paul, Liu, Gai, Lotfi, Atieh, Puscar, Julian, Rao, Anuj, and others, In Hot Chips: A Symposium on High Performance Chips 2017
  2. Experiences Using the RISC-V Ecosystem to Design an Accelerator-Centric SoC in TSMC 16nm Ajayi, Tutu, Al-Hawaj, Khalid, Amarnath, Aporva, Dai, Steve, Davidson, Scott, Gao, Paul, Liu, Atieh, Puscar, Julian, Rao, Anuj, and others, 2017
  3. A self-indexed register file for efficient arithmetical computing hardware Yang, Lei, Xie, Shaolin, Liu, Zijun, Du, Xueliang, and Wang, Donglin In 2017 9th Computer Science and Electronic Engineering (CEEC) 2017
  4. A reconfigurable ASIC-like image polyphase interpolation implementation method Yang, Lei, Guo, Ruoshan, Xie, Shaolin, and Wang, Donglin In 2017 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC) 2017

2016

  1. MaPU: A novel mathematical computing architecture Wang, Donglin, Xie, Shaolin, and others, In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA) 2016
  2. Methods and devices for multi-granularity parallel FFT butterfly computation Wang, Donglin, Wang, Tao, Xie, Shaolin, Hao, Jie, and Yin, Leizu 2016
  3. Data access method and device for parallel FFT computation Xie, Shaolin, Wang, Donglin, Lin, Xiao, Hao, Jie, Xue, Xiaojun, Wang, Tao, and Yin, Leizu 2016
  4. Parallel bit reversal devices and methods Xie, Shaolin, Wang, Donglin, Hao, Jie, Wang, Tao, and Yin, Leizu 2016

2015

  1. Multi-granularity parallel storage system Wang, Donglin, Liu, Zijun, Xue, Xiaojun, Zhang, Xing, Zhang, Zhiwei, and Xie, Shaolin 2015
  2. Multi-granularity parallel storage system and storage Wang, Donglin, Xie, Shaolin, Xue, Xiaojun, Liu, Zijun, and Zhang, Zhiwei 2015