Table 2 - uploaded by Trevor Spiteri
Content may be subject to copyright.
Bit rate and PSNR obtained for the pedestrian area sequence using different search algorithms with QP = 26, Lagrangian optimization, and different search ranges. 

Bit rate and PSNR obtained for the pedestrian area sequence using different search algorithms with QP = 26, Lagrangian optimization, and different search ranges. 

Source publication
Conference Paper
Full-text available
This paper presents a reconfigurable processor and an associated toolset able to generate high-performance and fast motion estimation cores for high-definition video coding applications. The presented tools include a compiler, a cycle-accurate model and a design space exploration framework. With the help of these tools the designer can generate a p...

Context in source publication

Context 1
... results were obtained when using other search algorithms, and when coding differ- ent video sequences. Table 2 compares the results obtained when using 5 different motion estimation search algorithms: exhaustive (full) search, UMH search [5], hexagonal search [4], diamond search, and the search algorithm employed by the Xilinx motion estimation engine [9]. In HD sequences, a relatively small movement by an object translates to mo- tion by a large number of pixels because of the high reso- lution, so performance suffers considerably when the search range is limited. ...

Similar publications

Article
Full-text available
This paper presents a new silicon physical unclonable function (PUF) based on a transient effect ring oscillator (TERO). The proposed PUF has state of the art PUF characteristics with a good ratio of PUF response variability to response length. Unlike RO-PUF, it is not sensitive to the locking phenomenon, which challenges the use of ring oscillator...
Conference Paper
Full-text available
Expanding transcendental functions in a series of Shift-and-Add operations is an alternative to Taylor or Chebyshev series expansions when fixed-point arithmetic with reduced wordlength is required. Typically, reconfigurable arrays do not provide architectural support for shift operations. Instead, shift operations are emulated by either multiplexi...
Article
Full-text available
The acquisition of Global Navigation Satellite Systems (GNSS) signals using code division multiple access (CDMA) can be performed through classical correlation or using a Fourier transform. These methods are well known, but what is missing is a comparison of their performance for a given hardware area or target. The work reported here presents this...
Conference Paper
Full-text available
Reconfigurable computing is a hot topic for research, as the possibilities and the technology offered by the reconfigurable devices improve year after year both in terms of available configurable logic resources and the possibilities offered to exploit them. This has led CAD tools to grow both in complexity and effectiveness. The expertise required...

Citations

Article
The H.264/AVC deblocking filter is becoming the performance bottleneck of H.264/AVC parallelization on many-core platform. Efficient parallelization of the deblocking filter on a many-core platform is challenging, because the deblocking filter has complicated data dependencies, which provide insufficient parallelism for so many cores. Furthermore, parallelization may have significant synchronization and load imbalance overhead. At present, research on the parallelizing deblocking filter on a many-core platform is rare and focuses on data-level parallelization. In this paper, we propose a three-step framework considering task-level segmentation and data-level parallelization to efficiently parallelize the deblocking filter. First, we review the entire deblocking filter process in 4$\,\times\,$4 block edge-level and divide it into two parts: 1) boundary strength computation (BSC) and 2) edge discrimination and filtering (EDF), which increases the parallelism. Then, we apply the Markov empirical transition probability matrix and Huffman tree (METPMHT) to the BSC, which alleviate the load imbalance problem. Finally, we use an independent pixel connected area parallelization (IPCAP) for the EDF, which increases the parallelism and reduces the synchronization. In experiments, we apply our parallel method to the deblocking filter of the H.264/AVC reference software JM15.1 on the Tile64 platform without any Tile64 platform-based optimizations. Compared to the well-known 2D-wavefront method, the proposed method achieves on average 14.85, 17.83, and 10.60 times speed-up for QCIF, CIF, and HD videos using 62 cores, respectively.
Conference Paper
Parallel implementations of motion estimation for high definition videos typically exploit various forms of parallelism (GOP, frame-, slice- and macroblock-level) to deliver real-time throughput. Although parallel implementations deliver real-time throughput, they often suffer from limited flexibility and scalability due to the form of parallelism and architecture used. In this work, we use Group Of MacroBlocks (GOMB) and Intra-MB (IMB) parallelism with a multi-ASIP (Application Specific Instruction set Processor) architecture to provide a flexible and scalable platform for motion estimation of high definition videos. Multiple GOMBs are processed by the ASIPs in parallel (GOMB-level) where each ASIP is equipped with custom instructions to process the pixels of an MB in parallel (IMB-level). The system is flexible and scalable as the number of ASIPs (number of GOMBs) and custom instructions are not fixed, and are determined through design space exploration. We evaluated the multi-ASIP architecture in Tensilica's commercial design environment with varying number of ASIPs (up to nine), and compared hand-coded and automatically generated custom instructions. The results illustrate that systems with three and seven ASIPs delivered real-time throughput of 30 and 60 fps respectively for “pedestrian”, “rush hour” and “tractor” HD1080p video sequences. In addition, the results indicate that the multi-ASIP platform can be extended for even higher resolutions such as Ultra High Definition (UHD) due to its flexibility and scalability.