1. 18 Apr, 2009 3 commits
    • mru's avatar
      PPC asm for AV_RL*() · 5d43440a
      mru authored
      PPC is normally big endian but has special little endian load/store
      instructions.  Using these avoids a separate byteswap.  This makes the
      vorbis decoder about 5% faster.  Not much else uses little-endian
      read/write extensively.
      
      GCC generates horrible PPC code for the default AV_[RW]B64 (which uses
      a packed struct), so we override it with a plain pointer cast.
      
      git-svn-id: file:///var/local/repositories/ffmpeg/trunk@18602 9553f0bf-9b14-0410-a0b8-cfaf0461ba5b
      5d43440a
    • mru's avatar
      ARM asm for AV_RN*() · b2ff3961
      mru authored
      ARMv6 and later support unaligned loads and stores for single
      word/halfword but not double/multiple.  GCC is ignorant of this and
      will always use bytewise accesses for unaligned data.  Casting to an
      int32_t pointer is dangerous since a load/store double or multiple
      instruction might be used (this happens with some code in FFmpeg).
      Implementing the AV_[RW]* macros with inline asm using only supported
      instructions gives fast and safe unaligned accesses.  ARM RVCT does
      the right thing with generic code.
      
      This gives an overall speedup of up to 10%.
      
      git-svn-id: file:///var/local/repositories/ffmpeg/trunk@18601 9553f0bf-9b14-0410-a0b8-cfaf0461ba5b
      b2ff3961
    • mru's avatar
      Reorganise intreadwrite.h · 9e809645
      mru authored
      This changes intreadwrite.h to support per-arch implementations of the
      various macros allowing us to take advantage of special instructions
      or other properties the compiler does not know about.
      
      git-svn-id: file:///var/local/repositories/ffmpeg/trunk@18600 9553f0bf-9b14-0410-a0b8-cfaf0461ba5b
      9e809645
  2. 17 Apr, 2009 37 commits