Skip to content

Releases: intel/intel-graphics-compiler

igc-1.0.6646

16 Mar 14:11
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Added a key to dump out WIA information into a unique file per each function invocation,
  • Added comments for stack callee function prolog to assist debugging,
  • Added indirect regioning restrictions,
  • Added ld cases for texture folding,
  • Added legalization checks for VxH regions for int-to-fp moves,
  • Added localization of live ranges to reduce accumulator usages,
  • Added shader dump for spec constants,
  • Added support for a new pattern in PushAnalysis::IsStatelessCBLoad() to detect,
  • Added VISA option -dumpintf to dump RA interference graph,
  • Added more passes to igc_opt,
  • Added reporting warnings inside IGC passes,
  • Added handles to plane coefficients,
  • Added reg keys for scheduled BB range in local scheduler,
  • Added additional DP emulation mode,
  • Allow coalescing of spill/fill in presence of stack calls,
  • Allow remat even for operations using NoMaskWA,
  • Appling renaming to linear scan RA spill/fill,
  • Change memory semantics to relaxed for OpenCL 1.x atomics,
  • Decouple VC debug options to allow emission of debug infromation without debuggable kernels,
  • Emit warning about an unsupported debuggability if ZeBin is requested,
  • Enabled Wa16012061344 for read suppresion issue caused by predictor,
  • Enhancements in compiler output,
  • Extended FCL dumps with CMFE options and inputs,
  • Favoring the llvm::BasicBlock name for vISA labels,
  • Fixed build break in Fedora,
  • Fixed emission of debug information for implicit variable locations,
  • Fixed erroneous size calculation for DW_OP_bit_piece,
  • Fixes for media height support in Cisa Builder,
  • Fixes for PosDep MatchMad condition,
  • Fixed logic when LLVM name is the empty string,
  • Fixed missing barrier when inline ASM is used in a kernel,
  • Fixed non-deterministic Function->VisaModule lookup,
  • Fixed PushAnalysis to not create unaligned 64bit runtime value arguments,
  • Fixed the hybrid RA with spill,
  • Fixed the linear scan RA time status,
  • Fixed issue for multiple thread compilation of shaders,
  • Fixed GEP scalarized indexes calculation in CG_LowerGEPForPrivMem pass,
  • For optnone builtins, allow IGC to determine inline/noinline and stackcall/subroutine calls,
  • GenISA ibfe/ubfe constant literal offset may exceed 31,
  • Implemented by value argument linearization,
  • Implemented IGC_ASSERT in IGC/OCLFE,
  • Improved DebugInfo robustness by implementing naive error-handling,
  • Lifted 4K predicate variable restriction on vISA assembly,
  • Made LinearScan default in ForceFastestSIMD,
  • Misc. initial edits to the file parsing code in the global scope,
  • Moving BiF parsing tools to a separate file,
  • Renamed some VC options to have "-vc" prefix instead of "-genx",
  • Reworked setPredicateForDiscard() to not use a temporary register for flag storage,
  • Select phi input in non-overlapping region,
  • Support for function pointer builtins/intrinsics,
  • Support for Function pointer SIMD Variants,
  • Support for uniformly typed read,
  • Unify conditions for llvm::JumpThreading usage,
  • Updated copyright headers,
  • Updated DPEmu,
  • Updated the indirect call info check in SWSB,
  • VC: backend can lower lzd64,
  • VC: debug info fixes for non-standalone kernels,
  • VC: legacy messages legalization to vc-codegen,
  • vISA: add helper function for calla check,
  • vISA: add HWConformity::fixCalla for HW restriction,
  • Simplifying ldrawvector to ldrawindex when we have a case where only one element is being used and we know the offset is a constant integer value,
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6410

02 Mar 13:36
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Consider a WA table entry before inserting a flush sampler instruction
  • Location expressions improvements
  • Do not split arithmetic instructions in IGC as vISA will handle it
  • Backing out Simple push algorithm Optimization
  • Fix reg number issue in translate math
  • Changes for -O2. Optimizing non-user functions to save compiling time.
  • Fix the SWSB when there is no send in kernel
  • Add support to generate thread IDs in 2x2 blocks.
  • Seperate global and local variables to reduce compilation time.
  • Don't replace OpDecorate with OpGroupDecorate.
  • Add InferAddressSpacesPass only if needed.
  • Fix crash in SIMD32 mode caused by pseudo_ret instruction's source operand right bound computation.
  • Update DispatchGPGPUWalkerAlongYFirst lookup
  • Changes for -O2. Optimizing non-user functions to save compiling time.
  • Changed ldms and ldmcs convertion to 16bit. Fixed ldmcs usage in users other than 16bit ldms
  • Cleanup unnecessary dynamic allocations.
  • Avoid warning of implicit i64->i32 by forcing explicit conversion.
  • Optimization for signed reminder for constant power of 2 int32.
  • Switch TPM to SVM entirely.
  • Do not modify wrregion input in non-overlapping region optimization.
  • Changed ldms and ldmcs convertion to 16bit. Fixed ldmcs usage in users other than 16bit ldms
  • Avoid warning of implicit i64->i32 by forcing explicit conversion
  • Simplify usage of IGC_BUILD__VC_ENABLED cmake option Change IGC_VC_DISABLED macro to more consistent IGC_VC_ENABLED
  • Removed external dependency on llvm_patches and improved llvm setup in project
  • Produce truncate instead of __builtin_spirv_OpUConvert for not rounded/saturated converts
  • Fix missing barrier when inline ASM is used in a kernel
  • Split uniform into thread uniform, work group uniform, and global uniform. which give us a detail info that could be used to enable better optimization.
  • Set InlineAsm usage per function group, to create correct builder for multiple FGs.
  • Support for stackcalls with InlineAsm by parsing multiple functions in single text stream.
  • Broadcast uniform variables if 'rw' constraint was specified (Inline ASM)
  • Optimize generic pointer load for kernels not using local memory.
  • Bug fix for SWSB when comparing the footprint.
  • Produce truncate instead of __builtin_spirv_OpUConvert for not rounded/saturated converts.
  • Extend GAS phi resolution to all loops, not only top level ones.
  • Remove the dependence between dummy csel instructions.
  • Adds custom iterator class for Function Group. Can iterate through the FunctionGroup class, which uses a 2D vector storage.
  • Split uniform into thread uniform, work group uniform, and global uniform. which give us a detail info that could be used to enable better optimization.
  • Change OpenCL builtin mad implementation to use fma instruction instead of multiply add.
  • Cast Base and Insert parameters to unsigned to avoid sign extension while shifting
  • Add check for compute shaders that may need XYZ walk of thread IDs.
  • ZEBinary: Fix scractch memory buffer creation.
  • If unmasked regions are nested then the most nested intrinsic llvm.genx.GenISA.UnmaskedRegionEnd switched off unmasked code generation, resulting in other embracing nested regions generatedr as masked code.
  • Fix missing barrier when inline ASM is used in a kernel.
  • Extra flag has been added to WIAnalysis Runner to not mark some uniform instructions as random.
  • Added a field to implicit argument structure for stack calls. Modified layout of local ids based on SIMD size.
  • IGA: add disassembler option "--output-on-fail"
  • Fix discovery of inlined DISubprogram nodes
  • Implement support for both SPV-IR forms for BitFieldInsert builtins
  • Introduction of new entry in IGC constant folder for bfrev.
  • Update TracePointerSource() function to detect cases where two different resource pointer values describe the same resource.
  • Vector backend does not support creation of L0 module with external functions. Insert assert in GenXCisaBuilder, explaining that.
  • Take SpillMemOffset into consideration when reporting spill size.
  • Split send has argument no 4, and it can be addr register. Make sure check dependence on src3 as well.
  • Add case when propagating non-generic pointer to store.
  • Disable certain transformations when compiling code for debug.
  • Add -vc-promote-array-alloca-limit knob to control array promotion total size (2nd edition). Force array promotion for CMRT binary.
  • Replace strcat by compound assignment operator
  • Now appropriately handling shl instructions with unsupported types.
  • More fixes to get local RA to honor declare even-alignment.
  • Print SLMsize in compiler output file
  • IGA SWSB refactoring: Unify InstType getter function
  • Fix missing barrier when inline ASM is used in a kernel
  • Extract vc input handling into another function
  • Fix an assertion due to unexpected RAUW with a constant
  • Extend supported subtargets in VC
  • Solve the memory leak issue of SWSB
  • Add control to route some resources to LSC/HDC
  • Fix scratch surface allocation for VC
  • Remove addrspacecast only if there no other uses.
  • Set alwaysinline on invoke kernels. Don't add stack call or indirect call attributes.
  • Extract vc input handling into another function
  • Add interface target for vc intrinsics headers
  • Move stepping into Options instead of a global variable.
  • Add DoNotSpill attribute for vISA variables.
  • ZEBinary: Support buffer_offset implicit argument
  • If all its operands are region invariant, an inst is region invariant.
  • Commit base data structures for implicit argument handling for bindless offsets. Changes in StatelessToBindless promotion will come later.
  • For optnone builtins, allow -O0 flag to determine if we should call them as subroutines or stackcalls.
  • Allow EnableA64WA env variable in Linux relesae mode.
  • BinaryEncodingIGA: fix math pipe instruction check
  • Upgraded error messages with source file locations and names of the kernel causing the error.
  • Implement support for both SPV-IR forms for conversion builtins
  • Prevent redundant lowering attempt during SIMD CF Conformance
  • Now appropriately handling shl instructions with unsupported types.
  • Make sure trivial RA honors even-alignment.
  • ZEBinary: add regkey to enable .bss section for zero-initialized global variables
  • add -vc-promote-array-alloca-limit knob to control array promotion total size
  • Add simplify CFG pass to pass manager to simplify work of LICM
  • Add an option for GenXPromoteArray threshold
  • Debug location expression improvements
  • Reduce memory footprint in GraphColor
  • Fix binary encoding for simd2 align16 instructions
  • Filter out "endif" and "else" when inserting dummy mov.
  • Avoid localization of large data for oclbin, use relocation instead.
  • Add option for TPM memory placement.
  • Correct localization costs for global vectors.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6087

08 Feb 14:57
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Fix a bug when comparing two source regions as type was not considered.
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6083

26 Jan 12:29
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Process legalization of 64-bit moves on VC,
  • Added the support for sending the textureID and dimensions value to the driver,
  • Support for SPV_EXT_shader_atomic_float_min_max,
  • Update the read suppression WA to use less dummy instructions,
  • Move CMABI::doInitialization code in a separate helper analysis,
  • Do not hash label operands, CreateLabel() should always return a new label,
  • Modernize GraphColor code,
  • Optimize to generate mad by promoting src2 from :b to :wv,
  • Fix for creating a dump directory for Linux,
  • Update configuration_flags.md,
  • Fix Phi handling in SIMD CF Conformance,
  • Introduction of new entry in IGC constant folder for bfi,
  • Extend ValueTracker to be able to track inside user functions,
  • Remove omitting zero/undef sample params for cube maps,
  • Bug fixes for O0 inlining heuristic,
  • Corrects a defect where the vISA asm parser erroneously used,
  • Move ocl runtime info to headers,
  • Added missing code for ADL_S and RKL,
  • Fix builtin mangling for OpReadClockKHR,
  • Make sure structurizer uses correct mask offset,
  • Setting FunctionControl to force indirect call now applies to all user functions,
  • Fixed logic error when there exists a reg key which is a substring of another reg key (i.e. ShaderDumpEnable and ShaderDumpEnableAll),
  • Add UMD control to disable higher Simds,
  • Report private memory usage in assembly dump,
  • Other minor fixed and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.5964

05 Jan 15:26
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Process legalization of 64-bit moves on VC backend side.
  • Renumber subroutines after removing unreachable code.
  • Add enviroment variable for OCL debugging options.
  • As urem needs positive operands, srcMod could generate negative operands. To be safe, disable srcMod for urem.
  • The existing I64 shift emu had wrong result if shift amt is not in [31, 0]. The problem is that the inner condition was generated incorrectly.
  • Passing context for code patching.
  • Process legalization of 64-bit moves on VC backend side.
  • Search for genx.output.1 intrinsics not only in return blocks (some execution paths may have callable instead).
  • Fix for DumpToCustomDir flag and logic of ShaderDumpPidDisable flag on Linux.
  • Fix: Wrong pattern is matched in GenSpecificPattern.
  • Add support for SPV_INTEL_long_constant_composite.
  • Process legalization of 64-bit moves on VC backend side.
  • Use simple token allocation algorithm in debug mode.
  • Disable stateless to stateful promotion after 32 promotion.
  • Adding /Ob3 inline expansion option to 64-bit release config for agressive inlining within IGC.
  • Fix coalescing of output arguments after migration to genx.output.1 intrinsic.
  • Support for SPV_EXT_shader_atomic_float_add extension.
  • Guard the FC patch SWSB info generation to save compilation time
  • When a byte is promoted to word, its signedness should remain unchanged.
  • Temp WA to limit kernel name length
  • Initialize address register for indirect addressing if shader has indirect resources accesses i.e. a0 is used in send descriptor.
  • Add support for SPV_INTEL_fp_fast_math_mode in SPIRVReader.
  • Address register initial support.
  • Fix initialization of GenXTidyControlFlow
  • Fix push constant threshold for CFL GT3.
  • When a byte is promoted to word, its signedness should remain unchanged.
  • Fixed performance issues with subroutine inlining heuristic.
  • Move block push constants threshold setting from being the default IGC flag value to CPlatform.
  • Remove -hasRNEAndRenorm and its associated code.
  • int64 mul does not support srcMod
  • Refactored some conditions to make the source code more idiomatic and easier to read.
  • Change of stateless indirect access reporting mechanism.
  • Prevent unnecessary copies generation on GenXCoalescing.
  • Extract code that adds Compute Shder CodeGen passes to a separate function.
  • Change return type of createWrRegion and small refactoring in GenXBaling
  • Hybrid RA with spill
  • Maintain physical pred/succ during CFG BB insert/delete.
  • RA compilation time--remove unnecessary operations build inteference with local RA.
  • Workaround for sampler feedback bug.
  • Optimization for signed scalar division for constant power of 2 int as divided.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.5884

22 Dec 20:52
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Avoid read-modify-write when spilling scalar variables.
  • Open source ROCKETLAKE and ALDERLAKE_S
  • Refactor spill/fill intrinsic to not rely on the execution size passed in.
  • Replace a hot function with templated version for better compile time.
  • Move private memory allocations to SLM.
  • Add preserve CFG and WIA to AdvMemOpt to save time
  • Enable ForceInlineStackCallWithImplArg by default, and -O0 no longer force inlines all function calls.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.5819

16 Dec 15:33
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Moved FP64 math to separate bc, a second attempt,
  • Added debug printouts for DebugInfo,
  • Renamed createSrc to make it less verbose and aligned with createDst,
  • Added processing of new masked gather intrinsics (gather4_masked_scaled2 and gather_masked_scaled2)
  • Moved ModuleAllocaInfo to the header for reuse,
  • Unified inlining heuristic for stackcalls and subroutines,
  • Cleaned up IR debug dumps,
  • Fix for argument indirection with already indirected call,
  • Fixed a bug where spilled dst's size was incorrectly computed in debug mode,
  • VC: Support FP64 BiF after it was supported in scalar backend,
  • Added optimizations for signed division for constant power of 2,
  • Cleaned up in GenXCategory,
  • Fixed subregister offset for spilled destination,
  • IMF LA open-sourcing: Switch back to previous FP32 atan2 implementation,
  • Changed reduce implementation to remove extra barriers,
  • Corrected wrappers in llvm::DIBuilder,
  • Provide the ability to call one kernel from another,
  • Reduced time on LiveVar update,
  • Don't insert branches in loops and on a big amount of samples,
  • Enabled ForceInlineStackCallWithImplArg by default, and -O0 no longer force inlines all function calls,
  • Reduced the RA compilation time-use: replace push_back with emplace_back,
  • Handle optnone builtins with subroutines instead of stackcalls,
  • Other minor fixed and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.5761

08 Dec 12:29
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Added padding between globals when encoding,
  • Added SPIRVDLL_SRC variable which takes prepared sprivdll sources,
  • Added support to emit relocations in debug info,
  • Improved LiveVar time by changing data-structure,
  • Improvements in VC debug info,
  • Increased per-thread stack size for SVM case,
  • Made GenXTidyControlFlow actually preserve liveness,
  • Moved splitStructPhis implementation to the proper place,
  • Optimized generic pointer load for kernels not using local memory,
  • Reduced the RA compilation time,
  • Reduced the redundant interferences caused by function call,
  • Specified type of pointer arithmetic to avoid tagging,
  • Updated patch token version,
  • Utilized genx.gaddr instrinsic for const/global tables,
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.5723

09 Dec 11:19
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Considering uniformness during register pressure estimate,
  • Eliminated name length field restriction,
  • Enabled spill cleanup for fp based spill/fill,
  • Fixed extra option processing for CM online compilation,
  • Fixed image tracking for GetBufferPtr scenario,
  • Fixed spill code generation for spilled dest with non-zero subregister,
  • Fixed the assignment of BTI values in the case of multiple uses,
  • IMF LA open-sourcing,
  • Implemented SPV_INTEL_unstructured_loop_controls extension,
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.5699

30 Nov 15:31
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Added IMF LA math function for FP32
  • Avoid OCL kernel recompilation if there is less than 2% spill/fill
  • Implement support for implicit arguments in stack call functions
  • Fix bug in SPIRV reader to correctly propagate flags
  • Local variables no longer optimized out in off-loaded functions
  • Use ValueTracker to track width and height media block read/write parameter
  • Add support for reading implicit arguments from stack call functions
  • Fix vISA parser error for fcall/fret
  • Optimize to generate mad by promoting src2 from :b to :w

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.