Skip to content

Releases: intel/intel-graphics-compiler

igc-1.0.7423

18 May 13:29
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • API option to control per-thread memory,
  • Add GenXAggregatePseudoLowering to the list of GenX passes,
  • Add helper macro for tablegen,
  • Add missing dependencies on llvm libs for VC,
  • Add missing dl to vc driver link libraries,
  • Add missing library for vc codegen,
  • Add option 'EnableDivergentBarrierCheck' to check for barriers that may be in divergent control flow,
  • Add option to strip debug info from llvm IR,
  • Add support for function pointer relocation to global/constant buffer. Save relocation data to module metadata, to be patched with actual function address by runtime,
  • Added missing header in preparation of LLVM 11,
  • Added more passes to igc_opt,
  • After IGCInstructionCombiningPass address of an indirectly called function is used inside 'combined' store instruction. In order to process by further passes such constant expressions has to be splitted,
  • Allow cmd arg, registered in passInfo, to be used as pass names in IGC keys PrintAfter and PrintBefore,
  • Allow float to packed half-float move on select platforms, Nth try,
  • Avoid same space AS casts in LowerGPCallArg,
  • BCR tunning,
  • Bug fixes to enable function pointers directly passed by FE (Commit attempt #3),
  • Change default stateless private size,
  • Change encoding of return register location in each function's epilogue in FDE,
  • Cleanup old emulation functions to preserve compatibility with old ISPC,
  • Correction in LLD build,
  • Cosmetic fixes for VC emu boilerplate generator,
  • Detect local to generic pointer casts, second try,
  • Disable Read suppression with single IGC key,
  • Do not load zero values from genx.alloca,
  • Dump just after each pass execution,
  • Enable ZWDelta in Code Patch,
  • Enable immediate pool for cmp instruction,
  • Enabling preRA_Schedule in default ForceFastestSIMD due to the ACOdyssey regression in IGC-4149,
  • Expanding BufferType Buffer Type range is set to 32 instead of 16,
  • Fix CSEL before EOT, there may be atomic URB inst,
  • Fix Travis environmental error, libc6,
  • Fix Ubuntu build instruction,
  • Fix bug where class type was encoded as struct type in dwarf,
  • Fix calculation of TPM address offset and emit warning if allocated space is not enough,
  • Fix debug info reader for static members,
  • Fix debug line info in GenericAddressDynamicResolution,
  • Fix erroneous unary "~" implementation for cm-cl vector,
  • Fix frame destruction boundary condition,
  • Fix incorrect CISA offset attached to EOT,
  • Fix lowering shader interpreted values (GS),
  • Fix packed immediate handling on platforms that don't have byte regioning,
  • Fix phi nodes coalescing. In the case of indirectbr instruction several phi nodes may have the same PHICPY segment. Coalescing analysis was not ready for this and it led to excess copies,
  • Fix stack mem option parsing,
  • Fix the bug in flag register spill/fill clean up,
  • Fix the optimization EnableMergeTransposeSLM,
  • Fixing DebugLoc's in VectorPreProcess pass,
  • Flip DispatchAlongY override setting to reflect new default,
  • For padding constant/global buffers, use stringstream width/fill instead of writing characters one by one,
  • Handle convert jmpi to goto correctly on platforms that don't support predCtrl width,
  • Handling ConstantInt and PtrToInt in evaluating constant address Also adding a new helper function for retreiving constant address,
  • High-Level Load/Store G4IR support,
  • Implement OpTypeBufferSurfaceINTEL in the old SPIRV-LLVM-Translator,
  • Implement more efficient emulation for floating point global atomics,
  • Improve jump codegen by setting uniform if jump's flag is workgroup/global uniform under EU fusion,
  • Included lldELF library to IGC solution,
  • Initial implementation of 64-bit integer division routines for VC backend,
  • Introducing lowering diagnostics kind,
  • Keep pass disabled in this checkin,
  • Limit the display of the warning for the ShowFullVectorsInShaderDumps flag to one in the console,
  • Link debug info with required llvm libraries,
  • Make VC option parser IGC top level component,
  • Make emitStateRegID() inputs easier to read,
  • Make run()/reset() members private. Add inProgress flag to avoid recomputation when it is already in progress causing stack overrun,
  • Merge stores/loads from different SLM buffers,
  • Minor improvement to TGL workaround,
  • Minor improvements to split aligned scalar pass,
  • Move code to strip debug info before CheckInstrType pass,
  • Move common LLVM build setup to one placeo,
  • Move dependent instructions for ZWDelta into the payload section,
  • Move link libraries from plugin to codegen library,
  • Move linux backend plugin code to separate file,
  • Need to check Bti Value is constant first,
  • Optimize mergeScalar pass to benefit more cases, third try,
  • Option for explicit stateless private size,
  • Option for globals localization configuration. More modes of localization is supported,
  • Packetizer need to handle the argument with an addrspacecast as its use,
  • Pass llvm options using compilation options,
  • Pre-allocate all R1Lo aliases,
  • Re-enable IGC registry key TotalGRFNum,
  • Remember backend config in genx module,
  • Remove debug llvm options parsing in VC,
  • Report unsupported SPIRV opcode,
  • Requested by debugger, copy "sret" argument to return register upon stack function exit, such that debugger can query the return value of the function. Also added the "sret" attribute to implicit vector pointer argument used to represent the return value as specified in the IGC call convention ABI,
  • Require mad dst to be aligned in split aligned scalar pass,
  • Resolve circular dependency in vc metadata headers,
  • Same code style of 'const' order,
  • Set Payload LiveOut as Output to prevent reallocation,
  • Set default globals localization to always localize vectors,
  • Simplify legalization of store instructions with constants,
  • Simplifying the geometry shader lowering pass,
  • Some crean up passes, which were added after LateInlineUnmaskedFunc=1, created complex or combined instructions. These instructions have to split with BreakConstantExpr pass,
  • Support build of spirv translator with prebuilt LLVM,
  • Support load inst in GenXAggregatePseudoLowering,
  • Switch to OCL conformant return value in VC printf,
  • Switch to use InstVisitor in GenXAggregatePseudoLowering,
  • Transform SPV_INTEL_optimization_hints into SPV_KHR_expect_assume,
  • Try to insert VectorUniform intrinsic to the lowest common dominator of all loads and stores,
  • Update addSamplerFlushBeforeEOT with additional HW requirements,
  • Update the insertInstLabel to fixEndifWhileLabels,
  • Use cmake argument parser in build bif function,
  • Use link wrapper to filter link command for vc plugin,
  • Use llvm source hook for SPIRV translator,
  • When LateInlineUnmaskedFunc is on the following SROA pass created ShuffleVector instructions which is not supported in code generation. Add legalization pass to expand such instruction into supported ones,
  • When creating a new call, make sure call inst's calling convention matches its callee, otherwise the call would be deleted (by instcombine),
  • When promoting arrays to registers wrong assumption regarding support for fp64 and int64 is made,
  • Workaround for IMAGE_SUPPORT macro definition in Clang,
  • ZEBinary: Add sampler_index to payload arguments,
  • ZEBinary: fallback/assert when encounter inline sampler,
  • ZEBinary: support Rela format in ZEELFObjectBuilder,
  • Other fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.7181

10 May 14:54
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Avoid O(n^2) iteration over kernel declares.
  • Disable bundle conflict reduction if there is spill
  • Change the footprint for OWORD load.
  • Add check for IGC destruction. If IGC static objects are destructed then IGC returns error code to driver.
  • Copy clang sources to handle opencl-clang patching
  • Respect per instruction contraction flag in mad pattern match.
  • Add warning when -cmc option is used for SPIRV path
  • Add BCR support in RA for TGL
  • Enable Shader debug hash code on for dx and ogl adapters by default
  • Emulation inliner means to inline emulation functions only.
  • Instructions using acc operands are not candidates for this optimization as such instructions have alignment restrictions.
  • High-Level Load/Store G4IR support.
  • Fix CM FE Interface

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.7152

27 Apr 14:13
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Proper cleanup after matching reverse sqrt.
  • Set return location register to DW_CFA_undefined in kernel frame.
  • Limit Vec Element in ShaderDump
  • Check if value stored at BLOCK_INDEX_INVOKR_FUNC is NULL FixAddressSpace for PHINode
  • Remove unnecessary legacy code that was creating lot of strings in dwarf.
  • Add globals to cache. These include function arguments.
  • Introduce llvm hooks for LLVM projects
  • Fix the dependence tracking for ACC regsiter
  • Embed debug info in zebin
  • Fix creation of fshl and fshr
  • Fix: Build succeeded despite undefined builtin
  • Handle more patterns in dynamic buffer promotion.
  • Use helper function to handle LLVM components in IGC
  • Fix the bug of forceDebugSWSB
  • Remove dependence tracking for flag register.
  • Move LLVM prebuild handling to IGC cmakes
  • Add simple push for bindless buffers.
  • Fix crash in TypesLegalizationPass when array is return from function call.
  • Create VCDriver library with compilation manager code
  • Noopt attribute nolonger disables inlining without noinline attribute present.
  • Change assumed simd size in determining private memory size per physical thread.
  • Add getBuilder member function in G4_INST class.
  • Match inverse sqrt from division.
  • Refactor optimizing 3d ld instructions.
  • Update the acc sub algorithm to reduce compilation time.
  • Fix the csel instruction inserted after else.
  • Add scan for peephole opt of acc substition.
  • Allow IGC keys PrintAfter/PrintBefore to take a list of pass names.
  • Redesign handling of spirv lib in IGC

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.7076

27 Apr 14:26
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Respect per instruction contraction flag in mad pattern match.
  • Add enable preemption to finalizer flags.
  • Support for SPV_KHR_linkonce_odr in SPIRV Reader.
  • When promoting arrays to registers wrong assumption regarding fp64 and int64 is made.
  • Enhance m_num1DAccesses lookup in CS
  • Enable partial emulation for fp64 div/sqrt for OCL
  • Change interface for revision id information
  • Add possibility to force bindless constant buffers to be untyped.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.7041

19 Apr 13:29
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Keep Fast Math Flags during memory operations simplifications.
  • Allow float to packed half-float move on select platforms, second try.
  • Fix handling saturation patterns.
  • Force private memory to global buffer when generic load/store are present
  • Optionally allow for compilation without payload header.
  • Fix bug with setting of global variable in kernel arg offsets.
  • Fix right bound computation for send destination.
  • Fix in NoMask WA for the last BB.
  • Change unroll threshold for high trip count, nested loops.
  • Support for SPV_INTEL_noopt in OCL adaptor.
  • Fix bugs in expandMulPostSchedule pass.
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6909

12 Apr 13:43
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Added ARF latency for scheduler,
  • Added more dep info in comment for SWSB,
  • Added option to disable VC BiF, disable by default prior to LLVM 9,
  • Added extra functionality to InlineLocalsResolution for identification and removal of unused global variables and all their successive recursive user nodes in the def-use tree,
  • As prebuiltin lib is meant to be neutral to flags, removing llvm.module.flags,
  • DebugInfo should emit several bit_pieces if a variable is larger that a register,
  • Enable float accumualator for sel,
  • Enable split memory fence operations,
  • Fixed build break on Clang, error: unknown warning group,
  • Fixed emission of debug info for VC in the presence of indirect calls,
  • Fixed emit of InsertElement of uniform vector,
  • Fixed indentation in CMakes,
  • Fixed issue where ResolveGAS pass caused removal of a specific instructions,
  • Fixed legalization of stores operating on composite types,
  • Fixed missing debug info links when creating Gen specific intrinsics,
  • Fixed spill mem size calculation in VC,
  • Implemented support for both SPV-IR forms of atomic builtins and OpControlBarrier,
  • Improved readability of debug info codebase (NFC),
  • Initial support of CM-CL BiF, printf resolution in VC,
  • Made dump() no arg function. Add dumptofile(filename),
  • Made VC PressureTracker aware of DataLayout,
  • New, more accurate implementation of lgamma & tgamma,
  • Normalize BE_FP and BE_SP when interpreting them as they are in oword,
  • Re-enable memory fence scheduling and do not schedule it beyond branches,
  • Refactoring in GenX,
  • Removed a power-of-two lookup table,
  • Removed OpSource language check assert,
  • Renamed function attribute "IndirectlyCalled" to "referenced-indirectly" to match SPIRV FE,
  • Separate CMake utilities from main IGC list,
  • Speed up GenXLiveness analysis,
  • Support for Nontemporal MemoryAccess in SPIRVReader,
  • Support import/export SPIRV linkage for indirect calls.
  • Support legacy IR without constant addrspace for printf,
  • Support prinf with args in GenXPrintfResolution,
  • Update copyright headers,
  • Use std::decay instead of custom functor in Frontend.h,
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6812

30 Mar 12:40
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Implement support for both SPV-IR forms of OpIsNan, OpIsInf, OpIsFinite and OpIsNormal builtins.
  • Implement support for both SPV-IR forms of OpLessOrGreater, OpOrdered and OpUnordered builtins.
  • Add option to schedule fence commit move.
  • Fixed alignment processing in clone helper functions.
  • Adding framework for error flag for catching uninitialized variables.
  • Adding warning flag for unitialized variables in Compiler project and cleaning up needed issues.
  • Unify tblgen detection in VC.
  • Move UnreachableHandling pass after all LowerSwitch pass runs.
  • Wrap CM-CL library to support clang-9.
  • fcl options string must start with "-cmc" to invoke CM frontend.
  • Incrementally apply pattern match transforms.
  • Dispatch along y optimization - phase one.
  • Added XeHP SDV to platfom enum.
  • Support "%=" string format for labels in InlineAsm. Transforms this special format string into a unique label suffix for that asm block.
  • Add a key: EnableL3FlushForGlobal, to control L3 flush.
  • Redesign stackcalls codegen in VC.
  • Fix for optimized compilation with debug info.
  • Enable accumulator usage for sel instruction.
  • Skip step 5 in LowerGPCallArg only when processing function with variable number of arguments.
  • Reimplement workgroup reduce, scan_inclusive and scan_exclusive using subgroups.
  • Added new passes to igc_opt.
  • Implement support for both SPV-IR forms of OpIsNan, OpIsInf, OpIsFinite and OpIsNormal builtins.
  • Changed the naming scheme used by VC to produce debug info dumps.
  • Implement support for both SPV-IR forms for OpAny/OpAll builtins.
  • Clone routine should make sure that alignment is set correctly.
  • Simplifying code related to sample and texel fetch instructions.
  • Remove unused included header.
  • ZEBinary: Add a regkey to disable printf support.
  • Do stateful transformation for non-gep ptr.
  • IGA: Add new kv apis and some refactoring.
  • Mov cleanupBindless after LVN.
  • Initial CMCL Support library and tool implementation.
  • Do not promote svm gather/scatter w/ mismatched types.
  • Decide emission of pre-fills for spills based on presence of corresponding pseudo kill or def count of spill.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6748

30 Mar 12:36
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • DebugInfo - changed code layout and added few asserts to line info emission
  • Add option to skip memory fence commit.
  • This is a minor change to allow stateful transformation for non-gep pointer.
  • Available externally OCLInlineThreshold option
  • Enable code patching by default CodePatch=2
  • Remove bunch of outdated CMake code
  • Remove unused function from BiF
  • VC can now dump asm for indirectly-called functions
  • Add gen11 and gen12 bindless system routines
  • Added check for induction variable sext in Simd32Profitability
  • Change unreachable instructions to "return undef"
  • Apply the same skipping rules for step 1 and step 5 of LowerGPCallArg
  • Change OpenCL builtin mad implementation to use fma instruction instead of multiply add.
  • Initial implementation of cm-cl library
  • Link with LLVM target if dylib is required
  • Switch TPM to SVM entirely
  • Moving opencl-clang discovery code to outer scope to make it available for VC
  • Simplifying code related to sample and texel fetch instructions
  • Generate native sqrt for fast llvm sqrt operation and match reciprocal sqrt
  • Refactor Sub- and Work- group Scan and Reduce

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6712

22 Mar 17:23
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Simplifying PrivateMemoryResolution
  • Simplify SWSB fields in G4_INST
  • Fix debug info link in VISA for caller save/restore code.
  • Fixed stack call implicit arg mismatch between caller/callee.
  • Fix the pseudo kill for RA
  • Improve jump codegen by setting uniform if jump's flag is workgroup/global uniform
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6646

16 Mar 14:11
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Added a key to dump out WIA information into a unique file per each function invocation,
  • Added comments for stack callee function prolog to assist debugging,
  • Added indirect regioning restrictions,
  • Added ld cases for texture folding,
  • Added legalization checks for VxH regions for int-to-fp moves,
  • Added localization of live ranges to reduce accumulator usages,
  • Added shader dump for spec constants,
  • Added support for a new pattern in PushAnalysis::IsStatelessCBLoad() to detect,
  • Added VISA option -dumpintf to dump RA interference graph,
  • Added more passes to igc_opt,
  • Added reporting warnings inside IGC passes,
  • Added handles to plane coefficients,
  • Added reg keys for scheduled BB range in local scheduler,
  • Added additional DP emulation mode,
  • Allow coalescing of spill/fill in presence of stack calls,
  • Allow remat even for operations using NoMaskWA,
  • Appling renaming to linear scan RA spill/fill,
  • Change memory semantics to relaxed for OpenCL 1.x atomics,
  • Decouple VC debug options to allow emission of debug infromation without debuggable kernels,
  • Emit warning about an unsupported debuggability if ZeBin is requested,
  • Enabled Wa16012061344 for read suppresion issue caused by predictor,
  • Enhancements in compiler output,
  • Extended FCL dumps with CMFE options and inputs,
  • Favoring the llvm::BasicBlock name for vISA labels,
  • Fixed build break in Fedora,
  • Fixed emission of debug information for implicit variable locations,
  • Fixed erroneous size calculation for DW_OP_bit_piece,
  • Fixes for media height support in Cisa Builder,
  • Fixes for PosDep MatchMad condition,
  • Fixed logic when LLVM name is the empty string,
  • Fixed missing barrier when inline ASM is used in a kernel,
  • Fixed non-deterministic Function->VisaModule lookup,
  • Fixed PushAnalysis to not create unaligned 64bit runtime value arguments,
  • Fixed the hybrid RA with spill,
  • Fixed the linear scan RA time status,
  • Fixed issue for multiple thread compilation of shaders,
  • Fixed GEP scalarized indexes calculation in CG_LowerGEPForPrivMem pass,
  • For optnone builtins, allow IGC to determine inline/noinline and stackcall/subroutine calls,
  • GenISA ibfe/ubfe constant literal offset may exceed 31,
  • Implemented by value argument linearization,
  • Implemented IGC_ASSERT in IGC/OCLFE,
  • Improved DebugInfo robustness by implementing naive error-handling,
  • Lifted 4K predicate variable restriction on vISA assembly,
  • Made LinearScan default in ForceFastestSIMD,
  • Misc. initial edits to the file parsing code in the global scope,
  • Moving BiF parsing tools to a separate file,
  • Renamed some VC options to have "-vc" prefix instead of "-genx",
  • Reworked setPredicateForDiscard() to not use a temporary register for flag storage,
  • Select phi input in non-overlapping region,
  • Support for function pointer builtins/intrinsics,
  • Support for Function pointer SIMD Variants,
  • Support for uniformly typed read,
  • Unify conditions for llvm::JumpThreading usage,
  • Updated copyright headers,
  • Updated DPEmu,
  • Updated the indirect call info check in SWSB,
  • VC: backend can lower lzd64,
  • VC: debug info fixes for non-standalone kernels,
  • VC: legacy messages legalization to vc-codegen,
  • vISA: add helper function for calla check,
  • vISA: add HWConformity::fixCalla for HW restriction,
  • Simplifying ldrawvector to ldrawindex when we have a case where only one element is being used and we know the offset is a constant integer value,
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.