The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().
The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration).
With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.
In drm/i915/gt/uc/intel_us_fw.c, the function intel_uc_fw_copy_rsa()
just use the mapping to do memory copy so it doesn't need to disable
pagefaults and preemption for mapping. Thus the local mapping without
atomic context (not disable pagefaults / preemption) is enough.
Therefore, intel_uc_fw_copy_rsa() is a function where the use of
memcpy_from_page() with kmap_local_page() in place of memcpy() with
kmap_atomic() is correctly suited.
Convert the calls of memcpy() with kmap_atomic() / kunmap_atomic() to
memcpy_from_page() which uses local mapping to copy.
[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com/T/#u
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231203132947.2328805-8-zhao1.liu@linux.intel.com
Due to a change in the auth flow on MTL, GuC 70.7.0 and newer will only
be able to authenticate HuC 8.5.1 and newer. The plan is to update the 2
binaries synchronously in linux-firmware so that the fw repo always has
a matching pair that works; still, it's better to check in the kernel so
we can print an error message and abort HuC loading if the binaries are
out of sync instead of failing the authentication.
v2: Add clarification comment, fix typo in commit msg, clean up variable
declaration (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> #v1
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230717234905.117114-1-daniele.ceraolospurio@intel.com
The release and security versions of the GSC binary are not used at
runtime to decide interface compatibility (there is a separate version
for that), but they're still useful for debug, so it is still worth
extracting them and printing them out in dmesg.
To get to these version, we need to navigate through various headers in
the binary. See in-code comment for details.
v2: fix and improve size checks when crawling the binary header, add
comment about the different version, wrap the partition base/offset
pairs in the GSC header in a struct (Alan)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230612181529.2222451-3-daniele.ceraolospurio@intel.com
In the previous patch we extracted the offset of the legacy-style HuC
binary located within the GSC-enabled blob, so now we can use that to
load the HuC via DMA if the fuse is set that way.
Note that we now need to differentiate between "GSC-enabled binary" and
"loaded by GSC", so the former case has been renamed to "has GSC headers"
for clarity, while the latter is now based on the fuse instead of the
binary format. This way, all the legacy load paths are automatically
taken (including the auth by GuC) without having to implement further
code changes.
v2: s/is_meu_binary/has_gsc_headers/, clearer logs (John)
v3: split check for GSC access, better comments (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-4-daniele.ceraolospurio@intel.com
The new binaries that support the 2-step authentication contain the
legacy-style binary, which we can use for loading the HuC via DMA. To
find out where this is located in the image, we need to parse the
manifest of the GSC-enabled HuC binary. The manifest consist of a
partition header followed by entries, one of which contains the offset
we're looking for.
Note that the DG2 GSC binary contains entries with the same names, but
it doesn't contain a full legacy binary, so we need to skip assigning
the dma offset in that case (which we can do by checking the ccs).
Also, since we're now parsing the entries, we can extract the HuC
version that way instead of using hardcoded offsets.
Note that the GSC binary uses the same structures in its binary header,
so they've been added in their own header file.
v2: fix structure names to match meu defines (s/CPT/CPD/), update commit
message, check ccs validity, drop old version location defines.
v3: drop references to the MEU tool to reduce confusion, fix log (John)
v4: fix log for real (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> #v2
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-3-daniele.ceraolospurio@intel.com
Now that each FW has its own reserved area, we can keep them always
pinned and skip the pin/unpin dance on reset. This will make things
easier for the 2-step HuC authentication, which requires the FW to be
pinned in GGTT after the xfer is completed.
Since the vma is now valid for a long time and not just for the quick
pin-load-unpin dance, the name "dummy" is no longer appropriare and has
been replaced with vma_res. All the functions have also been updated to
operate on vma_res for consistency.
Given that we pin the vma behind the allocator's back (which is ok
because we do the pinning in an area that was previously reserved for
thus purpose), we do need to explicitly re-pin on resume because the
automated helper won't cover us.
v2: better comments and commit message, s/dummy/vma_res/
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230531235415.1467475-2-daniele.ceraolospurio@intel.com
Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.
From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity. PAT is expected to work much like MOCS already works today,
and by design userspace is expected to select the index that exactly
matches the desired behavior described in the hardware specification.
For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For kernel objects, cache_level is used
for simplicity and backward compatibility. For Pre-gen12 platforms PAT can
have 1:1 mapping to i915_cache_level, so these two are interchangeable. see
the use of LEGACY_CACHELEVEL.
One consequence of this change is that gen8_pte_encode is no longer working
for gen12 platforms due to the fact that gen12 platforms has different PAT
definitions. In the meantime the mtl_pte_encode introduced specfically for
MTL becomes generic for all gen12 platforms. This patch renames the MTL
PTE encode function into gen12_pte_encode and apply it to all gen12. Even
though this change looks unrelated, but separating them would temporarily
break gen12 PTE encoding, thus squash them in one patch.
Special note: this patch changes the way caching behavior is controlled in
the sense that some objects are left to be managed by userspace. For such
objects we need to be careful not to change the userspace settings.There
are kerneldoc and comments added around obj->cache_coherent, cache_dirty,
and how to bypass the checkings by i915_gem_object_has_cache_level. For
full understanding, these changes need to be looked at together with the
two follow-up patches, one disables the {set|get}_caching ioctl's and the
other adds set_pat extension to the GEM_CREATE uAPI.
Bspec: 63019
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-3-fei.yang@intel.com
It was noticed that duplicate entries in the firmware table could cause
an infinite loop in the firmware loading code if that entry failed to
load. Duplicate entries are a bug anyway and so should never happen.
Ensure they don't by tweaking the table validation code to reject
duplicates.
For full m/m/p files, that can be done by simply tweaking the patch
level check to reject matching values. For reduced version entries,
the filename itself must be compared.
v2: Improve comment (review by Daniele)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230502234007.1762014-6-John.C.Harrison@Intel.com
The validation of the firmware table was being done inside the code
for scanning the table for the next available firmware blob. Which is
unnecessary. So pull it out into a separate function that is only
called once per blob type at init time.
Also, drop the CONFIG_SELFTEST requirement and make errors terminal.
It was mentioned that potential issues with backports would not be
caught by regular pre-merge CI as that only occurs on tip not stable
branches. Making the validation unconditional and failing driver load
on detecting of a problem ensures that such backports will also be
validated correctly.
This requires adding a firmware global flag to indicate an issue with
any of the per firmware tables. This is done rather than adding a new
state enum as a new enum value would be a much more invasive change -
lots of places would need updating to support the new error state.
Note also that this change means that a table error will cause the
driver to wedge even on platforms that don't require firmware files.
This is intentional as per the above backport concern - someone doing
backports is not guaranteed to test on every platform that they may
potential affect. So forcing a failure on all platforms ensures that
the problem will be noticed and corrected immediately.
v2: Change to unconditionally fail module load on a validation error
(review feedback/discussion with Daniele).
v3: Add a new flag to track table validation errors (review
feedback/discussion with Daniele).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230502234007.1762014-5-John.C.Harrison@Intel.com
When reduced version firmware files were added (matching major
component being the only strict requirement), the minor version was
still tracked and a notification reported if it was older. However,
the patch version should really be tracked as well for the same
reasons. The KMD can work without the change but if the effort has
been taken to release a new firmware with the change then there must
be a valid reason for doing so - important bug fix, security fix, etc.
And in that case it would be good to alert the user if they are
missing out on that new fix.
v2: Use correct patch version number and drop redunant debug print
(review by Daniele / CI results).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230504202252.1104212-2-John.C.Harrison@Intel.com
When HuC is loaded by GSC, there is no header definition for the kernel
to look at and firmware is just handed to GSC. However when reading the
version, it should still check the size of the blob to guarantee it's not
incurring into out-of-bounds array access.
If firmware is smaller than expected, the following message is now
printed:
# echo boom > /lib/firmware/i915/dg2_huc_gsc.bin
# dmesg | grep -i huc
[drm] GT0: HuC firmware i915/dg2_huc_gsc.bin: invalid size: 5 < 184
[drm] *ERROR* GT0: HuC firmware i915/dg2_huc_gsc.bin: fetch failed -ENODATA
...
Even without this change the size, header and signature are still
checked by GSC when loading, so this only avoids the out-of-bounds array
access.
Fixes: a7b516bd98 ("drm/i915/huc: Add fetch support for gsc-loaded HuC binary")
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230413200349.3492571-1-lucas.demarchi@intel.com
The GuC firmware includes an extra version number to specify the
submission API level. So use that rather than the main firmware
version number for submission related checks.
Also, while it is guaranteed that GuC version number components are
only 8-bits in size, other firmwares do not have that restriction. So
stop making assumptions about them generically fitting in a u16
individually, or in a u32 as a combined 8.8.8.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221129232031.3401386-4-John.C.Harrison@Intel.com
It was noticed that the table order verification step was only being
run once rather than once per firmware type. Fix that.
Note that the long term plan is to convert this code to be a mock
selftest. It is already only compiled in when selftests are enabled.
And the work involved in the conversion was estimated to be
non-trivial. So that conversion is currently low on the priority list.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221122233328.854217-1-John.C.Harrison@Intel.com
Our current FW loading process is the same for all FWs:
- Pin FW to GGTT at the start of the ggtt->uc_fw node
- Load the FW
- Unpin
This worked because we didn't have a case where 2 FWs would be loaded on
the same GGTT at the same time. On MTL, however, this can happen if both
GTs are reset at the same time, so we can't pin everything in the same
spot and we need to use separate offset. For simplicity, instead of
calculating the exact required size, we reserve a 2MB slot for each fw.
v2: fail fetch if FW is > 2MBs, improve comments (John)
v3: more comment improvements (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221108020600.3575467-3-daniele.ceraolospurio@intel.com
The current HuC status getparam return values are a bit confusing in
regards to what happens in some scenarios. In particular, most of the
error cases cause the ioctl to return an error, but a couple of them,
INIT_FAIL and LOAD_FAIL, are not explicitly handled and neither is
their expected return value documented; these 2 error cases therefore
end up into the catch-all umbrella of the "HuC not loaded" case, with
this case therefore including both some error scenarios and the load
in progress one.
The updates included in this patch change the handling so that all
error cases behave the same way, i.e. return an errno code, and so
that the HuC load in progress case is unambiguous.
The patch also includes a small change to the FW init path to make sure
we always transition to an error state if something goes wrong.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Tony Ye <tony.ye@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220928004145.745803-14-daniele.ceraolospurio@intel.com
The earlier update to support reduced versioning of firmware files
introduced an issue with the firmware override module parameter. A
self test would specify an invalid file name (invalid meaning not in
the table) both with and without setting the override flag. The
*non-override* case would cause an infinite loop. I.e. a situation
that is impossible to hit outside of the selftest because either the
file name has come from the table in first place or it came from an
override. However, the override case was also broken in that it would
bypass some of the later processing.
The first fix is to update the scanning loop code so that if an
invalid file is passed in, it will exit rather than loop forever. So
if the impossible situation did somehow occur in the future, it
wouldn't be such a big problem.
The second flips the logic on the override early exit to be negative
rather than positive. That way if an explicit override has been set,
then it won't try to scan for backup options (because there is no
point anyway - the user wanted X and if X is not available, that's
their problem). It also means that it won't skip code that still needs
to be run once a valid firmware file has been selected.
v2: Also remove ANSI colour codes that accidentally got left in an
error message in the original patch.
Fixes: 665ae9c9ca ("drm/i915/uc: Support for version reduced and multiple firmware files")
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220914005821.3702446-2-John.C.Harrison@Intel.com
With the move to un-versioned filenames, it becomes more difficult to
know exactly what version of a given firmware is being used. So add
the patch level version number to the debugfs output.
Also, support matching by patch level when selecting code paths for
firmware compatibility. While a patch level change cannot be backwards
breaking, it is potentially possible that a new feature only works
from a given patch level onwards (even though it was theoretically
added in an earlier version that bumped the major or minor version).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906230147.479945-2-daniele.ceraolospurio@intel.com
There was a misunderstanding in how firmware file compatibility should
be managed within i915. This has been clarified as:
i915 must support all existing firmware releases forever
new minor firmware releases should replace prior versions
only backwards compatibility breaking releases should be a new file
This patch cleans up the single fallback file support that was added
as a quick fix emergency effort. That is now removed in preference to
supporting arbitrary numbers of firmware files per platform.
The patch also adds support for having GuC firmware files that are
named by major version only (because the major version indicates
backwards breaking changes that affect the KMD) and for having HuC
firmware files with no version number at all (because the KMD has no
interface requirements with the HuC).
For GuC, the driver will report via dmesg if the found file is older than
expected. For HuC, the KMD will no longer require updating for any new
HuC release so will not be able to report what the latest expected
version is.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906230147.479945-1-daniele.ceraolospurio@intel.com
This patch re-introduces support for GuC v69 in parallel to v70. As this
is a quick fix, v69 has been re-introduced as the single "fallback" guc
version in case v70 is not available on disk and only for platforms that
are out of force_probe and require the GuC by default. All v69 specific
code has been labeled as such for easy identification, and the same was
done for all v70 functions for which there is a separate v69 version,
to avoid accidentally calling the wrong version via the unlabeled name.
When the fallback mode kicks in, a drm_notice message is printed in
dmesg to inform the user of the required update. The existing
logging of the fetch function has also been updated so that we no
longer complain immediately if we can't find a fw and we only throw an
error if the fetch of both the base and fallback blobs fails.
The plan is to follow this up with a more complex rework to allow for
multiple different GuC versions to be supported at the same time.
v2: reduce the fallback to platform that require it, switch to
firmware_request_nowarn(), improve logs.
Fixes: 2584b3549f ("drm/i915/guc: Update to GuC version 70.1.1")
Link: https://lists.freedesktop.org/archives/intel-gfx/2022-July/301640.html
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220718230732.1409641-1-daniele.ceraolospurio@intel.com
On newer platforms (starting DG2 G10 B-step and G11 A-step), ownership of
HuC loading has been moved from the GuC to the GSC. As part of the
change, the header format of the HuC binary has been updated and does not
match the GuC anymore. The GSC will perform all the required checks on
the binary size, so we only need to check that the version matches.
Note that since we still haven't added any gsc-loaded FWs, the
loaded_via_gsc variable will always be kept to its initialization value
of zero.
v2: Add a note about loaded_via_gsc being zero (Alan)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220504204816.2082588-3-daniele.ceraolospurio@intel.com