Commit Graph

322 Commits

Author SHA1 Message Date
Rafael J. Wysocki
4649620d94 thermal: core: Make thermal_zone_device_unregister() return after freeing the zone
Make thermal_zone_device_unregister() wait until all of the references
to the given thermal zone object have been dropped and free it before
returning.

This guarantees that when thermal_zone_device_unregister() returns,
there is no leftover activity regarding the thermal zone in question
which is required by some of its callers (for instance, modular driver
code that wants to know when it is safe to let the module go away).

Subsequently, this will allow some confusing device_is_registered()
checks to be dropped from the thermal sysfs and core code.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-11 20:49:53 +01:00
Rafael J. Wysocki
44844db913 thermal: core: Add trip thresholds for trip crossing detection
The trip crossing detection in handle_thermal_trip() does not work
correctly in the cases when a trip point is crossed on the way up and
then the zone temperature stays above its low temperature (that is, its
temperature decreased by its hysteresis).  The trip temperature may
be passed by the zone temperature subsequently in that case, even
multiple times, but that does not count as the trip crossing as long as
the zone temperature does not fall below the trip's low temperature or,
in other words, until the trip is crossed on the way down.

|-----------low--------high------------|
             |<--------->|
             |    hyst   |
             |           |
             |          -|--> crossed on the way up
             |
         <---|-- crossed on the way down

However, handle_thermal_trip() will invoke thermal_notify_tz_trip_up()
every time the trip temperature is passed by the zone temperature on
the way up regardless of whether or not the trip has been crossed on
the way down yet.  Moreover, it will not call thermal_notify_tz_trip_down()
if the last zone temperature was between the trip's temperature and its
low temperature, so some "trip crossed on the way down" events may not
be reported.

To address this issue, introduce trip thresholds equal to either the
temperature of the given trip, or its low temperature, such that if
the trip's threshold is passed by the zone temperature on the way up,
its value will be set to the trip's low temperature and
thermal_notify_tz_trip_up() will be called, and if the trip's threshold
is passed by the zone temperature on the way down, its value will be set
to the trip's temperature (high) and thermal_notify_tz_trip_down() will
be called.  Accordingly, if the threshold is passed on the way up, it
cannot be passed on the way up again until its passed on the way down
and if it is passed on the way down, it cannot be passed on the way down
again until it is passed on the way up which guarantees correct
triggering of trip crossing notifications.

If the last temperature of the zone is invalid, the trip's threshold
will be set depending of the zone's current temperature: If that
temperature is above the trip's temperature, its threshold will be
set to its low temperature or otherwise its threshold will be set to
its (high) temperature.  Because the zone temperature is initially
set to invalid and tz->last_temperature is only updated by
update_temperature(), this is sufficient to set the correct initial
threshold values for all trips.

Link: https://lore.kernel.org/all/20220718145038.1114379-4-daniel.lezcano@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-20 16:59:55 +01:00
Rafael J. Wysocki
8c35b1f472 thermal: core: Pass trip pointer to governor throttle callback
Modify the governor .throttle() callback definition so that it takes a
trip pointer instead of a trip index as its second argument, adjust the
governors accordingly and update the core code invoking .throttle().

This causes the governors to become independent of the representation
of the list of trips in the thermal zone structure.

This change is not expected to alter the general functionality.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-20 19:26:37 +02:00
Rafael J. Wysocki
a26b452e83 Merge branch 'acpi-thermal'
The ACPI thermal driver changes include some thermal core modifications
that are depended on by subsequent thermal core changes, so merge them.

* acpi-thermal: (26 commits)
  thermal: trip: Drop lockdep assertion from thermal_zone_trip_id()
  thermal: trip: Remove lockdep assertion from for_each_thermal_trip()
  thermal: core: Drop thermal_zone_device_exec()
  ACPI: thermal: Use thermal_zone_for_each_trip() for updating trips
  ACPI: thermal: Combine passive and active trip update functions
  ACPI: thermal: Move get_active_temp()
  ACPI: thermal: Fix up function header formatting in two places
  ACPI: thermal: Drop list of device ACPI handles from struct acpi_thermal
  ACPI: thermal: Rename structure fields holding temperature in deci-Kelvin
  ACPI: thermal: Drop critical_valid and hot_valid trip flags
  ACPI: thermal: Do not use trip indices for cooling device binding
  ACPI: thermal: Mark uninitialized active trips as invalid
  ACPI: thermal: Merge trip initialization functions
  ACPI: thermal: Collapse trip devices update function wrappers
  ACPI: thermal: Collapse trip devices update functions
  ACPI: thermal: Add device list to struct acpi_thermal_trip
  ACPI: thermal: Fix a small leak in acpi_thermal_add()
  ACPI: thermal: Drop valid flag from struct acpi_thermal_trip
  ACPI: thermal: Drop redundant trip point flags
  ACPI: thermal: Untangle initialization and updates of active trips
  ...
2023-10-11 17:56:51 +02:00
Dan Carpenter
c99626092e thermal: core: prevent potential string overflow
The dev->id value comes from ida_alloc() so it's a number between zero
and INT_MAX.  If it's too high then these sprintf()s will overflow.

Fixes: 203d3d4aa4 ("the generic thermal sysfs driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-11 16:15:09 +02:00
Rafael J. Wysocki
4963e34ce7 thermal: core: Drop thermal_zone_device_exec()
Because thermal_zone_device_exec() has no users any more and there are
no plans to use it anywhere, revert commit 9a99a996d1 ("thermal: core:
Introduce thermal_zone_device_exec()") that introduced it.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-05 13:32:55 +02:00
Rafael J. Wysocki
d069ed6b75 thermal: core: Allow trip pointers to be used for cooling device binding
Add new helper functions, thermal_bind_cdev_to_trip() and
thermal_unbind_cdev_from_trip(), to allow a trip pointer to be used for
binding a cooling device to a trip point and unbinding it, respectively,
and redefine the existing helpers, thermal_zone_bind_cooling_device()
and thermal_zone_unbind_cooling_device(), as wrappers around the new
ones, respectively.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-09-28 12:57:10 +02:00
Rafael J. Wysocki
2c7b4bfade thermal: core: Store trip pointer in struct thermal_instance
Replace the integer trip number stored in struct thermal_instance with
a pointer to the relevant trip and adjust the code using the structure
in question accordingly.

The main reason for making this change is to allow the trip point to
cooling device binding code more straightforward, as illustrated by
subsequent modifications of the ACPI thermal driver, but it also helps
to clarify the overall design and allows the governor code overhead to
be reduced (through subsequent modifications).

The only case in which it adds complexity is trip_point_show() that
needs to walk the trips[] table to find the index of the given trip
point, but this is not a critical path and the interface that
trip_point_show() belongs to is problematic anyway (for instance, it
doesn't cover the case when the same cooling devices is associated
with multiple trip points).

This is a preliminary change and the affected code will be refined by
a series of subsequent modifications of thermal governors, the core and
the ACPI thermal driver.

The general functionality is not expected to be affected by this change.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-09-28 12:55:29 +02:00
Rafael J. Wysocki
9502108876 thermal: core: Drop trips_disabled bitmask
After recent changes, thermal_zone_get_trip() cannot fail, as invoked
from thermal_zone_device_register_with_trips(), so the only role of
the trips_disabled bitmask is struct thermal_zone_device is to make
handle_thermal_trip() skip trip points whose temperature was initially
zero.  However, since the unit of temperature in the thermal core is
millicelsius, zero may very well be a valid temperature value at least
in some usage scenarios and the trip temperature may as well change
later.  Thus there is no reason to permanently disable trip points
with initial temperature equal to zero.

Accordingly, drop the trips_disabled bitmask along with the code
related to it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-09-25 11:46:19 +02:00
Rafael J. Wysocki
fb2c10245f thermal: core: Fix disabled trip point check in handle_thermal_trip()
Commit bc840ea5f9 ("thermal: core: Do not handle trip points with
invalid temperature") added a check for invalid temperature to the
disabled trip point check in handle_thermal_trip(), but that check was
added at a point when the trip structure has not been initialized yet.

This may cause handle_thermal_trip() to skip a valid trip point in some
cases, so fix it by moving the check to a suitable place, after
__thermal_zone_get_trip() has been called to populate the trip
structure.

Fixes: bc840ea5f9 ("thermal: core: Do not handle trip points with invalid temperature")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-14 21:51:49 +02:00
Rafael J. Wysocki
edd220b33f thermal: core: Drop thermal_zone_device_register()
There are no more users of thermal_zone_device_register(), so drop it
from the core.

Note that thermal_zone_device_register_with_trips() may be renamed to
thermal_zone_device_register() in the future, but only after a grace
period allowing all of the possible work in progress that may be using
the latter to adjust.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-05 21:42:18 +02:00
Rafael J. Wysocki
d332db8fc1 thermal: core: Add function for registering tripless thermal zones
Multiple callers of thermal_zone_device_register() don't pass any trips
to it and they might use a shortened argument list for that, so add
a special function with fewer arguments for this purpose.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-05 21:42:18 +02:00
Rafael J. Wysocki
35d8dbbb25 thermal: core: Drop unused .get_trip_*() callbacks
After recent changes in the ACPI thermal driver and in the Intel DTS
IOSF thermal driver, all thermal zone drivers are expected to use trip
tables for initialization and none of them should implement
.get_trip_type(), .get_trip_temp() or .get_trip_hyst() callbacks, so
drop these callbacks entirely from the core.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-29 20:46:31 +02:00
Rafael J. Wysocki
0c2ec0f165 Merge branch 'acpi-thermal'
Merge ACPI thermal driver changes for 6.6-rc1:

 - Drop non-functional nocrt parameter from ACPI thermal (Mario
   Limonciello).

 - Clean up the ACPI thermal driver, rework the handling of firmware
   notifications in it and make it provide a table of generic trip point
   structures to the core during initialization (Rafael Wysocki).

* acpi-thermal:
  ACPI: thermal: Eliminate code duplication from acpi_thermal_notify()
  ACPI: thermal: Drop unnecessary thermal zone callbacks
  ACPI: thermal: Rework thermal_get_trend()
  ACPI: thermal: Use trip point table to register thermal zones
  thermal: core: Rework and rename __for_each_thermal_trip()
  ACPI: thermal: Introduce struct acpi_thermal_trip
  ACPI: thermal: Carry out trip point updates under zone lock
  ACPI: thermal: Clean up acpi_thermal_register_thermal_zone()
  thermal: core: Add priv pointer to struct thermal_trip
  thermal: core: Introduce thermal_zone_device_exec()
  thermal: core: Do not handle trip points with invalid temperature
  ACPI: thermal: Drop redundant local variable from acpi_thermal_resume()
  ACPI: thermal: Do not attach private data to ACPI handles
  ACPI: thermal: Drop enabled flag from struct acpi_thermal_active
  ACPI: thermal: Drop nocrt parameter
2023-08-25 20:44:26 +02:00
Rafael J. Wysocki
9a99a996d1 thermal: core: Introduce thermal_zone_device_exec()
Introduce a new helper function, thermal_zone_device_exec(), that can
be used by drivers to run a given callback routine under the zone lock.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-17 11:23:32 +02:00
Rafael J. Wysocki
bc840ea5f9 thermal: core: Do not handle trip points with invalid temperature
Trip points with temperature set to THERMAL_TEMP_INVALID are as good as
disabled, so make handle_thermal_trip() ignore them.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-08-10 20:57:35 +02:00
Ahmad Fatoum
80ddce5f2d thermal: core: constify params in thermal_zone_device_register
Since commit 3d439b1a2a ("thermal/core: Alloc-copy-free the thermal zone
parameters structure"), thermal_zone_device_register() allocates a copy
of the tzp argument and callers need not explicitly manage its lifetime.

This means the function no longer cares about the parameter being
mutable, so constify it.

No functional change.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-07-24 09:51:31 +02:00
Daniel Lezcano
7cefbaf081 thermal: core: Encapsulate tz->device field
There are still some drivers needing to play with the thermal zone
device internals. That is not the best but until we can figure out if
the information is really needed, let's encapsulate the field used in
the thermal zone device structure, so we can move forward relocating
the thermal zone device structure definition in the thermal framework
private headers.

Some drivers are accessing tz->device, that implies they need to have
the knowledge of the thermal_zone_device structure but we want to
self-encapsulate this structure and reduce the scope of the structure
to the thermal core only.

By adding this wrapper, these drivers won't need the thermal zone
device structure definition and are no longer an obstacle to its
relocation to the private thermal core headers.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-27 19:20:12 +02:00
Daniel Lezcano
3d439b1a2a thermal/core: Alloc-copy-free the thermal zone parameters structure
The caller of the function thermal_zone_device_register_with_trips()
can pass a thermal_zone_params structure parameter.

This one is used by the thermal core code until the thermal zone is
destroyed. That forces the caller, so the driver, to keep the pointer
valid until it unregisters the thermal zone if we want to make the
thermal zone device structure private the core code.

As the thermal zone device structure would be private, the driver can
not access to thermal zone device structure to retrieve the tzp field
after it passed it to register the thermal zone.

So instead of forcing the users of the function to deal with the tzp
structure life cycle, make the usage easier by allocating our own
thermal zone params, copying the parameter content and by freeing at
unregister time. The user can then create the parameters on the stack,
pass it to the registering function and forget about it.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230404075138.2914680-3-daniel.lezcano@linaro.org
2023-04-07 18:36:28 +02:00
Zhang Rui
ded2d383b1 thermal/core: Remove thermal_bind_params structure
Remove struct thermal_bind_params because no one is using it for thermal
binding now.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230330104526.3196-1-rui.zhang@intel.com
2023-04-07 11:18:22 +02:00
Rafael J. Wysocki
75f74a9071 Merge tag 'thermal-v6.4-rc1-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal control material for 6.4-rc1 from Daniel Lezcano:

"- Add more thermal zone device encapsulation: prevent setting
   structure field directly, access the sensor device instead the
   thermal zone's device for trace, relocate the traces in
   drivers/thermal (Daniel Lezcano)

 - Use the generic trip point for the i.MX and remove the get_trip_temp
   ops (Daniel Lezcano)

 - Use the devm_platform_ioremap_resource() in the Hisilicon driver
   (Yang Li)

 - Remove R-Car H3 ES1.* handling as public has only access to the ES2
   version and the upstream support for the ES1 has been shutdown (Wolfram
   Sang)

 - Add a delay after initializing the bank in order to let the time to
   the hardware to initialze itself before reading the temperature
   (Amjad Ouled-Ameur)

 - Add MT8365 support (Amjad Ouled-Ameur)"

* tag 'thermal-v6.4-rc1-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/ti: Use fixed update interval
  thermal/drivers/stm: Don't set no_hwmon to false
  thermal/drivers/db8500: Use driver dev instead of tz->device
  thermal/core: Relocate the traces definition in thermal directory
  thermal/drivers/hisi: Use devm_platform_ioremap_resource()
  thermal/drivers/imx: Use the thermal framework for the trip point
  thermal/drivers/imx: Remove get_trip_temp ops
  thermal/drivers/rcar_gen3_thermal: Remove R-Car H3 ES1.* handling
  thermal/drivers/mediatek: Add delay after thermal banks initialization
  thermal/drivers/mediatek: Add support for MT8365 SoC
  thermal/drivers/mediatek: Control buffer enablement tweaks
  dt-bindings: thermal: mediatek: Add binding documentation for MT8365 SoC
2023-04-03 20:43:32 +02:00
Rafael J. Wysocki
cd246fa969 thermal: core: Clean up thermal_list_lock locking
Once thermal_list_lock has been acquired in
__thermal_cooling_device_register(), it is not necessary to drop it
and take it again until all of the thermal zones have been updated,
so change the code accordingly.

No expected functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-03 20:40:21 +02:00
Daniel Lezcano
32a7a02117 thermal/core: Relocate the traces definition in thermal directory
The traces are exported but only local to the thermal core code. On
the other side, the traces take the thermal zone device structure as
argument, thus they have to rely on the exported thermal.h header
file. As we want to move the structure to the private thermal core
header, first we have to relocate those traces to the same place as
many drivers do.

Cc: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20230307133735.90772-2-daniel.lezcano@linaro.org
2023-04-01 20:51:45 +02:00
Rafael J. Wysocki
ce07727aff Merge back thermal control material for 6.4-rc1. 2023-03-27 13:46:13 +02:00
Rafael J. Wysocki
6babf38d89 Merge branch 'thermal-acpi'
Merge a fix for a recent thermal-related regression in the ACPI
processor driver.

* thermal-acpi:
  ACPI: processor: thermal: Update CPU cooling devices on cpufreq policy changes
  thermal: core: Introduce thermal_cooling_device_update()
  thermal: core: Introduce thermal_cooling_device_present()
  ACPI: processor: Reorder acpi_processor_driver_init()
2023-03-24 17:11:27 +01:00
Ido Schimmel
f1b80a3878 thermal: core: Restore behavior regarding invalid trip points
Commit 7c3d5c20dc ("thermal/core: Add a generic thermal_zone_get_trip()
function") stopped marking trip points with a zero temperature as
disabled, behavior that was originally introduced in commit 81ad4276b5
("Thermal: Ignore invalid trip points").

When using the mlxsw driver we see that when such trip points are not
disabled, the thermal subsystem repeatedly tries to set the state of the
associated cooling devices to the maximum state.

Address this by restoring the original behavior and mark trip points
with a zero temperature as disabled.

Fixes: 7c3d5c20dc ("thermal/core: Add a generic thermal_zone_get_trip() function")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-22 19:59:08 +01:00
Rafael J. Wysocki
790930f442 thermal: core: Introduce thermal_cooling_device_update()
Introduce a core thermal API function, thermal_cooling_device_update(),
for updating the max_state value for a cooling device and rearranging
its statistics in sysfs after a possible change of its ->get_max_state()
callback return value.

That callback is now invoked only once, during cooling device
registration, to populate the max_state field in the cooling device
object, so if its return value changes, it needs to be invoked again
and the new return value needs to be stored as max_state.  Moreover,
the statistics presented in sysfs need to be rearranged in general,
because there may not be enough room in them to store data for all
of the possible states (in the case when max_state grows).

The new function takes care of that (and some other minor things
related to it), but some extra locking and lockdep annotations are
added in several places too to protect against crashes in the cases
when the statistics are not present or when a stale max_state value
might be used by sysfs attributes.

Note that the actual user of the new function will be added separately.

Link: https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
2023-03-22 15:20:38 +01:00
Rafael J. Wysocki
c43198af05 thermal: core: Introduce thermal_cooling_device_present()
Introduce a helper function, thermal_cooling_device_present(), for
checking if the given cooling device is in the list of registered
cooling devices to avoid some code duplication in a subsequent
patch.

No expected functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
2023-03-22 15:20:38 +01:00
Daniel Lezcano
3034f859b9 thermal: Add a thermal zone id accessor
In order to get the thermal zone id but without directly accessing the
thermal zone device structure, add an accessor.

Use the accessor in the hwmon_scmi and acpi_thermal.

No functional change intented.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-03 20:45:02 +01:00
Daniel Lezcano
072e35c988 thermal/core: Add thermal_zone_device structure 'type' accessor
The thermal zone device structure is exposed via the exported
thermal.h header. This structure should stay private the thermal core
code. In order to encapsulate the structure, let's add an accessor to
get the 'type' of the thermal zone.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-03 20:45:02 +01:00
Daniel Lezcano
a6ff3c0021 thermal/core: Add a thermal zone 'devdata' accessor
The thermal zone device structure is exposed to the different drivers
and obviously they access the internals while that should be
restricted to the core thermal code.

In order to self-encapsulate the thermal core code, we need to prevent
the drivers accessing directly the thermal zone structure and provide
accessor functions to deal with.

Provide an accessor to the 'devdata' structure and make use of it in
the different drivers.

No functional changes intended.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-03 20:45:02 +01:00
ye xingchen
5bbafd4362 thermal: core: Use sysfs_emit_at() instead of scnprintf()
Follow the advice in Documentation/filesystems/sysfs.rst that show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the
value to be returned to user space.

Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09 20:39:48 +01:00
Rafael J. Wysocki
9e0a9be24b thermal: Fail object registration if thermal class is not registered
If thermal_class is not registered with the driver core, there is no way
to expose the interfaces used by the thermal control framework, so
prevent thermal zones and cooling devices from being registered in
that case by returning an error from object registration functions.

For this purpose, use a thermal_class pointer that will be NULL if the
class is not registered.  To avoid wasting memory in that case, allocate
the thermal class object dynamically and if it fails to register, free
it and clear the thermal_class pointer to NULL.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-25 16:51:19 +01:00
Daniel Lezcano
5b8de18ee9 thermal/core: Move the thermal trip code to a dedicated file
The thermal_core.c files contains a lot of functions handling
different thermal components like the governors, the trip points, the
cooling device, the OF cooling device, etc ...

This organization does not help to migrate to a more sane code where
there is a better self-encapsulation as all the components' internals
can be directly accessed from a single file.

For the sake of clarity, let's move the thermal trip points code in a
dedicated thermal_trip.c file and add a function to browse all the
trip points like we do with the thermal zones, the govenors and the
cooling devices.

The same can be done for the cooling devices and the governor code but
that will come later as the current work in the thermal framework is
to fix the trip point handling and use a generic trip point structure.

No functional changes intended.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-25 16:40:39 +01:00
Daniel Lezcano
b57d62862d thermal/core: Remove unneeded ida_destroy()
As per documentation for the ida_destroy() function: "If the IDA is
already empty, there is no need to call this function."

The thermal framework is in the init sequence, so the ida was not yet
used and consequently it is empty in case of error.

There is no need to call ida_destroy(), let's remove the calls.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-25 16:40:39 +01:00
Daniel Lezcano
58d1c9fd0e thermal/core: Fix unregistering netlink at thermal init time
The thermal subsystem initialization miss an netlink unregistering
function in the error. Add it.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-25 16:40:39 +01:00
Viresh Kumar
47e3f00074 thermal: core: Use device_unregister() instead of device_del/put()
Lets not open code device_unregister() unnecessarily.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-24 20:22:55 +01:00
Viresh Kumar
e398421fd0 thermal: core: Move cdev cleanup to thermal_release()
thermal_release() already frees cdev, let it do rest of the cleanup as
well in order to simplify the error paths in
__thermal_cooling_device_register().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-24 20:21:49 +01:00
Rafael J. Wysocki
a2c81dc59d Merge back thermal control material for 6.3. 2023-01-23 18:52:53 +01:00
Viresh Kumar
6c54b7bc8a thermal: core: call put_device() only after device_register() fails
put_device() shouldn't be called before a prior call to
device_register(). __thermal_cooling_device_register() doesn't follow
that properly and needs fixing. Also
thermal_cooling_device_destroy_sysfs() is getting called unnecessarily
on few error paths.

Fix all this by placing the calls at the right place.

Based on initial work done by Caleb Connolly.

Fixes: 4748f9687c ("thermal: core: fix some possible name leaks in error paths")
Fixes: c408b3d1d9 ("thermal: Validate new state in cur_state_store()")
Reported-by: Caleb Connolly <caleb.connolly@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Frank Rowand <frowand.list@gmail.com>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Tested-by: Caleb Connolly <caleb.connolly@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-19 21:06:41 +01:00
Johan Hovold
e6ec64f852 thermal/drivers/qcom: Fix set_trip_temp() deadlock
The set_trip_temp() callback is used when changing the trip temperature
through sysfs. As it is called with the thermal-zone-device lock held
it must not use thermal_zone_get_trip() directly or it will deadlock.

Fixes: 78c3e2429be8 ("thermal/drivers/qcom: Use generic thermal_zone_get_trip() function")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20221214131617.2447-2-johan+linaro@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
2023-01-06 14:14:48 +01:00
Daniel Lezcano
2e38a2a981 thermal/core: Add a generic thermal_zone_set_trip() function
The thermal zone ops defines a set_trip callback where we can invoke
the backend driver to set an interrupt for the next trip point
temperature being crossed the way up or down, or setting the low level
with the hysteresis.

The ops is only called from the thermal sysfs code where the userspace
has the ability to modify a trip point characteristic.

With the effort of encapsulating the thermal framework core code,
let's create a thermal_zone_set_trip() which is the writable side of
the thermal_zone_get_trip() and put there all the ops encapsulation.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20221003092602.1323944-4-daniel.lezcano@linaro.org
2023-01-06 14:14:47 +01:00
Daniel Lezcano
7c3d5c20dc thermal/core: Add a generic thermal_zone_get_trip() function
The thermal_zone_device_ops structure defines a set of ops family,
get_trip_temp(), get_trip_hyst(), get_trip_type(). Each of them is
returning a property of a trip point.

The result is the code is calling the ops everywhere to get a trip
point which is supposed to be defined in the backend driver. It is a
non-sense as a thermal trip can be generic and used by the backend
driver to declare its trip points.

Part of the thermal framework has been changed and all the OF thermal
drivers are using the same definition for the trip point and use a
thermal zone registration variant to pass those trip points which are
part of the thermal zone device structure.

Consequently, we can use a generic function to get the trip points
when they are stored in the thermal zone device structure.

This approach can be generalized to all the drivers and we can get rid
of the ops->get_trip_*. That will result to a much more simpler code
and make possible to rework how the thermal trip are handled in the
thermal core framework as discussed previously.

This change adds a function thermal_zone_get_trip() where we get the
thermal trip point structure which contains all the properties (type,
temp, hyst) instead of doing multiple calls to ops->get_trip_*.

That opens the door for trip point extension with more attributes. For
instance, replacing the trip points disabled bitmask with a 'disabled'
field in the structure.

Here we replace all the calls to ops->get_trip_* in the thermal core
code with a call to the thermal_zone_get_trip() function.

The thermal zone ops defines a callback to retrieve the critical
temperature. As the trip handling is being reworked, all the trip
points will be the same whatever the driver and consequently finding
the critical trip temperature will be just a loop to search for a
critical trip point type.

Provide such a generic function, so we encapsulate the ops
get_crit_temp() which can be removed when all the backend drivers are
using the generic trip points handling.

While at it, add the thermal_zone_get_num_trips() to encapsulate the
code more and reduce the grip with the thermal framework internals.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20221003092602.1323944-2-daniel.lezcano@linaro.org
2023-01-06 14:14:47 +01:00
Yang Yingliang
4748f9687c thermal: core: fix some possible name leaks in error paths
In some error paths before device_register(), the names allocated
by dev_set_name() are not freed. Move dev_set_name() front to
device_register(), so the name can be freed while calling
put_device().

Fixes: 1dd7128b83 ("thermal/core: Fix null pointer dereference in thermal_release()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-25 19:51:41 +01:00
Guenter Roeck
b778b4d782 thermal/core: Protect thermal device operations against thermal device removal
Thermal device operations may be called after thermal zone device removal.
After thermal zone device removal, thermal zone device operations must
no longer be called. To prevent such calls from happening, ensure that
the thermal device is registered before executing any thermal device
operations.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Guenter Roeck
05eeee2b51 thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex
Protect access to thermal operations against thermal zone removal by
acquiring the thermal zone device mutex. After acquiring the mutex, check
if the thermal zone device is registered and abort the operation if not.

With this change, we can call __thermal_zone_device_update() instead of
thermal_zone_device_update() from trip_point_temp_store() and from
emul_temp_store(). Similar, we can call __thermal_zone_set_trips() instead
of thermal_zone_set_trips() from trip_point_hyst_store().

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Guenter Roeck
1c439dec35 thermal/core: Introduce locked version of thermal_zone_device_update
In thermal_zone_device_set_mode(), the thermal zone mutex is released only
to be reacquired in the subsequent call to thermal_zone_device_update().

Introduce __thermal_zone_device_update(), which is similar to
thermal_zone_device_update() but has to be called with the thermal device
mutex held. Call the new function from thermal_zone_device_set_mode()
to avoid the extra thermal device mutex release/acquire sequence in that
function.

With the new function in place, re-implement thermal_zone_device_update()
as wrapper around __thermal_zone_device_update() to acquire and release
the thermal device mutex.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Guenter Roeck
30b2ae07d3 thermal/core: Delete device under thermal device zone lock
Thermal device attributes may still be opened after unregistering
the thermal zone and deleting the thermal device.

Currently there is no protection against accessing thermal device
operations after unregistering a thermal zone. To enable adding
such protection, protect the device delete operation with the
thermal zone device mutex. This requires splitting the call to
device_unregister() into its components, device_del() and put_device().
Only the first call can be executed under mutex protection, since
put_device() may result in releasing the thermal zone device memory.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Guenter Roeck
d35f29ed9d thermal/core: Destroy thermal zone device mutex in release function
Accesses to thermal zones, and with it the thermal zone device mutex,
are still possible after the thermal zone device has been unregistered.
For example, thermal_zone_get_temp() can be called from temp_show()
in thermal_sysfs.c if the sysfs attribute was opened before the thermal
device was unregistered.

Move the call to mutex_destroy from thermal_zone_device_unregister()
to thermal_release() to ensure that it is only destroyed after it is
guaranteed to be no longer accessed.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:04:37 +01:00
Dan Carpenter
e49a1e1ee0 thermal/core: fix error code in __thermal_cooling_device_register()
Return an error pointer if ->get_max_state() fails.  The current code
returns NULL which will cause an oops in the callers.

Fixes: c408b3d1d9 ("thermal: Validate new state in cur_state_store()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-28 19:59:51 +02:00