Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

amdgpu >=4.11 ioctl auth/permission problems (Vulkan, OpenCL, Xwayland, OpenMW-on-Wayland) #33

Closed
valpackett opened this issue Feb 26, 2018 · 45 comments
Labels
4.15 amdgpu bug feedback requested Freedback has been requested from submitter

Comments

@valpackett
Copy link
Contributor

valpackett commented Feb 26, 2018

$ vulkaninfo
===========
VULKAN INFO
===========

Vulkan API Version: 1.0.65


Instance Extensions:
====================
[…]
amdgpu_device_initialize: amdgpu_get_auth (1) failed (-1)
amdgpu_device_initialize: amdgpu_get_auth (1) failed (-1)
amdgpu_device_initialize: amdgpu_get_auth (1) failed (-1)
/usr/ports/graphics/vulkan-sdk/work/Vulkan-LoaderAndValidationLayers-sdk-1.0.65.1/demos/vulkaninfo.c:1670: failed with VK_ERROR_INITIALIZATION_FAILED
@valpackett valpackett changed the title amdgpu 4.11 broke Xwayland and Vulkan amdgpu 4.11 broke Vulkan Feb 26, 2018
@iotamudelta
Copy link
Member

Interesting. I think this is probably related to getting the wrong device name from the fd. To be more specific, I think the fd returns the /dev/dri prefix where it shouldn't (or vice versa).

@johalun can you check this?

@johalun
Copy link
Member

johalun commented Feb 26, 2018

Where can I find vulkaninfo? (and how do I get the rest of vulkan installed?)

@johalun
Copy link
Member

johalun commented Feb 26, 2018

Maybe you can enable drm debug and paste a dmesg output?
sysctl dev.drm.drm_debug=-1

@valpackett
Copy link
Contributor Author

@valpackett
Copy link
Contributor Author

[drm:drm_ioctl] pid=101407, dev=0xe280, auth=0, DRM_IOCTL_GET_CLIENT
[drm:drm_ioctl_permit] unlikely(!(flags & DRM_RENDER_ALLOW) && drm_is_render_client(file_priv)))[drm:drm_ioctl] ret = -13

full dmesg: https://gist.github.com/myfreeweb/261f9cfd6002f4cf024553a08ed9434d

@valpackett
Copy link
Contributor Author

also I broke Xwayland and for some reason couldn't fix it even by downgrading to previous kernel and 4.9 drm-next-kmod o_0

And the error there is also -13 (EACCES), but in amdgpu_query_info(ACCEL_WORKING)

@johalun
Copy link
Member

johalun commented Feb 26, 2018

Thanks! I remember doing something with DRM_RENDER_ALLOW involved. Will check it out tomorrow (evening here now).

@johalun
Copy link
Member

johalun commented Feb 27, 2018

-       DRM_IOCTL_DEF(DRM_IOCTL_GET_CLIENT, drm_getclient, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF(DRM_IOCTL_GET_CLIENT, drm_getclient, DRM_UNLOCKED),

Here's one change that might affect this. However, DRM_RENDER_ALLOW hasn't been there since at least v4.0 (maybe never) upstream... It has been added on FreeBSD for some reason..

@johalun
Copy link
Member

johalun commented Feb 27, 2018

But why do only vulkan break? What does vulkan do differently from other clients?

@valpackett
Copy link
Contributor Author

This affects OpenCL too:

$ LD_PRELOAD=/lib/libthr.so.3 clinfo
amdgpu_device_initialize: amdgpu_get_auth (1) failed (-1)
amdgpu: amdgpu_device_initialize failed.
do_winsys_init: DRM version is 3.10.0 but this driver is only compatible with 2.12.0 (kernel 3.2) or later.
[…]
Number of devices                                 0

@valpackett
Copy link
Contributor Author

Yep, adding DRM_RENDER_ALLOW fixed it!

@valpackett
Copy link
Contributor Author

So!!! Two more problems turned out to be ioctl authentication/permission problems!

  • Xwayland not showing anything because of not detecting DRI3 (did work fine as root, turns out)
  • OpenMW NULL surface segfault under Wayland

Debug traces:

[drm:drm_ioctl] pid=101807, dev=0xe200, auth=0, AMDGPU_INFO
[drm:drm_ioctl_permit] unlikely((flags & DRM_AUTH) && !drm_is_render_client(file_priv) && !file_priv->authenticated))[drm:drm_ioctl] ret = -13
[drm:drm_ioctl] pid=101807, dev=0xe200, auth=0, DRM_IOCTL_PRIME_HANDLE_TO_FD
[drm:drm_ioctl_permit] unlikely((flags & DRM_AUTH) && !drm_is_render_client(file_priv) && !file_priv->authenticated))[drm:drm_ioctl] ret = -13
[drm:drm_ioctl] [drm:drm_ioctl] pid=101726, dev=0xe200, auth=0, AMDGPU_GEM_CREATE
pid=100801, dev=0xe200, auth=0, DRM_IOCTL_PRIME_FD_TO_HANDLE
[drm:drm_ioctl_permit] [drm:drm_ioctl] unlikely((flags & DRM_AUTH) && !drm_is_render_client(file_priv) && !file_priv->authenticated))pid=101726, dev=0xe200, auth=0, AMDGPU_GEM_VA
[drm:drm_ioctl] ret = -13

The interesting thing is that the other calls already have DRM_RENDER_ALLOW!

Here's my current workaround patch, which just removes DRM_AUTH (lol probably bad for security):

--- amd/amdgpu/amdgpu_kms.c.orig        2018-03-06 19:36:21 UTC
+++ amd/amdgpu/amdgpu_kms.c
@@ -882,20 +882,20 @@ int amdgpu_get_vblank_timestamp_kms(struct drm_device 
 }
 
 const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
-       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_BO_LIST, amdgpu_bo_list_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_BO_LIST, amdgpu_bo_list_ioctl, DRM_RENDER_ALLOW),
        /* KMS */
-       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_MMAP, amdgpu_gem_mmap_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_WAIT_IDLE, amdgpu_gem_wait_idle_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_CS, amdgpu_cs_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_INFO, amdgpu_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_CS, amdgpu_cs_wait_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_FENCES, amdgpu_cs_wait_fences_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_METADATA, amdgpu_gem_metadata_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_MMAP, amdgpu_gem_mmap_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_WAIT_IDLE, amdgpu_gem_wait_idle_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_CS, amdgpu_cs_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_INFO, amdgpu_info_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_CS, amdgpu_cs_wait_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_FENCES, amdgpu_cs_wait_fences_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_METADATA, amdgpu_gem_metadata_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_RENDER_ALLOW),
 };
 const int amdgpu_max_kms_ioctl = ARRAY_SIZE(amdgpu_ioctls_kms);
 
--- drm/drm_ioctl.c.orig        2018-03-06 17:16:42 UTC
+++ drm/drm_ioctl.c
@@ -551,7 +551,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
        DRM_IOCTL_DEF(DRM_IOCTL_GET_MAGIC, drm_getmagic, DRM_UNLOCKED),
        DRM_IOCTL_DEF(DRM_IOCTL_IRQ_BUSID, drm_irq_by_busid, DRM_MASTER|DRM_ROOT_ONLY),
        DRM_IOCTL_DEF(DRM_IOCTL_GET_MAP, drm_legacy_getmap_ioctl, DRM_UNLOCKED),
-       DRM_IOCTL_DEF(DRM_IOCTL_GET_CLIENT, drm_getclient, DRM_UNLOCKED),
+       DRM_IOCTL_DEF(DRM_IOCTL_GET_CLIENT, drm_getclient, DRM_UNLOCKED|DRM_RENDER_ALLOW),
        DRM_IOCTL_DEF(DRM_IOCTL_GET_STATS, drm_getstats, DRM_UNLOCKED),
        DRM_IOCTL_DEF(DRM_IOCTL_GET_CAP, drm_getcap, DRM_UNLOCKED|DRM_RENDER_ALLOW),
        DRM_IOCTL_DEF(DRM_IOCTL_SET_CLIENT_CAP, drm_setclientcap, DRM_UNLOCKED),
@@ -622,8 +622,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 
        DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETRESOURCES, drm_mode_getresources, DRM_CONTROL_ALLOW|DRM_UNLOCKED),
 
-       DRM_IOCTL_DEF(DRM_IOCTL_PRIME_HANDLE_TO_FD, drm_prime_handle_to_fd_ioctl, DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF(DRM_IOCTL_PRIME_FD_TO_HANDLE, drm_prime_fd_to_handle_ioctl, DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF(DRM_IOCTL_PRIME_HANDLE_TO_FD, drm_prime_handle_to_fd_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF(DRM_IOCTL_PRIME_FD_TO_HANDLE, drm_prime_fd_to_handle_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 
        DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETPLANERESOURCES, drm_mode_getplane_res, DRM_CONTROL_ALLOW|DRM_UNLOCKED),
        DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETCRTC, drm_mode_getcrtc, DRM_CONTROL_ALLOW|DRM_UNLOCKED),

@valpackett valpackett changed the title amdgpu 4.11 broke Vulkan amdgpu 4.11 ioctl auth/permission problems (Vulkan, OpenCL, Xwayland, OpenMW-on-Wayland) Mar 6, 2018
@johalun
Copy link
Member

johalun commented Mar 6, 2018

@myfreeweb Does this make opencl work too? DRM code matches upstream so problem should be elsewhere. Thanks for narrowing it down :)

@valpackett
Copy link
Contributor Author

Yep, it does!

$ LD_PRELOAD=/lib/libthr.so.3 clinfo
Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 18.1.0-devel
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD Radeon (TM) RX 480 Graphics (POLARIS10 / DRM 3.10.0 / 12.0-CURRENT, LLVM 5.0.1)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 18.1.0-devel
  Driver Version                                  18.1.0-devel
  Device OpenCL C Version                         OpenCL C 1.1
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               36
  Max clock frequency                             1266MHz
…

@johalun
Copy link
Member

johalun commented Mar 6, 2018

Can you try clpeak on i915? I'm getting i/o error on the float compute benchmarking..

@valpackett
Copy link
Contributor Author

I haven't updated my i915 laptop to 4.11 yet. I'll test that (and other stuff) when I get around to that :)

@johalun
Copy link
Member

johalun commented Mar 6, 2018

No problem. I can probably test i915 today.

@gldisater
Copy link
Contributor

have these patches to get radeon OpenCL to work been committed yet?

@johalun
Copy link
Member

johalun commented Mar 15, 2018

You mean the patches here above? They are not a solution, just a work around that will not be committed...
Other than that, no, I don't think the real issue has been worked on yet.

@valpackett
Copy link
Contributor Author

hm, I wonder if ANV (Intel Vulkan) only working when run as root (even back in 4.9) is also an ioctl permission thing…

@valpackett valpackett changed the title amdgpu 4.11 ioctl auth/permission problems (Vulkan, OpenCL, Xwayland, OpenMW-on-Wayland) amdgpu >=4.11 ioctl auth/permission problems (Vulkan, OpenCL, Xwayland, OpenMW-on-Wayland) Mar 23, 2018
@johalun
Copy link
Member

johalun commented Mar 27, 2018

Let's tackle this once I'm done merging 4.15 (end of April). A stable release based on v4.15 is planned.

@johalun
Copy link
Member

johalun commented May 27, 2018

I made this change

+       DRM_IOCTL_DEF(DRM_IOCTL_GET_CLIENT, drm_getclient, DRM_UNLOCKED|DRM_RENDER_ALLOW),
-       DRM_IOCTL_DEF(DRM_IOCTL_GET_CLIENT, drm_getclient, DRM_UNLOCKED),

and pushed it. At least for now it will work until we can figure out the proper solution.

@valpackett
Copy link
Contributor Author

hm, do you see any errors with the DRM_IOCTL_PRIME_ and AMDGPU_ ones? Does Xwayland work for you as a regular user?

@johalun
Copy link
Member

johalun commented May 27, 2018

xwayland does not complain but does not render anything either. seems the apps are running, just no window.. I know this happened before but was fixed somehow...
Will check the other stuff once my poudriere build is complete.

@valpackett
Copy link
Contributor Author

Yeah, Xwayland not rendering means the PRIME ones are being denied, so you reproduced the bug :)

@johalun
Copy link
Member

johalun commented May 27, 2018

Ok. I will play around with the ioctl permission see if I can get it working at least. Then we figure out what the real reason is.

@fhajji
Copy link

fhajji commented Aug 28, 2018

On Xorg, that helped too:

Adding DRM_RENDER_ALLOW against cc04340 made devel/clinfo show my AMD RX580 on Xorg again (running graphics/drm-next-kmod on 11.2-STABLE r337689).

I'm still experiencing opencl issues though like all kinds of segfaults, OpenCL 2.1/2.2 mismatches etc, but at least that's some progress.

https://forums.freebsd.org/threads/opencl-with-amd-radeon-rx580-segfaults.66789/

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230967

@valpackett
Copy link
Contributor Author

all kinds of segfaults, OpenCL 2.1/2.2 mismatches

Yeah, clover is a really incomplete driver, you can't expect everything to work. OpenCL 2.x is not implemented at all…

@ahonecker76
Copy link

I am working on FREEBSD with RX480 and encounter exact the same topic. How can i patching the system accordingly ? My knowledge is limited :( - What file/data i need to change in order get the patch working?

Thank you
Andreas

@valpackett
Copy link
Contributor Author

@ahonecker76 ah, sorry, the patch is a bit out of date, one of the fixes (a small one) is merged into the main branch, and the file names have moved from amd/amdgpu to drivers/gpu/drm/amd/amdgpu.

I'll post an update and instructions soon. I guess I can post a binary package as well

@ahonecker76
Copy link

ahonecker76 commented Sep 18, 2018

@myfreeweb Ok, thank you. I did not know it changed already. I did getting mad for get opencl working with my polaris GPU and after try nearly everything i found in google, i did ended up here. FreeBSD seems a bit tricky regarding this matter. Thank you for reply and help.

@valpackett
Copy link
Contributor Author

diff --git i/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c w/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index bd6e9a40f..de38fba72 100644
--- i/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ w/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -1014,23 +1014,23 @@ void amdgpu_disable_vblank_kms(struct drm_device *dev, unsigned int pipe)
 }
 
 const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
-	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_VM, amdgpu_vm_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_VM, amdgpu_vm_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(AMDGPU_SCHED, amdgpu_sched_ioctl, DRM_MASTER),
-	DRM_IOCTL_DEF_DRV(AMDGPU_BO_LIST, amdgpu_bo_list_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_FENCE_TO_HANDLE, amdgpu_cs_fence_to_handle_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_BO_LIST, amdgpu_bo_list_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_FENCE_TO_HANDLE, amdgpu_cs_fence_to_handle_ioctl, DRM_RENDER_ALLOW),
 	/* KMS */
-	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_MMAP, amdgpu_gem_mmap_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_WAIT_IDLE, amdgpu_gem_wait_idle_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_CS, amdgpu_cs_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_INFO, amdgpu_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_CS, amdgpu_cs_wait_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_FENCES, amdgpu_cs_wait_fences_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_METADATA, amdgpu_gem_metadata_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW)
+	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_MMAP, amdgpu_gem_mmap_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_WAIT_IDLE, amdgpu_gem_wait_idle_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_CS, amdgpu_cs_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_INFO, amdgpu_info_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_CS, amdgpu_cs_wait_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_FENCES, amdgpu_cs_wait_fences_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_METADATA, amdgpu_gem_metadata_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_RENDER_ALLOW)
 };
 const int amdgpu_max_kms_ioctl = ARRAY_SIZE(amdgpu_ioctls_kms);
 
diff --git i/drivers/gpu/drm/drm_ioctl.c w/drivers/gpu/drm/drm_ioctl.c
index a8e2c1341..9019d7766 100644
--- i/drivers/gpu/drm/drm_ioctl.c
+++ w/drivers/gpu/drm/drm_ioctl.c
@@ -634,8 +634,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 
 	DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETRESOURCES, drm_mode_getresources, DRM_CONTROL_ALLOW|DRM_UNLOCKED),
 
-	DRM_IOCTL_DEF(DRM_IOCTL_PRIME_HANDLE_TO_FD, drm_prime_handle_to_fd_ioctl, DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF(DRM_IOCTL_PRIME_FD_TO_HANDLE, drm_prime_fd_to_handle_ioctl, DRM_AUTH|DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_PRIME_HANDLE_TO_FD, drm_prime_handle_to_fd_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_PRIME_FD_TO_HANDLE, drm_prime_fd_to_handle_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 
 	DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETPLANERESOURCES, drm_mode_getplane_res, DRM_CONTROL_ALLOW|DRM_UNLOCKED),
 	DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETCRTC, drm_mode_getcrtc, DRM_CONTROL_ALLOW|DRM_UNLOCKED),

Here's the patch for v4.16 (should work for 4.17 but I haven't updated yet)

Clone this repo, apply the patch using git apply (you can e.g. copy the text and pipe using xclip: xclip -out | git apply, or save into a file), build the module (make -j8 where 8 is the number of parallel jobs i.e. cpu cores) and install (sudo make install / doas make install)

For the lazy way, here's my binaries: https://unrelentingtech.s3.dualstack.eu-west-1.amazonaws.com/kms-drm-416.txz — this is for fairly recent CURRENT… I mean 12.0-ALPHA4 actually (my build is from September 6th, uname -K == 1200084), might work fine on alpha6 or whatever it is now. You can extract that into / (cd / && sudo tar -xvf kms-drm-416.txz).

But I recommend building from source.

@ahonecker76
Copy link

Thank you for your support. After i did implement it i did get an error loaded the amdgpu into kernel. KLD drm.ko: depends on kernel - not available or version mismatch
linker_load_file: Unsupported file type
KLD amdgpu.ko: depends on drmn - not available or version mismatch
linker_load_file: Unsupported file type

Seems some kind of Version missmatch i have now :( Before it loaded without any problems, but OPENCL not reconized it. I am at 11.2 stable at the moment.

@valpackett
Copy link
Contributor Author

11.2 stable

Did you use my binary build? I said it's for 12.0-alpha, not 11.2!

@ahonecker76
Copy link

All right, that´s the matter than. I am at 11.2, that´s the reason it´s not working. Thank you

@valpackett
Copy link
Contributor Author

You should build it from source with the patch.

@ahonecker76
Copy link

I am sorry to ask for you help again. If i want make it, i get some errors:

test/kms-drm/drivers/gpu/drm/ttm/ttm_memory.c:595:36: error: implicit declaration of function 'vm_free_count' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
available = get_nr_swap_pages() + vm_free_count();
^
1 error generated.
*** [ttm_memory.o] Error code 1

make[1]: stopped in /test/kms-drm/drm
--- ttm_page_alloc.o ---
/test/kms-drm/drivers/gpu/drm/ttm/ttm_page_alloc.c:1167:8: error: use of undeclared identifier 'GFP_TRANSHUGE_LIGHT'
(GFP_TRANSHUGE_LIGHT | __GFP_NORETRY |
^
/test/kms-drm/drivers/gpu/drm/ttm/ttm_page_alloc.c:1168:8: error: use of undeclared identifier '__GFP_KSWAPD_RECLAIM'
__GFP_KSWAPD_RECLAIM) &
^
/test/kms-drm/drivers/gpu/drm/ttm/ttm_page_alloc.c:1173:8: error: use of undeclared identifier 'GFP_TRANSHUGE_LIGHT'
(GFP_TRANSHUGE_LIGHT | __GFP_NORETRY |
^
/test/kms-drm/drivers/gpu/drm/ttm/ttm_page_alloc.c:1174:8: error: use of undeclared identifier '__GFP_KSWAPD_RECLAIM'
__GFP_KSWAPD_RECLAIM) &
^
4 errors generated.
*** [ttm_page_alloc.o] Error code 1

make[1]: stopped in /test/kms-drm/drm
2 errors

make[1]: stopped in /test/kms-drm/drm
*** [all] Error code 2

make: stopped in /test/kms-drm
1 error

make: stopped in /test/kms-drm

@valpackett
Copy link
Contributor Author

hm. I guess you could try going back to 4.15 (git reset --hard HEAD && git checkout drm-v4.15) and applying the… old version of the patch probably

@johalun does 4.17 work on stable rn?

@ahonecker76
Copy link

ahonecker76 commented Sep 19, 2018

Well, bad luck. It also not finish the make process. It stopps at intel part. I did try everything starting with 4.11, which is the only one completed. Starting with 4.12 the make process never successfully ended.

4 warnings generated.
1 error

make[1]: stopped in /test/kms-drm/i915
*** [all] Error code 2

The whole thing onyl working with sucessfull kldload of amdgpu with the actual original PORT. As soon i start using this repository, even the 4.11 sucessfully build, i get the:

KLD drm.ko: depends on kernel - not available or version mismatch
linker_load_file: Unsupported file type
KLD amdgpu.ko: depends on drmn - not available or version mismatch
linker_load_file: Unsupported file type

I give up for the moment. Seems get OpenCL working with Polaris it´s too tricky at the moment, even the graphic driver itself seems to work.

@valpackett
Copy link
Contributor Author

radeontop falls back to /dev/mem mode because some ioctls are denied, probably the ones I haven't even touched in my workaround:

openat(AT_FDCWD,"/dev/dri/card0",O_RDWR|O_CLOEXEC,00) = 4 (0x4)
ioctl(4,0xc0106407 { IORW 0x64('d'), 7, 16 },0x7fffffffd840) ERR#13 'Permission denied'
ioctl(4,0xc0106407 { IORW 0x64('d'), 7, 16 },0x7fffffffd840) ERR#13 'Permission denied'
ioctl(4,0xc0106401 { IORW 0x64('d'), 1, 16 },0x7fffffffd840) = 0 (0x0)
ioctl(4,0xc0106401 { IORW 0x64('d'), 1, 16 },0x7fffffffd840) = 0 (0x0)

possibly related to clbr/radeontop#72 ? (I am running Weston)

@nihil-2019
Copy link

nihil-2019 commented Feb 6, 2019

Just noting here that if amdgpu is loaded through loader.conf, like so :

exec="load /boot/kernel/kernel"
exec="load /boot/modules/amdgpu_si58_mc_bin.ko"
exec="load /boot/modules/radeon_TAHITI_uvd_bin.ko"
exec="load /boot/modules/radeon_TAHITI_vce_bin.ko"
exec="load /boot/modules/radeon_pitcairn_ce_bin.ko"
exec="load /boot/modules/radeon_pitcairn_k_smc_bin.ko"
exec="load /boot/modules/radeon_pitcairn_me_bin.ko"
exec="load /boot/modules/radeon_pitcairn_pfp_bin.ko"
exec="load /boot/modules/radeon_pitcairn_rlc_bin.ko"
exec="load /boot/kernel/vmm.ko"
exec="load /boot/modules/amdgpu.ko"

then the patch is not required for everything to work. Obviously replace the bin files with the firmware that's required for your GPU, & don't load anything after amdgpu.ko or it refuses to initialize the screen

johalun pushed a commit that referenced this issue Feb 13, 2019
This patch prevents division by zero htotal.

In a follow-up mail Tina writes:

> > How did you manage to get here with htotal == 0? This needs backtraces (or if
> > this is just about static checkers, a mention of that).
> > -Daniel
>
> In GVT-g, we are trying to enable a virtual display w/o setting timings for a pipe
> (a.k.a htotal=0), then we met the following kernel panic:
>
> [   32.832048] divide error: 0000 [#1] SMP PTI
> [   32.833614] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc4-sriov+ #33
> [   32.834438] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.10.1-0-g8891697-dirty-20180511_165818-tinazhang-linux-1 04/01/2014
> [   32.835901] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.836004] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.836004] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.836004] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.836004] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.836004] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.836004] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.836004] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.836004] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.836004] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.836004] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.836004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.836004] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.836004] Call Trace:
> [   32.836004]  intel_mode_from_pipe_config+0x72/0x90
> [   32.836004]  intel_modeset_setup_hw_state+0x569/0xf90
> [   32.836004]  intel_modeset_init+0x905/0x1db0
> [   32.836004]  i915_driver_load+0xb8c/0x1120
> [   32.836004]  i915_pci_probe+0x4d/0xb0
> [   32.836004]  local_pci_probe+0x44/0xa0
> [   32.836004]  ? pci_assign_irq+0x27/0x130
> [   32.836004]  pci_device_probe+0x102/0x1c0
> [   32.836004]  driver_probe_device+0x2b8/0x480
> [   32.836004]  __driver_attach+0x109/0x110
> [   32.836004]  ? driver_probe_device+0x480/0x480
> [   32.836004]  bus_for_each_dev+0x67/0xc0
> [   32.836004]  ? klist_add_tail+0x3b/0x70
> [   32.836004]  bus_add_driver+0x1e8/0x260
> [   32.836004]  driver_register+0x5b/0xe0
> [   32.836004]  ? mipi_dsi_bus_init+0x11/0x11
> [   32.836004]  do_one_initcall+0x4d/0x1eb
> [   32.836004]  kernel_init_freeable+0x197/0x237
> [   32.836004]  ? rest_init+0xd0/0xd0
> [   32.836004]  kernel_init+0xa/0x110
> [   32.836004]  ret_from_fork+0x35/0x40
> [   32.836004] Modules linked in:
> [   32.859183] ---[ end trace 525608b0ed0e8665 ]---
> [   32.859722] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.860287] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.862680] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.863309] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.864182] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.865206] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.866359] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.867213] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.868075] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.868983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.869659] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.870599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.871598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.872549] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> Since drm_mode_hsync() has the logic to check mode->htotal, I just extend it to cover the case htotal==0.

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
[danvet: Add additional explanations + cc: stable.]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1548228539-3061-1-git-send-email-tina.zhang@intel.com
@valpackett
Copy link
Contributor Author

With drm-v5.0 and Vega, looks like the only thing that still fails without the patch is Xwayland.

  • Vulkan works in RetroArch at least
  • clinfo doesn't quit early (just segfaults after it's done / after ICD loader properties)
  • OpenMW works fine
  • Xwayland still doesn't display anything:
[drm:drm_ioctl] pid=100886, dev=0xe200, auth=0, DRM_IOCTL_PRIME_HANDLE_TO_FD
[drm:drm_ioctl_permit] unlikely((flags & DRM_AUTH) && !drm_is_render_client(file_priv) && !file_priv->authenticated))[drm:drm_ioctl] pid=100886, ret = -13

@johalun
Copy link
Member

johalun commented Jun 13, 2019

xwayland issues are solved with xorg 1.20

@zeising zeising added the feedback requested Freedback has been requested from submitter label Jul 16, 2020
@zeising
Copy link
Member

zeising commented Jul 16, 2020

Is this still relevant after the update of xorg-server (and xwayland) to 1.20?

@valpackett
Copy link
Contributor Author

No :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4.15 amdgpu bug feedback requested Freedback has been requested from submitter
Projects
None yet
Development

No branches or pull requests

8 participants