This accelerates spans operations using GPU-based geometry computation
-wellipse500 goes from about 4k/sec before the patch, to ~8k/sec in
the GLES2 fallback loop, to ~100k/sec in desktop mode.
v2: Rebase on skipping the prepare rewrite for now, and fix the GLSL
1.20 and GLES2 cases (changes by anholt).
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
This currently computes the GLSL version in a fairly naïve fashion,
and leaves that in the screen private for other users. This will let
us update the version computation in one place later on.
v2: Drop an accidental rebase-squashed hunk (change by anholt).
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
This accelerates poly point when possible by off-loading all geometry
computation to the GPU.
Improves x11perf -dot performance by 28109.5% +/- 1022.01% (n=3)
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
This just adds a bunch of support code to construct shaders from
'facets', which bundle attributes needed for each layer of the
rendering system. At this point, that includes only the primitive and
the fill stuff.
v2: Correct comment in glamor transform about 1/2 pixel correction needed
for GL_POINT. (Eric Anholt)
v3: Rebase on Markus's cleanups (change by anholt)
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
This adds a few helper functions to make pixmap fbo access symmetrical
between the single fbo and tiled cases.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
This lets code treat the one-fbo pixmaps more symmetrically with the
tiled pixmaps.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Glamor has a mode where pixmaps will be constructed from numerous
small FBOs. This allows testing of the tiled pixmap code without
needing to create huge pixmaps.
However, the render glyph code assumed that it could create a pixmap
large enough for the glyph atlas. Instead of attempting to fix that
(which would be disruptive and not helpful), I've added a new pixmap
creation usage, GLAMOR_CREATE_NO_LARGE which forces allocation of a
single large FBO.
Now that we have pixmaps with varying FBO sizes, I then went around
and fixed the few places using the global FBO max size and replaced
that with the per-pixmap FBO tiling sizes, which were already present
in each large pixmap.
Xephyr has been changed to pass GLAMOR_CREATE_NO_LARGE when it creates
the screen pixmap as it doesn't want to deal with tiling either.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
The mbr path was hard coded enabled for desktop gl and disabled for
gles. But there are both desktop without mbr and GLES with mbr.
v2: Don't forget to update the fini path, too (change by anholt)
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
This will help tools like fips, apitrace, or INTEL_DEBUG=shader_time
provide useful information about the shaders in use.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
Now that the core deals with that for us, we can avoid all this extra
carefulness.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
The common pattern is to do nested if statements making calls to
prepare_access() and then pop those mappings back off in each set of
braces. Some cases checked for src == dst to avoid leaking mappings,
but others didn't. Others didn't even do the nested mappings, so a
failure in the outer map would result in trying to umap the inner and
failing.
By allowing nested mappings, we can fix both problems by not requiring
the care from the caller, plus we can allow a simpler nesting of all
the prepares in one if statement.
v2: Add a comment about nested unmap behavior, and just reuse the
glamor_access_t enum.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
Nothing was using it, and it was going to complicate the
glamor_prepare_access bugfixing I'm going to do next.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
This unpacks the bitfield into an int size, but my experience has been
that packing bitfields doesn't matter for performance.
v2: Convert more comparisons against numbers or implicit bool
comparisons to comparisons against the enum names, and fix up some
comments.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
This hasn't actually been a problem, since the server hasn't allocated
any glyphs before our glyph private initialization during
CreateScreenResources. But it's generally not X Server style to do
things this way.
Now that glamor itself drives both parts of glyphs setup, DDX drivers
no longer need to tell glamor to initialize glyphs. We do retain the
old public symbol so they can keep running with no changes.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
There's no reason to hide EGL from the rest of glamor, now that we
have epoxy.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
v2:
- Make the default buffer size a #define. (by Markus Wick)
- Fix the return offset for mapping with buffer_storage. (oops!)
v3:
- Avoid GL error at first rendering from unmapping no buffer.
- Rebase on the glBindBuffer(GL_ARRAY_BUFFER, 0) change.
v4: Rebase on Markus's vbo init changes.
v5: Fix missing put_context() in the buffer_storage fallback path.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
We should be uploading any vertex data using this kind of upload
style, since it saves a bunch of extra copies of our vertex data.
v2:
- Add a simple comment about what the function does.
- Use get_vbo_space()'s return in trapezoids, instead of dereffing
glamor_priv->vb (by Markus Wick).
- Fix the double-unmapping by moving put_vbo_space() outside of
flush_composite_rects().
- Remove the rest of the composite_vbo_offset usage, and just always
use get_vbo_space()'s return value.
v3:
- Fix failure to put_vbo_space in traps when no prims were
generated.
- Unbind the VBO from put_vbo_space(). Keeps callers from
forgetting to do so.
v4:
- Split out some changes into the previous 3 commits while trying to
track down a regression.
- Fix regression due to rebase fail where glamor_priv->vbo_offset
wasn't incremented.
v5:
- Fix GLES2 VBO sizing.
- Add a comment about resize behavior.
- Move glamor_vbo.c init code to glamor_vbo.c from
glamor_render.c. (Derived from Markus's changes, but the GLES2 fix
dropped almost all of the code in the functions).
v6:
- Drop the initial BufferData on GLES2 (it happens at put() time).
- Don't forget to set vbo_offset to the size on GLES2.
- Use char * instead of void * in the cast to return the vbo_offset.
- Resize the default FBO to 512kb, to be similar to previous
behavior. +1.66124% +/- 0.284223% (n=679) on aa10text.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
I want to extract the VBO mapping code, and as part of that I need to
get the global vbo_offset munging to stop.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
It's only used in the nonantialiased, triangle-based trapezoids path.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
They never get reattached to any other program, so saving them to
unreference later is a waste of code.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
This is the last desktop-versus-ES2 build ifdef in core glamor.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
We only ask for GL_RGB on desktop GL as far as I can see, but now if
GLES2 did happen to ask for GL_RGB it would return a cache index
instead of -1.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
The GLX side just gets the context from the current state. That's
also something I want to do for EGL, so that the making a context is
separate from initializing glamor, but I think I need the modesetting
driver in the server before I think about hacking on that more.
The previous code was rather incestuous, along with pulling in xf86
dependencies to our dix code. The new code just initializes itself
from the current state.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Libepoxy hides all the GL versus GLES2 dispatch handling for us, with
higher performance.
v2: Squash in the later patch to drop the later of two repeated
glamor_get_dispatch()es instead (caught by keithp)
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
We're not using the extension prototypes, since you have to dlsym them
anyway. Disabling their definitions prevents them from being defined
twice (once by gl.h, once by glext.h).
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
v2: Also edit the one in glamor_egl.c (by anholt)
v3: Also edit the one in glamor_eglmodule.c (by anholt)
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
Xfuncproto.h has equivalents for these already.
v2: Adjust a couple more likelies after the rebase (anholt)
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
This implements some DRI3 helpers to help the DDXs using
glamor to support DRI3.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
When creating a window with recordmydesktop running, the following may happen:
create picture 0x1cd457e0, with drawable 0x1327d1f0
(SetWindowPixmap is called)
destroy picture 0x1cd457e0, with drawable 0x1cd65820
Obtaining format for pixmap 0x1327d1f0 and picture 0x1cd457e0
==7989== Invalid read of size 4
==7989== at 0x8CAA0CA: glamor_get_tex_format_type_from_pixmap (glamor_utils.h:1252)
==7989== by 0x8CAD1B7: glamor_download_sub_pixmap_to_cpu (glamor_pixmap.c:1074)
==7989== by 0x8CA8BB7: _glamor_get_image (glamor_getimage.c:66)
==7989== by 0x8CA8D2F: glamor_get_image (glamor_getimage.c:92)
==7989== by 0x29AEF2: miSpriteGetImage (misprite.c:413)
==7989== by 0x1E7674: compGetImage (compinit.c:148)
==7989== by 0x1F5E5B: ProcShmGetImage (shm.c:684)
==7989== by 0x1F686F: ProcShmDispatch (shm.c:1121)
==7989== by 0x15D00D: Dispatch (dispatch.c:432)
==7989== by 0x14C569: main (main.c:298)
==7989== Address 0x1cd457f0 is 16 bytes inside a block of size 120 free'd
==7989== at 0x4C2B60C: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==7989== by 0x228897: FreePicture (picture.c:1477)
==7989== by 0x228B23: PictureDestroyWindow (picture.c:73)
==7989== by 0x234C19: damageDestroyWindow (damage.c:1646)
==7989== by 0x1E92C0: compDestroyWindow (compwindow.c:590)
==7989== by 0x20FF85: DbeDestroyWindow (dbe.c:1389)
==7989== by 0x185D46: FreeWindowResources (window.c:907)
==7989== by 0x1889A7: DeleteWindow (window.c:975)
==7989== by 0x17EBF1: doFreeResource (resource.c:873)
==7989== by 0x17FC1B: FreeClientResources (resource.c:1139)
==7989== by 0x15C4DE: CloseDownClient (dispatch.c:3402)
==7989== by 0x2AB843: CheckConnections (connection.c:1008)
==7989==
(II) fail to get matched format for dfdfdfdf
The fix is to update the picture pointer when the window pixmap is changed,
so it moves the picture around with the window rather than the pixmap.
This makes FreePicture work correctly.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71088
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
This does YV12 and I420 for now, not sure if we can do packed without
a GL extension.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This commit will benefit vertex stressing cases such as
aa10text/rgb10text, and can get about 15% performance gain.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Acked-by: Junyan <junyan.he@linux.intel.com>
After increase to gcc4.7, it reports more warnings, now
fix them.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Junyan He<junyan.he@linux.intel.com>
The trapezoid generating speed of the shader is relatively
slower when the trapezoid area is big. We fallback when
the trapezoid's width and height is bigger enough.
The big traps number will also slow down the render because
of the VBO size. We fallback if ntrap > 256
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-By: Zhigang Gong <zhigang.gong@linux.intel.com>
As xorg 1.13 change the scrn interaces and remove those
global arrays. Some API change cause we can't build. Now
fix it.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Previous patch doesn't set the offset to zero for GLESv2
path. Now fix it.
This patch also fix a minor problem in pixmap uploading
preparation. If the revert is not REVERT_NORMAL, then we
don't need to prepare a fbo for it. As current mesa i965
gles2 driver doesn't support to set a A8 texture as a fbo
target, we must fix this problem. As some A1/A8 picture
need to be uploaded, this is the only place a A8 texture
may be attached to a fbo.
This patch also enable the shader gradient for GLESv2.
The reason we disable it before is that some glsl linker
doesn't support link different objects which have cross
reference. Now we don't have that problem.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Practically, for pure 2D blit, the blit copy is much faster
than textured copy. For the x11perf copywinwin100, it's about
3x faster. But if we have heavy rendering/compositing, then use
textured copy will get much better (>30%)performance for most
of the cases.
So we simply add a data element to track current state. For
rendering state we use textured copy, otherwise, we use blit
copy.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Don't call miCompositeRects. Use glamor_composite_clipped_region
to render those boxes at once.
Also add a new function glamor_solid_boxes to fill boxes at once.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
I met a problem with large texture (larger than 7000x7000)'s
uploading on SNB platform. The map_gtt get back a mapped VA
without error, but write to that virtual address triggers
BUS error. This work around is to avoid that direct uploading.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
For the componentAlpha with PictOpOver, we use two pass
rendering to implement it. Previous implementation call
two times the glamor_composite_... independently which is
very inefficient. Now we change the control flow, and do
the two pass internally and avoid duplicate works.
For the x11perf -rgb10text, this optimization can get about
30% improvement.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>