Specifically for xrgb2101010 format.
Tested on KDE Plasma-5 with XRender based composite
acceleration backend. Much smoother and faster.
(v2) Dropped argb2101010, because of depth 32 confusion with
argb8888, as pointed out by Eric. Thanks!
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Antoine Martin <antoine@nagafix.co.uk>
This plumbs the full width color for solid pictures through to fb, exa,
and glamor. External drivers and acceleration code may wish to make a
similar change for sufficiently new servers.
v2: Don't break ABI (Michel Dänzer)
v2.1: Use the (correct) full color in fb too (Michel Dänzer)
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
COMPOSITE_REGION() can pass NULL as a source picture, make sure we
handle that nicely in both glamor_composite_clipped_region() and
glamor_composite_choose_shader().
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=101894
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Unlike the previous two fixes, this one introduces new GL calls and
statechanges of the scissor. However, given that our Render drawing
already does CPU side transformation and inefficient box upload, this
shouldn't be a limiting factor for Render acceleration.
Surprisingly, it improves x11perf -comppixwin10 -repeat 1 -reps 10000
on i965 by 3.21191% +/- 1.79977% (n=50).
v2: Make the jump to the exit land after scissor disable.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
When selecting "CA_TWO_PASS" in glamor_composite_clipped_region() when
the hardware does not support "GL_ARB_blend_func_extended", we call
glamor_composite_choose_shader() twice in a row, which in turn calls
glamor_pixmap_ensure_fbo().
On memory pixmaps, the first call will set the FBO and the second one
will fail an assertion in glamor_upload_picture_to_texture() because
the FBO is already set.
Bail out earlier when the mask pixmap is in memory and the hardware
capabilities would require to use two pass, so that the assertion is not
failed and the rendering is correct.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99346
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
I don't think anybody has run this code since it was pulled into the
server.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
The extension came out in 2000, and all Mesa-supported hardware that
can do glamor supports it. We were already relying on the ARB version
being present on desktop.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Even if the pixmap's storage has alpha, it may have been uploaded with
garbage in the alpha channel, so we need to force the shader to set
alpha to 1. This was broken way back in
355334fcd9.
Fixes rendercheck -t composite -f x8r8g8b8.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
The copy optimization in d37329cba4
replicated a bug from last time we did a copy optimization: CopyArea
is only defined for matching depths. This is only a problem at 15 vs
16 depth right now (24 vs 32 would also have matching Render formats,
but they should work) but be strict in case we store other depths
differently in the future.
Fixes rendercheck -t blend -o src -f x4r4g4b4,x3r4g4b4
v2: Drop excessive src->depth == dst->depth check that snuck in.
v3: Switch back to src->depth == dst->depth
v4: Touch up commit message (s/bpp/depth).
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
v2: Fix "orignal" too (review feedback by ajax, change by anholt)_
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
Commit b64108fa ("glamor: Check for composite operations which are
equivalent to copies") failed to copy conditions from exaComposite which
ensure that the composite operation doesn't access outside of the source
picture.
This fixes rendercheck regressions from the commit above.
Reviewed-by: Keith Packard <keithp@keithp.com>
Patch b64108fa30 added a short cut that
identifies composite operations that can be performed with a simple
copy instead.
glamor_copy works in absolute coordinates, so the dx and dy values
passed in need to be converted from drawable-relative to absolute by
adding the drawable x/y values.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
A1 and A8 pixmaps are usually stored in the Red channel to conform
with more recent GL versions. When using these pixmaps as mask values,
that works great. When using these pixmaps as source values, then the
value we want depends on what the destination looks like.
For RGBA or RGB destinations, then we want to use the Red channel
for A values and leave RGB all set to zero.
For A destinations, then we want to leave the R values in the Red
channel so that they end up in the Red channel of the output.
This patch adds a helper function, glamor_bind_texture, which performs
the glBindTexture call along with setting the swizzle parameter
correctly for the Red channel. The swizzle parameter for the Alpha
channel doesn't depend on the destination as it's safe to leave it
always swizzled from the Red channel.
This fixes incorrect rendering in firefox for this page:
https://gfycat.com/HoarseCheapAmericankestrel
while not breaking rendering for this page:
https://feedly.com
v2: Add change accidentally left in patch for missing
glDisable(GL_COLOR_LOGIC_OP).
Found by Emil Velikov <emil.l.velikov@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63397
Signed-off-by: Keith Packard <keithp@keithp.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
glamor_make_current is supposed to be called before any GL APIs.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
If the logic op gets left enabled, it overrides the blending
operation, causing incorrect contents on the display.
v2: Disable only on non-ES2, but disable even for PictOpSrc
v3: Found another place this is needed in
glamor_composite_set_shader_blend
v4: Remove change dependent on new glamor_set_composite_texture
API. This belongs in a different patch.
Found by Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Increases x11perf -compwinwin500 numbers by a factor of 10 for me with
radeonsi.
Conditions copied from exaComposite().
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
RENDER requires that sampling outside of any source/mask picture results
in alpha == 0.0.
The OpenGL border colour cannot set alpha = 0.0 if the texture format
doesn't have an alpha channel, so we have to use the RepeatFix handling
in that case.
Also, only force alpha = 1.0 when sampling inside of RGBx source/mask
pictures.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94514
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
glamor_composite_choose_shader() may upload our scratch pixmaps to get
a Render operation completed. We don't want to hang onto GL memory
for our scratch pixmaps, since we'll just have to reallocate them at a
new w/h next time around, and the contents will be updated as well.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
glamor_upload_sub_pixmap_to_texture() only had the one caller, so we
can merge it in, fix its silly return value, and propagate a bunch of
constants.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
For hardware that doesn't do actual jumps for conditionals (i915,
current vc4 driver), this reduces the number of texture fetches
performed (assuming the driver isn't really smart about noticing that
the same sampler is used on each side of an if just with different
coordinates).
No performance difference on i965 with x11perf -magpixwin100 (n=40).
Improves -magpixwin100 by 12.9174% +/- 0.405272% (n=5) on vc4.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
All sorts of weird indentation, and some cuddled conditional
statements deep in the if tree.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
wh ratios are != 1.0 only when large, so with that we can simplify
down how we end up with RepeatFix being used.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is a step toward using glamor_program.c for Render acceleration.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We can just hand in a constant mask and the driver will optimize away
the multiplication for us.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
GL_RED is supported by core profiles while GL_ALPHA is not; use GL_RED
for one channel objects (depth 1 to 8), and then swizzle them into the
alpha channel when used as a mask.
[airlied: updated to master, add swizzle to composited glyphs and xv paths]
v2: consolidate setting swizzle into the texture creation code, it
should work fine there. Handle swizzle when setting color as well.
v3: Fix drawing to a8 with Render (changes by anholt, reviewed by airlied).
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
We only need it once at the top of the shader, so just put it
there.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
It's been on the list to add dual source blending support to avoid the
two pass componentAlpha code. Radeon has done this for a while in
EXA, so let's add support to bring glamor up to using it.
This adds dual blend to both render and composite glyphs paths.
Initial results show close to doubling of speed of x11perf -rgb10text.
v2: Fix breakage of all of CA acceleration for systems without
GL_ARB_blend_func_extended. Add CA support for all the ops we
support in non-CA mode when blend_func_extended is present. Clean
up some comments and formatting. (changes by anholt)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
I originally inherited this from the EXA code, without determining
whether it was really needed. Regular composite should end up doing
the same thing, since it's all just shaders anyway. To the extent
that it doesn't, we should fix composite.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reading and writing to 16-depth pixmaps using PICT_x1r5g5b5 ends up
failing, unless you're doing a straight copy at the same bpp where the
misinterpretation matches on both sides.
Fixes rendercheck/blend/over and renderhceck/blend/src in piglit.
Please cherry-pick this to active stable branches.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
If the source/mask pixmap is a pixmap that doesn't have an FBO
attached, and it doesn't match the Render operation's size, then we'll
composite it to a CPU temporary (not sure why). We would take the
PictFormatShort from the source Picture, make a pixmap of that depth,
and try to look up the PictFormat description from the depth and the
PictFormatShort. However, the screen's PictFormats are only attached
to the screen's visuals' depths. So, with an x2r10g10b10 short format
(depth 30), we wouldn't find the screen's PictFormat for it
(associated with depth 32).
Instead of trying to look up from the screen, just use the pFormat
that came from our source picture. The only time we need to look up a
PictFormat when we're doing non-shader gradients, which we put in
a8r8g8b8.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
I'm amazed we've made it as far as we have without these checks: if
you made an unusual format picture that wasn't the normal a8r8g8b8 or
x8r8g8b8 or a8, we'd go ahead and try to render with it, ignoring that
the sampler would fetch totally wrong bits.
Fixes 260 tests in rendercheck -t blend -o src -f a8r8g8b8,x2r10g10b10
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This gives the compiler a chance to optimize when the data is never
changed -- for example, with pict_format_combine_tab, the compiler
ends up inlining the 24 bytes of data into just 10 more bytes of code.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
We use this for all of our other performance-sensitive rendering, too.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
XRender defines this, GL really doesn't like it.
kwin 4.x and qt 4.x seem to make this happen for the
gradient in the titlebar, and on radeonsi/r600 hw
this draws all kinds of wrong.
v2: bump this up a level, and check it earlier.
(I assume the XXXX was for this case.)
v3: add same code to largepixmap paths (Keith)
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jasper St. Pierre <jstpierre@mecheye.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
New composite glyphs code uses the updated glamor program
infrastructure to create efficient shaders for drawing render text.
Glyphs are cached in two atlases (one 8-bit, one 32-bit) in a simple
linear fashion. When the atlas fills, it is discarded and a new one
constructed.
v2: Eric Anholt changed the non-GLSL 130 path to use quads instead of
two triangles for a significant performance improvement on hardware
with quads. Someone can fix the GLES quads emulation if they want to
make it faster there.
v3: Eric found more dead code to delete
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
By dropping the unconditional logic op disable at the end of
rendering, this fixes GL errors being thrown in GLES2 contexts (which
don't have logic ops). On desktop, this also means a little less
overhead per draw call from taking one less trip through the
glEnable/glDisable switch statement of doom in Mesa.
The exchange here is that we end up taking a trip through it in the
XV, Render, and gradient-generation paths. If the glEnable() is
actually costly, we should probably cache our logic op state in our
screen, since there's no way the GL could make that switch statement
as cheap as the caller caching it would be.
v2: Don't forget to set the logic op in Xephyr's drawing.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
This will let us eliminate the pixmap types shortly
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Just embed the large elements in the regular pixmap private and
collapse the union to a single struct.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
There's no reason to waste memory storing redundant copies of the same
pointer all over the system; just pass in pointers as necessary to
each function.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
The core rendering code already requires that FBOs be allocated at
exactly the pixmap size so that tiling and stippling work
correctly. Remove the allocation support for that, along with the
render code.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>