Commit Graph

152 Commits

Author SHA1 Message Date
Eric Anholt
76bd0f9949 glamor: Don't bother keeping references to shader stages for gradients.
They never get reattached to any other program, so saving them to
unreference later is a waste of code.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2014-02-14 18:30:01 -08:00
Eric Anholt
f8d384fa8f glamor: Move shader precision stuff from build time to shader compile time.
This is the last desktop-versus-ES2 build ifdef in core glamor.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2014-02-14 18:30:01 -08:00
Eric Anholt
0e4f341418 glamor: Unifdef the cache format indices.
We only ask for GL_RGB on desktop GL as far as I can see, but now if
GLES2 did happen to ask for GL_RGB it would return a cache index
instead of -1.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2014-02-14 18:30:01 -08:00
Eric Anholt
f3f4fc7a65 glamor: Add a screen argument to drop an ifdef from glamor_set_alu().
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2014-02-14 18:30:01 -08:00
Eric Anholt
c3c8a5f360 glamor: yInverted is a boolean value, so use the Bool type.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2014-02-14 18:30:01 -08:00
Eric Anholt
4afe15d8bf glamor: Put in a pluggable context switcher for GLX versus EGL.
The GLX side just gets the context from the current state.  That's
also something I want to do for EGL, so that the making a context is
separate from initializing glamor, but I think I need the modesetting
driver in the server before I think about hacking on that more.

The previous code was rather incestuous, along with pulling in xf86
dependencies to our dix code.  The new code just initializes itself
from the current state.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-02-14 18:30:01 -08:00
Eric Anholt
0373b3f4f7 glamor: Convert to using libepoxy.
Libepoxy hides all the GL versus GLES2 dispatch handling for us, with
higher performance.

v2: Squash in the later patch to drop the later of two repeated
    glamor_get_dispatch()es instead (caught by keithp)

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-02-14 18:28:56 -08:00
Eric Anholt
b98e49379c glamor: Remove more out-of-tree compat code.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-01-27 09:30:47 -08:00
Eric Anholt
54e78ec31e glamor: Convert use of the old "pointer" typedef to "void *".
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-01-27 09:30:47 -08:00
Eric Anholt
9af66851e2 glamor: Disable definitions of GL extension prototypes to avoid warnings.
We're not using the extension prototypes, since you have to dlsym them
anyway.  Disabling their definitions prevents them from being defined
twice (once by gl.h, once by glext.h).

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-01-27 09:30:47 -08:00
Adam Jackson
b3acb47e98 glamor: Use dix-config.h not project config.h
v2: Also edit the one in glamor_egl.c (by anholt)
v3: Also edit the one in glamor_eglmodule.c (by anholt)

Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-01-27 09:30:47 -08:00
Eric Anholt
0c5a7c2086 glamor: Remove compat code for building out of tree.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-01-27 09:30:47 -08:00
Adam Jackson
82efb90efb glamor: Remove copy of sna's compiler.h
Xfuncproto.h has equivalents for these already.

v2: Adjust a couple more likelies after the rebase (anholt)

Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-01-27 09:30:47 -08:00
Eric Anholt
714926b090 glamor: Fix up some indentation damage on header prototypes.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-01-27 09:30:47 -08:00
Eric Anholt
7f6e865359 glamor: Fix some indent damage of putting a ' ' after the '*' for pointers.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-01-27 09:30:47 -08:00
Eric Anholt
d84d71029a glamor: Apply x-indent.sh.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
2014-01-27 09:30:47 -08:00
Axel Davy
7cfd9cc232 Add DRI3 support to glamor
This implements some DRI3 helpers to help the DDXs using
glamor to support DRI3.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:54 -08:00
Maarten Lankhorst
842cd7eb43 fixup picture in SetWindowPixmap
When creating a window with recordmydesktop running, the following may happen:

create picture 0x1cd457e0, with drawable 0x1327d1f0
(SetWindowPixmap is called)
destroy picture 0x1cd457e0, with drawable 0x1cd65820

Obtaining format for pixmap 0x1327d1f0 and picture 0x1cd457e0
==7989== Invalid read of size 4
==7989==    at 0x8CAA0CA: glamor_get_tex_format_type_from_pixmap (glamor_utils.h:1252)
==7989==    by 0x8CAD1B7: glamor_download_sub_pixmap_to_cpu (glamor_pixmap.c:1074)
==7989==    by 0x8CA8BB7: _glamor_get_image (glamor_getimage.c:66)
==7989==    by 0x8CA8D2F: glamor_get_image (glamor_getimage.c:92)
==7989==    by 0x29AEF2: miSpriteGetImage (misprite.c:413)
==7989==    by 0x1E7674: compGetImage (compinit.c:148)
==7989==    by 0x1F5E5B: ProcShmGetImage (shm.c:684)
==7989==    by 0x1F686F: ProcShmDispatch (shm.c:1121)
==7989==    by 0x15D00D: Dispatch (dispatch.c:432)
==7989==    by 0x14C569: main (main.c:298)
==7989==  Address 0x1cd457f0 is 16 bytes inside a block of size 120 free'd
==7989==    at 0x4C2B60C: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==7989==    by 0x228897: FreePicture (picture.c:1477)
==7989==    by 0x228B23: PictureDestroyWindow (picture.c:73)
==7989==    by 0x234C19: damageDestroyWindow (damage.c:1646)
==7989==    by 0x1E92C0: compDestroyWindow (compwindow.c:590)
==7989==    by 0x20FF85: DbeDestroyWindow (dbe.c:1389)
==7989==    by 0x185D46: FreeWindowResources (window.c:907)
==7989==    by 0x1889A7: DeleteWindow (window.c:975)
==7989==    by 0x17EBF1: doFreeResource (resource.c:873)
==7989==    by 0x17FC1B: FreeClientResources (resource.c:1139)
==7989==    by 0x15C4DE: CloseDownClient (dispatch.c:3402)
==7989==    by 0x2AB843: CheckConnections (connection.c:1008)
==7989==
(II) fail to get matched format for dfdfdfdf

The fix is to update the picture pointer when the window pixmap is changed,
so it moves the picture around with the window rather than the pixmap.

This makes FreePicture work correctly.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71088
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:54 -08:00
Dave Airlie
e3d1d4e3ca glamor: add initial Xv support
This does YV12 and I420 for now, not sure if we can do packed without
a GL extension.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-12-18 11:23:54 -08:00
Zhigang Gong
e846f48f48 Increase vbo size to 64K verts.
This commit will benefit vertex stressing cases such as
aa10text/rgb10text, and can get about 15% performance gain.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Acked-by: Junyan <junyan.he@linux.intel.com>
2013-12-18 11:23:53 -08:00
Zhigang Gong
b8f0a21882 Silence compilation warnings.
After increase to gcc4.7, it reports more warnings, now
fix them.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Junyan He<junyan.he@linux.intel.com>
2013-12-18 11:23:53 -08:00
Junyan He
c3096c5a56 Fallback to pixman when trapezoid mask is big.
The trapezoid generating speed of the shader is relatively
 slower when the trapezoid area is big. We fallback when
 the trapezoid's width and height is bigger enough.
 The big traps number will also slow down the render because
 of the VBO size. We fallback if ntrap > 256

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-By: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:53 -08:00
Zhigang Gong
bc1b412b3b Synch with xorg 1.13 change.
As xorg 1.13 change the scrn interaces and remove those
global arrays. Some API change cause we can't build. Now
fix it.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:53 -08:00
Zhigang Gong
4c27ca4700 gles2: Fixed the compilation problem and some bugs.
Previous patch doesn't set the offset to zero for GLESv2
path. Now fix it.

This patch also fix a minor problem in pixmap uploading
preparation. If the revert is not REVERT_NORMAL, then we
don't need to prepare a fbo for it. As current mesa i965
gles2 driver doesn't support to set a A8 texture as a fbo
target, we must fix this problem. As some A1/A8 picture
need to be uploaded, this is the only place a A8 texture
may be attached to a fbo.

This patch also enable the shader gradient for GLESv2.
The reason we disable it before is that some glsl linker
doesn't support link different objects which have cross
reference. Now we don't have that problem.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:53 -08:00
Zhigang Gong
32a7438bf7 glamor_copyarea: Use blitcopy if current state is not render.
Practically, for pure 2D blit, the blit copy is much faster
than textured copy. For the x11perf copywinwin100, it's about
3x faster. But if we have heavy rendering/compositing, then use
textured copy will get much better (>30%)performance for most
of the cases.

So we simply add a data element to track current state. For
rendering state we use textured copy, otherwise, we use blit
copy.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:53 -08:00
Zhigang Gong
4d1a2173f2 glamor_compositerects: Implement optimized version.
Don't call miCompositeRects. Use glamor_composite_clipped_region
to render those boxes at once.
Also add a new function glamor_solid_boxes to fill boxes at once.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:52 -08:00
Zhigang Gong
dd79243398 optimize: Use likely and unlikely.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:52 -08:00
Zhigang Gong
682f5d2989 glamor_largepixmap: Walkaround for large texture's upload.
I met a problem with large texture (larger than 7000x7000)'s
uploading on SNB platform. The map_gtt get back a mapped VA
without error, but write to that virtual address triggers
BUS error. This work around is to avoid that direct uploading.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:52 -08:00
Zhigang Gong
37d4022f01 glamor_render: Optimize the two pass ca rendering.
For the componentAlpha with PictOpOver, we use two pass
rendering to implement it. Previous implementation call
two times the glamor_composite_... independently which is
very inefficient. Now we change the control flow, and do
the two pass internally and avoid duplicate works.

For the x11perf -rgb10text, this optimization can get about
30% improvement.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:52 -08:00
Zhigang Gong
c1bd50d58d glamor_glyphs: Detect fake or real glyphs overlap.
To split a glyph's extent region to three sub-boxes
as below.

left box   2 x h
center box (w-4) x h
right box  2 x h

Take a simple glyph A as an example:
     *
  __* *__
   *****
  *     *
  ~~   ~~

The left box and right boxes are both 2 x 2. The center box
is 2 x 4.

The left box has two bitmaps 0001'b and 0010'b to indicate
the real inked area.
The right box also has two bitmaps 0010'b and 0001'b.

And then we can check the inked area in left and right boxes with
previous glyph. If the direction is from left to right, then we
need to check the previous right bitmap with current left bitmap.

And if we found the center box has overlapped or we overlap with
not only the previous glyph, we will treat it as real overlapped
and will render the glyphs via mask.

If we only intersect with previous glyph on the left/right edge.
Then we further compute the real overlapped bits. We set a loose
check criteria here, if it has less than two pixel overlapping, we
treat it as non-overlapping.

With this patch, The aa10text boost fom 1660000 to 320000.
Almost double the performance! And the cairo test result is the
same as without this patch.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:52 -08:00
Junyan He
8f31aed48c Use the direct render path for A1
Because when mask depth is 1, there is no Anti-Alias at all,
 in this case, the directly render can work well and it is faseter.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2013-12-18 11:23:52 -08:00
Junyan He
fa74a213ad Add the trapezoid direct render logic
We firstly get the render area by clipping the trapezoid
 with the clip rect, then split the clipped area into small
 triangles and use the composite logic to generate the result
 directly. This manner is fast but have the problem that
 some implementation of GL do not implement the Anti-Alias
 of triangles fill, so the edge sometimes has sawtooth. It is
 not acceptable when use trapezoid to approximate circles and
 wide lines.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2013-12-18 11:23:52 -08:00
Junyan He
5f1560c84a Modilfy the composite logic to two phases
We seperate the composite to two phases, firstly to
 select the shader according to source type and logic
 op, setting the right parameters. Then we emit the
 vertex array to generate the dest result.
 The reason why we do this is that the shader may be
 used to composite no only rect, trapezoid and triangle
 render function can also use it to render triangles and
 polygens. The old function glamor_composite_with_shader
 do the whole two phases work and can not match the
 new request.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2013-12-18 11:23:52 -08:00
RobinHe
bd180be619 Use shader to generate the temp trapezoid mask
The old manner of trapezoid render uses pixman to
 generate a mask pixmap and upload it to the GPU.
 This effect the performance. We now use shader to
 generate the temp trapezoid mask to avoid the
 uploading of this pixmap.
 We implement a anti-alias manner in the shader
 according to pixman, which will caculate the area
 inside the trapezoid dividing total area for every
 pixel and assign it to the alpha value of that pixel.
 The pixman use a int-to-fix manner to approximate but
 the shader use float, so the result may have some
 difference.
 Because the array in the shader has optimization problem,
 we need to emit the vertex of every trapezoid every
 time, which will effect the performance a lot. Need to
 improve it.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2013-12-18 11:23:52 -08:00
RobinHe
6dd81c5939 Create the file glamor_triangles.c
Create the file glamor_trapezoid.c, extract the logic
 relating to trapezoid from glamor_render.c to this file.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2013-12-18 11:23:52 -08:00
Zhigang Gong
bf38ee407b Enable large pixmap by default.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:52 -08:00
Zhigang Gong
8ca16754f7 largepixmap: Fix the selfcopy issue.
If the source and destination are the same pixmap/fbo, and we
need to split the copy to small pieces. Then we do need to
consider the sequence of the small pieces when the copy area
has overlaps. This commit take the reverse/upsidedown into
the clipping function, thus it can generate correct sequence
and avoid corruption self copying.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Zhigang Gong
5325c800f7 largepixmap: Support self composite for large pixmap.
The simplest way to support large pixmap's self compositing
is to just clone a pixmap private data structure, and change
the fbo and box to point to the correct postions. Don't need
to copy a new box.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Zhigang Gong
e96ea02010 largepixmap: Implement infrastructure for large pixmap.
Added infrastructure for largepixmap, this commit implemented:
1. Create/Destroy large pixmap.
2. Upload/Download large pixmap.
3. Implement basic repeat normal support.
3. tile/fill/copyarea large pixmap get supported.

The most complicated part glamor_composite still not implemented.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Zhigang Gong
ace35e408c glamor_largepixmap: first commit for large pixmap.
This is the first commit to add support for large pixmap.
The large here means a pixmap is larger than the texutre's
size limitation thus can't fit into one single texutre.

The previous implementation will simply fallback to use a
in memory pixmap to contain the large pixmap which is
very slow in practice.

The basic idea here is to use an array of texture to hold
the large pixmap. And when we need to get a specific area
of the pixmap, we just need to compute/clip the correct
region and find the corresponding fbo.

We need to implement some auxiliary routines to clip every
rendering operations into small pieces which can fit into
one texture.

The complex part is the transformation/repeat/repeatReflect
and repeat pad and their comination. We will support all of
them step by step.

This commit just add some necessary data structure to represent
the large pixmap, and doesn't change any rendering process.
This commit doesn't add real large pixmap support.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Junyan He
d900f553c2 Extract the gradient related code out.
1. Extract the logic of gradient from the glamor_render.c
 to the file glamor_gradient.c.
 2. Modify the logic of gradient pixmap gl draw. Use the
 logic like composite before, but the gradient always just
 have one rect to render, so no need to set the VB and EB,
 replace it with just call glDrawArrays. 3.Kill all the
 warning in glamor_render.c

Reviewed-by: Zhigang Gong<zhigang.gong@linux.intel.com>

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Zhigang Gong
8169280464 glamor_set_destination_pixmap_priv_nc: set drawable's width x height.
Previous implementation set the whole fbo's width and height as the
viewpoint. This may increase the numerical error as we may only has
a partial region as the valid pixmap. So add a new marco
pixmap_priv_get_dest_scale to get proper scale factor for the
destination pixmap. For the source/mask pixmap, we still need to
consider the whole fbo's size.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Zhigang Gong
7f55e48499 Remove the texture cache code.
Caching texture objects is not necessary based on previous testing.
To keep the code simple, we remove it.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Zhigang Gong
1035fc72b9 Fixed all unused variables warnings.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Zhigang Gong
0d846d9569 Added --enable-debug configuration option.
For release version, we disable asserts.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:51 -08:00
Zhigang Gong
6e50ee9c10 glamor_fbo: Added a threshold value for the fbo cache pool.
Currently set it to 256MB. If cache pool watermark increases
to this value, then don't push any fbo to this pool, will purge
the fbo directly.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:50 -08:00
Zhigang Gong
9f53cc1c33 glamor_render.c: Fixed repeatPad and repeatRelect.
We should use difference calculation for these two repeat mode
when we are a sub region within one texture.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:50 -08:00
Zhigang Gong
6b664dda69 gradient: Disable gradient for gles2.
As PVR glsl compiler seems doesn't support external fragment
function, and fails at compile gradient shader. Disable it
for now. We may need to modify gradient shader to don't use
external function.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:50 -08:00
Junyan He
3d96929596 Fix the problem of memory leak in gradient pixmap generating.
Fix the problem of memory leak in gradient pixmap
 generating. The problem caused by we do not call
 glDeleteShader when destroy a shader program. This patch
 will split the gradient pixmap generating to three
 category. If nstops < 6, we will use the no array version
 of the shader, which has the best performance. Else if
 nstops < 16, we use array version of the shader, which is
 compiled and linked at screen init stage. Else if nstops >
 16, we dynamically create a new shader program, and this
 program will be cached until bigger nstops.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:50 -08:00
Zhigang Gong
9bcddff93b pending_op: Remove the pending operations handling.
We have disabled this feature for a long time, and previous
testing shows that this(pending fill) will not bring observed
performance gain. Now remove it.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-12-18 11:23:50 -08:00