We don't care that much about startup time to write different code paths...
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
We don't use fixed function rendering, so there is no need to reset
the program at all. This lets the driver avoid checking for state
changes between draw calls when we rebind the same program.
Improves xephyr x11perf -f8text performance by 6.03062% +/- 1.64928%
(n=20)
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
This will help tools like fips, apitrace, or INTEL_DEBUG=shader_time
provide useful information about the shaders in use.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
Now that the core deals with that for us, we can avoid all this extra
carefulness.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
The old Xephyr codebase was using the GL window system framebuffer for
the screen pixmap, but that meant you couldn't texture from it to do
operations sourcing from the screen, so in the version that landed I
instead had the screen just be a plain texture.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
This unpacks the bitfield into an int size, but my experience has been
that packing bitfields doesn't matter for performance.
v2: Convert more comparisons against numbers or implicit bool
comparisons to comparisons against the enum names, and fix up some
comments.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
The argument to setup_composte_vbo is the number of verts.
v2: Drop the now-unused vert_stride value.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
We should be uploading any vertex data using this kind of upload
style, since it saves a bunch of extra copies of our vertex data.
v2:
- Add a simple comment about what the function does.
- Use get_vbo_space()'s return in trapezoids, instead of dereffing
glamor_priv->vb (by Markus Wick).
- Fix the double-unmapping by moving put_vbo_space() outside of
flush_composite_rects().
- Remove the rest of the composite_vbo_offset usage, and just always
use get_vbo_space()'s return value.
v3:
- Fix failure to put_vbo_space in traps when no prims were
generated.
- Unbind the VBO from put_vbo_space(). Keeps callers from
forgetting to do so.
v4:
- Split out some changes into the previous 3 commits while trying to
track down a regression.
- Fix regression due to rebase fail where glamor_priv->vbo_offset
wasn't incremented.
v5:
- Fix GLES2 VBO sizing.
- Add a comment about resize behavior.
- Move glamor_vbo.c init code to glamor_vbo.c from
glamor_render.c. (Derived from Markus's changes, but the GLES2 fix
dropped almost all of the code in the functions).
v6:
- Drop the initial BufferData on GLES2 (it happens at put() time).
- Don't forget to set vbo_offset to the size on GLES2.
- Use char * instead of void * in the cast to return the vbo_offset.
- Resize the default FBO to 512kb, to be similar to previous
behavior. +1.66124% +/- 0.284223% (n=679) on aa10text.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
I want to extract the VBO mapping code, and as part of that I need to
get the global vbo_offset munging to stop.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
It's only used in the nonantialiased, triangle-based trapezoids path.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
We don't need any current contents of the buffer, and this allows an
implementation to make a temporary BO for a streamed upload if it
wants to.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus at selfnet.de>
Those calls are only for enabling texture handling in the fixed
function pipeline, while everything we do is with shaders.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
It used to be the thing that returned your dispatch table and happeend
to set up the context, but now it just sets up the context.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Libepoxy hides all the GL versus GLES2 dispatch handling for us, with
higher performance.
v2: Squash in the later patch to drop the later of two repeated
glamor_get_dispatch()es instead (caught by keithp)
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Add Fast/Good/Best and appropriately map to Nearest and
Bilinear. Additionally, add a fallback path for unsupported filters.
Notably, this fixes window shadow rendering with Compiz, which uses
PictFilterConvolution for some odd reason.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
This lets us explicitly specify the range of vertices that are used,
which the OpenGL driver can use for optimization. Particularly,
it results in lower CPU overhead with Mesa-based drivers.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
After increase to gcc4.7, it reports more warnings, now
fix them.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Junyan He<junyan.he@linux.intel.com>
In some cases we allocate the VBO but have no vertex to
emit, which cause the VBO fail to be released. Fix it.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Previous patch doesn't set the offset to zero for GLESv2
path. Now fix it.
This patch also fix a minor problem in pixmap uploading
preparation. If the revert is not REVERT_NORMAL, then we
don't need to prepare a fbo for it. As current mesa i965
gles2 driver doesn't support to set a A8 texture as a fbo
target, we must fix this problem. As some A1/A8 picture
need to be uploaded, this is the only place a A8 texture
may be attached to a fbo.
This patch also enable the shader gradient for GLESv2.
The reason we disable it before is that some glsl linker
doesn't support link different objects which have cross
reference. Now we don't have that problem.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Fixes incorrectly clipped rendering. E.g. the cursor in Evolution
composer windows became invisible.
Signed-off-by: Michel Daenzer <michel.daenzer@amd.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Practically, for pure 2D blit, the blit copy is much faster
than textured copy. For the x11perf copywinwin100, it's about
3x faster. But if we have heavy rendering/compositing, then use
textured copy will get much better (>30%)performance for most
of the cases.
So we simply add a data element to track current state. For
rendering state we use textured copy, otherwise, we use blit
copy.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
We can reuse the last one if the last one is big enough
to contain current vertext data. In the meantime, Use
MapBufferRange instead of MapBuffer.
Testing shows, this patch brings some benefit for
aa10text/rgb10text. Not too much, but indeed faster.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
For the componentAlpha with PictOpOver, we use two pass
rendering to implement it. Previous implementation call
two times the glamor_composite_... independently which is
very inefficient. Now we change the control flow, and do
the two pass internally and avoid duplicate works.
For the x11perf -rgb10text, this optimization can get about
30% improvement.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
One case we need force clip when download/upload a drm_texture
pixmap. Actually, this is only meaningful for testing purpose.
As we may set the max_fbo_size to a very small value, but the
drm texture may exceed this value but the drm texture pixmap
is not largepixmap. This is not a problem with OpenGL. But for
GLES2, we may need to call glamor_es2_pixmap_read_prepare to
create a temporary fbo to do the color conversion. Then we have
to force clip the drm pixmap here to avoid large pixmap handling
at glamor_es2_pixmap_read_prepare.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
We change some macros to put the vert to the vertex buffer
directly when we cacluating it. This way, we can get about
4% performance gain.
This commit also fixed one RepeatPad bug, when we RepeatPad
a not eaxct size fbo. We need to calculate the edge. The edge
should be 1.0 - half point, not 1.0.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
We seperate the composite to two phases, firstly to
select the shader according to source type and logic
op, setting the right parameters. Then we emit the
vertex array to generate the dest result.
The reason why we do this is that the shader may be
used to composite no only rect, trapezoid and triangle
render function can also use it to render triangles and
polygens. The old function glamor_composite_with_shader
do the whole two phases work and can not match the
new request.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Create the file glamor_trapezoid.c, extract the logic
relating to trapezoid from glamor_render.c to this file.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
The simplest way to support large pixmap's self compositing
is to just clone a pixmap private data structure, and change
the fbo and box to point to the correct postions. Don't need
to copy a new box.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
This commit implement almost all the needed functions for
the large pixmap support. It's almost complete.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Now we start to enable glamor_composite on large pixmap.
We need to do a three layer clipping to split the dest/source/mask
to small pieces. This commit only support non-transformation and
repeat normal case.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Added infrastructure for largepixmap, this commit implemented:
1. Create/Destroy large pixmap.
2. Upload/Download large pixmap.
3. Implement basic repeat normal support.
3. tile/fill/copyarea large pixmap get supported.
The most complicated part glamor_composite still not implemented.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
This is the first commit to add support for large pixmap.
The large here means a pixmap is larger than the texutre's
size limitation thus can't fit into one single texutre.
The previous implementation will simply fallback to use a
in memory pixmap to contain the large pixmap which is
very slow in practice.
The basic idea here is to use an array of texture to hold
the large pixmap. And when we need to get a specific area
of the pixmap, we just need to compute/clip the correct
region and find the corresponding fbo.
We need to implement some auxiliary routines to clip every
rendering operations into small pieces which can fit into
one texture.
The complex part is the transformation/repeat/repeatReflect
and repeat pad and their comination. We will support all of
them step by step.
This commit just add some necessary data structure to represent
the large pixmap, and doesn't change any rendering process.
This commit doesn't add real large pixmap support.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
1. Extract the logic of gradient from the glamor_render.c
to the file glamor_gradient.c.
2. Modify the logic of gradient pixmap gl draw. Use the
logic like composite before, but the gradient always just
have one rect to render, so no need to set the VB and EB,
replace it with just call glDrawArrays. 3.Kill all the
warning in glamor_render.c
Reviewed-by: Zhigang Gong<zhigang.gong@linux.intel.com>
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Previous implementation set the whole fbo's width and height as the
viewpoint. This may increase the numerical error as we may only has
a partial region as the valid pixmap. So add a new marco
pixmap_priv_get_dest_scale to get proper scale factor for the
destination pixmap. For the source/mask pixmap, we still need to
consider the whole fbo's size.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
We miss the strict warning flags for a long time, now add it back.
This commit also fixed most of the warnings after enable the strict
flags.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>