We don't use fixed function rendering, so there is no need to reset
the program at all. This lets the driver avoid checking for state
changes between draw calls when we rebind the same program.
Improves xephyr x11perf -f8text performance by 6.03062% +/- 1.64928%
(n=20)
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
From the GL_ARB_vertex_buffer_object spec:
After the client has specified the contents of a mapped data store,
and before the data in that store are dereferenced by any GL commands,
the mapping must be relinquished by calling
boolean UnmapBufferARB(enum target);
Our mappings were only getting reaped at PBO destroy time, after the
upload. If the GL implementation wasn't coherent, it would have used
stale data to do the texture upload.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
Nothing was using it, and it was going to complicate the
glamor_prepare_access bugfixing I'm going to do next.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
This unpacks the bitfield into an int size, but my experience has been
that packing bitfields doesn't matter for performance.
v2: Convert more comparisons against numbers or implicit bool
comparisons to comparisons against the enum names, and fix up some
comments.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Markus Wick <markus@selfnet.de>
A pair of 150 lines of inlined switch statements in a header file is
crazy.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
v2: Just pass in the PicturePtr to glamor_pict_format_is_compatible()
(suggestion by keithp)
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
gl_ModelViewProjection and friends aren't used in our shaders, so this
setup didn't do anything.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Those calls are only for enabling texture handling in the fixed
function pipeline, while everything we do is with shaders.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
It used to be the thing that returned your dispatch table and happeend
to set up the context, but now it just sets up the context.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
Libepoxy hides all the GL versus GLES2 dispatch handling for us, with
higher performance.
v2: Squash in the later patch to drop the later of two repeated
glamor_get_dispatch()es instead (caught by keithp)
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Keith Packard <keithp@keithp.com>
After increase to gcc4.7, it reports more warnings, now
fix them.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Junyan He<junyan.he@linux.intel.com>
Previous patch doesn't set the offset to zero for GLESv2
path. Now fix it.
This patch also fix a minor problem in pixmap uploading
preparation. If the revert is not REVERT_NORMAL, then we
don't need to prepare a fbo for it. As current mesa i965
gles2 driver doesn't support to set a A8 texture as a fbo
target, we must fix this problem. As some A1/A8 picture
need to be uploaded, this is the only place a A8 texture
may be attached to a fbo.
This patch also enable the shader gradient for GLESv2.
The reason we disable it before is that some glsl linker
doesn't support link different objects which have cross
reference. Now we don't have that problem.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
I met a problem with large texture (larger than 7000x7000)'s
uploading on SNB platform. The map_gtt get back a mapped VA
without error, but write to that virtual address triggers
BUS error. This work around is to avoid that direct uploading.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
One case we need force clip when download/upload a drm_texture
pixmap. Actually, this is only meaningful for testing purpose.
As we may set the max_fbo_size to a very small value, but the
drm texture may exceed this value but the drm texture pixmap
is not largepixmap. This is not a problem with OpenGL. But for
GLES2, we may need to call glamor_es2_pixmap_read_prepare to
create a temporary fbo to do the color conversion. Then we have
to force clip the drm pixmap here to avoid large pixmap handling
at glamor_es2_pixmap_read_prepare.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
If the source and destination are the same pixmap/fbo, and we
need to split the copy to small pieces. Then we do need to
consider the sequence of the small pieces when the copy area
has overlaps. This commit take the reverse/upsidedown into
the clipping function, thus it can generate correct sequence
and avoid corruption self copying.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Added infrastructure for largepixmap, this commit implemented:
1. Create/Destroy large pixmap.
2. Upload/Download large pixmap.
3. Implement basic repeat normal support.
3. tile/fill/copyarea large pixmap get supported.
The most complicated part glamor_composite still not implemented.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
This is the first commit to add support for large pixmap.
The large here means a pixmap is larger than the texutre's
size limitation thus can't fit into one single texutre.
The previous implementation will simply fallback to use a
in memory pixmap to contain the large pixmap which is
very slow in practice.
The basic idea here is to use an array of texture to hold
the large pixmap. And when we need to get a specific area
of the pixmap, we just need to compute/clip the correct
region and find the corresponding fbo.
We need to implement some auxiliary routines to clip every
rendering operations into small pieces which can fit into
one texture.
The complex part is the transformation/repeat/repeatReflect
and repeat pad and their comination. We will support all of
them step by step.
This commit just add some necessary data structure to represent
the large pixmap, and doesn't change any rendering process.
This commit doesn't add real large pixmap support.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Previous implementation set the whole fbo's width and height as the
viewpoint. This may increase the numerical error as we may only has
a partial region as the valid pixmap. So add a new marco
pixmap_priv_get_dest_scale to get proper scale factor for the
destination pixmap. For the source/mask pixmap, we still need to
consider the whole fbo's size.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
We miss the strict warning flags for a long time, now add it back.
This commit also fixed most of the warnings after enable the strict
flags.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
It seems that mesa has bugs when uploading bitmap to texture.
We switch to convert bitmap to a8 format and then upload the
a8 texture.
Also added a helper routine to dump 1bpp pixmap.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
There are two cases which we may use a wrong texture size.
1. A pixmap is modified by the client side after it created
it. Then the pixmap's width may mismatch the original fbo/tex's
size. Thus we need to check this condition when preparing
upload the pixmap.
2. We provide two API to download/upload sub region of a
textured pixmap. The caller may pass in a larger width then
the original pixmap's size, this may happen at putimage
and setspans. We need to validate the width and height
when do the downloading/uploading.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
We have disabled this feature for a long time, and previous
testing shows that this(pending fill) will not bring observed
performance gain. Now remove it.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Currently, intel's mesa dri driver will not check pbo for
a TexSubImage2D. So we use glTexImage2D if we are a fully
updating.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
As the pixmap may be attached to a picture, we need to use
glamor_upload_sub_pixmap to process it. glamor_copy_n_to_n
will not consider the picture case.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
As a textured_drm pixmap has a drm bo attached to it, and
it's the DDX layer to set it stride value. In some case,
the stride value is not equal to PixmapBytePad(w, depth)
which is used within glamor.
Then if it is the case, we have two choice, one is to set
the GL_PACK_ROW_LENGTH/GL_UNPACK_ROW_LENGTH when we need
to download or upload the pixmap. The other option is to
change the pixmap's devKind to match the one glamor is using
when downloading the pixmap, and restore it to the drm stride
after uploading the pixmap.
We choose the 2nd option, as GLES doesn't support the first
method.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
We already adjust the stride of the pixmap, and keep the alignment
as 4 should be ok to let the GL/GLES match the stride.
Previous version has a unbalanced PACK ROW length seting, and is
buggy, now fixed it.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
For A1 to A8's conversion, the stride is different for the
source and destination. Previous implementation use the same
stride, and may allocate less memory than required. Thus may
crash the server when uploading a A1 pixmap. Now fix it.
Tested-by: Peng Li <peng.li@intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Just as the downloading side, we can upload an sub region data to
a pixmap's specified region. The data could be in memory or in a
pbo buffer.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Introduced two function glamor_get_sub_pixmap/glamor_put_sub_pixmap,
can easily used to get and put sub region of a big textured pixmap.
And it can use pbo if possible.
To support download a big textured pixmap's sub region to another
pixmap's pbo, we introduce a new type of pixmap GLAMOR_MEMORY_MAP.
This type of pixmap has a valid devPrivate.ptr pointer, and that
pointer points to a pbo mapped address.
Now, we are ready to refine those
glamor_prepare_access/glamor_finish_access pairs.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Prepare to optimize the fallback path. We choose the important
rendering pathes to optimzie it by using shader. For other pathes,
we have to fallback. We may continue to optimize more pathes in
the future, but now we have to face those fallbacks.
The original fallback is very slow and will download/upload the whole
pixmap. From this commit, I will refine it to just download/upload
needed part.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
As we don't need to allocate new buffer when downloading pixmap
to CPU, we change the prototype of the color converting function
and let the caller to provide the buffer to hold the result.
All the color conversion function supports store the result
just at the same place of the source.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
We don't need to issue the glamor_fallback at the glamor_set_alu
routine, as the caller may support GXclear or other most frequent
Ops. Leave it to the caller to determine fallback or not.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
As GLES2 doesn't support LogiOps, we have to fallback
here. GLES2 programing guide's statement is as below:
"In addition, LogicOp is removed as it is very
infrequently used by applications and the OpenGL ES
working group did not get requests from independent
software vendors (ISVs) to support this feature in
OpenGL ES 2.0."
So, I think, fallback here may not a big deal ;).
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Added color conversion code to support 1555/2101010
formats,now gles2 can pass the render check with all
formats.
We use 5551 to represent 1555, and do the revertion
if downloading/uploading is needed.
For 2101010, as gles2 doesn't support reading the
identical formats. We have to use 8888 to represent,
thus we may introduce some accurate problem. But anyway,
we can pass the error checking in render check, so that
may not be a big problem.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
This patch fixed two major problems when we do the color convesion with
GLES2.
1. lack of necessary formats in FBO pool.
GLES2 has three different possible texture formats, GL_RGBA,
GL_BGRA and GL_ALPHA. Previous implementation only has one bucket
for all the three formats which may reuse a incorrect texture format
when do the cache lookup. After this fix, we can enable fbo safely
when running with GLES2.
2. Refine the format matching method in
glamor_get_tex_format_type_from_pictformat.
If both revertion and swap_rb are needed, for example use GL_RGBA
to represent PICT_b8g8r8a8. Then the downloading and uploading should
be handled differently.
The picture's format is PICT_b8g8r8a8,
Then the expecting color layout is as below (little endian):
0 1 2 3 : address
a r g b
Now the in GLES2 the supported color format is GL_RGBA, type is
GL_UNSIGNED_TYPE, then we need to shuffle the fragment
color as :
frag_color = sample(texture).argb;
before we use glReadPixel to get it back.
For the uploading process, the shuffle is a revert shuffle.
We still use GL_RGBA, GL_UNSIGNED_BYTE to upload the color
to a texture, then let's see
0 1 2 3 : address
a r g b : correct colors
R G B A : GL_RGBA with GL_UNSIGNED_BYTE
Now we need to shuffle again, the mapping rule is
r = G, g = B, b = A, a = R. Then the uploading shuffle is as
below:
frag_color = sample(texture).gbar;
After this commit, gles2 version can pass render check with all
the formats except those 1555/2101010.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Renaming glamor_priv->dispatch and wrapping the access to
the dispatch table with a function that also ensured the
context was bound.
dispatch = glamor_get_dispatch(glamor_priv);
...
glamor_put_dispatch(glamor_priv);
So that we catch all places where we attempt to call into GL withouta
context. As an optimisation we can then do glamor_get_context();
glamor_put_context() around the rendering entry points to reduce the
frequency of having to restore the old context. (Along with allowing
the context to be recursively acquired and making the old context part of
the glamor_egl state.)
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>