This was an attempt to avoid scratch gc creation and validation for paintwin
because that was expensive. This is not the case in current servers, and the
danger of failure to implement it correctly (as seen in all previous
implementations) is high enough to justify removing it. No performance
difference detected with x11perf -create -move -resize -circulate on Xvfb.
Leave the screen hooks for PaintWindow* in for now to avoid ABI change.
Improve exaShmPutImage performance and reuse its core in exaPutImage as it
seems faster than the previous code when the driver doesn't provide an
UploadToScreen hook.
Make sure all damage records are notified of the damage incurred by actual
ShmPutImage calls.
Remove superfluous manual damage tracking for actual PutImage calls.
Initialize system and FB copy in exaFillRegionSolid and adapt
exaGetPixmapFirstPixel to the new migration infrastructure.
This should mostly eliminate migration overhead for these, whether they are
used for acceleration or fallbacks.
As we can't actually accelerate anything interesting here, just migrate out
once and call fbSolidBoxClipped instead of taking a round trip via offscreen
memory with exaSolidBoxClipped.
Reuse pending damage region for extents and to prevent any actual migration of
pixmap contents when we're overwriting the whole pending damage region.
Remove superfluous manual damage tracking.
We finally want to catch all cases where the pixmap pointer is dereferenced
outside of exaPrepare/FinishAccess.
Also fix a couple of such cases exposed by this change.
The initiator of migration can pass in a region that defines the relevant area
of each source pixmap or the irrelevant area of the destination pixmap. By
default, the pending damage region is assumed relevant for the destination
pixmap, and everything for source pixmaps.
Thanks to Jarno Manninen for reassuring me that my own ideas for this were
feasible and for providing additional ideas.
This is mainly to avoid wasting effort for XSync(), but just reading a single
pixel directly is probably faster than DownloadFromScreen anyway. Though in
light of the latter, even larger thresholds might be useful.
Also move the swappedOut check before the migration checks because migration
can't actually occur when swapped out.
Only migrate when appropriate. In particular, don't migrate to offscreen in the
no-fallback case as copying from system memory should usually be as fast if not
faster than DownloadFromScreen, in particular if the bits need to be uploaded
to offscreen first.
* Convert rects to region and use it for damage tracking.
* When possible, defer to exaFillRegion{Solid,Tiled} using converted region.
* Always migrate for fallbacks.
* Move damage tracking out of ExaCheckPolyFillRect.
* Support planemasks, different ALUs and arbitrary tile origin.
* Leave damage tracking and non-trivial fallbacks to callers.
* Always migrate for fallbacks.
This is in preparation for using these from more other functions.
Mostly due to exaDrawableDirty() now calculating the backing pixmap coordinates
internally, for cases where they aren't trivially known. There's a new
exaPixmapDirty() function for the other cases.
which still happen somewhat frequently and were cluttering up my
fallback debugging output. x11perf says it's a major performance win in
those cases (though probably irrelevant), and it passes Xlib9.
implementation to avoid unprepared access to the tile. Also, relocate
the fbGetDrawable to avoid using a stale dest pointer after
exaSolidBoxClipped() may have migrated it. Revealed by xtest.
devPrivate.ptr when pointing at offscreen memory, outside of
exaPrepare/FinishAccess(). This was used with fakexa to find (by NULL
dereference) many instances of un-Prepared CPU access to the
framebuffer:
- GC tiles used in several ops when fillStyle == FillTiled were never
Prepared.
- Migration could lead to un-Prepared access to mask data in render's
Trapezoids and Triangles
- PutImage's UploadToScreen failure fallback failed to Prepare.
FB_ALLONES, bitsPerPixel >= 8, GXcopy cases. With the radeon driver on
my machine, this gives about 10% speedup in PutImage
10x10 and 500x500, and 40% speedup for 10x10 ShmPutImage, up to 65%
improvement in 500x500 ShmPutImage. Also fixes a crasher in GetImage
that slipped in at the last minute.
ZPixmap, planeMask ~= FB_ALLONES, bitsPerPixel >= 8 case. I'm pretty
convinced that this is the only case that we care about at all. Tested
with xwd -root and xwd on a gnome-terminal, in a composited environment
or not.
manual conversion to allow for different migration schemes to be
implemented reasonably, but does include some minor improvements such
as accounting for pinned pixmaps not being acceleratable, and for our
current GetImage and GetSpans not being accelerated.
when extending the driver interface. The card and accel structures are
merged into the ExaDriverRec, which is to be allocated using
exaDriverAlloc(). The driver structure also grows exa_major and
exa_minor, which drivers fill in and have checked by EXA
(double-checking that the driver really did check that the EXA version
was correct). Removes exaInitCard(), which is replaced by the driver
filling in the rec by hand, and the exaGetVersion() and related
EXA_*VERSION which are replaced by always using the XFree86 loadable
module versioning.
dependencies. It was nearly abstract enough already to be used by
multiple DDXes. This will be useful for EXA development through
providing a fake acceleration implementation within Xephyr, so that
testing can be done on new EXA code without worrying about buggy
drivers.