Since commit f07f18231a ('EXA: Allow using
exaCompositeRects also when we can't use a mask in exaGlyphs.') we were
checking the wrong set of coordinates in the buffer where glyphs to be rendered
are accumulated when no mask is used in exaGlyphs.
This fixes occasional glyph corruption which can be corrected with redraws, in
particular with Qt4.
Thanks to Maarten Maathuis for asking the right question: 'where do we protect
against evicting glyphs that are still needed?'
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Preserve the EXA ABI by introducing a new driver flag EXA_SUPPORTS_PREPARE_AUX.
If the driver doesn't set this flag, we have to assume any Prepare/FinishAccess
driver hooks can't handle the EXA_PREPARE_AUX* indices, so we move out such
pixmaps at PrepareAccess time.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=18710 .
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
This should give the full benefits of the glyph cache even when we can't use a
mask.
This also means we no longer need to scan the glyphs to see if they overlap,
we can just use a mask or not as the client asks.
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=18710 .
As this can't work without new EXA_PREPARE_AUX* indices, this requires a major
version bump, so we can also drop the UploadToScratch driver hook and
ExaOffscreenSwap*(). So this also fixes
http://bugs.freedesktop.org/show_bug.cgi?id=20213 .
Moreover, introduce EXA_DRIVER_KNOWN_MAJOR to break compilation of drivers
which may not be able to handle EXA_PREPARE_AUX*, giving instructions how to
make them build again in the #error message.
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
This reverts commit 97c1cbc702.
- Sorry for the thinko, pending damage is often not fragmentated.
- Should the dst region become fragmentated, you actually want to copy more to unfragmentate it.
- The src optimisation is more aggressive and possibly harmful in light of the new initial state of pixmaps.
- There is now actually a performance improvement by almost always keeping the number of rects low.
- use DEST in the createPixmap wrapper, because stipple already takes MASK (in case someone uses swappers).
- Anticipate some of the less common situations when fbValidateDrawable will access tile related pixmaps.
- I did some testing with full fallbacks forced by the driver.
- I ran rendercheck, expedite and the (full) x11perf test suite.
- Thanks to ajax for pointing out this should be unneeded.
This is probably required, but apparently not sufficient, for making migration
heuristics other than "always" work correctly again. Not that I really care
about them...
A grep on xorg/* revealed there's no consumer of this define.
Quote Alan Coopersmith:
"The consumer was in past versions of the headers now located
in proto/x11proto - for instance, in X11R6.0's xc/include/Xproto.h,
all the event definitions were only available if NEED_EVENTS were
defined, and all the reply definitions required NEED_REPLIES.
Looks like Xproto.h dropped them by X11R6.3, which didn't have
the #ifdef's anymore, so these are truly ancient now."
Signed-off-by: Peter Hutterer <peter.hutterer@redhat.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Save in a few special cases, _X_EXPORT should not be used in C source
files. Instead, it should be used in headers, and the proper C source
include that header. Some special cases are symbols that need to be
shared between modules, but not expected to be used by external drivers,
and symbols that are accessible via LoaderSymbol/dlopen.
This patch also adds conditionally some new sdk header files, depending
on extensions enabled. These files were added to match pattern for
other extensions/modules, that is, have the headers "deciding" symbol
visibility in the sdk. These headers are:
o Xext/panoramiXsrv.h, Xext/panoramiX.h
o fbpict.h (unconditionally)
o vidmodeproc.h
o mioverlay.h (unconditionally, used only by xaa)
o xfixes.h (unconditionally, symbols required by dri2)
LoaderSymbol and similar functions now don't have different prototypes,
in loaderProcs.h and xf86Module.h, so that both headers can be included,
without the need of defining IN_LOADER.
xf86NewInputDevice() device prototype readded to xf86Xinput.h, but
not exported (and with a comment about it).
This is the biggest "visibility" patch. Instead of doing a "export"
symbol on demand, export everything in the sdk, so that if some module
fails due to an unresolved symbol, it is because it is using a symbol
not in the sdk.
Most exported symbols shouldn't really be made visible, neither
advertised in the sdk, as they are only used by a single shared object.
Symbols in the sdk (or referenced in sdk macros), but not defined
anywhere include:
XkbBuildCoreState()
XkbInitialMap
XkbXIUnsupported
XkbCheckActionVMods()
XkbSendCompatNotify()
XkbDDXFakePointerButton()
XkbDDXApplyConfig()
_XkbStrCaseCmp()
_XkbErrMessages[]
_XkbErrCode
_XkbErrLocation
_XkbErrData
XkbAccessXDetailText()
XkbNKNDetailMaskText()
XkbLookupGroupAndLevel()
XkbInitAtoms()
XkbGetOrderedDrawables()
XkbFreeOrderedDrawables()
XkbConvertXkbComponents()
XkbWriteXKBSemantics()
XkbWriteXKBLayout()
XkbWriteXKBKeymap()
XkbWriteXKBFile()
XkbWriteCFile()
XkbWriteXKMFile()
XkbWriteToServer()
XkbMergeFile()
XkmFindTOCEntry()
XkmReadFileSection()
XkmReadFileSectionName()
InitExtInput()
xf86CheckButton()
xf86SwitchCoreDevice()
RamDacSetGamma()
RamDacRestoreDACValues()
xf86Bpp
xf86ConfigPix24
xf86MouseCflags[]
xf86SupportedMouseTypes[]
xf86NumMouseTypes
xf86ChangeBusIndex()
xf86EntityEnter()
xf86EntityLeave()
xf86WrapperInit()
xf86RingBell()
xf86findOptionBoolean()
xf86debugListOptions()
LoadSubModuleLocal()
LoaderSymbolLocal()
getInt10Rec()
xf86CurrentScreen
xf86ReallocatePciResources()
xf86NewSerialNumber()
xf86RandRSetInitialMode()
fbCompositeSolidMask_nx1xn
fbCompositeSolidMask_nx8888x0565C
fbCompositeSolidMask_nx8888x8888C
fbCompositeSolidMask_nx8x0565
fbCompositeSolidMask_nx8x0888
fbCompositeSolidMask_nx8x8888
fbCompositeSrc_0565x0565
fbCompositeSrc_8888x0565
fbCompositeSrc_8888x0888
fbCompositeSrc_8888x8888
fbCompositeSrcAdd_1000x1000
fbCompositeSrcAdd_8000x8000
fbCompositeSrcAdd_8888x8888
fbGeneration
fbIn
fbOver
fbOver24
fbOverlayGeneration
fbRasterizeEdges
fbRestoreAreas
fbSaveAreas
composeFunctions
VBEBuildVbeModeList()
VBECalcVbeModeIndex()
TIramdac3030CalculateMNPForClock()
shadowBufPtr
shadowFindBuf()
miRRGetScreenInfo()
RRSetScreenConfig()
RRModePruneUnused()
PixmanImageFromPicture()
extern int miPointerGetMotionEvents()
miClipPicture()
miRasterizeTriangle()
fbPush1toN()
fbInitializeBackingStore()
ddxBeforeReset()
SetupSprite()
InitSprite()
DGADeliverEvent()
SPECIAL CASES
o defined as _X_INTERNAL
xf86NewInputDevice()
o defined as static
fbGCPrivateKey
fbOverlayScreenPrivateKey
fbScreenPrivateKey
fbWinPrivateKey
o defined in libXfont.so, but declared in xorg/dixfont.h
GetGlyphs()
QueryGlyphExtents()
QueryTextExtents()
ParseGlyphCachingMode()
InitGlyphCaching()
SetGlyphCachingMode()
This patch exports all symbols required by the compilable
(in a x86 linux computer) xorg/driver/* modules.
Still missing symbols worth mentioning are:
sunleo
miFindMaxBand no longer available
intel (uxa/uxa-accel.c)
fbShmPutImage no longer available (and should have been static)
mga
MGAGetClientPointer (should come from matrox's libhal)
This is not a definitive "visibility" patch, as all it does is to
export missing symbols, but the modules that current don't compile,
may require more symbols once fixed, and third party drivers should
also require more symbols exported.
A "definitive" patch should export symbols defined in the sdk.
We can get a case with gnome-terminal + links, where we get two arrays
of glyphs all with 0 width and 0 heights in them. If this happens
we manage to get to this case without any buffer setup and segfault.
- By adding a small hack to the xserver i was able to easily test the performance of the normally rare direct case (using cairo).
- It turned out to be 70% slower for me (large test on an otherwise idle computer), which seems enough of a reason to remove it.
- AddTraps could also use a 2nd look, but since noone is using that it's a bit hard and less useful to test.
- Redo damage naming for more consistency.
- Call post submission functions only where appropriate.
- EXA can now live without it's odd damage workarounds.
There's no reason to not just dispatch this straight into the GC. As a
bonus, if you do so, damage wraps correctly, and thus swcursor works.
The side effect is it's no longer possible to override ShmPutImage with
ShmRegisterFuncs().
Also remove the (broken) damage tracking for same from EXA, since it didn't
work right, and is now superfluous.
It's buggy without Composite acceleration (leading to cropped glyphs) and not
really useful in that case anyway. The bug probably still needs to be found and
fixed for drivers that provide a PrepareComposite hook but can't accelerate
text rendering though.
Recording damage from other operations (e.g. creating a client damage record)
may confuse the migration code resulting in corruption.
Option "EXAOptimizeMigration" appears safe now, so enable it by default. Also
remove it from the manpage, as it should only be necessary on request in the
course of bug report diagnostics anymore.
When possible, use UploadToScreen() rather than CompositePicture()
to upload glyphs onto the glyph cache pixmap. This avoids allocating
offscreen memory for each glyph making management of offscreen
areas much more efficient.
Add a function to composite multiple independent rectangles
from the same source to the same destination in a single
operation: this is useful for building a glyph mask.
Add back exaGlyphs(); the new version copies the glyph images
onto a single large glyph pixmap and draws from their to the
destination surface. This reduces the management of small
offscreen areas and will allow us to avoid texture unit setup
between each glyph.
* Make sure available areas are considered to have no eviction cost. This seems
to help for https://bugs.freedesktop.org/show_bug.cgi?id=15513 but I'm afraid
that may just be coincidence.
* Only calculate eviction cost of each area once for each eviction pass.
Safeguard against potential (though unlikely) division by zero.
* Cosmetic enhancements: Name eviction cost related variables 'cost' instead of
'score' to emphasize that smaller values are better, update Doxygen file
comment to the way eviction works now.
In some cases we can still do the copying in hardware even if the
dimensions of the pixmaps are out of range. This is true when the boxes
that we're to copy are all in the card's range.
Reduce the cost of the inner loop, by keeping a set of pointers to the
first and the last areas in the series, subtracting the cost of the first
area from the score, and adding the cost of the last area while walking
the list. This commit also moves the scanning loop from exaOffscreenAlloc
into a separate function.
Idea by Michel Dänzer.
Replace the current score keeping algorithm with a rolling counter that's
incremented in ExaOffscreenMarkUsed, with the previous value being stored
in the area. exaOffscreenAlloc uses the difference between the counter
value and the value in the area when deciding which area to evict.
It now also takes the size of the areas into account, and favors evicting
smaller areas.
The credit for these ideas goes to Michel Dänzer.
These hints allow an acceleration architecture to optimize allocation of certain
types of pixmaps, such as pixmaps that will serve as backing pixmaps for
redirected windows.
This reverts commit 1365aeff54.
It defeated the optimization for drivers that don't provide a CreatePixmap
hook. The optimization makes no sense for drivers that do anyway, so disable
it for them completely.
This adds hooks for the driver to access Create/DestroyPixmap and ModifyPixmapHe
ader.
It allocates a 0 sized pixmap using fb and calls the driver routine to do
work of allocating the actual memory.
ModifyPixmapHeader is mainly required for hooking the screen pixmap which
isn't create by normal methods
Not sure what I was thinking when I wrote this... it would cause the box
coordinates to be off for exaCopyNtoNTwoDir or fallbacks.
Thanks to Tilman Sauerbeck for pointing out the problem on IRC and testing the
fix.
If the driver isn't compatible to the server, all bets are off anyway wrt
the contents of the fields that we're validating, which can lead to bogus
error messages.
This was an attempt to avoid scratch gc creation and validation for paintwin
because that was expensive. This is not the case in current servers, and the
danger of failure to implement it correctly (as seen in all previous
implementations) is high enough to justify removing it. No performance
difference detected with x11perf -create -move -resize -circulate on Xvfb.
Leave the screen hooks for PaintWindow* in for now to avoid ABI change.
Improve exaShmPutImage performance and reuse its core in exaPutImage as it
seems faster than the previous code when the driver doesn't provide an
UploadToScreen hook.
Make sure all damage records are notified of the damage incurred by actual
ShmPutImage calls.
Remove superfluous manual damage tracking for actual PutImage calls.
Exclude bits that will be overwritten from migration.
Use exaGlyphs even when Composite can't be accelerated, to avoid PolyFillRect
roundtrip via offscreen memory.
Initialize mask pixmap in exaGlyphs in FB in addition to system if the driver
provides Composite hooks to avoid migration overhead.
Remove manual damage tracking where superfluous.
Initialize system and FB copy in exaFillRegionSolid and adapt
exaGetPixmapFirstPixel to the new migration infrastructure.
This should mostly eliminate migration overhead for these, whether they are
used for acceleration or fallbacks.
As we can't actually accelerate anything interesting here, just migrate out
once and call fbSolidBoxClipped instead of taking a round trip via offscreen
memory with exaSolidBoxClipped.
Reuse pending damage region for extents and to prevent any actual migration of
pixmap contents when we're overwriting the whole pending damage region.
Remove superfluous manual damage tracking.
Only migrate once in exaTrapezoids/Triangles instead of every time in
exaRasterizeTrapezoid/AddTriangles. Adapt manual damage tracking to new
infrastructure.
Also move definition of NeedsComponent() closer to where it's used.
We finally want to catch all cases where the pixmap pointer is dereferenced
outside of exaPrepare/FinishAccess.
Also fix a couple of such cases exposed by this change.
The initiator of migration can pass in a region that defines the relevant area
of each source pixmap or the irrelevant area of the destination pixmap. By
default, the pending damage region is assumed relevant for the destination
pixmap, and everything for source pixmaps.
Thanks to Jarno Manninen for reassuring me that my own ideas for this were
feasible and for providing additional ideas.
over to new system.
Need to update documentation and address some remaining vestiges of
old system such as CursorRec structure, fb "offman" structure, and
FontRec privates.
Composite's automatic redirection is a more general mechanism than the
ad-hoc BS machinery, so it's much prettier to implement the one in terms
of the other. Composite now wraps ChangeWindowAttributes and activates
automatic redirection for windows with backing store requested. The old
backing store infrastructure is completely gutted: ABI-visible structures
retain the function pointers, but they never get called, and all the
open-coded conditionals throughout the DIX layer to implement BS are gone.
Note that this is still not a strictly complete implementation of backing
store, since Composite will throw the bits away on unmap and therefore
WhenMapped and Always hints are equivalent.
The fb pointer would be left uninitialized when exaPixmapIsOffscreen
returned false. When it returned true and the pixmap was damaged,
fb would be initialized from the pixmap's devPrivate.ptr before the
exaDoMigration and exaPrepareAccess calls, at which point
devPrivate.ptr would still be pointing at offscreen memory.
miTrapezoids creates an alpha pixmap and initializes the contents
using PolyFillRect, which causes the pixmap to be moved in for
acceleration. The subsequent call to RasterizeTrapezoid won't be
accelerated by EXA, which causing the pixmap to be moved back out
again.
By wrapping Trapezoids and using ExaCheckPolyFillRect instead of
PolyFillRect to initialize the pixmap, we avoid this roundtrip.
This avoids some inefficiency in creating a temporary Picture
for every glyph at rendering time. My measurements with an i965
showed the previous patch causing a 10-15% slowdown for NoAccel
and XAA cases, (while providing an 18% speedup for EXA).
With this change, the NoAccel and XAA performance regression is
eliminated, and the overall EXA speedup, (before any of the
glyphs-as-pixmaps work), is now 32%.
Instead of system-memory data which prevents accelerated
compositing of glyphs, (at least without forcing an upload
of the glyph data before compositing).
If the pixel in framebuffer memory isn't modified since we uploaded it, we
can just read from the system memory copy, wihch avoids both a readback and
an accelerator stall.
In principle this function is still wrong, and all the framebuffer pixel
access should be going through (w)fb so we can get pixel layout corrections.
This is mainly to avoid wasting effort for XSync(), but just reading a single
pixel directly is probably faster than DownloadFromScreen anyway. Though in
light of the latter, even larger thresholds might be useful.
Also move the swappedOut check before the migration checks because migration
can't actually occur when swapped out.
Only migrate when appropriate. In particular, don't migrate to offscreen in the
no-fallback case as copying from system memory should usually be as fast if not
faster than DownloadFromScreen, in particular if the bits need to be uploaded
to offscreen first.
* Defer to simpler hooks in more cases (inspired by XAA behaviour).
* Move damage tracking from lower to higher level functions.
* Always migrate for fallbacks.
* Convert rects to region and use it for damage tracking.
* When possible, defer to exaFillRegion{Solid,Tiled} using converted region.
* Always migrate for fallbacks.
* Move damage tracking out of ExaCheckPolyFillRect.
* Support planemasks, different ALUs and arbitrary tile origin.
* Leave damage tracking and non-trivial fallbacks to callers.
* Always migrate for fallbacks.
This is in preparation for using these from more other functions.
Additionally, protect libcw setup behind checks for Render, to avoid
segfaulting if Render isn't available (xnest).
The previous setup was an ABI-preserving dance, which is better nuked now.
Now, anything that needs libcw must explicitly initialize it, and
miDisableCompositeWrapper (previously only called by EXA and presumably binary
drivers) is gone.
This is a new behavior for version 2.1 of EXA, and only takes effect if the
driver has requested that. Otherwise, the previous behavior remains the same.