v2:
- Require power-of-two bpp in ScreenInit
- Eliminate fbCreatePixmapBpp
v3
- Squash in the exa and glamor changes so we can remove pRotatedPixmap
in the same stroke.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
The other paths don't build or work, PCI and other buses are almost
always 32 bit data paths, and X doesn't really support pixels bigger
than that anyway.
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
The width parameter is used to disable the blit fast-path (memcpy) when
source and destination rows overlap in memory. This check was added in [0].
Unfortunately, the calculation to determine if source and destination
lines overlapped was incorrect:
(1) it converts width from pixels to bytes, but width is actually in
bits, not pixels.
(2) it adds this byte offset to dst/srcLine, which implicitly converts
the offset from bytes to sizeof(FbBits).
Fix both of these by converting addresses to byte pointers and width
to bytes and doing comparisons on the resulting byte address.
For example:
A 32-bpp 1366 pixel-wide row will have
width = 1366 * 32 = 43712 bits
bpp = 32
(bpp >> 3) = 4
width * (bpp >> 3) = 174848 FbBits
(FbBits *)width => 699392 bytes
So, "careful" was true if the destination line was within 699392 bytes,
instead of just within its 1366 * 4 = 5464 byte row.
This bug causes us to take the slow path for large non-overlapping rows
that are "close" in memory. As a data point, XGetImage(1366x768) on my
ARM chromebook was taking ~140 ms, but with this fixed, it now takes
about 60 ms.
XGetImage() -> exaGetImage() -> fbGetImage -> fbBlt()
[0] commit e32cc0b4c8
Author: Adam Jackson <ajax@redhat.com>
Date: Thu Apr 21 16:37:11 2011 -0400
fb: Fix memcpy abuse
The memcpy fast path implicitly assumes that the copy walks
left-to-right. That's not something memcpy guarantees, and newer glibc
on some processors will indeed break that assumption. Since we walk a
line at a time, check the source and destination against the width of
the blit to determine whether we can be sloppy enough to allow memcpy.
(Having done this, we can remove the check for !reverse as well.)
v3: Convert to byte units
This first checks to make sure the blt is byte aligned, converts all
of the data to byte units and then compares for byte address range
overlap between source and dest.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
fbblt.c: In function 'fbBlt':
fbblt.c:76:16: warning: declaration of 'src' shadows a previous local
fbblt.c:52:13: warning: shadowed declaration is here
fbblt.c:77:16: warning: declaration of 'dst' shadows a previous local
fbblt.c:52:19: warning: shadowed declaration is here
fbbltone.c: In function 'fbBltPlane':
fbbltone.c:742:13: warning: declaration of 'w' shadows a previous local
fbbltone.c:725:9: warning: shadowed declaration is here
Signed-off-by: Yaakov Selkowitz <yselkowitz@users.sourceforge.net>
Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net>
This is strictly the application of the script 'x-indent-all.sh'
from util/modular. Compared to the patch that Daniel posted in
January, I've added a few indent flags:
-bap
-psl
-T PrivatePtr
-T pmWait
-T _XFUNCPROTOBEGIN
-T _XFUNCPROTOEND
-T _X_EXPORT
The typedefs were needed to make the output of sdksyms.sh match the
previous output, otherwise, the code is formatted badly enough that
sdksyms.sh generates incorrect output.
The generated code was compared with the previous version and found to
be essentially identical -- "assert" line numbers and BUILD_TIME were
the only differences found.
The comparison was done with this script:
dir1=$1
dir2=$2
for dir in $dir1 $dir2; do
(cd $dir && find . -name '*.o' | while read file; do
dir=`dirname $file`
base=`basename $file .o`
dump=$dir/$base.dump
objdump -d $file > $dump
done)
done
find $dir1 -name '*.dump' | while read dump; do
otherdump=`echo $dump | sed "s;$dir1;$dir2;"`
diff -u $dump $otherdump
done
Signed-off-by: Keith Packard <keithp@keithp.com>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Acked-by: Alan Coopersmith <alan.coopersmith@oracle.com>
The memcpy fast path implicitly assumes that the copy walks
left-to-right. That's not something memcpy guarantees, and newer glibc
on some processors will indeed break that assumption. Since we walk a
line at a time, check the source and destination against the width of
the blit to determine whether we can be sloppy enough to allow memcpy.
(Having done this, we can remove the check for !reverse as well.)
On an Intel Core i7-2630QM with an NVIDIA GeForce GTX 460M running in
NoAccel, the broken code and various fixes for -copywinwin{10,100,500}
gives (edited to fit in 80 columns):
1: Disable the fastpath entirely
2: Replace memcpy with memmove
3: This fix
4: The code before this fix
1 2 3 4 Operation
------ --------------- --------------- --------------- ------------
258000 269000 ( 1.04) 544000 ( 2.11) 552000 ( 2.14) Copy 10x10
21300 23000 ( 1.08) 43700 ( 2.05) 47100 ( 2.21) Copy 100x100
960 962 ( 1.00) 1990 ( 2.09) 1990 ( 2.07) Copy 500x500
So it's a modest performance hit, but correctness demands it, and it's
probably worth keeping the 2x speedup from having the fast path in the
first place.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
This was generated by:
cd fb
coan source --replace -DFB_SCREEN_PRIVATE -DFB_24BIT -DFB_24_32BIT -DFB_SCREEN_PRIVATE -UFBNOPIXADDR -UFBNO24BIT -UFBNO24_32 *.[ch]
A follow up patch readds the FB_24_32BIT define for Intel UXA.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
Save in a few special cases, _X_EXPORT should not be used in C source
files. Instead, it should be used in headers, and the proper C source
include that header. Some special cases are symbols that need to be
shared between modules, but not expected to be used by external drivers,
and symbols that are accessible via LoaderSymbol/dlopen.
This patch also adds conditionally some new sdk header files, depending
on extensions enabled. These files were added to match pattern for
other extensions/modules, that is, have the headers "deciding" symbol
visibility in the sdk. These headers are:
o Xext/panoramiXsrv.h, Xext/panoramiX.h
o fbpict.h (unconditionally)
o vidmodeproc.h
o mioverlay.h (unconditionally, used only by xaa)
o xfixes.h (unconditionally, symbols required by dri2)
LoaderSymbol and similar functions now don't have different prototypes,
in loaderProcs.h and xf86Module.h, so that both headers can be included,
without the need of defining IN_LOADER.
xf86NewInputDevice() device prototype readded to xf86Xinput.h, but
not exported (and with a comment about it).
This is the biggest "visibility" patch. Instead of doing a "export"
symbol on demand, export everything in the sdk, so that if some module
fails due to an unresolved symbol, it is because it is using a symbol
not in the sdk.
Most exported symbols shouldn't really be made visible, neither
advertised in the sdk, as they are only used by a single shared object.
Symbols in the sdk (or referenced in sdk macros), but not defined
anywhere include:
XkbBuildCoreState()
XkbInitialMap
XkbXIUnsupported
XkbCheckActionVMods()
XkbSendCompatNotify()
XkbDDXFakePointerButton()
XkbDDXApplyConfig()
_XkbStrCaseCmp()
_XkbErrMessages[]
_XkbErrCode
_XkbErrLocation
_XkbErrData
XkbAccessXDetailText()
XkbNKNDetailMaskText()
XkbLookupGroupAndLevel()
XkbInitAtoms()
XkbGetOrderedDrawables()
XkbFreeOrderedDrawables()
XkbConvertXkbComponents()
XkbWriteXKBSemantics()
XkbWriteXKBLayout()
XkbWriteXKBKeymap()
XkbWriteXKBFile()
XkbWriteCFile()
XkbWriteXKMFile()
XkbWriteToServer()
XkbMergeFile()
XkmFindTOCEntry()
XkmReadFileSection()
XkmReadFileSectionName()
InitExtInput()
xf86CheckButton()
xf86SwitchCoreDevice()
RamDacSetGamma()
RamDacRestoreDACValues()
xf86Bpp
xf86ConfigPix24
xf86MouseCflags[]
xf86SupportedMouseTypes[]
xf86NumMouseTypes
xf86ChangeBusIndex()
xf86EntityEnter()
xf86EntityLeave()
xf86WrapperInit()
xf86RingBell()
xf86findOptionBoolean()
xf86debugListOptions()
LoadSubModuleLocal()
LoaderSymbolLocal()
getInt10Rec()
xf86CurrentScreen
xf86ReallocatePciResources()
xf86NewSerialNumber()
xf86RandRSetInitialMode()
fbCompositeSolidMask_nx1xn
fbCompositeSolidMask_nx8888x0565C
fbCompositeSolidMask_nx8888x8888C
fbCompositeSolidMask_nx8x0565
fbCompositeSolidMask_nx8x0888
fbCompositeSolidMask_nx8x8888
fbCompositeSrc_0565x0565
fbCompositeSrc_8888x0565
fbCompositeSrc_8888x0888
fbCompositeSrc_8888x8888
fbCompositeSrcAdd_1000x1000
fbCompositeSrcAdd_8000x8000
fbCompositeSrcAdd_8888x8888
fbGeneration
fbIn
fbOver
fbOver24
fbOverlayGeneration
fbRasterizeEdges
fbRestoreAreas
fbSaveAreas
composeFunctions
VBEBuildVbeModeList()
VBECalcVbeModeIndex()
TIramdac3030CalculateMNPForClock()
shadowBufPtr
shadowFindBuf()
miRRGetScreenInfo()
RRSetScreenConfig()
RRModePruneUnused()
PixmanImageFromPicture()
extern int miPointerGetMotionEvents()
miClipPicture()
miRasterizeTriangle()
fbPush1toN()
fbInitializeBackingStore()
ddxBeforeReset()
SetupSprite()
InitSprite()
DGADeliverEvent()
SPECIAL CASES
o defined as _X_INTERNAL
xf86NewInputDevice()
o defined as static
fbGCPrivateKey
fbOverlayScreenPrivateKey
fbScreenPrivateKey
fbWinPrivateKey
o defined in libXfont.so, but declared in xorg/dixfont.h
GetGlyphs()
QueryGlyphExtents()
QueryTextExtents()
ParseGlyphCachingMode()
InitGlyphCaching()
SetGlyphCachingMode()
Use the READ and WRITE macros to wrap memory accesses that could be in video
memory. Add MEMCPY_WRAPPED and MEMSET_WRAPPED macros to wrap memcpy and
memset, respectively.
Add XSERV_t, TRANS_SERVER, TRANS_REOPEN to quash warnings.
Add #include <dix-config.h> or <xorg-config.h>, as appropriate, to all
source files in the xserver/xorg tree, predicated on defines of
HAVE_{DIX,XORG}_CONFIG_H. Change all Xfont includes to
<X11/fonts/foo.h>.